Property-Based Testing
Note:
Important notes are in blue.
Learning outcomes
In this assignment, you will:
- Reason about correctness for a programming problem with multiple valid solutions.
- Implement an algorithm based on an informal, English-language specification.
- Create an automatic test case generator that uses property-based testing to check whether implementations of the algorithm meet the specification.
This is a language-agnostic programming assignment.
Programming language; machine
Programming language
You may use any programming language to complete this assignment, with instructor permission. Permission is granted by default for two programming languages: Python and Rust, for the following reasons:
- Python: Everyone registered for this class has some experience with Python, so this is likely simplest way for many students to complete this assignment.
- Rust: Rust is a (relatively) new, modern language for safer systems programming. The implementation assignments in this class, with the associated Workshop session, provide a chance for you to pick up a new programming language if you would like a bit more of a challenge.
Whatever language you choose, your submission must conform the input-output specification (described in the next section).
The January 31 Workshop session is dedicated to launching this assignment.
Machine
You can complete this assignment on any machine with a UNIX-based file system (your own Mac OS or Linux machine or the department's CS Linux server). If you have a Windows machine, you can still develop locally, but should test your code on the CS Linux server (e.g., using the VSCode remote server extension; post on Zulip if you need help with this).
Assignment description
At a high level, this assignment will have you
- Implement a useful algorithm, then
- Write your own property-based testing oracle to check the correctness of your implementation.
In grading your assignment, I will check your oracle against known-incorrect implementations of the algorithm!
The problem
You are a software engineer at an organization that produces many related software packages. Your team wants you to take a look at implementing an automated dependency management system.
Each software package your team produces has some number of dependencies. If package X
has dependency Y
, that means your system should install package Y
before it can install package Y
. For example, package Y
could be the numpy
package for manipulating data, and package X
could be a scientific computing script.
Your task is to determine, given a list of dependencies, what order to install packages in. ( This is a problem encountered by real package management software, such a pip
for Python, npm
for JavaScript, and Cargo for Rust).
For example, you might be given the following list. You should read each line P Q
as meaning "P
is a dependency of Q
".
B A
B C
A C
D A
For this example, a correct output would be:
B
D
A
C
Meaning: install package B
first, then D
, then A
, then C
.
Is this the only correct output? (Think, then click!)
No! For example, B
and D
have no dependence relationship, so we could be equally correct with the installation order of:
D
B
A
C
In Workshop, we will first stop here and practice coming up with an algorithmic solution to this informal, English-language specification. This is a problem that definitely could come up in a software engineering technical interviews! (It came up at least twice in technical interviews Alexa has had).
Several groups considered the quesiton of cycles in the list of dependencies. In something like an interview context, this is a great question to ask your interviewer! It shows you recognize it as a problem and that you know your solution will be different depending on the answer.
The algorithm: topological sort
The problem from the previous section resolves to a classic graph theory problem. A valid list of dependencies should form a simple, directed acyclic graph, where packages are vertices and dependencies are directed edges (going from the package that must be installed first to the package that must be installed after). Let our graph be . Finding a valid order of installation is equivalent to finding a linear ordering of vertices such that for all edges in , precedes in the ordering. This is known as a topological sort, or topsort for short.
Another example of topological sort is determining the order of spreadsheets formula cell evaluation. You have almost certainly done your own informal topological sort to determine what order to take your Wellesley Computer Science classes, given the list of prerequisites!
Depending on the graph, many valid topological sorts may exist for each of these problems.
Your tasks: topsort
, oracle
, and README
You will write two programs for this assignment:
Part 1: A topological sorting program (named topsort
).
Part 2: An oracle for property-based testing of any topsort implementation (named oracle
).
In addition, good software engineering practice is to document your software. I expect your code to be well-commented, and part of your grade will be based on design and style.
Most public software engineering projects have a top-level file called README.txt
or README.md
. The latter filetype is Markdown, a very useful formatting style used by many open-source software projects (e.g., most projects on Github) as well as the CS340 lecture notes and assignments (the pretty colored boxes are from a particular variant of Markdown popularized by Github).
Your submission should also include a README.txt
or README.md
that describes:
- What language you chose to implement the project in, and what (if any) command line instruction you use to generate your executables.
- What machine you tested your code on (just "MacOS" or "CS Linux" suffices).
- Any known bugs or issues with your implementation.
- A list of other CS340 students you collaborated with (remembering to follow the guidelines in the syllabus).
- A list of external resources/websites you used (again, remembering to follow the syllabus guidelines).
I encourage you to use a private Github repository for this assignment, (1) to practice using Git, which is a useful systems-engineering skill, and (2) to make it easier to recover from mistakes. Refer to the CS240 resources if you need a refresher. Using Git is not required (and be sure to not use a public repository, since everyone in the class is completing the same assignment).
Part 1: Algorithm implementation, topsort
I suggest that you implement the topological sorting program using the following pseudocode:
function topsort(g):
L = empty list to store ordering
S = set of vertices with no incoming edges
while S is not empty:
u = vertex removed from S
append u to L
for each vertex v where edge e = (u,v) in g:
remove e from g
if v has no other incoming edges in g:
insert v in S
return L
Since your program will ultimately just read and write number from standard in and out, much of your implementation is up to you. You use any appropriate data structures in your chosen language to represent the graph and the queue (for example, just Python list
or dict
may suffice, depending on your design).
I just ask that you not use any external/library implementation of topological sort (or any other sort).
Your implementation of topsort
should be a standalone executable. More on that in the Input-output specification section.
For the purposes of this assignment, your topsort
can assume that the input DAG is well-formed. That is, you can assume that the graph does not have cycles!
Part 1: In short, part 1 of this assignment is to create a topological sorting executable (named topsort
) that reads in directed acyclic graphs from standard input and writes a single valid topological sort to standard output.
Part 2: Property-based testing, oracle
For the second part of this assignment, you will build an automated testing oracle for topological sorting programs based on the concepts of property-based testing that we covered.
Your oracle
program will consume the file system path to a topological sorting program and the number of inputs to generate, n
.
Your oracle
should generate n
valid input graphs and write each of them to the provided program's standard input. Your oracle will check that the output provided by the topsort executable is in fact a topological sort of the corresponding input. Note that your oracle
should call the topsort
implementation n
times.
Note: implementing your oracle
to meet this specification will require running a subprocess and piping to/from the subprocesses standard input and output. These are useful systems-engineering skills. We'll talk through this together in Workshop.
Note that this is one component that you can look up via online tutorials. That is, searching and adapting code for subprocess management with attribution (that is, including the link in your README
) is encouraged, as long as you abide by the license of the source. Searching and adapting an existing Python topological sort implementation is not allowed.
If the provided topsort
produces a valid topological sort for all n
graphs that you have generated, then your program should exit with status 0
. Otherwise, your program should exit with status 1
. The oracle should not write anything to standard output or standard error.
For the purposes of this assignment, your oracle
should assume that n
is valid (a non-negative integer).
In a real system, we would expect to write checks to validate this, but that is not the focus in this one-week assignment.
You may be worried about what your oracle
would do about a deviously-slow topsort
implementation. For this assignment, I will not test against any very long-running topsort
implementations.
In general, it would be impossible to write an oracle to get around the "long running" problem.
Why? (Think, then click!)
Testing (in finite time) whether or not a program terminates is impossible in general (see the Halting problem)!
Most real-world testing systems work around this by setting a reasonable threshold (e.g., 5 minutes). If the tested program does not terminate within the threshold, the program is considered to be non-terminating and the testing software kills the offending process. However, I don't expect you to implement this in a week-long assignment.
Part 1: In short, part 2 of this assignment is to create a property-based testing system (named oracle
) that reads the path to a topsort
oracle and a number of inputs n
, generates n
input directed acyclic graphs, and exits with a status depending on whether the topsort
is correct with respect to those n
inputs.
System topological sort implementations model graphs in a simple text format, where each line has two space-delimitated values that represent a single directed edge.
For example, consider the following directed acyclic graph:
The expected text format of this graph is:
7 11
7 8
3 10
5 11
3 8
8 9
11 10
11 2
11 9
where each line represents a directed edge with each vertex name, an alphanumeric, case-sensitive string of any length, separated by a space (in the example of 7 11
, vertex 7
comes before 11
). Note that the order of lines does not matter.
Remember, a valid input should not have cycles!
The expected output of a topological sort in this text-based representation is in this form:
3
5
7
11
8
9
10
2
where each line contains a vertex name in topological order from top to bottom.
Remember that for valid topological sorts, your oracle
should exit with status 0
, and nothing should be printed to standard out or standard error. That is, your oracle
should write graphs in this format to the topsort
input and read lists of vertices from the topsort
program in this format, but the oracle
itself should not print anything to the command line.
Testing your testing
You'll want to perform multiple checks of both of your executables.
Testing your oracle
against the system tsort
First, you can test your oracle
by using a known-correct (although not formally proven correct!) implementation of topological sort.
Most UNIX-based systems also come with a command-line topological sorting utility installed (and this assignment matches the utility's input-outport format). To check if your system has this installed, run which tsort
(note the different spelling). On CS Linux, this is installed at /usr/bin/tsort
.
Test against the system tsort
with:
./oracle /usr/bin/tsort n
where n
is the number of inputs you want to generate. You can experiment to see how big of a n
you can test (though please do not run any program for more than a few minutes on CS Linux).
Your oracle should exit with status 0
on the system tsort
.
You can use the following terminal command to show the return status of the command run previously.
echo $?
Testing your topsort
with your oracle
You also (hopefully!) have your own correct topsort
implementation from part 1.
Using your own oracle
, check that your own implementation of topsort
is correct. Choose your own values for n
.
./oracle ./topsort n
Testing your own BROKEN topsort
s with your oracle
To ensure your oracle detects incorrect topological sorting programs, try running it on deliberately broken implementations of your creation! You can do this by copying your implementation to a different files with different names, and passing those paths instead.
Using your own oracle
, check that your broken variants of topsort
are found to be faulty by your oracle. Choose your own values for n
.
./oracle ./bad-topsort-1 n
Your oracle should exit with status 1
on these broken programs. Detecting incorrect implementations is arguably more important than detecting correct implementations!
Grading
I'll assess your topsort
and oracle
programs by running them against different implementations. I'll also assess your code and documentation.
This assignment is worth 100 points, broken down by:
- 20:
topsort
implementation
- 30:
oracle
: generating input DAGs
- 30:
oracle
: testing the properties of output lists
- 20:
README
, design, and style
Submission
Submit your code and README
file via Gradescope.
Your submission should include:
- All of the source code you use to generate your two executables.
topsort
and oracle
: the executable files themselves (in Python, this is the same as #1, in Rust or other languages these may be distinct files):
README
see earlier Tasks section for what to include.
If you're hungry for more, try out these ideas (which could get up to +5 extra credit). If you do either of these, please (1) implement them in a different copy of your file, and (2) document this in your README
.
- Adapt your
topsort
implementation to detect cyclic input.
- Adapt your
oracle
implementation to terminate topsort
processes that run for longer than 2 minutes.
Real-world context
Out in the wild, many software engineering projects use some form of fuzzing or property-based testing. On Thursday February 1, we'll continue our discussion of this in class.
In particular, there are many libraries that make it easier to generate randomized tests, such as QuickCheck for Haskell (one of the projects that popularized this approach!), Hypothesis for Python, and PropTest or Bolero for Rust.
Software engineers on existing projects often use these libraries rather than implementing their own oracle from scratch like we did here. Implementing from scratch is useful for understanding how the concepts work, but I encourage you to consider these libraries for future implementation projects both inside and outside of CS340.
Attribution
This assignment is based on a previous version of the Oracle assignment from Tim Nelson's Logic for Systems at Brown University.
Property-Based Testing
Note:
Important notes are in blue.
Tasks are in green.
Learning outcomes
In this assignment, you will:
This is a language-agnostic programming assignment.
Programming language; machine
Programming language
You may use any programming language to complete this assignment, with instructor permission. Permission is granted by default for two programming languages: Python and Rust, for the following reasons:
Whatever language you choose, your submission must conform the input-output specification (described in the next section).
The January 31 Workshop session is dedicated to launching this assignment.
Machine
You can complete this assignment on any machine with a UNIX-based file system (your own Mac OS or Linux machine or the department's CS Linux server). If you have a Windows machine, you can still develop locally, but should test your code on the CS Linux server (e.g., using the VSCode remote server extension; post on Zulip if you need help with this).
Assignment description
At a high level, this assignment will have you
In grading your assignment, I will check your oracle against known-incorrect implementations of the algorithm!
The problem
You are a software engineer at an organization that produces many related software packages. Your team wants you to take a look at implementing an automated dependency management system.
Each software package your team produces has some number of dependencies. If package
X
has dependencyY
, that means your system should install packageY
before it can install packageY
. For example, packageY
could be thenumpy
package for manipulating data, and packageX
could be a scientific computing script.Your task is to determine, given a list of dependencies, what order to install packages in. ( This is a problem encountered by real package management software, such a
pip
for Python,npm
for JavaScript, and Cargo for Rust).For example, you might be given the following list. You should read each line
P Q
as meaning "P
is a dependency ofQ
".For this example, a correct output would be:
Meaning: install package
B
first, thenD
, thenA
, thenC
.Is this the only correct output? (Think, then click!)
No! For example,
B
andD
have no dependence relationship, so we could be equally correct with the installation order of:In Workshop, we will first stop here and practice coming up with an algorithmic solution to this informal, English-language specification. This is a problem that definitely could come up in a software engineering technical interviews! (It came up at least twice in technical interviews Alexa has had).
Several groups considered the quesiton of cycles in the list of dependencies. In something like an interview context, this is a great question to ask your interviewer! It shows you recognize it as a problem and that you know your solution will be different depending on the answer.
The algorithm: topological sort
The problem from the previous section resolves to a classic graph theory problem. A valid list of dependencies should form a simple, directed acyclic graph, where packages are vertices and dependencies are directed edges (going from the package that must be installed first to the package that must be installed after). Let our graph be . Finding a valid order of installation is equivalent to finding a linear ordering of vertices such that for all edges in , precedes in the ordering. This is known as a topological sort, or topsort for short.
Another example of topological sort is determining the order of spreadsheets formula cell evaluation. You have almost certainly done your own informal topological sort to determine what order to take your Wellesley Computer Science classes, given the list of prerequisites!
Depending on the graph, many valid topological sorts may exist for each of these problems.
Your tasks:
topsort
,oracle
, andREADME
You will write two programs for this assignment:
Part 1: A topological sorting program (named
topsort
).Part 2: An oracle for property-based testing of any topsort implementation (named
oracle
).In addition, good software engineering practice is to document your software. I expect your code to be well-commented, and part of your grade will be based on design and style.
Most public software engineering projects have a top-level file called
README.txt
orREADME.md
. The latter filetype is Markdown, a very useful formatting style used by many open-source software projects (e.g., most projects on Github) as well as the CS340 lecture notes and assignments (the pretty colored boxes are from a particular variant of Markdown popularized by Github).Your submission should also include a
README.txt
orREADME.md
that describes:I encourage you to use a private Github repository for this assignment, (1) to practice using Git, which is a useful systems-engineering skill, and (2) to make it easier to recover from mistakes. Refer to the CS240 resources if you need a refresher. Using Git is not required (and be sure to not use a public repository, since everyone in the class is completing the same assignment).
Part 1: Algorithm implementation,
topsort
I suggest that you implement the topological sorting program using the following pseudocode:
Since your program will ultimately just read and write number from standard in and out, much of your implementation is up to you. You use any appropriate data structures in your chosen language to represent the graph and the queue (for example, just Python
list
ordict
may suffice, depending on your design).I just ask that you not use any external/library implementation of topological sort (or any other sort).
Your implementation of
topsort
should be a standalone executable. More on that in the Input-output specification section.For the purposes of this assignment, your
topsort
can assume that the input DAG is well-formed. That is, you can assume that the graph does not have cycles!Part 1: In short, part 1 of this assignment is to create a topological sorting executable (named
topsort
) that reads in directed acyclic graphs from standard input and writes a single valid topological sort to standard output.Part 2: Property-based testing,
oracle
For the second part of this assignment, you will build an automated testing oracle for topological sorting programs based on the concepts of property-based testing that we covered.
Your
oracle
program will consume the file system path to a topological sorting program and the number of inputs to generate,n
.Your
oracle
should generaten
valid input graphs and write each of them to the provided program's standard input. Your oracle will check that the output provided by the topsort executable is in fact a topological sort of the corresponding input. Note that youroracle
should call thetopsort
implementationn
times.Note: implementing your
oracle
to meet this specification will require running a subprocess and piping to/from the subprocesses standard input and output. These are useful systems-engineering skills. We'll talk through this together in Workshop.Note that this is one component that you can look up via online tutorials. That is, searching and adapting code for subprocess management with attribution (that is, including the link in your
README
) is encouraged, as long as you abide by the license of the source. Searching and adapting an existing Python topological sort implementation is not allowed.If the provided
topsort
produces a valid topological sort for alln
graphs that you have generated, then your program should exit with status0
. Otherwise, your program should exit with status1
. The oracle should not write anything to standard output or standard error.For the purposes of this assignment, your
oracle
should assume thatn
is valid (a non-negative integer).In a real system, we would expect to write checks to validate this, but that is not the focus in this one-week assignment.
You may be worried about what your
oracle
would do about a deviously-slowtopsort
implementation. For this assignment, I will not test against any very long-runningtopsort
implementations.In general, it would be impossible to write an oracle to get around the "long running" problem.
Why? (Think, then click!)
Testing (in finite time) whether or not a program terminates is impossible in general (see the Halting problem)!
Most real-world testing systems work around this by setting a reasonable threshold (e.g., 5 minutes). If the tested program does not terminate within the threshold, the program is considered to be non-terminating and the testing software kills the offending process. However, I don't expect you to implement this in a week-long assignment.
Part 1: In short, part 2 of this assignment is to create a property-based testing system (named
oracle
) that reads the path to atopsort
oracle and a number of inputsn
, generatesn
input directed acyclic graphs, and exits with a status depending on whether thetopsort
is correct with respect to thosen
inputs.Input-output specification
System topological sort implementations model graphs in a simple text format, where each line has two space-delimitated values that represent a single directed edge.
For example, consider the following directed acyclic graph:
The expected text format of this graph is:
where each line represents a directed edge with each vertex name, an alphanumeric, case-sensitive string of any length, separated by a space (in the example of
7 11
, vertex7
comes before11
). Note that the order of lines does not matter.Remember, a valid input should not have cycles!
The expected output of a topological sort in this text-based representation is in this form:
where each line contains a vertex name in topological order from top to bottom.
Remember that for valid topological sorts, your
oracle
should exit with status0
, and nothing should be printed to standard out or standard error. That is, youroracle
should write graphs in this format to thetopsort
input and read lists of vertices from thetopsort
program in this format, but theoracle
itself should not print anything to the command line.Testing your testing
You'll want to perform multiple checks of both of your executables.
Testing your
oracle
against the systemtsort
First, you can test your
oracle
by using a known-correct (although not formally proven correct!) implementation of topological sort.Most UNIX-based systems also come with a command-line topological sorting utility installed (and this assignment matches the utility's input-outport format). To check if your system has this installed, run
which tsort
(note the different spelling). On CS Linux, this is installed at/usr/bin/tsort
.Test against the system
tsort
with:where
n
is the number of inputs you want to generate. You can experiment to see how big of an
you can test (though please do not run any program for more than a few minutes on CS Linux).Your oracle should exit with status
0
on the systemtsort
.You can use the following terminal command to show the return status of the command run previously.
echo $?
Testing your
topsort
with youroracle
You also (hopefully!) have your own correct
topsort
implementation from part 1.Using your own
oracle
, check that your own implementation oftopsort
is correct. Choose your own values forn
.Testing your own BROKEN
topsort
s with youroracle
To ensure your oracle detects incorrect topological sorting programs, try running it on deliberately broken implementations of your creation! You can do this by copying your implementation to a different files with different names, and passing those paths instead.
Using your own
oracle
, check that your broken variants oftopsort
are found to be faulty by your oracle. Choose your own values forn
.Your oracle should exit with status
1
on these broken programs. Detecting incorrect implementations is arguably more important than detecting correct implementations!Grading
I'll assess your
topsort
andoracle
programs by running them against different implementations. I'll also assess your code and documentation.This assignment is worth 100 points, broken down by:
topsort
implementationoracle
: generating input DAGsoracle
: testing the properties of output listsREADME
, design, and styleSubmission
Submit your code and
README
file via Gradescope.Your submission should include:
topsort
andoracle
: the executable files themselves (in Python, this is the same as #1, in Rust or other languages these may be distinct files):README
see earlier Tasks section for what to include.Extra fun
If you're hungry for more, try out these ideas (which could get up to +5 extra credit). If you do either of these, please (1) implement them in a different copy of your file, and (2) document this in your
README
.topsort
implementation to detect cyclic input.oracle
implementation to terminatetopsort
processes that run for longer than 2 minutes.Real-world context
Out in the wild, many software engineering projects use some form of fuzzing or property-based testing. On Thursday February 1, we'll continue our discussion of this in class.
In particular, there are many libraries that make it easier to generate randomized tests, such as QuickCheck for Haskell (one of the projects that popularized this approach!), Hypothesis for Python, and PropTest or Bolero for Rust.
Software engineers on existing projects often use these libraries rather than implementing their own oracle from scratch like we did here. Implementing from scratch is useful for understanding how the concepts work, but I encourage you to consider these libraries for future implementation projects both inside and outside of CS340.
Attribution
This assignment is based on a previous version of the Oracle assignment from Tim Nelson's Logic for Systems at Brown University.