Property-Based Testing

Note:

Important notes are in blue.

Tasks are in green.

Learning outcomes

In this assignment, you will:

Reason about correctness for a programming problem with multiple valid solutions.
Implement an algorithm based on an informal, English-language specification.
Create an automatic test case generator that uses property-based testing to check whether implementations of the algorithm meet the specification.

This is a language-agnostic programming assignment.

Programming language; machine

Programming language

You may use any programming language to complete this assignment, with instructor permission. Permission is granted by default for two programming languages: Python and Rust, for the following reasons:

Python: Everyone registered for this class has some experience with Python, so this is likely simplest way for many students to complete this assignment.
Rust: Rust is a (relatively) new, modern language for safer systems programming. The implementation assignments in this class, with the associated Workshop session, provide a chance for you to pick up a new programming language if you would like. Because Rust is a bit more of a challenge, completing the assignment in Rust will add up to +10 (out of 100) extra credit points on the assignment.

Whatever language you choose, your submission must conform the input-output specification (described in the next section).

Machine

You can complete this assignment on any machine with a UNIX-based file system (your own Mac OS or Linux machine or the department's CS Linux server). If you have a Windows machine, you can still develop locally, but should test your code on the CS Linux server (e.g., using the VSCode remote server extension; post on Zulip if you need help with this).

Assignment description

At a high level, this assignment will have you

Implement a useful algorithm, then
Write your own property-based testing oracle to check the correctness of your implementation.

In grading your assignment, I will check your oracle against known-incorrect implementations of the algorithm!

The problem

You are a software engineer at an organization that produces many related software packages. Your team wants you to take a look at implementing an automated dependency management system.

Each software package your team produces has some number of dependencies. If package X has dependency Y, that means your system should install package Y before it can install package Y. For example, package Y could be the numpy package for manipulating data, and package X could be a scientific computing script.

Your task is to determine, given a list of dependencies, what order to install packages in. ( This is a problem encountered by real package management software, such a pip for Python, npm for JavaScript, and Cargo for Rust).

For example, you might be given the following list. You should read each line P Q as meaning "P is a dependency of Q" (that is, P should be installed before Q).

B A 
B C 
A C 
D A

For this example, a correct output would be:

B
D
A
C

Meaning: install package B first, then D, then A, then C.

Is this the only correct output? (Think, then click!)

No! For example, B and D have no dependence relationship, so we could be equally correct with the installation order of:

D
B
A
C

In Workshop, we will first stop here and practice coming up with an algorithmic solution to this informal, English-language specification. This is a problem that definitely could come up in a software engineering technical interviews! (It came up at least twice in technical interviews Alexa has had).

The algorithm: topological sort

The problem from the previous section resolves to a classic graph theory problem. A valid list of dependencies should form a simple, directed acyclic graph, where packages are vertices and dependencies are directed edges (going from the package that must be installed first to the package that must be installed after). Let our graph be G = (V, E). Finding a valid order of installation is equivalent to finding a linear ordering of vertices such that for all edges (u, v) in E, u precedes v in the ordering. This is known as a topological sort, or topsort for short.

Another example of topological sort is determining the order of spreadsheets formula cell evaluation. You have almost certainly done your own informal topological sort to determine what order to take your Wellesley Computer Science classes, given the list of prerequisites!

Depending on the graph, many valid topological sorts may exist for each of these problems.

Your tasks: `topsort`, `oracle`, and `README`

The remaining parts of this assignment are individual (though not [Independent]). You will write two programs for this assignment:

Part 1: A topological sorting program (named topsort).

Part 2: An oracle for property-based testing of any topsort implementation (named oracle).

In addition, good software engineering practice is to document your software. I expect your code to be well-commented, and part of your grade will be based on design and style.

Most public software engineering projects have a top-level file called README.txt or README.md. The latter filetype is Markdown, a very useful formatting style used by many open-source software projects (e.g., most projects on Github) as well as the CS340 lecture notes and assignments (the pretty colored boxes are from a particular variant of Markdown popularized by Github).

Your submission should also include a README.txt or README.md that describes:

What language you chose to implement the project in, and what (if any) command line instruction you use to generate your executables.
What machine you tested your code on (just "MacOS" or "CS Linux" suffices).
Any known bugs or issues with your implementation.
A list of other CS340 students you collaborated with (remembering to follow the guidelines in the syllabus).
A list of external resources/websites you used (again, remembering to follow the syllabus guidelines).
How long this assignment took you.

I encourage you to use a private Github repository for this assignment, (1) to practice using Git, which is a useful systems-engineering skill, and (2) to make it easier to recover from mistakes. Refer to the CS240 resources if you need a refresher. Using Git is not required (and be sure to not use a public repository, since everyone in the class is completing the same assignment).

Part 1: Algorithm implementation, `topsort`

I suggest that you implement the topological sorting program using the following pseudocode:











	function topsort(g): // g = directed acyclic graph
	    L = empty list to store ordering
	    S = set of vertices with no incoming edges
	    while S is not empty:
	        u = vertex removed from S
	        append u to L
	        for each vertex v where edge e = (u,v) in g:
	            remove e from g
	            if v has no other incoming edges in g:
	                insert v in S
	    return L

Since your program will ultimately just read and write number from standard in and out, much of your implementation is up to you. You use any appropriate data structures in your chosen language to represent the graph and the queue (for example, just Python list or dict may suffice, depending on your design).

I just ask that you not use any external/library implementation of topological sort (or any other sort).

Your implementation of topsort should be a standalone executable. More on that in the Input-output specification section.

For the purposes of this assignment, your topsort can assume that the input DAG is well-formed. That is, you can assume that the graph does not have cycles!

Part 1: In short, part 1 of this assignment is to create a topological sorting executable (named topsort) that reads in directed acyclic graphs from standard input and writes a single valid topological sort to standard output.

Part 2: Property-based testing, `oracle`

For the second part of this assignment, you will build an automated testing oracle for topological sorting programs based on the concepts of property-based testing that we covered.

Your oracle program will consume the file system path to a topological sorting program and the number of inputs to generate, n.

Your oracle should generate n valid input graphs and write each of them to the provided program's standard input. Your oracle will check that the output provided by the topsort executable is in fact a topological sort of the corresponding input. Note that your oracle should call the topsort implementation n times.

Note: implementing your oracle to meet this specification will require running a subprocess and piping to/from the subprocesses standard input and output. These are useful systems-engineering skills. We'll talk through this together in Workshop.

Note that this is one component that you can look up via online tutorials. That is, searching and adapting code for subprocess management with attribution (that is, including the link in your README) is encouraged, as long as you abide by the license of the source. Searching and adapting an existing Python topological sort implementation is not allowed.

If the provided topsort produces a valid topological sort for all n graphs that you have generated, then your program should exit with status 0. Otherwise, your program should exit with status 1. The oracle should not write anything to standard output or standard error.

For the purposes of this assignment, your oracle should assume that n is valid (a non-negative integer).

In a real system, we would expect to write checks to validate this, but that is not the focus in this one-week assignment.

You may be worried about what your oracle would do about a deviously-slow topsort implementation. For this assignment, I will not test against any very long-running topsort implementations.

In general, it would be impossible to write an oracle to get around the "long running" problem.

Why? (Think, then click!)

Testing (in finite time) whether or not a program terminates is impossible in general (see the Halting problem)!

Most real-world testing systems work around this by setting a reasonable threshold (e.g., 5 minutes). If the tested program does not terminate within the threshold, the program is considered to be non-terminating and the testing software kills the offending process. However, I don't expect you to implement this in a week-long assignment.

Part 1: In short, part 2 of this assignment is to create a property-based testing system (named oracle) that reads the path to a topsort oracle and a number of inputs n, generates n input directed acyclic graphs, and exits with a status depending on whether the topsort is correct with respect to those n inputs.

Input-output specification

System topological sort implementations model graphs in a simple text format, where each line has two space-delimitated values that represent a single directed edge.

For example, consider the following directed acyclic graph:

The expected text format of this graph is:

where each line represents a directed edge with each vertex name, an alphanumeric, case-sensitive string of any length, separated by a space (in the example of 7 11, vertex 7 comes before 11). Note that the order of lines does not matter.

Remember, a valid input should not have cycles!

The expected output of a topological sort in this text-based representation is in this form:

where each line contains a vertex name in topological order from top to bottom.

Remember that for valid topological sorts, your oracle should exit with status 0, and nothing should be printed to standard out or standard error. That is, your oracle should write graphs in this format to the topsort input and read lists of vertices from the topsort program in this format, but the oracle itself should not print anything to the command line.

Testing your testing

You'll want to perform multiple checks of both of your executables.

Testing your `oracle` against the system `tsort`

First, you can test your oracle by using a known-correct (although not formally proven correct!) implementation of topological sort.

Most UNIX-based systems also come with a command-line topological sorting utility installed (and this assignment matches the utility's input-outport format). To check if your system has this installed, run which tsort (note the different spelling). On CS Linux, this is installed at /usr/bin/tsort.

Test against the system tsort with:

./oracle /usr/bin/tsort n

where n is the number of inputs you want to generate. You can experiment to see how big of a n you can test (though please do not run any program for more than a few minutes on CS Linux).

Your oracle should exit with status 0 on the system tsort.

You can use the following terminal command to show the return status of the command run previously.

echo $?

Testing your `topsort` with your `oracle`

You also (hopefully!) have your own correct topsort implementation from part 1.

Using your own oracle, check that your own implementation of topsort is correct. Choose your own values for n.

./oracle ./topsort n

Testing your own BROKEN `topsort`s with your `oracle`

To ensure your oracle detects incorrect topological sorting programs, try running it on deliberately broken implementations of your creation! You can do this by copying your implementation to a different files with different names, and passing those paths instead.

Create two different broken versions of topsort: ./bad-topsort-1 and ./bad-topsort-2. These could even be earlier version of your implementation!

Using your own oracle, check that your broken variants of topsort are found to be faulty by your oracle. Choose your own values for n.

./oracle ./bad-topsort-1 n
./oracle ./bad-topsort-2 n

Your oracle should exit with status 1 on these broken programs. Detecting incorrect implementations is arguably more important than detecting correct implementations!

Grading

I'll assess your topsort and oracle programs by running them against different implementations. I'll also assess your code and documentation.

This assignment is worth 100 points, broken down by:

20: topsort correct implementation
10: bad-topsort-1 bad-topsort-2 incorrect implementations
25: oracle: generating input DAGs
25: oracle: testing the properties of output lists
20: README, design, and style
(optional) up to +10 Rust implementation

Submission

Submit your code and README file via Gradescope.

Your submission should include:

All of the source code you use to generate your two executables.
Update for the autograder, to get automated feedback:
- If you are submitting in Ptyhon, just upload the two files topsort and oracle (be sure the name match, e.g., have no .py).
- If you are submitting in Rust, please uplaod topsort.rs and oracle.rs plus your Cargo.toml file that has both [bin] sections. Reach out to Alexa and Ashley via Zulip if you need help with this.
README see earlier Tasks section for what to include.

Extra fun

If you're hungry for more, try out these ideas (which could get up to +5 extra credit). If you do either of these, please (1) implement them in a different copy of your file, and (2) document this in your README.

Adapt your topsort implementation to detect cyclic input.
Adapt your oracle implementation to terminate topsort processes that run for longer than 2 minutes.

Real-world context

Out in the wild, many software engineering projects use some form of fuzzing or property-based testing. We'll continue our discussion of this in class.

In particular, there are many libraries that make it easier to generate randomized tests, such as QuickCheck for Haskell (one of the projects that popularized this approach!), Hypothesis for Python, and PropTest or Bolero for Rust.

Software engineers on existing projects often use these libraries rather than implementing their own oracle from scratch like we did here. Implementing from scratch is useful for understanding how the concepts work, but I encourage you to consider these libraries for future implementation projects both inside and outside of CS340.

Attribution

This assignment is based on a previous version of the Oracle assignment from Tim Nelson's Logic for Systems at Brown University.