Alloy internals, SAT solving

We know Alloy uses an underlying "SAT solver" to find satisfying instances (or tell you no instances can be found).

The problem of finding an instance is in terms of atoms, sets, and relations; as shown in the analyzer when we have an instance.

In contrast, the problem of boolean satisfiability is this: given a formula of boolean inputs (a, b, c, etc) that are combined with and, or, and not, is there some assignment of true/false to each input such that the formula is true?

Somehow, Alloy needs to bridge this gap between sets and relations and pure booleans.

Consider this simple model in Alloy:

abstract sig Node {
}
one sig A, B, C extends Node {}
run {} for exactly 3 Node

How many instances do you expect?

Think, then click!

One, since there is one of each node (this model has no actual options).

When we run the analyzer, we see:

```0 vars. 0 primary vars. 9ms.``

The part we will focus on today is the 0 primary vars. This corresponds to the idea that there are 0 options, or choices, the analyzer needs to consider in this model.

What if we add some more optionality to the model?

abstract sig Node {
  edges: set Node
}
run {some edges} for exactly 3 Node

Now, how many instances do you expect?

Think, then click!

There are 9 potential pairs of nodes in edges (3 possible outgoing edges for 3 total nodes). 9 choices of things that may, or may not, be the case in the world. Nothing else. Each edge choice is a boolean, true or false. Like in CS240, we can reason over strings of booleans.

Thus, there are $2^{9} = 512$ potential instances. (Note: if we did not have named nodes, Alloy would do some symmetry breaking to show fewer instances, we'll discuss this later in the course).

If we run Alloy on this model, we'll see:

13 vars. 9 primary vars. 2 clauses. 18ms.

Again, the primary variables correspond to these atomic truths or choices in the model. In this case is, these are just which edges exist in our fixed, 3-node world.

If we add another detail to our model, like forcing disj edges (meaning no two nodes will point to the same node), do we expect the number of variables to change?

abstract sig Node {
  edges: disj set Node
}
one sig A, B, C extends Node {}


run {some edges} for exactly 3 Node

Think, then click!

No, the variables stay the same, but the number of clauses changes:

30 vars. 9 primary vars. 30 clauses. 18ms.

These new clauses encode the idea that two nodes can't point to the same other node, which is what disj captures:

(not A->B) or (not C->B)
(not A->A) or (not C->B)
(not A->A) or (not A->B)

Each of these conditions is implicitly combined with an and.

We can chance the model to increase the number of variables, for example, by allowing a potential 4th node.

For example:

sig Node {
  edges: disj set Node
}
one sig A, B, C extends Node {}


run {some edges} for 4 Node

Now, how many primary variables do we expect?

Think, then click!

There is another potential Node in the world; the world may be either size 3 or 4. Whether or not this fourth Node exists is 1 new Boolean variable. And since there are 4 potential nodes in the world, there are now 4^2 = 16 potential edges. Note we need variables for all 16 possible edges.

The total number of primary variables is thus 1+16 = 17.

Conjunctive Normal Form

Conjunctive Normal Form (CNF) is a specific format that is conceptually the and of a bunch of or clauses on variables or their negations. Boolean solvers often expect input in CNF, for algorithmic reasons we'll see in the next assignment.

A clause is a set of variables and their negations combined with or.
To satisfy the formula, every clause must be satisfied.

From Alloy models to Boolean formulas

Once we know the set of Boolean variables we'll use, we can translate Alloy constraints to purely Boolean ones via substitution.

The actual Alloy Analyzer is more complex than this, but here's an example of how a basic one might work. Suppose we have the constraint:

all n: Node | A in n.edges

There are no all quantifiers in Boolean logic. How can we get rid of it?

Think, then click!

An all is just a big and over the nodes in the instance.

A in A.edges
A in B.edges
A in C.edges

(Node4 in Node) implies A in Node4.edges
// Implies can be translated to and, or, not
// (p => q) is the same as (not p or q)

There are similar rules for other operators.

Solving SAT

Is the SAT problem decidable? How do you know?

Think, then click!

Yes, it is decidable because we have a finite number of variables. We can describe a brute force algorithm to solve the problem, which always terminates because it loops over $2^{n}$ for a finite number of $n$ variables.

Strategy: brute force with truth tables

We could have a brute force solution that conceptually builds the full truth table for the formula. If we have $n$ variables, the truth table has $2^{n}$ rows. We know the answer is $S A T$ if there is some row with output $t r u e$ (with the assignment given by that row); otherwise the answer is $U N S A T$ .

However, this brute force strategy is always exponential in terms of both memory an runtime. Can we do better?

A better strategy: intuition

Suppose I asked you to solve a boolean constraint problem. Maybe it comes from Alloy or maybe it doesn't. Here's an example:

x1 and (x1 implies x2)

Is there a satisfying instance for this constraint?

If yes, what is it, and how did you obtain it?
If no, how did you reach that conclusion?

Think, then click!

Yes, there is, x1 = true and x2 = true.

The intuition is that because x1 is on its own in a clause, we know it must be true (since and is strict). Once we know x1 is true, we can use that knowledge to simplify the second clause, showing us x2 must be true.

More complex strategy

If you've taken CS235, you know that SAT is NP-complete: it cannot be solved for arbitrary inputs without taking time exponential in the size of the input.

However, that is true for arbitrary inputs, not necessarily common inputs! Plenty of real-world problems are worst case exponential (or even more difficult) and computer systems solve them all the time.

We'll now consider an algorithm that can often do better than exponential on real-world problems. The "code" in today's notes is pseudocode–it shouldn't be viewed as complete–rather, the goal is to motivate the core ideas you'll implement in the next assignment.

Unit propagation

The idea we as humans used in the last example was to find a variable on its own in a clause, commit to that assignment of true or false, then propagate that knowledge. This is called unit propagation.

Our example can be put into CNF as x1 and (!x1 or x2) by expanding the definition of implies.

We can now more easily spot opportunities to propagate knowledge quite quickly, just by searching for clauses of only one element!

A unit clause is a 1-element clause: a variable or its negation.

The empty clause is a 0-element clause, and is equivalent to false. (A clause is a big "or", and an "or" gives a set of ways the formula could evaluate to True. But an empty clause gives no such options!)

We can check for unit clauses in time linear in the number of clauses.

Suppose we've identified a unit clause, in this case x1. Then, for every other clause in the set, we can check:

Case 1: Does the clause contain the same literal as the unit clause? If so, then that clause is subsumed by the unit clause: it must be satisfied! Delete the entire clause.
Case 2: Does the clause contain the opposite literal as in the unit clause? If so, then the clause cannot possibly be made true by that opposite literal. That is, our variable is unhelpful in making the clause true, so we delete that literal from the clause.

Combined algorithm: guessing + unit propagation

Unit propagation can be very helpful, but it often only gets us for far. We might have formulas with no unit clauses. In that case, we have to fall back on the previous strategy: to brute-force by guessing!

We need to be sure we still have a complete solver though, so if our first guess fails, we'll need to try again with the opposite guess. This makes our algorithm a backtracking search (because we backtrack on a wrong guess).

With the combination of unit propagation and backtracking search, is our best-case time still exponential?

Think, then click!

No, now, the best-case time is linear! We could guess correctly on every branch. The solver could be lucky: consider a formula like x1 and x2 and x3 and x4 and x5. If we always guess true, we only needs 5 recursive steps to return true; the brute force solution would need 2^5.

Note this initial pseudocode is simplified somewhat from class. See the assignment for complete pseudocode.


def solve(formula: List<Clause>) -> bool:
    
    # are there any unit clauses? if so, propagate them
    # keep doing so until there are no more changes
    # Beware: mutation can be a source of bugs here...
    while formula changes:
        formula = unit_propagate(formula)
    
    # Did we produce the empty clause?
    if empty clause in formula:
        # This indicates a contradiction
        return False
    # Did we find a valid solution?
    else if formula is all consistent unit clauses:    
        return True
    # Otherwise, we need to guess 
    else:
        branch = remaining[0]
        # Guess that the variable is true
        true_result = solve(formula + {branch})
        
        # Our guess worked!
        if true_result: 
            return True
        # Our guess failed; BACKTRACK and try again
        else: 
            # Guess that the variable is false
            false_result = solve(formula + {!branch})
            return false_result

This idea–a recursive, backtracking search paired with unit propagation–is the foundation of one of the most famous boolean solver algorithms: DPLL (named after the authors: Davis, Putnam, Logemann, and Loveland).

What the algorithm returns

Returning just a boolean is not very useful for the caller. For example, if Alloy were using this solver, it would want the instance itself, not just that the instance exists.

Every time we make a recursive call, there's an implicit set of assumptions involved. That is, there's always a partial instance of previously-selected guesses in effect at any point. We can make this explicit by adding a parameter to the function, and returning the guesses that produce a success.

If our goal is to produce a total solver (and it usually is), we'll need to post-process the result and pick arbitrary values for variables that have no value in the assumption set.

The pseudocode provided in the assignment will include an argument for the mapping itself.

Continuing with Thursday's lecture

Improvements to SAT solving implementation

Modern SAT solvers use many improvements beyond the core DPLL algorithm we have covered so far.

Easy: Pure elimination

Do you spot any way we as humans could simplify this formula?

1, -2, 4, 5, -6
1, 3, -4, 8
1, -7 -8

Think, then click!

Since the literal 1 has the same sign (positive) in every clause, we know it would not be helpful to set 1 for false, and we actually can set 1 to true then use that simplify the rest of the formula (in this case, that assignment alone is sufficient).

The core DPLL algorithm captures this idea as well, with a subroutine called pure elimination.

A variable is pure if for every clause it appears in within the formula, the literal has the same sign (all positive or all negative).

pure_elim(F):   
  for each variable x
    if +/-x is pure in F
      add a unit clause {+/-x}
      remove all clauses containing +/-x

Implementing pure elimination is extra credit for the SAT assignment.

Harder: conflict-driven clause learning

Modern, state-of-the-art SAT solvers use an additional strategy called conflict-driven clause learning, or CDCL.

We won't go into the details of CDCL or implement it, but the key idea is this: a solver can "learn" from conflicts in parts of the search space in order to avoid running into the same conflict multiple times.

In CDCL, the core idea is to:

Unit propagate and pure eliminate as before.
Check for SAT/UNSAT as before.
Guess when needed. As you do so, build an implication graph about what is implied by the choice.
If you reach a conflict:
- Find the point in the implication graph that led to the conflict (algorithmically, the "cut" in the graph).
- Derive a new clause which is the negation of the assignments that led to the conflict.
- Backtrack to that decision level (potentially, backtracking more aggressively than the basic DPLL algorithm. )

Harder: concurrent/distributed SAT solving

If you had a very large boolean formula to solve, how could you use multiple servers to help?

Think, then click!

Here's an over-simplified idea: divide and conquer based on some subset of the variables in the formula. If you have $k$ servers, pick $l o g_{2} k$ variables, and assign each combination of values (guess) for those variables to each server.

This is a basic divide-an-conquer strategy, where we divide the search space up and assign each subspace to one server.

With this strategy, how do you know the formula is SAT?

Think, then click!

As soon as any server returns SAT with an assignment, the entire formula must be SAT, so you can early return.

With this strategy, how do you know the formula is UNSAT?

Think, then click!

You have to wait for every server to return UNSAT (since a valid assignment could be found with any initial guess).

Real systems for concurrent/distributed SAT solving go beyond basic divide-and-conquer to try to actually share information across the different workers. We won't go into this further in lecture, but this could make an interesting final project.

Getting to CNF form

So far, we've been assuming our SAT solvers can always take in boolean formulas in CNF form. Is this a fair assumption?

Given an arbitrary formula, how do we convert it to CNF?

Think, then click!

A brute force strategy is to apply boolean algebraic laws, as seen in CS240: for example, the distributive law, DeMorgan's law, and the double negation law.

This strategy works, but is worst case exponential in the size of the output formula (the repeated application of the distributive law can blow up the size of the formula.)

Real systems use versions of the Tseitin Transformation instead, which is linear in the size of the output formula. We won't go into the details of this transformation, but the intuition is that it adds new variables for subformulas to avoid an exponential blowup in the size of the output.

Additional details + applications

Symmetry breaking in Alloy

Going back to our initial Alloy example:

abstract sig Node {
  edges: one Node
}
one sig A, B, C extends Node {}

run {} for exactly 3 Node

Consider what happens if we change the model such that the nodes A, B, and C are no longer named/individual sigs.

The model then becomes:

sig Node {
  edges: one Node
}

run {} for exactly 3 Node

When we run this version, we only get 9 instances, instead of $2^{9}$ instances!

The reason for this is that Alloy tries to avoid showing you the "same" instance multiple times. If atoms are unnamed and the constraints can never distinguish between different atoms them, instances will be considered "the same" if you can arrive at one by renaming elements of the other.

Since the individual Node atoms are now anonymous/unnamed, we have far more equivalent instances. We call these instances isomorphic to each other, or symmetries of each other. Alloy will only show one instance per set of isomorphic instances (in most cases).

Formally, we say that Alloy finds every instance "up to isomorphism". This is useful for:

increasing the quality of information you get from paging through instances; and
improving the runtime on unsatisfiable problems.

Filtering out instances via symmetry breaking is an area of active research.

In talking to researchers who use Alloy in the field, symmetry breaking is a core concern for applying Alloy in practice.

Handling integers and counting in boolean SAT

How does allow handle the + operator, for example, from our counter model:

always Counter.counter' = add[Counter.counter, 1]

if the solver only reasons over booleans?

Think, then click!

Like we saw in CS240, you can build a single-bit adder from and, or, and not gates. Alloy builds integer arithmetic out of basic boolean functions using this same type of logic. There is a deep connection between systems implementations and boolean logic!

More applications of SAT solving

SAT solving is important from a theoretical perspective (complexity theory) as well as a practical one (applications spanning many subfields of computer science).

Some primary applications include:

Software verification
- Design model checking/finding, like Alloy.
- Some forms of static analysis.
- Bounded model checking on actual implementations (more next week).
- Unbounded verification (Dafny, F*, etc).
Hardware design + verification
- Verification and equivalence checking.
- Example: layout of physicals compute elements within a chip design.
Planning/scheduling
8. E.g., air-traffic control, telegraph routing, arranging tournament or visit schedules.

See Practical application of SAT solving for more.

Bounded model checking

Alloy is one example of model checking designs, but there are also approaches tied more closely to software implementations. That is, other forms of model checking run on actual code, rather than requiring a separate design specification like in Alloy.

Classic bounded model checking asks the question: is there a path of length k from an initial state to a state where a desired property does not hold?

Other model checkers analyze concrete implementation code. For example, the C bounded model checker (used by AWS as well as other companies).

Example of CBMC in use in industry AWS C Common Library.

Next week, we'll cover more on how SAT solvers are used within other, more complex solvers called SMT solvers.