Today, we'll introduce the tool we'll use for the next ~5 weeks: the Alloy language and Analyzer.
Introduction
Alloy
Reminder: this class does have reading, and for some readings later in the semester we may have reading responses!
Today's reading dug into the motivation behind the design of Alloy.
The core idea behind Alloy is to model the design of software, independently from (and perhaps even before) the implementation. In practice, Alloy has been used to find bugs in decades-old, lauded distributed systems protocols and find new security vulnerabilities that could rival spectre and meltdown.
Real world context: Alloy for security vulnerabilities
We spent some time in class discussing the Spectre and Meltdown software vulnerabilities, which rocked the computing world when they were discovered by several research teams in 2018.
From the above website:
While programs are typically not permitted to read data from other programs, a malicious program can exploit Meltdown and Spectre to get hold of secrets stored in the memory of other running programs. This might include your passwords stored in a password manager or browser, your personal photos, emails, instant messages and even business-critical documents.
In short, Spectre and Meltdown are both "timing-channel" vulnerabilities that allow an attacker to infer secret program values by observing the time it takes programs to execute.
While these programs were discovered independently, researchers at Princeton used Alloy to discover previously unknown, similar attacks. You can read the full paper here.
While Alloy is not necessarily the most popular system analysis tool, this case study is just one example of its substantial real-world impact. We'll also start with Alloy as our first modeling tool because it is (relatively) easy to learn, with a simple syntax and a visual interface.
Models as imperfect representations
As the statisticians "say", "all models are wrong, but some models are useful".
A model essentially by definition leaves off some of the details of the real system being modeled!
In physics, what sort of factors do we typically ignore? For example: when asked to predict the outcome of a block sliding down a ramp.
(Think, then click!)
In basic mechanics, we ignore:
- Friction.
- Air resistance.
Physicists ignore these details because they are negligible to the overall system behavior in basic mechanics.
Rather than trying to model every part of a computer system exhaustively, we'll model things on an "as needed" basis. Our models will start as very general, then we will progressively add details.
Modeling vs. programming
Modeling is distinct from "normal" programming.
An empty program does nothing.
An empty model allows all behavior.
(Credit to Daniel Jackson, the principal creator of Alloy, via Tim Nelson, the professor who first taught me Alloy).
Writing models in Alloy will feel quite different from writing programs!
In traditional, imperative programming, you provide specific instructions to be carried out in a designated order. Writing models in Alloy is a example of declarative programming, which describes rules, or constraints, that determine the behavior of the model. The order of these constraints typically does not matter!
Example: our malloc mystery (mystery 2)
Recall the memory allocator implementation (malloc
) mystery that we discussed on the first day of class:
All blocks are "alloc" or "free".
Blocks have at most one "predecessor"
HEAP
+-----------+
b0 |alloc 0b11 +
+-----------+
b1 | ??? |
+-----------+
b2 |free 0b?0 +
+-----------+
The question: given that we don't know the allocation status of ???,
is it possible for us to know if the following is true?
"There is some free block whose predecessor is allocated"
What do we need to model to answer this question?
(Think, then click!)
We'll start with the basics:
- Blocks
- Allocation status
- The predecessor field
We'll use Alloy to perform this analysis for us (to some extent) by creating a new model.
Note that we'll use Alloy 6 in this course.
Sigs
To model a basic concept in Alloy, we'll use a sig
. sig
s are conceptually like classes in an object-oriented language like Java.
We'll start with a sig
for Block
:
sig Block {
}
run {}
We'll also add an empty run {}
command, which tells the Alloy Analyzer to find us some examples of the model we have defined. When we click Execute
, Alloy will pop up a window to view the example. We can view examples as a graph-based visualization, in plain texts, or as a tree of nested data.
With just a Block
sig defined, we might see a single Block
as the first instance found. Alloy has a univ
built-in sig that contains everything in the model's universe, in this case, our single Block
.
Model vs. instance, sig vs atom
A model is the Alloy code we write that defines a universe and some constraints on that universe. An instance is a specific example the Alloy Analyzer finds for that model (the Analyzer usually finds many instances for a given model).
A sig
defines a concept in the model (like a class
in Java). We'll call a specific instance of a sig
in a model an atom (like an instantiated object in Java). Block
is the sig, Block$0
is an atom.
Scope
Alloy allows you to write infinite models, but the analyzer will only check finite subsets of behavior. Analyzer commands must be scoped (by default, if you leave the scope off, Alloy runs with scope 3).
Alloy includes an integer type in the universe as well by default. We can run for up to 5 blocks (not exactly 5) and no integers with
run {} for 5 but 0 Int
Fields
Like in Java or Python, we can add fields to our concept definition (under the hood, these will work very differently in Alloy, but the syntax is about the same).
Let's define a new Status
sig and add it as a field.
sig Status{}
sig Block {
status: Status,
}
When we run this, Alloy finds instances with a whole bunch of different statuses, which we don't want. We want two specific ones: alloc
and free
.
How might we model the allocation status in an object-oriented language?
(Think, then click!)
One answer: a boolean.
Alloy does have a boolean type in the standard library, but let's imagine we might want to extend the status type in the future.
In OOP, we might use an enum or subclass for that.
In Alloy, we'll use an inheritance relationship to describe child sig
s.
sig Status {}
one sig Alloc extends Status{}
one sig Free extends Status{}
However, this model still allows plain Status
atoms, which we don't want.
How might we model a superclass that we never instantiate in an oriented language?
(Think, then click!)
Making it an abstract
class (e.g., in Java).
We can do that in Alloy too:
abstract sig Status {}
one sig Alloc extends Status{}
one sig Free extends Status{}
The one
modifier tells Alloy to only allow one atom each of Alloc
and Free
, since we do not need separate atoms for these concepts.
Fields as relations
Now, we can add another field that points to the predecessor block.
sig Status{}
sig Block {
status: Status,
prede: Block,
}
(pred
is a keyword in Alloy, so we'll use prede
instead).
When we run this, we now see that prede
looks something like this in the text representation:
prede = {Block$0->Block$1, Block$1->Block$2}
Does this look familiar, from last week? prede
is a binary relation relating each block to its predecessor! In Alloy this is explicit: fields are relations between the sig type you are defining and the sig type of the field.
Alloy's modeling
We'll come to understand this incrementally, but: everything in Alloy is a relation.
Instances of objects make up sets (which you can think of a unary relations).
Single instances, or scalars, are singleton, unary relations (e.g., the integer 1 under the hood in alloy is the set ).
Fields are relations between the defining sig type and the listed type of the field.
Note: relations in Alloy are flat: nothing is truly nested. We can picture each relation as a flat spreadsheet with distinct rows (if you print relations in the Alloy Evaluator for a specific instance, it will show in a format like a spreadsheet).
Default multiplicity on fields
If we run the previous model, we see that each Block
atom always has a predecessor, even though in our diagram the first block has no predecessor.
sig Block {
status: Status,
prede: Block,
}
Here, because the fields do not specify a multiplicity (size of set) on the field, Alloy defaults to one
. Instead, we will use the lone
keyword on the prede
field to indicate each Block can have zero or one predecessor.
We'll update this segment to:
sig Block {
status: Status,
prede: lone Block,
}
And with that, we now have a model for our overall (simplified) heap structure. We could imagine restricting the heap further, to force a linear order of blocks. For now, we'll keep the overall heap definition simple, and switch to modeling the specific question around predecessors.
Predicates in Alloy
The question we want to answer is: given our starting conditions on three blocks, is there some free block whose predecessor is allocated?
To model this in Alloy, we'll constrain our starting condition. In Alloy, a predicate that can be true or false is defined with a pred
block, that looks almost like a function. Each statement within the block adds a constraint.
pred init(b0, b1, b2: Block) {
b1.prede = b0
b2.prede = b1
b0.status = Alloc
b2.status = Free
no b0.predecessor -- the first block has no prede
}
Note that we do not need semicolons to separate lines. Each line is a new constraint on the model (so we could equivalently link them with and
operators; the order does not matter).
In the final line, we use the no
multiplicity to indicate that the block b0
has no predecessors.
Predicates in Alloy are like predicates in logic: they can be either true or false. Importantly: just writing the predicate does not constrain the model to have the predicate be true, unless we invoke it (directly or indirectly) in an analyzer command.
Now when we run this model, we see the initial setup we expect. In viewing the instances, we can color-code block by changing the Theme.
Checking an implication
Now, we want to check if the full scenario in our mystery: must some free block have a predecessor that is allocated?
We'll start by writing a new predicate:
pred someFreeWithAllocPred{
some b: Block |
b.status = Free and b.prede.status = Alloc
}
The |
is pronounced "such that", just like in logic.
some
is like the existential quantifier: it says there exists some Block
where the following lines after the |
are true.
We can run the following predicate and command to see examples where someFreeWithAllocPred
is true:
pred show {
some disj b1, b2, b3: Block | init[b1, b2, b3]
someFreeWithAllocPred
}
run show for 1 but 3 Block, 0 Int
When we click through the different instances the Analyzer finds, our property is true in each. BUT, so far, we have only asked the Analyzer to find examples, rather than to actually reason about whether the statement must always be true.
Asserts and checks
Can we instead ask Alloy if there is every a counterexample to this statement?
Yes! We can use an assert
block with a check
command to check for counterexamples to the mystery, written as an implication. We'll use the general setup of: initial setup implies final condition. In Alloy:
assert show {
all b1, b2, b3: Block | init[b1, b2, b3] => someFreeWithAllocPred
}
check show for 1 but 3 Block, but 0 Int
The check
command, which invokes a named assert
or an unnamed block, tries to find instances (within the given) scope where the assertion is NOT true. That is, check
attempts to find counterexamples that would invalidate the assertion. See the comments below for a comparison of run
and check
.
abstract sig Status {}
one sig Alloc extends Status {}
one sig Free extends Status {}
sig Block {
status: Status,
prede: lone Block,
succ: lone Block
}
fact {
~prede = succ
}
pred init(b0, b1, b2: Block) {
no b0.prede
b1.prede = b0
b2.prede = b1
b0.status = Alloc
b2.status = Free
}
pred someFreeWithAllocPred {
some b: Block |
b.status = Free and b.prede.status = Alloc
}
assert showProperty {
all b1, b2, b3 : Block | init[b1, b2, b3] => someFreeWithAllocPred
}
check showProperty for 5 but exactly 10 Block
Summary of useful Alloy Syntax
See the Resources on the website for more!
Some of these we did not get to today, but will see in-class examples of soon.
Built-in sets:
univ
everything in the model
none
the empty set
Quantifiers/Multiplicities
some
>= 1, equivalent to "there exists"
all
every atom, equivalent to "for all"
no
0, equivalent to "there does not exist"
one
exactly 1
lone
0 or 1
Disjointedness/Distinctness
disj
: used to specify that we mean distinct atoms
Set operators:
+
union
&
intersection
-
difference
in
subset
=
equality
Relation operators:
->
cartesian product
.
join
[]
join, flipped syntax
~
transpose
Logical operators:
&&
, and
, ||
, or
, !
: same as Java/Python
=>
implies
<=>
iff
Sig relationships
S extends T
abstract S
: no atoms other than those contained in its subsets
Today, we'll introduce the tool we'll use for the next ~5 weeks: the Alloy language and Analyzer.
Introduction
Alloy
Reminder: this class does have reading, and for some readings later in the semester we may have reading responses!
Today's reading dug into the motivation behind the design of Alloy.
The core idea behind Alloy is to model the design of software, independently from (and perhaps even before) the implementation. In practice, Alloy has been used to find bugs in decades-old, lauded distributed systems protocols and find new security vulnerabilities that could rival spectre and meltdown.
Real world context: Alloy for security vulnerabilities
We spent some time in class discussing the Spectre and Meltdown software vulnerabilities, which rocked the computing world when they were discovered by several research teams in 2018.
From the above website:
In short, Spectre and Meltdown are both "timing-channel" vulnerabilities that allow an attacker to infer secret program values by observing the time it takes programs to execute.
While these programs were discovered independently, researchers at Princeton used Alloy to discover previously unknown, similar attacks. You can read the full paper here.
While Alloy is not necessarily the most popular system analysis tool, this case study is just one example of its substantial real-world impact. We'll also start with Alloy as our first modeling tool because it is (relatively) easy to learn, with a simple syntax and a visual interface.
Models as imperfect representations
As the statisticians "say", "all models are wrong, but some models are useful".
A model essentially by definition leaves off some of the details of the real system being modeled!
In physics, what sort of factors do we typically ignore? For example: when asked to predict the outcome of a block sliding down a ramp.
(Think, then click!)
In basic mechanics, we ignore:
Physicists ignore these details because they are negligible to the overall system behavior in basic mechanics.
Rather than trying to model every part of a computer system exhaustively, we'll model things on an "as needed" basis. Our models will start as very general, then we will progressively add details.
Modeling vs. programming
Modeling is distinct from "normal" programming.
An empty program does nothing.
An empty model allows all behavior.
(Credit to Daniel Jackson, the principal creator of Alloy, via Tim Nelson, the professor who first taught me Alloy).
Writing models in Alloy will feel quite different from writing programs!
In traditional, imperative programming, you provide specific instructions to be carried out in a designated order. Writing models in Alloy is a example of declarative programming, which describes rules, or constraints, that determine the behavior of the model. The order of these constraints typically does not matter!
Example: our malloc mystery (mystery 2)
Recall the memory allocator implementation (
malloc
) mystery that we discussed on the first day of class:All blocks are "alloc" or "free". Blocks have at most one "predecessor" HEAP +-----------+ b0 |alloc 0b11 + +-----------+ b1 | ??? | +-----------+ b2 |free 0b?0 + +-----------+ The question: given that we don't know the allocation status of ???, is it possible for us to know if the following is true? "There is some free block whose predecessor is allocated"
What do we need to model to answer this question?
(Think, then click!)
We'll start with the basics:
We'll use Alloy to perform this analysis for us (to some extent) by creating a new model.
Note that we'll use Alloy 6 in this course.
Sigs
To model a basic concept in Alloy, we'll use a
sig
.sig
s are conceptually like classes in an object-oriented language like Java.We'll start with a
sig
forBlock
:We'll also add an empty
run {}
command, which tells the Alloy Analyzer to find us some examples of the model we have defined. When we clickExecute
, Alloy will pop up a window to view the example. We can view examples as a graph-based visualization, in plain texts, or as a tree of nested data.With just a
Block
sig defined, we might see a singleBlock
as the first instance found. Alloy has auniv
built-in sig that contains everything in the model's universe, in this case, our singleBlock
.Model vs. instance, sig vs atom
A model is the Alloy code we write that defines a universe and some constraints on that universe. An instance is a specific example the Alloy Analyzer finds for that model (the Analyzer usually finds many instances for a given model).
A
sig
defines a concept in the model (like aclass
in Java). We'll call a specific instance of asig
in a model an atom (like an instantiated object in Java).Block
is the sig,Block$0
is an atom.Scope
Alloy allows you to write infinite models, but the analyzer will only check finite subsets of behavior. Analyzer commands must be scoped (by default, if you leave the scope off, Alloy runs with scope 3).
Alloy includes an integer type in the universe as well by default. We can run for up to 5 blocks (not exactly 5) and no integers with
Fields
Like in Java or Python, we can add fields to our concept definition (under the hood, these will work very differently in Alloy, but the syntax is about the same).
Let's define a new
Status
sig and add it as a field.When we run this, Alloy finds instances with a whole bunch of different statuses, which we don't want. We want two specific ones:
alloc
andfree
.How might we model the allocation status in an object-oriented language?
(Think, then click!)
One answer: a boolean.
Alloy does have a boolean type in the standard library, but let's imagine we might want to extend the status type in the future.
In OOP, we might use an enum or subclass for that.
In Alloy, we'll use an inheritance relationship to describe child
sig
s.However, this model still allows plain
Status
atoms, which we don't want.How might we model a superclass that we never instantiate in an oriented language?
(Think, then click!)
Making it an
abstract
class (e.g., in Java).We can do that in Alloy too:
The
one
modifier tells Alloy to only allow one atom each ofAlloc
andFree
, since we do not need separate atoms for these concepts.Fields as relations
Now, we can add another field that points to the predecessor block.
(
pred
is a keyword in Alloy, so we'll useprede
instead).When we run this, we now see that
prede
looks something like this in the text representation:prede = {Block$0->Block$1, Block$1->Block$2}
Does this look familiar, from last week?
prede
is a binary relation relating each block to its predecessor! In Alloy this is explicit: fields are relations between the sig type you are defining and the sig type of the field.Alloy's modeling
We'll come to understand this incrementally, but: everything in Alloy is a relation.
Instances of objects make up sets (which you can think of a unary relations).
Single instances, or scalars, are singleton, unary relations (e.g., the integer 1 under the hood in alloy is the set ).
Fields are relations between the defining sig type and the listed type of the field.
Note: relations in Alloy are flat: nothing is truly nested. We can picture each relation as a flat spreadsheet with distinct rows (if you print relations in the Alloy Evaluator for a specific instance, it will show in a format like a spreadsheet).
Default multiplicity on fields
If we run the previous model, we see that each
Block
atom always has a predecessor, even though in our diagram the first block has no predecessor.Here, because the fields do not specify a multiplicity (size of set) on the field, Alloy defaults to
one
. Instead, we will use thelone
keyword on theprede
field to indicate each Block can have zero or one predecessor.We'll update this segment to:
And with that, we now have a model for our overall (simplified) heap structure. We could imagine restricting the heap further, to force a linear order of blocks. For now, we'll keep the overall heap definition simple, and switch to modeling the specific question around predecessors.
Predicates in Alloy
The question we want to answer is: given our starting conditions on three blocks, is there some free block whose predecessor is allocated?
To model this in Alloy, we'll constrain our starting condition. In Alloy, a predicate that can be true or false is defined with a
pred
block, that looks almost like a function. Each statement within the block adds a constraint.Note that we do not need semicolons to separate lines. Each line is a new constraint on the model (so we could equivalently link them with
and
operators; the order does not matter).In the final line, we use the
no
multiplicity to indicate that the blockb0
has no predecessors.Predicates in Alloy are like predicates in logic: they can be either true or false. Importantly: just writing the predicate does not constrain the model to have the predicate be true, unless we invoke it (directly or indirectly) in an analyzer command.
Now when we run this model, we see the initial setup we expect. In viewing the instances, we can color-code block by changing the Theme.
Checking an implication
Now, we want to check if the full scenario in our mystery: must some free block have a predecessor that is allocated?
We'll start by writing a new predicate:
The
|
is pronounced "such that", just like in logic.some
is like the existential quantifier: it says there exists someBlock
where the following lines after the|
are true.We can run the following predicate and command to see examples where
someFreeWithAllocPred
is true:When we click through the different instances the Analyzer finds, our property is true in each. BUT, so far, we have only asked the Analyzer to find examples, rather than to actually reason about whether the statement must always be true.
Asserts and checks
Can we instead ask Alloy if there is every a counterexample to this statement?
Yes! We can use an
assert
block with acheck
command to check for counterexamples to the mystery, written as an implication. We'll use the general setup of: initial setup implies final condition. In Alloy:The
check
command, which invokes a namedassert
or an unnamed block, tries to find instances (within the given) scope where the assertion is NOT true. That is,check
attempts to find counterexamples that would invalidate the assertion. See the comments below for a comparison ofrun
andcheck
.Summary of useful Alloy Syntax
See the Resources on the website for more!
Some of these we did not get to today, but will see in-class examples of soon.
Built-in sets:
univ
everything in the modelnone
the empty setQuantifiers/Multiplicities
some
>= 1, equivalent to "there exists"all
every atom, equivalent to "for all"no
0, equivalent to "there does not exist"one
exactly 1lone
0 or 1Disjointedness/Distinctness
disj
: used to specify that we mean distinct atomsSet operators:
+
union&
intersection-
differencein
subset=
equalityRelation operators:
->
cartesian product.
join[]
join, flipped syntax~
transposeLogical operators:
&&
,and
,||
,or
,!
: same as Java/Python=>
implies<=>
iffSig relationships
S extends T
abstract S
: no atoms other than those contained in its subsets