Welcome!
Welcome to CS340: Modeling for Computer Systems!
Agenda for today:
- Introductions
- Collaborative discussion of high-level course goal.
- Course expectations, learning outcomes.
- Some mysteries!
Professor introduction
I’m Alexa VanHattum—you can call me “Alexa”, “Prof. Alexa”, or “Prof VanHattum”.
This is my first year as an Assistant Professor at Wellesley. My research focus is the intersection of programming languages & computer systems, with a focus on applying lightweight formal methods (e.g., this course) to compilers for systems languages.
Before Wellesley, a version of this course (taught at Brown University by Tim Nelson) convinced me to pursue a PhD in the area. I spent time as a software engineer for Apple health, then completed grad school at Cornell.
Student introductions
(Names, pronouns, year, favorite programming project or programming language).
This is a small elective, and we want to all get to know eachother!
Correctness exercise
Let’s think about correctness in the context of programs you’ve written before.
Questions:
- How many people have worked on programming projects for a class? [Everyone]
- How many people have worked on programming projects not for a class? In what context? [Almost everyone]
- How many people have worked on programming projects that have had users other than yourself (it’s ok if not! My answer to this question was no until my senior year of college!)? [Some folks]
Now I want you to step back, and think about how you knew what to program in each case.
In groups, discuss the following. Fill in your 2x4 sheet of paper: {class project, real users}
Write your logins in the top right.
A. How did you know what code to write? Did you know it fully ahead of time, or did you change what you were implementing as you went?
B. How did you know you were done programming?
C. How did you know if your design and code were correct?
D. If the code was incorrect, how would you find that out afterward? What were the implications of it being incorrect?
(Group discussion of these 8 cases). Main points: with course projects, you were given a specification of what to program in the form of instructions from your professor. You knew your program was done because you either (1) passsed some tests or (2) ran out of time. You knew your code was correct because of how many tests it passed, manually testing, or the grade you got. The implication was a potentially bad grade.
With external users and software out in the world, a complete specification might never exist. You may know what to program from clients, or your own ideas, or just be experimenting. You might be done when you reach a deadline, or when you pass code review. For code correctness, you have to write your own tests in this case. Maybe you interact with Quality Assurance (QA) or other forms of additional manual testing. If code is incorrect: maybe it’s caught in QA. Maybe your users don’t use your software how you expect, or complain about it online or in your bug tracker. In extreme cases, maybe the code is used in medical software, or avionics, or autonomous driving, and someone’s life is at risk.
Spectrum of correctness
We can think of the correctness of our program as existing along a wide spectrum.
On the far left, near 0% certainty of correctness, we have code that we wrote without every running.
On the far right, near (but typically not at) 100% certainty of correctness, we have implementations that have been proven (with machine-checked proofs) to meet a formal specification. Even in that case, we ultimately have to trust some layers of the proof infrastructure, so there is still a vanishingly small chance that there are correctness issues with our program.
0% approaching 100%
<------------------------------------------------------------------------------------->
Never run the code!
Make sure it compiles
Make sure the code runs one input, manually
Manually check several cases
Write a few tests
Write a LOT of tests
| automatically
generating tests
Write a formal specification
Formally prove that your code
implements your specification.
"interactive theorem proving"
lives here
-----------------------------------
This course!
Sometimes called "lightwight formal methods".
we'll to the nice spot in the middle by
using MODELS of our system, instead of
proving the actual implementation of
every component against a formal
specification.
Expectations
We went over the course website and syllabus, focusing on:
- Assignments
- Collaboration policy
- Expectations around programming maturity, independence
- Workshop sessions
- Final project
- Readings
- Zulip, calendly, office hours
- No exams
Mysteries
We considered some mysterious computer programs.
Mystery 1
int mystery(int x) {
if (x % 2 == 0) {
return x;
} else {
return x - 1;
}
}
Mystery 2
Mystery 3
Let’s model a basic public-key cryptography scheme!
Can someone explain the basic idea of public key cryptography?
For this to work to let two people communicate, they need a way to know they are speaking to the right person. We don’t need to fully understand the details for this example, but the important thread to follow is that each person can “prove” they are who they say they are by decoding messages that were encoded with their public key.
Alice Wendy
Alice picks random "nonce",
initiates communication
----- {NonceA, I'm Alice}PublicW ---->
Wendy sees that Alice
wants to communicate, sends
back old nonce (proving Wendy
can descrypt a hidden message
for Wendy) and a new nonce
<------- {NonceA, NonceW}PublicA -----
Alice sends back the second
nonce to prove Alice can
descrypt a hidden message
for Alice
----- {NonceW}PublicW ----------->
Now, do both parties know they are communicating with the
right person?
This scenario (a simplification of the Needham–Schroeder protocol)
was “proven” correct for over a decade, but it has a bug!
Man-in-the-Middle Attack
If Eve can trick Alice into starting a communication:
Alice Eve Alice's Bank
-- {NonceA, I'm Alice}PublicE --->
---- {NonceA, I'm Alice}PublicB ------>
<------ {NonceA, NonceB}PublicA -------
<--- {NonceA, NonceB}PublicA ----
---- {NonceB}PublicE ----------- >
--------------- {NonceB}PublicB ------>
Now Alice is sending her bank info to Eve!
Mystery 4
We briefly touched on the Chord distributed systems protocol, and how a bug was found in it using Alloy, one of the tools we’ll learn in this class. More on this next time!
Welcome!
Welcome to CS340: Modeling for Computer Systems!
Agenda for today:
Professor introduction
I’m Alexa VanHattum—you can call me “Alexa”, “Prof. Alexa”, or “Prof VanHattum”.
This is my first year as an Assistant Professor at Wellesley. My research focus is the intersection of programming languages & computer systems, with a focus on applying lightweight formal methods (e.g., this course) to compilers for systems languages.
Before Wellesley, a version of this course (taught at Brown University by Tim Nelson) convinced me to pursue a PhD in the area. I spent time as a software engineer for Apple health, then completed grad school at Cornell.
Student introductions
(Names, pronouns, year, favorite programming project or programming language).
This is a small elective, and we want to all get to know eachother!
Correctness exercise
Let’s think about correctness in the context of programs you’ve written before.
Questions:
Now I want you to step back, and think about how you knew what to program in each case.
In groups, discuss the following. Fill in your 2x4 sheet of paper: {class project, real users}
Write your logins in the top right.
A. How did you know what code to write? Did you know it fully ahead of time, or did you change what you were implementing as you went?
B. How did you know you were done programming?
C. How did you know if your design and code were correct?
D. If the code was incorrect, how would you find that out afterward? What were the implications of it being incorrect?
(Group discussion of these 8 cases). Main points: with course projects, you were given a specification of what to program in the form of instructions from your professor. You knew your program was done because you either (1) passsed some tests or (2) ran out of time. You knew your code was correct because of how many tests it passed, manually testing, or the grade you got. The implication was a potentially bad grade.
With external users and software out in the world, a complete specification might never exist. You may know what to program from clients, or your own ideas, or just be experimenting. You might be done when you reach a deadline, or when you pass code review. For code correctness, you have to write your own tests in this case. Maybe you interact with Quality Assurance (QA) or other forms of additional manual testing. If code is incorrect: maybe it’s caught in QA. Maybe your users don’t use your software how you expect, or complain about it online or in your bug tracker. In extreme cases, maybe the code is used in medical software, or avionics, or autonomous driving, and someone’s life is at risk.
Spectrum of correctness
We can think of the correctness of our program as existing along a wide spectrum.
On the far left, near 0% certainty of correctness, we have code that we wrote without every running.
On the far right, near (but typically not at) 100% certainty of correctness, we have implementations that have been proven (with machine-checked proofs) to meet a formal specification. Even in that case, we ultimately have to trust some layers of the proof infrastructure, so there is still a vanishingly small chance that there are correctness issues with our program.
Expectations
We went over the course website and syllabus, focusing on:
Mysteries
We considered some mysterious computer programs.
Mystery 1
Mystery 2
Mystery 3
Let’s model a basic public-key cryptography scheme!
Can someone explain the basic idea of public key cryptography?
For this to work to let two people communicate, they need a way to know they are speaking to the right person. We don’t need to fully understand the details for this example, but the important thread to follow is that each person can “prove” they are who they say they are by decoding messages that were encoded with their public key.
Now, do both parties know they are communicating with the
right person?
This scenario (a simplification of the Needham–Schroeder protocol)
was “proven” correct for over a decade, but it has a bug!
Man-in-the-Middle Attack
If Eve can trick Alice into starting a communication:
Now Alice is sending her bank info to Eve!
Mystery 4
We briefly touched on the Chord distributed systems protocol, and how a bug was found in it using Alloy, one of the tools we’ll learn in this class. More on this next time!