code Reverse engineer compiled x86 runes to recover the Sourceror’s Code.
- Assign: Monday, 13 March
- Checkpoint: aim to complete Phase 4 by 11:59pm Monday, 20 March
- Due: 11:59pm Friday, 24 March
- Collaboration: Individual code assignment policy, as defined by the syllabus and honor code.
- Starter Code:
hg clone ssh://email@example.com/cs240codetub/cs240-x86-owner(Need help?)
- Submit: Commit and push your final revision and double check. (Need help?)
- Relevant Reference:
Silly version: Overhearing a hushed conversation in Prof. McCodegal’s office, you learn that the powerful, yet volatile, Sourceror’s Code lies hidden away in the depths of the Science Castle, guarded by six phases of magical protection. It holds the key to eternal coding enlightenment (and also the key to finishing your homework). While wandering the Science Castle corridors, you find an unassuming door you believe hides the Code. Meanwhile, the tutors who shall not be named are launching their own quest to steal away the Sourceror’s Code for the questionable machinations of their shadowy organization. You must retrieve the Sourceror’s Code before it is too late!
Fortunately, the designers of the six magical protections have left behind their recipes in ancient x86 runes. You must decipher the runes to determine how the protections work and discover the magical incantations that will disarm them. But beware, these protections are not to be taken lightly – speaking the wrong incantation will trip an alarm, ensnare you in a magical trap, and cause your runes to self-destruct! By disarming all six protections, you will recover the Sourceror’s Code and save the day.
Serious version: We have given you a compiled x86 binary executable without the C source code that created it. You must find inputs that satisfy six computational phases of the program to avoid tripping the alarm (and deleting your executable1). To do this, you will disassemble the executable and work with the x86 code to reconstruct what the program computes with each input. You will submit the six inputs plus your descriptions of what each phase computes (in English or C source code). You receive points for each correct solution and clear description.
- Start early. This assignment will be as challenging (and fun) as Bit Transfiguration, but in a different way. We will do the first phase together in lab.
- You cannot do this assignment in one sitting.2 Allocate time to work on it repeatedly with other things between.
- Use the tools and techniques introduced by lab and this document.
- Do not worry if you get stumped temporarily. Go do something else and come back refreshed.
- Share reverse engineering strategies, patterns, and ideas with each other.
- Take plenty of breaks for puns, laughing, sprinting around the microfocus, and other wheezes. Have fun!
Do this assignment in one of the CS 240 computing environments. The runes in this assignment require an x86-64 processor running Linux. (Most insurance policies do not cover loss of time or eyebrows due to use of other environments.) In fact, it appears as though the magical protections may ensnare you immediately (and vanish your runes) when run in other environments.
Clone your repository via the usual process (see the assignment manifest).
Your x86 Runes Adventure Kit contains the following files:
descriptions.txt: File in which you describe the x86 code
incantations.txt: File in which you write your six inputs
runes: Your individualized executable that you will use for the assignment
sample: A sample binary executable that we will use together only in lab
main.c: C source file with the executable’s main function
Complete the associated lab assignment and activity. These will help you learn basic reverse engineering tools and techniques and disarm the first phase of a sample executable very similar to your own.
Your executable has six phases of protection, each requiring an incantation (input) to disarm. You must use various principled tools and techniques to reverse engineer your executable before attempting to disarm it. Entering an incorrect incantation (input) at any phase trips an alarm and notifies the instructors, so guessing randomly or by brute force is inadvisable.
A lab activity helps you learn basic
reverse engineering tools and techniques and disarm the
first phase of a sample executable. Later phases get progressively
harder to disarm, but the expertise you gain with each phase should
offset the increased difficulty. If you are stumped, note that
*hints* about each phase appear as bad puns in the messages that the
executable prints, as well as here: (1) comparison, (2) loops, (3)
switch statements, (4) recursion, (5) pointers and arrays, (6) sorting
Unsubtantiated rumors suggest there may be a secret seventh protection hidden among the six that can be disarmed for extra fun, credit, and reward.
Submit two parts for each phase:
- Incantation (input): Write your disarming incantations for each
of the six phases on separate lines in
incantations.txt, in order.
- Description: In the separate
descriptions.txtfile, write a succinct paragraph or two for each phase, describing:
- What the phase is computing with your input at a high level of abstraction.
- Some key features of the x86 code that correspond to the high-level computation.
Describe at a high level as if you are summarizing whatever C code compiled to this assembly/machine code. (Feel free to write C to describe what is computed.) Do mention a couple assembly details that were particular aha! moments or red flags that alerted you to this high-level structure, but do not give a line-by-line run-down of the assembly code.
Keep notes along the way so you do not need to repeat the reverse engineering process to remember how it worked!
The executable ignores blank input lines. If you run your executable with a command line argument, for example:
$ ./runes incantations.txt
Then it will read input lines from the file
EOF (end of file), when it will switch over to read from
stdin (standard input from the terminal). In a moment of weakness,
the magical designers added this feature so you do not have to retype
the solutions to phases you have already disarmed.
Make sure to add a new line (return) at the end of each phase input in
incantations.txt, otherwise you may trip the alarm.
emacs will do
Do not edit or copy inputs from rich text applications, which sometimes use alternative text encodings of similar-looking characters or line endings.
Tools and Techniques
There are many ways to determine how to disarm your executable. You can examine it extensively to build a detailed mental model and determine exactly what it without ever running the program does. This is a useful skill, but is not always easy. You can also run the executable under a debugger, watch what it does step by step, perform disassembly along the way, and use this information to understand its behavior and disarm it. The most effective approach is often to combine elements of both techniques and develop your own new reasoning tools.
Please do not use brute force! You could try every possible key to find the right one (or write a program to do so), but the number of possibilities is so large that you will not be able to try them all in time.
There are many tools designed to help figure out both how programs work and what is wrong when they do not work. You will find the following tools (and hints on how to use them) useful in analyzing your runes.
To avoid accidentally triggering the protections, becoming ensnared, and temporarily losing your executable1, you must learn how to use a debugger to single-step through the assembly code, inspect state of register and memory, and use breakpoints. We will learn these techniques in lab. One of the nice side effects of this assignment is that you will become skilled at reasoning about the the execution of programs and using a debugger. These are crucial skills that will pay big dividends in the future.
Our key tool is
gdb, the GNU Debugger. To run the executable using
gdb on the executable
alone, then when using the
run command inside gdb (or abbreviated to
r), give the command-line arguments there:
$ gdb runes ... startup messages from gdb... (gdb) run incantations.txt
break trip_alarm will save you time!
Documentation and quick reference for
- CS 240 GDB reference sheet (pdf, txt)
- GDB Quick Reference card
- GDB manual
- For documentation from the command line, type
gdbcommand prompt, or type
man gdb, or
info gdbin the shell. Some people also like to run
gdbunder gdb-mode in emacs.
objdump -t runes: This will print out the executable’s symbol
table. The symbol table includes the names of all functions and
global variables in the executable, the names of all the functions
the executable calls, and their addresses. You may learn something
by looking at the function names!
objdump -d runes: Use this to disassemble all of the code in the
executable. You can also just look at individual functions. Reading
the assembler code can tell you how the executable works. Although
objdump -d gives you a lot of information, it does not tell you
the whole story. Calls to system-level functions may look
cryptic. For example, a call to
sscanf might appear as:
8048c36: e8 99 fc ff ff call 80488d4 <_init+0x1a0>
To determine that the call was to
sscanf, you would need to
strings -t x runes: This utility will display the printable
strings in your executable and their offset within the executable.
Looking for a particular tool? How about documentation? Do not forget,
man are your friends. In particular,
ascii is very useful. If you get stumped, come to office hours or
feel free to post questions to the course’s Google group.
CSAPP Chapter 3 is an excellent resource for understanding x86-64 assembly and machine code. There are several online references as well:
- CSAPP Chapter 3 (draft)
- CS 240 x86-64 exam reference sheet
- exhaustive x86 manuals from Intel and AMD
- x86 Assembly WikiBook
- CSAPP textbook resources
Note that there are two common x86 assembly code syntaxes: Intel and AT&T (we use the latter). CSAPP (p. 177) and Wikipedia describe the differences (including switched operand order).
The assignment is graded from a maximum of 100 points.
You may receive up to 85 points for disarming phases:
- Phase 1: 15 points
- Phase 2: 15 points
- Phase 3: 15 points
- Phase 4: 15 points
- Phase 5: 15 points
- Phase 6: 10 points
You may receive up to 15 points for descriptions:
- We will grade descriptions of one or two phases (chosen randomly), by the criteria above.
If a secret phase exists and you disarm it, you will receive a number of extra points that will become non-secret only if you disarm the hypothetical secret phase…