Binary Whizbangs and x86 Runes
- Checkpoint:
- Complete, commit, and push solutions to phases 1-3 (and optionally more) by 11:00pm Tuesday, 27 October.
- Each of phases 2-3 not submitted by this checkpoint reduces your maximum possible grade by 5 points.
- We also recommend completing phase 4 by this time, but doing so does not affect your grade.
- Due: 11:00pm Thursday, 29 October
- Starter code:
- fork wellesleycs240 / cs240-x86 and add bpw as admin
- (Need help?)
- Submissions: Commit and push your final version.
- Relevant Reading:
- CSAPP Chapter 3
- Collaboration: For this assignment, you may discuss and develop reverse-engineering strategies together with other students. Each student is given a unique version of the assignment, but the techniques to solve it are largely shared in common. You may share reverse engineering advice, but you may not “drive” another student’s reverse engineering process or allow someone else “drive” yours. For example, you should not let someone else type or dictate a series of gdb commands to you or vice versa.
Overview
A pair of prankster twins has planted a slew of binary whizbangs on our machines. A binary whizbang is a compiled executable program that consists of a sequence of phases encoded with ancient x86 runes (a.k.a. machine code). Each phase expects you to type a particular incantation string on stdin
(standard input). If you type the correct string, then the phase is defused and the whizbang proceeds to the next phase. Otherwise, the whizbang explodes by printing BOOM!!!
and then terminating. (Rumor has it that the whizbang silently notifies your instructor every time this happens.) The whizbang is defused when every phase has been defused.
There are too many whizbangs for your instructors to handle, so we are giving each student a whizbang to defuse. Your mission, which you have no choice but to accept, is to defuse your whizbang before the due date. Good luck, and welcome to the Whizbang Workforce!
Defusing a whizbang is no simple task, but you we will give you guidance and powerful tools for interpreting x86 runes. When you finish you will be an x86 runes expert!
- Start early. We will do the first phase together in lab. Start looking at it before then if you have time.
- Meet the checkpoint on time.
- It will set you up to finish the assignment on time.
- Missing the checkpoint caps your grade (see above).
- Do not worry if you get stumped temporarily. Go do something else and come back refreshed.
- Share reverse engineering strategies, patterns, and ideas with each other.
- Take plenty of breaks for puns, laughing, sprinting around the microfocus, and other wheezes.
- Solicit reverse-engineering advice from your classmates, tutors, and instructors.
- You can get a very solid grade without getting the last phase.
- Have fun!
Instructions
The whizbangs were constructed specifically for 32-bit machines using the IA32 dialect of the notorious x86 runes (a.k.a. machine code). Do this assignment in one of the CS 240 computing environments. (Most insurance policies do not cover loss of time due to use of alternate platforms.) In fact, there is a rumor that the prankster twins built in tamper-proofing charms that will cause them to explode immediately when run on other platforms.
Your Whizbang Workforce Welcome Kit contains the following files:
sample-whizbang
: A sample whizbang we will use together in labwhizbangs
: A directory containing one executable binary whizbang per student, named by Bitbucket IDwhizbang.c
: C source file with the whizbang’s main functionincantations.txt
: File in which you write your defusing solutionsdescriptions.txt
: File in which you describe your defusing solutions
Everyone gets a unique whizbang to defuse. Start by selecting your whizbang and removing everyone else’s:
$ hg mv whizbangs/YOUR_BITBUCKET_ID whizbang
$ hg rm whizbangs
$ hg commit -m 'Selected my whizbang'
Your job is to defuse the whizbang. You can use many tools to help you with this. Please look at the tools section below for some tips and ideas. The best way is to use gdb
to step through the disassembled binary.
Phases
Cursory experiments show that the whizbangs have 6 phases. Later phases get progressively harder to defuse. However, the expertise you gain as you move from phase to phase should offset this difficulty. If you are stumped, check the hints section at the end of this document. Unsubstantiated rumors suggest that a secret phase can be revealed and unlocked with truly powerful magic.
Whizbang Usage
The whizbang ignores blank input lines. If you run your whizbang with a command line argument, for example:
$ ./whizbang incantations.txt
Then it will read input lines from incantations.txt
until it reaches EOF
(end of file), and then switch over to read from stdin
(standard input from the terminal). In a moment of weakness, the prankster twins added this feature so you do not have to keep retyping the solutions to phases you have already defused.
To avoid accidentally setting off the whizbang, you will need to learn how to single-step through the assembly code and how to set breakpoints. You will also need to learn how to inspect the state of both registers and memory. One of the nice side effects of defusing your whizbang is that you will get very good at using a debugger. This is a crucial skill that will pay big dividends in the future.
To run the whizbang using your incantations.txt
under gdb
, invoke gdb
on the executable alone, then when using the run
command inside gdb (or abbreviated to r
), give the command-line arguments there:
$ gdb whizbang
... startup messages from gdb...
(gdb) run incantations.txt
Your Tasks
Your job is twofold:
- Determine the input to provide each phase to successfully defuse the whizbang.
- Document your reasoning.
You will submit two parts for each phase:
- Each line in
incantations.txt
will hold your defusing passphrase for a phase. -
Additionally, you must describe what you learned about the phase and how you solved it in the separate
descriptions.txt
file. Write a concise paragraph or two for each phase, describing:- What the phase is doing with your input.
- How you determined this or how you determined what input to provide.
Describe at a high level as if you are summarizing whatever C code compiled to this assembly/machine code. Do mention a couple assembly details that were particular aha! moments or red flags that alerted you to this high-level structure, but do not give a line-by-line run-down of the assembly code.
Keep notes along the way so you don’t have to do the process all over again to remember how it worked!
Grading:
Grading for each phase will be about half from the solution and half from your descriptions.
Each phase in total will be worth:
- Phase 1: 10 points
- Phase 2: 15 points
- Phase 3: 20 points
- Phase 4: 20 points
- Phase 5: 20 points
- Phase 6: 10 points
Some partial credit may be awarded on unsolved phases for succinct documentation of the methods you tried to reverse-engineer the phase and what you learned about it.
Your grade will be capped at 90 if you do not finish Phase 2 (and 95 if you finish Phase 2 but not Phase 3) by the checkpoint.
Resources for Deciphering x86 Runes
There are a number of online resources that will help you understand any assembly instructions you may encounter while examining the whizbang. In particular, the programming manuals for x86 processors distributed by Intel and AMD are exceptionally valuable. They both describe the same ISA, but sometimes one may be easier to understand than the other. Remember that we are using the 32-bit version (IA32) this semester, though the latest version (x86-64) is 64-bit.
Tools
There are many ways of defusing your whizbang. You can examine it in great detail without ever running the program, and figure out exactly what it does. This is a useful technique, but it not always easy to do. You can also run it under a debugger, watch what it does step by step, perform disassembly along the way, and use this information to defuse it. This is probably the most efficient method.
Please do not use brute force! You could try every possible key to find the right one (or write a program to do so), but the number of possibilities is so large that you will not be able to try them all in time. Rumor has it that each whizbang automatically notifies your instructor every time it goes off.
There are many tools designed to help figure out both how programs work and what is wrong when they do not work. You find the following tools (and hints on how to use them) useful in analyzing your whizbang.
gdb
: The GNU debugger is a command line debugger tool available on virtually every platform. You can trace through a program line by line, examine memory and registers, look at both the source code and assembly code (we are not giving you the source code for most of your whizbang), set breakpoints, set memory watch points, and write scripts. We have used GDB to examine the execution of C programs. Here, without C source or debugging symbols, you will examine only the compiled x86 program.
Here are some tips for using gdb
:
- To keep the whizbang from blowing up every time you type in a wrong input, you will want to learn how to set breakpoints.
- Here is a one-page
gdb
reference sheet (pdf, txt. The tools page has more links. - For other documentation, type
help
at thegdb
command prompt, or typeman gdb
, orinfo gdb
in the shell. Some people also like to rungdb
under gdb-mode in emacs
objdump -t whizbang
: This will print out the whizbang’s symbol table. The symbol table includes the names of all functions and global variables in the whizbang, the names of all the functions the whizbang calls, and their addresses. You may learn something by looking at the function names!
objdump -d whizbang
: Use this to disassemble all of the code in the whizbang. You can also just look at individual functions. Reading the assembler code can tell you how the whizbang works. Although objdump -d
gives you a lot of information, it does not tell you the whole story. Calls to system-level functions may look cryptic. For example, a call to sscanf
might appear as:
8048c36: e8 99 fc ff ff call 80488d4 <_init+0x1a0>
To determine that the call was to sscanf
, you would need to disassemble within gdb
.1
strings -t x whizbang
: This utility will display the printable strings in your whizbang and their offset within the whizbang.
Looking for a particular tool? How about documentation? Do not forget, the commands apropos
and man
are your friends. In particular, man ascii
is very useful. If you get stumped, come to office hours or feel free to post questions to the course’s Google group.
Hints: If you are having trouble figuring out what your whizbang is doing, here are some hints for what to think about at each stage: (1) comparison, (2) loops, (3) switch statements, (4) recursion, (5) pointers and arrays, (6) sorting linked lists.
Submitting Your Work
- Share your Bitbucket repository for this assignment with us: add bpw as admin.
- Commit your final revisions of
incantations.txt
anddescriptions.txt
- Push your latest revision to your Bitbucket repository.
- Double check that you have submitted by either:
hg status
andhg outgoing
should both give empty output.- The version of the file that appears on the Bitbucket webpage for your repository is the one we will get.
-
If you are curious about what’s going on, look ahead to linking and loading, CSAPP Chapter 7. ↩