code Reverse engineer compiled x86 runes to recover the Sourceror’s Code.

Contents

Overview

Silly version: Overhearing a hushed conversation in Prof. McCodegal’s office, you learn that the powerful, yet volatile, Sourceror’s Code lies hidden away in the depths of the Science Castle, guarded by six phases of magical protection. It holds the key to eternal coding enlightenment (and also the key to finishing your homework). While wandering the Science Castle corridors, you find an unassuming door you believe hides the Code. Meanwhile, the tutors who shall not be named are launching their own quest to steal away the Sourceror’s Code for the questionable machinations of their shadowy organization. You must retrieve the Sourceror’s Code before it is too late!

Fortunately, the designers of the six magical protections have left behind their recipes in ancient x86 runes. You must decipher the runes to determine how the protections work and discover the magical incantations that will disarm them. But beware, these protections are not to be taken lightly – speaking the wrong incantation will trip an alarm, ensnare you in a magical trap, and cause your runes to self-destruct! By disarming all six protections, you will recover the Sourceror’s Code and save the day.

Serious version: We have given you a compiled x86 binary executable without the C source code that created it. You must find inputs that satisfy six computational phases of the program to avoid tripping the alarm (and deleting your executable1). To do this, you will disassemble the executable and work with the x86 code to reconstruct what the program computes with each input. You will submit the six inputs plus your descriptions of what each phase computes (in English or C source code). You receive points for each correct solution and clear description.

Tips for Success
  • Start early. This assignment will be as challenging (and fun) as Bit Transfiguration, but in a different way. We will do the first phase together in lab.
  • You cannot do this assignment in one sitting.2 Allocate time to work on it repeatedly with other things between.
  • Use the tools and techniques introduced by lab and this document.
  • Do not worry if you get stumped temporarily. Go do something else and come back refreshed.
  • Share reverse engineering strategies, patterns, and ideas with each other.
  • Take plenty of breaks for puns, laughing, sprinting around the microfocus, and other wheezes. Have fun!

Setup

Do this assignment in one of the CS 240 computing environments. The runes in this assignment require an x86-64 processor running Linux. (Most insurance policies do not cover loss of time or eyebrows due to use of other environments.) In fact, it appears as though the magical protections may ensnare you immediately (and vanish your runes) when run in other environments.

Clone your repository via the usual process (see the assignment manifest).

Your x86 Runes Adventure Kit contains the following files:

  • descriptions.txt: File in which you describe the x86 code
  • incantations.txt: File in which you write your six inputs
  • runes: Your individualized executable that you will use for the assignment
  • sample: A sample binary executable that we will use together only in lab
  • main.c: C source file with the executable’s main function

Preparatory Exercises

Complete the associated lab assignment and activity. These will help you learn basic reverse engineering tools and techniques and disarm the first phase of a sample executable very similar to your own.

Tasks

Your executable has six phases of protection, each requiring an incantation (input) to disarm. You must use various principled tools and techniques to reverse engineer your executable before attempting to disarm it. Entering an incorrect incantation (input) at any phase trips an alarm and notifies the instructors, so guessing randomly or by brute force is inadvisable.

A lab activity helps you learn basic reverse engineering tools and techniques and disarm the first phase of a sample executable. Later phases get progressively harder to disarm, but the expertise you gain with each phase should offset the increased difficulty. If you are stumped, note that *hints* about each phase appear as bad puns in the messages that the executable prints, as well as here: (1) comparison, (2) loops, (3) switch statements, (4) recursion, (5) pointers and arrays, (6) sorting linked lists.

Unsubtantiated rumors suggest there may be a secret seventh protection hidden among the six that can be disarmed for extra fun, credit, and reward.

Submit two parts for each phase:

  1. Incantation (input): Write your disarming incantations for each of the six phases on separate lines in incantations.txt, in order.
  2. Description: In the separate descriptions.txt file, write a succinct paragraph or two for each phase, describing:
    • What the phase is computing with your input at a high level of abstraction.
    • Some key features of the x86 code that correspond to the high-level computation.

    Describe at a high level as if you are summarizing whatever C code compiled to this assembly/machine code. (Feel free to write C to describe what is computed.) Do mention a couple assembly details that were particular aha! moments or red flags that alerted you to this high-level structure, but do not give a line-by-line run-down of the assembly code.

    Keep notes along the way so you do not need to repeat the reverse engineering process to remember how it worked!

Usage

The executable ignores blank input lines. If you run your executable with a command line argument, for example:

$ ./runes incantations.txt

Then it will read input lines from the file incantations.txt until it reaches EOF (end of file), when it will switch over to read from stdin (standard input from the terminal). In a moment of weakness, the magical designers added this feature so you do not have to retype the solutions to phases you have already disarmed.

Text Encoding Matters

Make sure to add a new line (return) at the end of each phase input in incantations.txt, otherwise you may trip the alarm. emacs will do this automatically.

Do not edit or copy inputs from rich text applications, which sometimes use alternative text encodings of similar-looking characters or line endings.

Tools and Techniques

There are many ways to determine how to disarm your executable. You can examine it extensively to build a detailed mental model and determine exactly what it without ever running the program does. This is a useful skill, but is not always easy. You can also run the executable under a debugger, watch what it does step by step, perform disassembly along the way, and use this information to understand its behavior and disarm it. The most effective approach is often to combine elements of both techniques and develop your own new reasoning tools.

Please do not use brute force! You could try every possible key to find the right one (or write a program to do so), but the number of possibilities is so large that you will not be able to try them all in time.

There are many tools designed to help figure out both how programs work and what is wrong when they do not work. You will find the following tools (and hints on how to use them) useful in analyzing your runes.

gdb

To avoid accidentally triggering the protections, becoming ensnared, and temporarily losing your executable1, you must learn how to use a debugger to single-step through the assembly code, inspect state of register and memory, and use breakpoints. We will learn these techniques in lab. One of the nice side effects of this assignment is that you will become skilled at reasoning about the the execution of programs and using a debugger. These are crucial skills that will pay big dividends in the future.

Our key tool is gdb, the GNU Debugger. To run the executable using your incantations.txt under gdb, invoke gdb on the executable alone, then when using the run command inside gdb (or abbreviated to r), give the command-line arguments there:

$ gdb runes
... startup messages from gdb...
(gdb) run incantations.txt

Hint: break trip_alarm will save you time!

Documentation and quick reference for gdb:

objdump and strings

objdump -t runes: This will print out the executable’s symbol table. The symbol table includes the names of all functions and global variables in the executable, the names of all the functions the executable calls, and their addresses. You may learn something by looking at the function names!

objdump -d runes: Use this to disassemble all of the code in the executable. You can also just look at individual functions. Reading the assembler code can tell you how the executable works. Although objdump -d gives you a lot of information, it does not tell you the whole story. Calls to system-level functions may look cryptic. For example, a call to sscanf might appear as:

8048c36: e8 99 fc ff ff call 80488d4 <_init+0x1a0>

To determine that the call was to sscanf, you would need to disassemble within gdb.3

strings -t x runes: This utility will display the printable strings in your executable and their offset within the executable.

Looking for a particular tool? How about documentation? Do not forget, the commands apropos and man are your friends. In particular, man ascii is very useful. If you get stumped, come to office hours or feel free to post questions to the course’s Google group.

x86 Resources

CSAPP Chapter 3 is an excellent resource for understanding x86-64 assembly and machine code. There are several online references as well:

Note that there are two common x86 assembly code syntaxes: Intel and AT&T (we use the latter). CSAPP (p. 177) and Wikipedia describe the differences (including switched operand order).

Grading

The assignment is graded from a maximum of 100 points.

You may receive up to 85 points for disarming phases:

  • Phase 1: 15 points
  • Phase 2: 15 points
  • Phase 3: 15 points
  • Phase 4: 15 points
  • Phase 5: 15 points
  • Phase 6: 10 points

You may receive up to 15 points for descriptions:

  • We will grade descriptions of one or two phases (chosen randomly), by the criteria above.

If a secret phase exists and you disarm it, you will receive a number of extra points that will become non-secret only if you disarm the hypothetical secret phase…

  1. Thanks to version control, it will be recoverable. 2

  2. If you can do this assignment all in one sitting, come chat. We need to find you a challenge.