Buffer
Assignment: Buffer
- Assign: Friday, 12 Mar
- Due: 4:00pm Friday, 19 Mar
- Policy: Individual graded synthesis assignment
-
Code:
cs240 start buffer --solo
- Submit:
git commit
andgit push
your completed code. - Reference:
Contents
- Assignment: Buffer
- Overview
- Setup
- Tasks
- Preparatory Exercises
- The
laptop
Executable - Tools for Crafting Exploits
- Running and Testing Exploits
- Exploits
- Submission
- Grading
- Extra Fun Mayhem
Overview1
Boring version: This assignment helps you develop a detailed
understanding of the call stack organization by deploying a series
of buffer overrun attacks on a vulnerable executable file called
laptop
.
Silly version: Impressed by your recent reverse engineering
adventure, an anonymous Wellesley alum contacts you for assistance
subverting the laptop
of an evil mastermind bent on, you know,
something evil. Your task is to exploit vulnerabilities in the
laptop
’s software with C’s catch-fire semantics by providing
carefully crafted inputs that will cause buffer overflows and lead to
the self-destruction of the laptop
(and its evil whatchyamacallits)
in increasingly alarming ways. (Do not worry, neither your computer
nor ours will explode as a result of this assignment.)
Ethics: In this assignment, you will gain firsthand experience with exploits of a common type of security vulnerability in operating systems and network servers. Our purpose is to help you learn about the runtime operation of programs and to understand the nature and impact of this form of security weakness so that you can avoid it when you write system code. We do not condone the use of these or any other form of attack to gain unauthorized access to any system resources. There are criminal statutes governing such activities.
Goals
- To understand the procedure call abstraction and the details of its implementation with the stack discipline.
- To understand the far-reaching impacts of system design choices, especially through security implications of the call stack in a language that does not enforce memory safety.
- To understand the principles of buffer overrun vulnerabilities through practice exploits in a controlled environment.
- To scare yourself a bit when realizing that the same kind of vulnerability you exploited probably exists somewhere in the software powering your healthcare, transportation, utilities, and more.
Time Reports
According to self-reported times on this assignment from Fall 2018:
- 25% of students spent <= 7 hours.
- 50% of students spent <= 10 hours.
- 75% of students spent <= 10 hours.
Setup
Get your repository with cs240 start buffer --solo
.
Your starter repository contains the following files:
responses.txt
: file for English descriptions of your exploitsexploit1.hex
,exploit2.hex
,exploit3.hex
,exploit4.hex
: files for Exploits 1-4hex2raw
: utility to convert human-readable exploit descriptions written in hexadecimal to raw bytesid2cookie
: utility to convert user ID to unique “cookie” valueMakefile
: recipes to test your exploitslaptop
: executable you will attacklaptop.c
: important parts of C code used to compilelaptop
Create your cookie: Most attacks in this assignment will require
you to make a unique2 8-byte
“cookie”
value show up in places where it ordinarily would not. This value
will also determine the exact behavior of your executable. To
create your personalized cookie, run make cookie
. This will print
your cookie in hex and store your CS username and cookie in the files
id.txt
and cookie.txt
, respectively
Tasks
You must craft exploit strings that accomplish four increasingly
sophisticated buffer overrun attacks when provided as input to the
vulnerable laptop
executable.
Each exploit is described below.
Submit two parts for each exploit:
- Exploit string (input): Write your exploit string in
hex2raw
input format in each of the filesexploit1.hex
throughexploit4.hex
. - Description questions: In the separate
responses.txt
file, answer the questions given with each exploit succinctly to describe how your exploit works.- Many of these questions request only a couple words or an instruction listing from your disassembled code.
- Prose answers should focus on the general meaning of code and data rather than specific numbers or addresses (e.g., “return address”, not “0x4067c5”).
You may find it helpful to use these questions to help guide your exploit development.
Grading considers both the effectiveness of your exploits and your descriptions of how they work.
The remainder of this document describes:
- Prepatory exercises.
- The executable you will attack.
- Tools and techniques to use while constructing an exploit. (Skim, then return when working on Exploit 1.)
- How to run and test your exploits. (Skim, then return when working on Exploit 1.)
- The requirements and questions for each exploit.
- The grading criteria.
Preparatory Exercises
As you read this document, complete these exercises to familiarize yourself with stack frame layout, details of vulnerable functions, and tools for constructing exploits.
Preparation is your ticket for assistance.
- You must complete the preparatory exercises (and show evidence) before asking questions about code or debugging on the main assignment.
- You may ask questions on preparatory exercises at any time.
- Make sure you completed the setup above, including
m
baking your “cookie.” - As you read about the
laptop
executable, disassemble it to findgetbuf
. - Draw the call stack frame for a call to
getbuf
right before it callsGets
, using the conventions from class and lab. Label the positions and sizes of as many parts of the frame as you can recover. - On your call stack drawing, simulate a call to
getbuf
with a sample input string of your choosing by following the C code inlaptop.c
and, showing any updates to the call stack bounds or content. Do not bother simulatingGets
at the x86 level – take its functionaliy at face value as documented (or use the C code). - Remember these later to save time:
- Which exploits will run alone without GDB? Which exploits work only under GDB?
- What is the purpose of
hex2raw
? - What are the steps for running your exploit? For testing all exploits?
The laptop
Executable
The laptop
executable requires a user ID argument on the command
line and reads a string from standard input once it starts up. The
user ID customizes stack layout and verifies a unique “cookie” value
that your attacks must provide.
Usage
To run the laptop
executable:
$ ./laptop -u your_cs_username
Type string:
Alternatively, since your username was saved in a file when you made your cookie earlier, you can also use a subshell to pass the contents of this file as an argument:
$ ./laptop -u $(cat id.txt)
Type string:
Input Vulnerability
The laptop
executable reads a string from standard input with the
function getbuf()
:
unsigned long long getbuf() {
char buf[36];
// ...
unsigned long long val = (unsigned long long)Gets(buf);
// ...
return val % 40;
}
The full version of this function contains more code for an optional
additional challenge. The part shown here is sufficient for the
required parts of this assignment. The key feature to note is that
getbuf()
calls the function Gets()
, passing the address of its
local array buf
, which is allocated on the stack with space
for 36 char
s.
The function char* Gets(char* buf)
is similar to the standard C library function
char* gets(char* buf)
. It reads a string from standard input, terminated by a
newline character ('\n'
), and stores the characters of the string,
followed by a null terminator ('\0'
) starting at the memory address
given by its argument, buf
. It returns its argument.
Neither Gets()
nor gets()
has any way to determine whether there
is enough space at the destination to store the entire
string. Instead, they simply copy the entire string, assuming the
destination is large enough and thus possibly over-running the bounds
of the storage allocated at the destination.
If the input string read by getbuf()
is less than 36 characters
long, it is clear that getbuf()
will return some value less than
0x28 (that’s 4010), as shown by the following execution
example:
$ ./laptop -u your_cs_username
Type string: Acromantula!
Dud: getbuf returned 0x20
The value returned might differ for you, since it is derived from the
address of buf
on the stack, which may vary between systems.
Running the laptop
under gdb
will also yield different values
than it does outside gdb
.
Typically, an error occurs if we type a longer string:
$ ./laptop -u your_cs_username
Type string: This string is too long and it starts overwriting things.
Ouch!: You caused a segmentation fault!
As the error message indicates, over-running the buffer typically
causes the program state (e.g., the return addresses and other data structured that were stored on the stack) to be corrupted, leading to a memory access error. Your task is to be more clever with the strings you feed laptop
so that it does more interesting things. These are called exploit strings.
Disassembly
As in the previous assignment, use gdb
or objdump
to
disassemble the laptop
executable whenever you need to inspect its
contents. You do not need to start
, run
, or single-step into the
code you want to inspect. For example, to disassemble getbuf
, start
GDB with gdb ./laptop
then type disas getbuf
at the GDB prompt.
You may encounter the leaveq
instruction in the laptop
executable. This instruction is a historical artifact tied to how the
%rbp
register was used before x86-64 (when it was %ebp
). The
leaveq
instruction is equivalent to the following pair of
instructions in order:
mov %rbp, %rsp
popq %rbp
Tools for Crafting Exploits
Constructing exploits involves tricky tasks like writing untypeable characters and determining the byte encoding of x86 instructions. Use the techniques below to simplify your job.
Formatting Exploit Strings with hex2raw
Each ASCII character
in a string is represented by one byte. For example 'A'
is
represented by the byte value also described by the hexadecimal number
value 0x41
. While your exploits will be delivered under the guise
of strings, they will embed sequences of bytes encoding addresses,
numbers, or other non-character data. It is hard enough to map each
desired byte value in your exploit back to a character by hand, but
often, the specific bytes required do not even correspond to any
typeable or printable ASCII characters, making it “difficult” to type
your exploit string on a keyboard or view it on the screen. Do not
try to encode your exploit by hand!
We have provided a tool called hex2raw
to encode exploit strings:
- The input to
hex2raw
is a human-readable text description of a byte sequence where each byte is written as pair of hexadecimal digits. Successive bytes may be separated by spaces. - The output of
hex2raw
is a raw byte sequence, where each byte has the hexadecimal value described by the corresponding pair of characters in the input.
Suppose we want the sequence of bytes whose values are the hexadecimal
numbers 0x41
, 0x42
, 0x43
, 0x1b
. These same values, when interpreted
with the ASCII encoding, mean the characters 'A'
, 'B'
, 'C'
,
followed by the ASCII “escape” (ESC) character, which is not treated as a
string character when typed on the keyboard or printed to the terminal
output. Given the input 41 42 43 1b
, the hex2raw
utility will
output the desired 4-byte sequence.
To run hex2raw
, type the series of hexadecimal byte value
descriptions you want in a file (e.g., exploit1.hex
for Exploit 1).
Following our example, we could save the string 41 42 43 1b
into the
file exploit1.hex
using Emacs. Then run:
$ ./hex2raw < exploit1.hex > exploit1.bytes
The shell’s input redirection symbol <
instructs the command-line shell to
use the contents of exploit1.hex
as standard input to hex2raw
,
instead of looking for input from the keyboard. The shell’s output
redirection symbol >
instructs the command-line shell to store the
standard (printed) output of hex2raw
into a file called
exploit1.bytes
. Input and output redirection (<
and >
) are
general features of the command-line shell that can be used
independently and with any executable command.
Once the exploit string byte sequence is stored into the file
exploit1.bytes
, run laptop
with the contents of the file
exploit1.bytes
as input:
$ ./laptop -u your_cs_username < exploit1.bytes
Naturally, as with compiled source code, if you update your
exploit string specification in exploit1.hex
, you must run hex2raw
again to translate the new version to a byte sequence in
exploit1.bytes
to use this new exploit with the laptop
.
Warning: do not use 0A
Your exploit string must not contain byte value 0x0A
(0A
in hex2raw
input) at any intermediate position,
since this is the ASCII code for newline ('\n'
). When Gets()
encounters this byte, it will assume you intended to terminate
the string input. hex2raw
will warn you if it encounters this
byte value.
Running and Testing Exploits
Test all exploits (used for grading):
- Save your exploits in
hex2raw
input format in the proper files. - Run
make test
to translate and test each exploit and generate a summary.
Run an individual exploit:
- Write the exploit string in
hex2raw
input format in, e.g., the fileexploit1.hex
. -
Translate it to raw bytes with
hex2raw
:$ ./hex2raw < exploit1.hex > exploit1.bytes
-
Run it directly (possible for Exploits 1 and 2):
$ ./laptop -u your_cs_username < exploit1.bytes
or under
gdb
(required for Exploits 3 and 4):$ gdb ./laptop [... gdb startup output ...] (gdb) run -u your_cs_username < exploit1.bytes
GDB Scripts
When using gdb
, you may find it useful to save a series of gdb
commands to a text file (e.g., commands.txt
) and then use the -x
commands.txt
flag, which runs each line of the file as a command in
gdb
. This saves the trouble of retyping the commands every time you
run gdb
. You can read more about the -x
flag in gdb
’s man
page.
Exploits
Save your buffer overrun exploit strings in
hex2raw
input format in the files exploit1.hex
,
exploit2.hex
, exploit3.hex
, exploit4.hex
.
Exploit 1: Smoke
The function getbuf()
is called within laptop
by a function test()
:
void test() {
volatile unsigned long long val;
volatile unsigned long long local = 0xdeadbeef;
char* variable_length;
entry_check(3); /* Make sure entered this function properly */
val = getbuf();
if (val <= 40) {
variable_length = alloca(val);
}
entry_check(3);
/* Check for corrupted stack */
if (local != 0xdeadbeef) {
printf("Sabotaged!: the stack has been corrupted\n");
} else if (val == cookie) {
printf("Boom!: getbuf returned 0x%llx\n", val);
if (local != 0xdeadbeef) {
printf("Sabotaged!: the stack has been corrupted\n");
}
if (val != cookie) {
printf("Sabotaged!: control flow has been disrupted\n");
}
validate(3);
} else {
printf("Dud: getbuf returned 0x%llx\n", val);
}
}
When getbuf()
executes its return statement, the program ordinarily resumes execution within function test()
. Within the file laptop
, there is a
function smoke()
:
void smoke() {
entry_check(0); /* Make sure entered this function properly */
printf("Smoke!: You called smoke()\n");
validate(0);
exit(0);
}
Your task is to get laptop
to execute the code
for smoke()
when getbuf()
executes its
return statement, rather than returning to test()
. You
can do this by supplying an exploit string that overwrites the stored
return pointer in the stack frame for getbuf()
with the
address of the first instruction in smoke
. Note that
your exploit string may also corrupt other parts of the stack state,
but this will not cause a problem, because smoke()
causes
the program to exit directly.
Advice
- All the information you need to devise this exploit string can be
determined by examining a disassembled version of
laptop
. - Be careful about byte ordering (i.e., endianness).
- You might want to use
gdb
to step the program through the last few instructions ofgetbuf()
to make sure it is doing the right thing. - The placement of
buf
within the stack frame forgetbuf()
depends on which version ofgcc
was used to compilelaptop
. You must pad the beginning of your exploit string with the proper number of bytes to overwrite the return pointer. The values of these bytes can be arbitrary. - Don’t forget to use
hex2raw
to simplify your job.
Description Questions for Exploit 1
Answer these questions succinctly in responses.txt
.
- During a successfully exploited execution of the
laptop
, one crucial control-flow instruction is affected by your exploit string data in a way that causes it to choose a different next instruction to execute compared to normal execution (in the absence of buffer overflow), and allow the attack to begin executing different code than usual. What is the instruction address and assembly code for that crucial control-flow instruction in thelaptop
executable? - What part of your exploit string (described as a byte offset from the start of the string) causes the instruction from (a) to behave differently than normal? Why? (Write a sentence or two.)
- What instruction executes next after the instruction in (a) in a normal execution? Please provide an instruction address and the assembly code.
- What instruction executes next after the instruction in (a) in your exploited execution? Please provide an instruction address and the assembly code.
Exploit 2: Fizz
Within the laptop
there is also a function fizz()
:
void fizz(int arg1, char arg2, long arg3,
char* arg4, short arg5, short arg6, unsigned long long val) {
entry_check(1); /* Make sure entered this function properly */
if (val == cookie) {
printf("Fizz!: You called fizz(0x%llx)\n", val);
validate(1);
} else {
printf("Misfire: You called fizz(0x%llx)\n", val);
}
exit(0);
}
Similar to Exploit 1, your task is to get laptop
to
execute the code for fizz()
rather than returning
to test
. In this case, however, you must make it appear
to fizz
as if you have passed your cookie as its
argument. You can do this by encoding your cookie in the appropriate
place within your exploit string.
Advice
- Recall that the first six arguments are passed in registers and
additional arguments are passed on the stack. Your exploit code
needs to write to the appropriate place within the stack. This
explains our somewhat contrived
fizz
parameters. - You can use
gdb
to get the information you need to construct your exploit string. Set a breakpoint withingetbuf()
and run to this breakpoint. Determine key features such as the address ofval
and the location of the buffer.
Description Questions for Exploit 2
Answer these questions succinctly in responses.txt
.
- The retq instruction in getbuf uses one word of your exploit string
as a return address. Describe how each subsequent word of the
exploit string is interpreted by fizz, including how it finds your
cookie as
val
, and why each of these words must be at its position to allow fizz to make this interpretation. - What instruction in
fizz
finds the value of theval
argument? Give the instruction’s address and assembly code. - Where does the instruction from (b) find
val
relative to the top of the call stack? (Give a byte offset.)
Exploit 3: Bang
A much more sophisticated form of buffer attack involves supplying
a string that encodes actual machine instructions. The exploit string
then overwrites the return pointer with the starting address of these
instructions. When the calling function (in this
case getbuf
) executes its ret
instruction,
the program will start executing the instructions on the stack rather
than returning. With this form of attack, you can get the program to
do almost anything. The code you place on the stack is called
the exploit code. This style of attack is tricky, though,
because you must get machine code onto the stack and set the return
pointer to the start of this code.
You must run laptop
under gdb
for this exploit to
succeed. (Modern systems use memory protection mechanisms to prevent
execution of memory locations in the stack and guard against exactly
this type of attack. Since gdb
works a little differently than
normal program execution, it allows the exploit to succeed.)
Within the file laptop
there is a function bang()
:
unsigned global_value = 0;
void bang(unsigned long long val) {
entry_check(2); /* Make sure entered this function properly */
if (global_value == cookie) {
printf("Bang!: You set global_value to 0x%llx\n", global_value);
validate(2);
} else {
printf("Misfire: global_value = 0x%llx\n", global_value);
}
exit(0);
}
Similar to Exploits 1 and 2, your task is to get laptop
to execute
the code for bang()
rather than returning to test()
. Before this,
however, you must set global variable global_value
to your
cookie. Your exploit code should set global_value
, push the address
of bang()
on the stack, and then execute a ret
instruction to
cause a jump to the code for bang()
.
Byte-Encoding Instructions for Exploit Code
When including instructions as part of an exploit
payload, you must use the instruction encoding as machine code, the byte sequence
used to encode an instruction like pushq %rax
for the machine. This
is not the byte sequence representing the string "pushq %rax"
.
Use gcc
as an assembler and objdump
as a disassembler to generate
the byte codes for instruction sequences. Suppose we
write a file example.s
containing the following assembly code:
# Example of hand-generated assembly code
movq $0x1234abcd,%rax # Move 0x1234abcd to %rax
pushq $0x401080 # Push 0x401080 on to the stack
retq # Return
The code can contain a mixture of instructions and data. Anything
to the right of a #
character is a comment.
We can now assemble and disassemble this file, saving the disassembler’s description of the binary object code:
$ gcc -c example.s
$ objdump -d example.o > example.d
The generated file example.d
contains the following lines:
0: 48 c7 c0 cd ab 34 12 mov $0x1234abcd,%rax
7: 68 80 10 40 00 pushq $0x401080
c: c3 retq
Each line shows a single instruction. The number on the left indicates
the starting address (starting with 0), while the hex digits after the
:
character indicate the byte codes for the instruction, shown as
individual bytes in memory order from left to right. (Do not
flip them.) Thus, we can see that the instruction pushq $0x401080
has a hex-formatted byte code of 68 80 10 40 00
that could be
entered into an exploit string. The entire byte sequence to encode
the above instructions would be: 48 c7 c0 cd ab 34 12 68 80 10 40 00 c3
.
Advice
- You must run
laptop
undergdb
for this exploit to succeed. - Determining the byte encoding of instruction sequences by hand is
tedious and prone to errors. You can let tools do all of the work by
writing an assembly code file containing the instructions and data
you want to put on the stack. Assemble this file with
gcc
and disassemble it withobjdump
. This will allow you to see the byte sequence to include in your exploit. (A brief example of how to do this is included in the Byte-Encoding Instructions section above.) - Keep in mind that your exploit string depends on your computer, your compiler, and even your cookie.
- Watch your use of address modes when writing assembly code. Note
that
movq $0x4, %rax
copies the literal value0x0000000000000004
into register%rax
; whereasmovq 0x4, %rax
copies the contents of memory at address0x0000000000000004
into%rax
. If you forget a$
character, your code is likely to cause a segmentation fault, because the literal number that you mistakenly wrote as memory address is most likely not a legal memory address. - Due to restrictions on the total size of instruction encodings, x86
does not support all combinations of operands. For example, it is
not possible to write a
movq
instruction with a large literal source operand and an absolute memory address. If you get errors from the assembler, try breaking down instructions into multiple steps, storing intermediate values in registers. - Do not attempt to use either a
jmp
or acall
instruction to jump to the code forbang()
. These instructions use PC-relative addressing, which is tricky to set up correctly in this attack. Instead, push an address on the stack and use theret
instruction.
Description Questions for Exploit 3
Answer these questions succinctly in responses.txt
.
- Starting from the
ret
instruction ingetbuf
, list the sequence of instructions that the computer executes under your exploit up through the first instruction inbang
. For each instruction, list the instruction address and its assembly code. - Describe how the instruction sequence in (a) changes memory contents, register contents, and program counter (i.e., %rip). Write a couple/few sentences or annotate your listing above.
Exploit 4 [Independent Problem]: Boom
You must run laptop
under gdb
for this exploit to
succeed.
Our preceding attacks have all caused the program to jump to the
code for some other function, which then causes the program to
exit. As a result, it was acceptable to use exploit strings that
corrupt the stack, overwriting the saved value of
register %rbp
and the return pointer.
The most sophisticated form of buffer overrun attack causes the
program to execute some exploit code that patches up the stack and
makes the program return to the original calling function
(test()
in this case). The calling function is oblivious
to the attack. This style of attack is tricky, though, since you must:
(1) get machine code onto the stack, (2) set the return pointer to the
start of this code, and (3) undo the corruption made to the stack
state.
Your job is to supply an exploit string that will
cause getbuf()
to return your cookie back
to test()
, rather than the value 1. You can see in the
code for test()
that this will cause the program to go
Boom!
. Your exploit code should set your cookie as the
return value, restore any corrupted state, push the correct return
location on the stack, and execute a ret
instruction to
really return to test()
.
Advice
- You must run
laptop
undergdb
for this exploit to succeed. -
The
leaveq
instruction (a historical artifact tied to how the%ebp
register was used before x86-64) is equivalent to the following pair of instructions in order:mov %rbp, %rsp popq %rbp
- In order overwrite the return address slot on the stack, your exploit
string must also cover all the items saved on the stack between the
buf
array and the return address slot. So far, the code we have attempted to run with the exploit has not depended on this data, but a “normal”-looking return totest()
may depend on it. Consider carefully what is stored here on the stack duringgetbuf
, its original source, where it is stored aftergetbuf
completes, and howtest
may use it. Usegdb
to inspect the disassembled code ofgetbuf
andtest
. Determine how you can organize your exploit to avoid disturbing stack data ingetbuf
on whichtest
later relies. - Let tools such as
gcc
andobjdump
do all of the work of byte-encoding the instructions. - Keep in mind that your exploit string depends on your cookie, your computer, and your compiler.
Description Questions for Exploit 4
Answer these questions succinctly in responses.txt
.
-
Consider normal non-exploited execution.
Consider how the
getbuf
procedure passes its return value to thetest
procedure whentest
callsgetbuf
.-
Give the instruction address and assembly code of the first instruction (a1) in the
test
procedure that uses the return value of thegetbuf
procedure call. -
Give the instruction address and assembly code of the instruction (a2) that stores the return value that is later used by instruction (a1).
-
Does instruction (a2) execute before or after the
ret
instruction in thegetbuf
procedure?
-
-
Now, consider exploited execution using your input.
Consider how the
getbuf
procedure passes its return value to thetest
procedure whentest
callsgetbuf
.-
Give the instruction address and assembly code of the instruction (b2) that stores the return value that is later used by instruction (a1).
-
Does instruction (b1) execute before or after the
ret
instruction in thegetbuf
procedure?
-
-
The following instruction (c) in the
test
procedure appears shortly after the call to thegetbuf
procedure.0x0000000000401019 <+41>: movq -0x18(%rbp), %rbx
Explain how this instruction uses register
%rbp
. -
Consider normal non-exploited execution.
-
Many instructions may update register
%rbp
. Give the instruction address and assembly code of the specific instruction (d1) that puts the value into%rbp
that instruction (c) later finds in%rbp
. In other words, instruction (d1) is the last instruction to update%rbp
before instruction (c) uses%rbp
. (See the Exploit 4 Advice section.) -
Describe the source register or memory location (d2) where instruction (d1) finds the value to put in
%rbp
. If (d2) is a register, give the register name. If (d2) is a memory location, describe its position in the stack when it is used by instruction (d1). -
Trace the origin of this value back further. Give the instruction address and assembly code of the instruction (d3) that put the value in the register or memory location (d2) where instruction (d1) found it.
-
Explain how instruction (d1), location (d2), and instruction (d3) relate to conventions around procedure calls.
-
-
Now, consider exploited execution using your input.
-
Give the instruction address and assembly code of the specific instruction (e1) that puts the value into
%rbp
that instruction (c) later finds in%rbp
. In other words, instruction (e1) is the last instruction to update%rbp
before instruction (c) uses%rbp
. (See the Exploit 4 Advice section.) -
Whether or not instruction (e1) is the same as instruction (d1), describe how your exploit preserved the behavior of instruction instruction (c). If your exploit interacts with instruction (d1), location (d2), instruction (d3), or a distinct instruction (e1), explain how.
-
Explain how instruction (c) might cause a segmentation fault or other memory error if your exploit did not take the steps described by (e2).
-
Reflect on what you have accomplished.
You caused a program to execute arbitrary machine code of your own design simply by choosing a particular input. You have done so in a sufficiently stealthy way that the program did not realize that anything was amiss. Surely this is a significant problem!
Submission
Submit: The course staff will collect your work directly from your hosted repository. To submit your work:
-
Test your source code files one last time. Make sure that, at a minimum, submitted source code is free of syntax errors and any other static errors (such as static type errors or name/scope errors). In other words: the code does not need to complete the correct computation when invoked, but it must be a valid program. We will not grade files that do not pass this bar.
-
Make sure you have committed your latest changes. (Replace
FILES
with the files you changed andMESSAGE
with your commit message.)$ git add FILES $ git commit -m "MESSAGE"
-
Run the command
cs240 sign
to sign your work and respond to any assignment survey questions.$ cs240 sign
-
Push your signature and your latest local commits to the hosted repository.
$ git push
Confirm: All local changes have been submitted if the output of
git status
shows both:
Your branch is up to date with 'origin/master'
, meaning all local commits have been pushednothing to commit
, meaning all local changes have been committed
Resubmit: If you realize you need to change something later, just repeat this process.
Grading
The assignment is graded from a maximum of 100 points:
- Working exploits (80 points): run
make test
to check all of your exploits.- Exploit 1: 25 points
- Exploit 2: 25 points
- Exploit 3: 15 points
- Exploit 4 [Independent Problem]: 15 points
- Questions (20 points):
- Your answers for Exploit 4 and at last one other exploit will be graded.
Extra Fun Mayhem
This is an optional fun challenge. Try it after finishing the required parts of the assignment if you want to see the full power of buffer exploits.
execve
is a system call that replaces the currently running
program with another program inheriting all the open file descriptors. What
are the limitations of the exploits you have performed so far? How could calling
execve
allow you to circumvent this limitation? If you have time,
try writing an additional exploit (mayhem.hex
) that uses execve
and another
program to print a message.
-
This document is an alternative (s/ia32 pyrotechnics/x86-64 incoherent magic references/g) description for the old-style CSAPP Buffer Lab, which is available in ia32 form on the CSAPP website. ↩