CS 240 Lab 9

Procedures and the Call Stack

CS 240 Lab 9

In lab this week, you will experiment with how the stack is used for procedure/function calls.

Recursive Program

Exercise 1: When compiled, the C program from the lab assignment produces the x86 assembly shown on the right.

C program X86 assembly
#include <stdio.h>
long getAndSumValues() {
    long x;

// start of getAndSumValues; allocate stack space

0x401136 <+0>:       sub $0x18,%rsp
    printf("Enter an integer: ");

// print the prompt

0x40113a <+4>:       mov $0x4006a8,%edi
0x40113f <+9>:       mov $0x0,%eax
0x401144 <+14>:      callq 0x400418 <printf@plt>
    scanf("%ld", &x);

// Read keyboard input as a long int into x.

0x401149 <+19>:      lea 0x8(%rsp),%rsi
0x40114e <+24>:      mov $0x4006b8,%edi
0x401153 <+29>:      mov $0x0,%eax
0x401158 <+34>:      callq  0x400438 <__isoc99_scanf@plt>
    if (x == 0) {
        return 0;
    } else {
        return x + getAndSumValues();
    }
}

// Either recurse (& accumulate) or return 0

0x40115d <+39>:      mov 0x8(%rsp),%rax
0x401162 <+44>:      test %rax,%rax
0x401165 <+47>:      jne 0x40116c <getAndSumValues+54>
0x401167 <+49>:      add $0x18,%rsp
0x40116b <+53>:      ret
0x40116c <+54>:      mov $0x0,%eax
0x401171 <+59>:      call 0x401136 <getAndSumValues>
0x401176 <+64>:      add 0x8(%rsp), %rax
0x40117b <+69>:      jmp 0x401167 <getAndSumValues+49>
int main() {
  long result;

// Start of the main function; align stack pointers

0x40117d <+0>:       sub $0x8,%rsp
    result = getAndSumValues();

// call getAndSumValues

0x401181 <+4>:       mov 0x0,%eax
0x401186 <+9>:       call 0x401136 <getAndSumValues>
    printf("The result = %ld\n", result);

// set parameters; print

0x40118b <+14>:      mov %rax, %rsi
0x40118e <+17>:      mov $0x402028, %edi
0x401193 <+22>:      mov $0x0,%eax
0x401198 <+27>:      callq 0x401030 <printf@plt>
    return 0;
}

// restore stack pointer and return

0x40119d <+32>:     mov $0x0,%eax
0x4011a2 <+37>:     add $0x8, %rsp
0x4011a6 <+41>:     retq

Using VSCode, create the program and save it as stack.c by copying and pasting the following code:

#include <stdio.h>

long getAndSumValues() {
    long x;
    printf("Enter an integer: ");
    scanf(" %ld",&x);
    if (x == 0) {
        return 0;
    } else {
        return x + getAndSumValues();
    }
}

int main() {
    long result;
    result = getAndSumValues();
    printf("The result = %ld\n",result);
    return 0;
}

Exercise 1:

Compile the code:

gcc -Wall --std=c99 -g -O -o stack.bin stack.c

Run the program under gdb, start the program, and disassemble getAndSumValues:

$ gdb ./stack.bin
(gdb) start
(gdb) disas getAndSumValues

Set a breakpoint right before the stack pointer is reset before the return, on the instruction:

add    $0x18,%rsp

(In the assembly above this would be at getAndSumValues+49, but if you got different assembly code just set the breakpoint on that instruction.)

continue the program, and enter the following values:

(gdb) continue
Enter an integer: 10
Enter an integer: 11
Enter an integer: 12
Enter an integer: 0

When you hit the breakpoint, use info reg rsp to find the current value of the stack pointer %rsp. Use the backtrace command (or bt for short) to show a summary of the currently active stack frames. How many stack frames are active?

Correct answer: 5

Of the active stack frames, how many of them are for getAndSumValues?

Correct answer: 4

What gdb command can be used to display multiple 8-byte values starting at the stack pointer?

bt
x /20gx $rsp
print $rsp
print *$rsp

Run that command, and observe the output. Which values from the backtrace output also appear in the raw stack data?

Example answer: The backtrace lists memory addresses of instructions for each function, indicating where code will pick up when we resume that stack frame. These are taken from the stored return address values on the stack.

How many 8-byte values are in each stack frame that you see?

Correct answer: 4

How could we have known this from looking at the assembly code, without examining the stack?

Example answer: The function call itself always pushes one address onto the stack, and the code for getAndSumValues starts with sub $0x18,%rsp, which is the only modification it makes to the stack. Since 0x18 in hex is 16 + 8 = 24 in decimal, that’s an additional 3 8-byte slots being reserved, for a total of four per stack frame.

Here’s what your stack data display should look like (possibly with some variation in the precise addresses and in the left column of values):

0x7fffffffd990: 0x0000000000000000      0x0000000000000000
0x7fffffffd9a0: 0x0000000000000000      0x0000000000401176
0x7fffffffd9b0: 0x0000000000000000      0x000000000000000c
0x7fffffffd9c0: 0x0000000000000000      0x0000000000401176
0x7fffffffd9d0: 0x0000000000000000      0x000000000000000b
0x7fffffffd9e0: 0x0000000000000001      0x0000000000401176
0x7fffffffd9f0: 0x0000000200000006      0x000000000000000a
0x7fffffffda00: 0x0000000000000800      0x000000000040118b
0x7fffffffda10: 0x0000000000000000      0x00007ffff7c3feb0
0x7fffffffda20: 0x0000000000000000      0x000000000040117d

Which of these values are the values that you entered when you ran the program?

Example answer: The ‘a’, ‘b’, and ‘c’ in the right-hand column are the values you entered.

Why might the values in the left-hand column change from computer to computer?

Example answer: These values are not actually used by the program: space is reserved for 3 8-byte values, but only one of them (at 0x8($rsp) which means “8 bytes above the bottom of the stack”) actually gets used. So the slots in the left-hand column (slots 0 and 2 of our stack frame) are never assigned a value, and thus retain whatever value they might have previously had when used by a prior stack frame before it was cleaned up.

Why does gdb seem to display the stack in “backwards” order?

Example answer: gdb shows values in increasing memory address order, but since the stack grows downwards, this is the opposite of how we draw it on the board.

Which register should we watch to see the return value from each function call frame (give the 3-letter name with a ‘%’ beforehand)?

Correct answer: %rax

gdb has a command called display which works like print but which will auto-display the value every time we hit a breakpoint. Which command will display the value of %rax for us automatically?

display $rax
info reg rax
display *$rax
display $rsp

Run that command, and then use c to continue one step at a time until the program terminates. What values of %rax did you observe?

  • 1st value before any continues: Correct answer: 0
  • 2nd value after first continue: Correct answer: 12
  • 3rd value: Correct answer: 23
  • 4th value: Correct answer: 33

Delete your breakpoints and display expressions and set a new breakpoint at the start of getAndSumValues. Then run from the start again:

(gdb) delete break
(gdb) delete display
(gdb) br getAndSumValues
(gdb) run

Enter the same sequence of values as before (10, 11, 12, 0), but this time inspect the stack each time your breakpoint gets hit. Notice how each invocation of the function adds one stack frame.

Mystery Program

Analyze the X86 code (below) for a function mystery, disassembled with gdb.

0x400474 <+0>:     push   %rbx
0x400475 <+1>:     mov    %rdi,%rbx
0x400478 <+4>:     mov    $0x1,%rax
0x40047d <+9>:     test   %rdi,%rdi
0x400480 <+12>:    je     0x40048f <mystery+27>
0x400482 <+14>:    lea    -0x1(%rdi),%rdi
0x400486 <+18>:    callq  0x400474 <mystery>
0x40048b <+23>:    imul   %rbx,%rax
0x40048f <+27>:    pop    %rbx
0x400490 <+28>:    retq

Exercise 2:

How many parameters does this function take?

Correct answer: 1 Explanation: Of the parameter-passing registers, %rdi is the only one used by this code.

Is register %rbx a callee-saved or caller-saved register?

Caller-saved
Callee-saved

Explain which two instructions the compiler included because of your answer above:

Example answer: The initial push %rbx and penultimate pop %rbx implement the callee-saved nature of the %rbx register: since we overwrite that register by doing mov $rdi,%rbx and it’s a callee-saved register, we are responsible for restoring its value by the end of our function call back to the original value it had before the call started.

What is the lea instruction accomplishing?

Example answer: It subtracts 1 from %rdi.

What is the base case for the recursion in this code?

Example answer: The code avoids recursing if it executes the jump at address +12. That jump depends on the test at +9, which in turn depends on the value of %rdi, the first argument to the function. So when the argument is 0, recursion stops. In that case %rax will be 1 because of the mov at address +4.

Describe the overall purpose of this function:

Example answer: It multiplies the number you give it by each smaller integer, down to 1, computing the factorial of the argument using recursion.

Assume that mystery is called from a main function with return address 0x401240, and that %rdi is 4 and %rbx is 0xff. Furthermore, assume that %rsp was 0x7f...538 before the call instruction was executed (you can include the 7f... in your answers).

How many stack frames of mystery will be generated overall?

Correct answer: 5

What values will %rdi take on for the start of each stack frame?

Example answer: 4, 3, 2, 1, and finally 0.

How many 8-byte slots does each stack frame for mystery take up?

Correct answer: 2

Exercise 3:

Draw four stack diagrams on the board (or draw one and take pictures as you update it). In each diagram, note the values of %rdi, %rbx, and %rax, as well as %rsp and %pc (the program counter). Show program counter values for the instruction that will execute next, and indicate with an arrow which block of memory %rsp points to, using 8-byte memory blocks. Draw diagrams for the following moments:

  1. The diagram showing the first function call frame, paused before the very first execution of the +18 callq instruction.
  2. The diagram showing the second function call frame, paused before the second execution of the +18 callq instruction.
  3. The diagram showing all of the stack frames produced, paused before the +27 pop instruction that starts the return process.
  4. The diagram showing the state of the stack just before executing the final iteration of the retq instruction at +28 that will return from the very first function call frame.

Compare your diagrams with these examples as you create each one to check your understanding of stack mechanics:

First diagram:

Example answer:

Stack diagram showing stack and register values, with “Frame”, “Addr”, “Value”, “Arrows”, and “Comments” columns above, and “Reg”, and “Value” columns below (with “Arrows” and “Comments” continuing). Address 0x7f...538 is the first address, with an unknown value. It’s part of the “main” frame. Next is 0x7f…530, which is part of the “mystery_0” stack frame and has value 0x401240 with comment “back to main” and an arrow indicating it’s a pointer value (the arrow doesn’t point to anything). The other half of the mystery_0 frame is address 0x7f…528 in red, with value 0xff and comment “old %rbx.” This line is the destination of a red arrow that comes from the %rsp register. Register %rsp has value 0x7f…528 in red. Register %pc has value 0x400486 with comment “mystery+18” and a gray arrow. Registers %rdi, %rbx, and %rax have values 3, 4, and 1. 

Second diagram:

Example answer:

Continuation of previous diagram, with two extra memory slots shown, belonging to the “mystery_1” stack frame, which is now the active frame. These slots have addresses 0x7f…520 and 0x7f…518, the latter colored red and the destination of the red arrow. These have values 0x40048b and 4, with a comment on the first value stating “after call in mystery.” The %rsp register now has value 0x7f…518 in red, and is still the source of the red arrow. %pc is unchanged, while %rdi has value 2, %rbx has value 3, and %rax has value 1. 

Third diagram:

Example answer:

Continuation of the second diagram, with 6 additional memory slots grouped into 3 stack frames for mystery_2, mystery_3, and mystery_4. %rsp is now 0x7f…4e8, which is still the bottom slot in the diagram; the arrow and red address both refer there. The top value of each stack frame is 0x40048b with comment “after call in mystery” in all 4 cases. The second value in each stack frame counts down: 3 at 0x7f…508 in the mystery_2 frame, 2 at 0x7f…4f8 in the mystery_3 frame, and 1 at 0x7f…4e8 in the mystery_4 frame. Register values for %rdi, %rbx, and %rax are 0, 0, and 1 respectively. %pc is now 0x40048f with comment “mystery+27.” 

Fourth diagram:

Example answer:

Final update to the previous diagram. The red address and value of %rsp is now 0x7f…530, pointing to the top slot in the mystery_0 frame. All slots below that are faded out to gray, and marked as “abandoned frames” in the “Frame” column, although their values haven’t changed. The comment for 0x7f…528 that read “old %rbx” now reads “old %rbx; just popped.” The value of the %pc is 0x400490, with comment “mystery+28.” The values of %rdi, %rbx, and %rax are 0, 0xff, and 24 respectively. The comment for %rdi is now “not restored as we return.” The comment for %rbx is “restored to main’s value.” The comment for %rax is “final result.” 

Read Six Numbers

phase_2 in the x86 assignment uses a function called read_six_numbers to gather input, and allocates space on the stack as part of that process. To disarm phase_2, it will be useful to understand how this works.

Exercise 4:

Go to your x86practice folder from last week’s lab, and start debugging sample.bin:

$ gdb ./sample.bin
(gdb) break trip_alarm
(gdb) start

Disassemble the function phase_2:

Note: phase_2 has one parameter, and it is a pointer to the string that is input by the user to solve phase_2. Therefore, %rdi will contain the pointer to the input when the function begins execution.

(gdb) disas phase_2
Dump of assembler code for function phase_2:
   0x0000000000401232 <+0>:     push   %rbp
   0x0000000000401233 <+1>:     push   %rbx
   0x0000000000401234 <+2>:     sub    $0x28,%rsp
   0x0000000000401238 <+6>:     mov    %rsp,%rsi
   0x000000000040123b <+9>:     callq  0x401465 <read_six_numbers>
   0x0000000000401240 <+14>:   lea 0x4(%rsp),%rbx
   0x0000000000401245 <+19>:    lea    0x18(%rsp),%rbp
   0x000000000040124a <+24>:    mov    -0x4(%rbx),%eax
   0x000000000040124d <+27>:    add    $0x5,%eax
   0x0000000000401250 <+30>:    cmp    %eax,(%rbx)
   0x0000000000401252 <+32>:    je     0x4010c8 <phase_2+39>
   0x0000000000401254 <+34>:    call   0x40181c <trip_alarm>
   0x0000000000401259 <+39>:    add    $0x4,%rbx
   0x000000000040125d <+43>:    cmp    %rbp,%rbx
   0x0000000000401261 <+46>:    jne    0x4010b9 <phase_2+24>
   0x0000000000401263 <+48>:    add    $0x28,%rsp
   0x0000000000401267 <+52>:    pop    %rbx
   0x0000000000401268 <+53>:    pop    %rbp
   0x0000000000401269 <+54>:    ret
End of assembler dump.

Notice lines +2 and +6 above. First stack space is made by subtracting from %rsp, and then the second parameter to read_six_numbers is being set up by copying the current %rsp to %rsi before the call to read_six_numbers.

How many empty 8-byte slots does the sub instruction add to the stack?

Correct answer: 5 Explanation: 0x28 is 2 16s and an 8.

How many 8-byte slots do we need to store 6 integers?

Correct answer: 3 Explanation: Each int is 4 bytes.

Accounting for all stack usage, how many 8-byte slots does the stack frame for a call to phase_2 include?

Correct answer: 8 Hint: Remember to include the return address as well as any pushed values. Explanation: In addition to the 5 slots allocated by sub, the two pushes beforehand each use 8 bytes, and the call instruction that entered the function uses 8 bytes to store the return address.

This code does not mention %rdi, but we know that read_six_numbers will use the %rdi value as its first parameter. Which function set %rdi, and why doesn’t phase_2 need to set it?

Example answer: %rdi was set by main when it called phase_2. Since phase_2 is passing its own first parameter directly to read_six_numbers, it doesn’t need to modify the value before making that call.

What do you think the purpose of read_six_numbers is? Where do you think the results of executing read_six_numbers will be stored?

Example answer: The purpose is probably to read in 6 numbers from the line of text that the user types, since its first argument is the user input string. Given that the pointer argument it gets is a copy of the stack pointer, they will be stored on the stack (in the stack frame for the calling function, i.e., phase_2).

Disassemble read_six_numbers and examine the code:

(gdb) disas read_six_numbers

Dump of assembler code for function read_six_numbers:
   0x0000000000401465 <+0>:    sub    $0x18,%rsp
   0x0000000000401469 <+4>:    mov    %rsi,%rdx
   0x000000000040146c <+7>:    lea    0x4(%rsi),%rcx
   0x0000000000401470 <+11>:   lea    0x14(%rsi),%rax
   0x0000000000401474 <+15>:   mov    %rax,0x8(%rsp)
   0x0000000000401479 <+20>:   lea    0x10(%rsi),%rax
   0x000000000040147d <+24>:   mov    %rax,(%rsp)
   0x0000000000401481 <+28>:   lea    0xc(%rsi),%r9
   0x0000000000401485 <+32>:   lea    0x8(%rsi),%r8
   0x0000000000401489 <+36>:   mov    $0x402325,%esi
   0x000000000040148e <+41>:   mov    $0x0,%eax
   0x0000000000401493 <+46>:   callq  0x400b20 <__isoc99_sscanf@plt>
   0x0000000000401498 <+51>:   cmp    $0x5,%eax
   0x000000000040149b <+54>:   jg     0x4014a2 <read_six_numbers+61>
   0x000000000040149d <+56>:   callq  0x401392 <trip_alarm>
   0x00000000004014a2 <+61>:   add    $0x18,%rsp
   0x00000000004014a6 <+65>:   retq
End of assembler dump.

Which line in the code above calls a C standard library function (give the + offset, including the +)?

Correct answer: +46 Explanation: The __isoc99_ part of the name identifies the specific C library version; ISO stands for “International Standards Organization” and C99 is the 1999 standard for the C language. The @plt part indicates that it’s part of the “process linkage table” which you can read more about via that link if you want, but which we won’t ask you to understand for this class.

Which standard library function is it calling (just the function name as it would have been used in C code)?

Correct answer: sscanf

Look up the behavior of sscanf by searching online, and answer these questions:

What does sscanf do?

Example answer: It reads formatted values from a string, and stores them into variables pointed to by pointers it is given in addition to the string to scan.

How many parameters does sscanf have?

Example answer: It has a variable number of parameters, but always at least 3: the first parameter is the string to scan, the second is the format to recognize, and the third and subsequent parameters are pointers to variables where the scan results should be placed. For each % escape in the format string, the result will be placed into the variable pointed to by the next argument in order.

What are the first and second parameter of sscanf, and how are they used?

Example answer: The first parameter is the string to scan, and the second is the format string. The format string determines what kind(s) of values sscanf will try to read from the string being scanned, using % escapes like printf.

What are the remaining parameters of sscanf, and how are they used?

Example answer: Each additional parameter is a pointer to a variable into which the next scan result should be placed.

What value is returned by sscanf?

Example answer: It returns the number of items successfully scanned.

Notice that a large immediate value is loaded into %esi before the call to sscanf. In the assembly code above (but not necessarily in your code), what is that value?

An “immediate” value refers to a value that’s hard-coded into the assembly code, (and appears as part of the machine instruction). It’s “immediate” because it’s immediately available as part of the instruction, as opposed to being a value that the CPU accesses via either loading from a register or from memory. A $ appears in assembly code to indicate the use of an immediate value, as opposed to a memory reference.

Correct answer: 0x402325

Which parameter of sscanf is this value being used as? How do you know?

Example answer: The second parameter, because %rsi (or in this case the lower half %esi) is always the second parameter to a function call.

Examine memory at the address used by the code you’re running (may not match the value above exactly). Explain what you see:

Example answer: The value is the string "%d %d %d %d %d %d", which is being used as the format string for sscanf. The %d escape is for a “decimal” value, just as we use in printf, so this format string indicates that sscanf should read 6 numbers separated by whitespace from the string it is given.

Assume that you have called phase_2 with a pointer to the string “1 2 3 4 5 6” as its argument (let’s say the pointer’s numeric value is 0x603ba4). The return address for the call is 0x400f32 in main, and the values of %rbp and %rbx are currently 0xfeed and 0xf00d. The value of %rsp is 0x7f...578.

Draw stack diagrams for the following moments in the code (or draw and update one diagram, taking pictures along the way). Check your work as you complete each diagram. Write ‘?’ for any memory locations which need to be included in your diagram but which have unpredictable values. Include the values of registers %pc, %rsp, %rbp, %rbx, %rcx, %rdx, %r8, %r9, %rdi, %rsi, and %rax in each diagram, using ‘?’ values for registers that haven’t been assigned to yet.

  1. Right before the call instruction at phase_2+9 which will call read_six_numbers.
  2. Right before the mov instruction at read_six_numbers+4 after read_six_numbers has allocated additional stack space.
  3. Right before the call instruction at read_six_numbers+46, which will call sscanf.
  4. Right before the cmp instruction at read_six_numbers+51, immediately after sscanf returns (you do NOT need to read the assembly code for sscanf).
  5. Right before the lea instruction at phase_2+14, immediately after read_six_numbers returns.
  6. Right before the je instruction at phase_2+32. Note for this diagram whether the jump will be taken or not.

For now, ask your lab instructor to verify the diagrams you draw at each step.

TODO: Add diagrams for each of these steps to this page.

If you have time left, you should work on the x86 assignment for the rest of the lab.