mprotect
Contents
- Overview
- Part 0: Using xv6
- Part 1: The Life of a Syscall
- Part 2: Protecting Against Null Pointer Dereference
- Part 3: The
mprotect
System Call and Immutable Code - Submission and Grading
Overview
- Assign: Thursday, 12 Nov
- Checkpoint: aim to have started work on null protections; send a progress update Monday, 16 Nov
- Checkpoint: aim to have finished work on null protections and started on mprotect; send a progress update Wednesday, 18 Nov
- Due: Friday, 20 Nov
- Teams: pairs or individuals
- Submit:
git add
,git commit
, andgit push
your completed code. - Reference:
In this project, you will implement some basic memory protection features for xv6, a small Unix-like operating system. The required tasks for the project are to:
- Modify xv6 to ensure that dereferencing a null pointer (
NULL
) in a user program leads to an explicit error and termination of the process, rather than accessing memory contents at address 0. This part is fairly carefully guided. - Modify xv6 to implement the
mprotect
system call and use it to set restrictive permissions for protection of important regions of memory that should not be written, such as the code (text) section. This part is much more open.
Scope: While you may not write all that much code for this project, it will be much more involved and immersive than the preceding C programming tasks since you are working with a large code base. Much of your time in this project will be spent learning your way around the xv6 codebase before you actually get to writing code. Take this seriously and expect to spend a lot of time on it. Expect to feel a little overwhelmed at some moments, especially as you are figuring things out, but trust that we will find our way. Share your knowledge liberally! Ask/answer questions on Zulip; help others find their way; accept help from others to understand the existing code. The code you write should be your own, but please feel free to make navigation of the existing code more of an open class effort. Please check in with me often and do not be shy to report where you really are. We will work together to make the project work and adjust as needed.
Dependency chains: Parts 0 and 1 do not involve producing anything. They are about finding your way in xv6. They should absolutely be done before you attempt Parts 2 or 3. Parts 2 and 3 can be approached separately. Part 2 is more guided and requires less code. I do recommend doing it first, but if you are really stuck and Part 3 makes more sense to you, there’s no technical reason you cannot go there first.
Acknowledgment: This project is based on an OSTEP project plan, with some additional guidance.
Part 0: Using xv6
xv6 is a small educational Unix-like operating system designed and built at MIT. The xv6 book covers its design and implementation. (Throughout the book, numbers in parentheses refer to a specific line of code in the full source code listing of xv6, which matches your starter code.) xv6 is big, in the sense that its a few thousand lines of C and assembly code, but it’s small, in the sense that operating systems like Linux, macOS, or Windows are millions of lines of code. While we will not have time to become acquainted with all parts of xv6 this term as part of CS 341, doing so on your own time in the future is entirely feasible. Note: we are using the version of xv6 for the x86 architecture; newer versions use the RISC-V architecture.
Build and Run xv6
Since xv6, while relatively small, is a real OS, and running (or
debugging!) a real OS on real hardware can get quite mysterious and
painful when things are not working, we will use an emulator called
QEMU. QEMU is just a software program that
pretends to be x86 hardware. xv6 code will not really know the
difference, but this will make our lives much better. The CS server
(cs.wellesley.edu
a.k.a. tempest
) is setup with the tools you
need. If you have a local Linux environment, you can likely get QEMU
and other tools working, but the project alone will have plenty of
places to spend your time, so I strongly advise using tempest
. :-)
Fair warning, tempest
has a pretty old version of QEMU, so newer
fancy features may not be available.
Once I have created your team’s GitHub repository, you will find a copy
of the xv6 code there (with very minor tweaks in the Makefile for our
setup). Clone your repository to somewhere in your CS home directory,
cd
into it, and then run make build
. That will build xv6 and
create a disk image for QEMU.
bpw@tempest $ make
gcc -fno-pic -static -fno-builtin -fno-strict-aliasing -O2 -Wall -MD -ggdb -m32 -Werror -fno-omit-frame-pointer -fc
[… much more excitement …]
dd if=kernel of=xv6.img seek=1 conv=notrunc
347+1 records in
347+1 records out
178088 bytes (178 kB) copied, 0.00536816 s, 33.2 MB/s
To run xv6 in QEMU (and first build anything that’s been updated), use
make run
. This will take over your current terminal session as QEMU
launches. I recommend having two terminal sessions open so you can
poke at the xv6 source code while also working in xv6 itself. (You
will also need two sessions for debugging, later.) After make run
,
you should see some build messages go by, then you should end up with
something like this:
SeaBIOS (version 1.11.0-2.el7)
iPXE (http://ipxe.org) 00:03.0 C980 PCI2.10 PnP PMM+1FF94780+1FED4780 C980
Booting from Hard Disk..xv6...
cpu1: starting 1
cpu0: starting 0
sb: size 1000 nblocks 941 ninodes 200 nlog 30 logstart 2 inodestart 32 bmap start 58
init: starting sh
$
Here you see the BIOS (very low-level pre-OS boot-time setup), then
some output from xv6 as it boots, then a note that it is starting
sh
, its shell, and finally a shell prompt. By default, everything is
in one top-level directory. Try ls
.
$ ls
. 1 1 512
.. 1 1 512
README 2 2 2287
cat 2 3 14452
echo 2 4 13308
forktest 2 5 8132
grep 2 6 15988
init 2 7 14196
kill 2 8 13340
ln 2 9 13280
ls 2 10 16136
mkdir 2 11 13368
rm 2 12 13344
sh 2 13 24776
stressfs 2 14 14296
usertests 2 15 67192
wc 2 16 15116
zombie 2 17 13004
console 3 18 0
$
Yep, that’s it! Note that the console support is very basic as is the shell. No fancy history or convenience editing commands. if you enter odd characters or move around with the arrow keys, you’ll probably see your next command fail, but a prompt will return and keep working. Keep it simple.
Try some commands using the executables you see here.
The next question is: how do we get out? Shutdown? Here, we will enter
QEMU’s “monitor mode” and terminate the emulator. This will also work
even if you have totally wedged the OS with changes that you make. To
enter QEMU monitor mode, type Control-a
then c
. You should see a
QEMU prompt:
QEMU 2.0.0 monitor - type 'help' for more information
(qemu)
Here you can type quit
or just q
to shut things down. There are
many other things you can do here. (See help
.)
Note that changes to the xv6 file system are not durable between runs. Each time we boot from the clean image.
Alternatively, to pause emulation in QEMU (kind of like stopping time
for xv6), enter QEMU monitor mode with Control-a c
, then type stop
at the (qemu)
prompt. When you are ready to resume, type continue
at the (qemu)
monitor prompt, then type Control-a c
again to go
back to the xv6 console.
Please make sure to shutdown QEMU when not in use.
The older version of QEMU we have on tempest
is not especially
efficient. Basically, it uses up 100% of a CPU core whenever it’s
running xv6. This is fine while you are running and interacting with
xv6, but please make sure to shutdown QEMU when you stop working.
If you might have lost your connection and left a QEMU running, you
can SSH to tempest
and run pkill -u $USER qemu*
to terminate all of your
running QEMUs. (Don’t worry, you won’t be able to terminate anyone
else’s!)
See also pausing and continuing emulation.
GDB for the xv6 Kernel
Now, run xv6 with GDB so you can inspect the kernel. The setup for this is a bit more complicated than running a normal program under a debugger, so much of the setup steps are automated in a script for you.
Start xv6 under QEMU by running make gdb
. (We have made some minor
tweaks to the Makefile
to reduce typing and configure for defaults
on tempest
.) On tempest
, you will likely see startup that looks a
bit like this:
bpw@tempest $ make gdb
sed "s/localhost:1234/localhost:26228/" < .gdbinit.tmpl > attach.gdb
which: no qemu in (/net/atlas/bin:/usr/local/bin:/net/bin:/usr/local/smlnj/bin:/usr/local/racket/bin:/opt/rh/)
which: no qemu in (/net/atlas/bin:/usr/local/bin:/net/bin:/usr/local/smlnj/bin:/usr/local/racket/bin:/opt/rh/)
(process:13057): GLib-WARNING **: 17:42:05.252: gmem.c:489: custom memory allocation vtable not supported
*** Now run 'gdb -x attach.gdb' in another terminal.
qemu-system-i386 -nographic -drive file=fs.img,index=1,media=disk,format=raw -drive file=xv6.img,index=0,media=dis8
(process:13060): GLib-WARNING **: 17:42:05.390: gmem.c:489: custom memory allocation vtable not supported
Aside: tmux
One convenient—but totally optional—way to work with multiple
terminal sessions over a single SSH connection is to use
tmux
, a terminal multiplexer, which supports things like
seeing multiple terminals at once, detaching from your session so you
can log out of tempest
but leave things running, then login and
reattach to continue your terminal session later.
Notice that things seem to have paused. There is no further output; we have not reached the xv6 boot output or the xv6 shell prompt. This is because the QEMU startup process has paused to allow an external GDB running outside QEMU/xv6 to attach and observe the emulated machine.
There is some noise here that you can ignore, including lines from
make
and some warning about versions of software on tempest
(the
which
and WARNING
lines). The line of interest is:
*** Now run 'gdb -x attach.gdb' in another terminal.
Do that. Leave the current terminal where it is, then open a second
terminal session on tempest
, cd
to the same directory, and run
the command gdb -x attach.gdb
. (It’s critical that you’re in the
same directory as your xv6 source code.) In the new terminal, running
this command should bring you to a GDB prompt like this:
bpw@tempest $ gdb -x attach.gdb
GNU gdb (GDB) Red Hat Enterprise Linux 7.6.1-119.el7
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
+ target remote localhost:26228
warning: A handler for the OS ABI "GNU/Linux" is not built into this configuration
of GDB. Attempting to continue with the default i8086 settings.
The target architecture is assumed to be i8086
[f000:fff0] 0xffff0: ljmp $0xf000,$0xe05b
0x0000fff0 in ?? ()
+ symbol-file kernel
(gdb)
GDB is now attached and in control. Let’s set a breakpoint for the
exec
function inside the kernel using the break
or b
command in
GDB:
(gdb) break exec
Breakpoint 1 at 0x801009a0: file exec.c, line 12.
(gdb)
Now let the boot process continue until it first runs the exec
function in the xv6 kernel by using the continue
or c
command in GDB:
(gdb) continue
Continuing.
[New Thread 2]
[Switching to Thread 2]
The target architecture is assumed to be i386
=> 0x801009a0 <exec>: push %ebp
Breakpoint 1, exec (path=<error reading variable: can't compute CFA for this frame>,
argv=<error reading variable: can't compute CFA for this frame>, argv@entry=0x8dfffea0) at exec.c:12
12 {
(gdb)
Take a look back at your first terminal. Notice that the boot process
has continued and there is more output from xv6. However, we are still
not at the shell prompt, since an exec
syscall must run to get us
there. Let’s see where we are with backtrace
, bt
, or where
(gdb) backtrace
#0 exec (path=<error reading variable: can't compute CFA for this frame>,
argv=<error reading variable: can't compute CFA for this frame>, argv@entry=0x8dfffea0) at exec.c:12
#1 0x80105173 in sys_exec () at sysfile.c:419
#2 0x8010469a in syscall () at syscall.c:139
#3 0x80105699 in trap (tf=<error reading variable: can't compute CFA for this frame>) at trap.c:43
#4 0x801053e2 in alltraps () at trapasm.S:20
(gdb)
Aha! Interesting. The back trace bottoms out in the entry point for
traps in xv6. Of course, this trap must have been delivered by an
explicit int
(interrupt) instruction somewhere, but we cannot see
beyond the trap handler, which sits at the base of the kernel stack
for this process. (The user stack is elsewhere – or is it? Since this
is the very first process, was this trap really issued from a user
program? Hold that thought for now.)
Follow a few more steps to reach the shell prompt in xv6:
- Take note of the xv6 output you see in the first terminal, then tell
GDB to
continue
. - Now you should see output from xv6 up through
init: starting sh
, but it looks like we have arrived once more at theexec
breakpoint over in GDB. continue
once more. GDB should appear to be stuck or running computation but not yet reaching a breakpoint. Over in xv6, you should see a prompt.
Now every time you run a command in the xv6 shell, you should catch
the exec
breakpoint over in GDB. Of course, you could remove this
breakpoint or install a breakpoint anywhere else in the
kernel. (Usually, it’s best to do that in a fairly “high level”
function in the kernel, closer to the trap handler than to deep kernel
internals.) You can also quit
GDB, which will allow xv6 to continue
without GDB supervision. Terminate the QEMU session as usual when you
are done.
Part 1: The Life of a Syscall
As a warmup, take a tour of the life of a syscall in xv6. You will need to make some small additions along these code paths for the last part of the project.
There are a few things you can do to find the relevant code: use the ones that you find most useful. Whatever guide you follow, you should absolutely open up the files in your own xv6 codebase to follow along.
- This document gives a brief overview of the code involved in getting from user code to kernel system call implementation and back..
- Starting at 12:26, this video (by one of the OSTEP authors) gives
a detailed tour through through the same
code. Note: The version of
xv6 that we are using has a completely flat filesystem structure,
instead of the
include
,kernel
,user
, etc. directories shown in the video, but the code is nearly identical. - The xv6 book covers the entirety of xv6. Chapter 3 includes substantial detail on system calls. (Numbers in parentheses refer to a specific line of code in the full source code listing of xv6.)
Try using GDB to chase through some details of exec
or
another system call. Try placing a breakpoint near the trap entrypoint
in trap
(trap.c
) or one of the system call handlers in
sysproc.c
. Then single-step (s
or step
) for a while to see what
happens. To skip over an entire function call at once (instead of
descending into it step by step
), use n
or next
.
Part 2: Protecting Against Null Pointer Dereference
For simplicity, the standard xv6 implementation sets up the virtual
address space of each new process to store its code (a.k.a., the
text section) starting at virtual address 0 (or 0x00000000
, to be
completely explicit, since we’ll be working in a 32-bit world). This
is simple, as the entrypoint (the first instruction to run) is
predictably located at address 0, but it has a downside associated
with null pointers. The null pointer is generally encoded as 0. Thus
if we accidentally—or intentionally—dereference the null pointer
(the zero address) we will access the code of the process! Furthermore
xv6 does not protect the code (text) section in any way, meaning a
program can not only read, but also overwrite its own code. That
sounds like a recipe for fun or disaster, depending on your
perspective.
Task 2a: Null Pointer Dereference
Write a user program in evilnull.c
that reads some bytes of its own
code using a null pointer, prints a few of those bytes somehow, and
then writes some new value (like 0xBAD4FACE
) into memory through the
null pointer. This requires that you:
- Write
evilnull.c
.- Remember, xv6 provides a few familiar C functions in its user
libraries, but not the full standard C library. You are also
missing the
NULL
macro, which is just short for((void*) 0)
. If you want to writeNULL
you can introduce your own macro definition with#define NULL ((void*) 0)
. - Instead of
#include
ing normal C standard library headers, you will need to#include
xv6 user headers. Check outcat.c
for examples. printf
in xv6 requires an explicit file descriptor argument. 1 isstdout
. Seecat.c
for example.- Your
main
function should end with a call toexit()
. A simplereturn
will not work quite like you expect inmain
for xv6. - You may also wish to write a version that uses standard C headers and compile it manually to run directly on Linux, so you can observe its behavior there as well and compare with xv6.
- Remember, xv6 provides a few familiar C functions in its user
libraries, but not the full standard C library. You are also
missing the
- Update the
Makefile
to ensure anevilnull
program will be built fromevilnull.c
. You should not be writing new rules, just adding your program to the list of user programs. Check where theMakefile
references_cat
to figure out how to add your program to the list. - Launch xv6 in the emulator (
make run
), then run yourevilnull
program and confirm that its evilness succeeds. Be prepared for how to terminate the emulator, just in case your evilness is a bit more successful than you anticipated.
This program will serve as your basic test case to see if your work in later tasks can defeat its evil.
Task 2b: Prohibit Access to Page Zero
As a simple means of protecting against (most) null pointer dereferences, you will modify xv6 so that it reserves page 0 of every process’s virtual address space as inaccessible to a user process. As a result of this reservation, any attempt to access address 0 (or nearby positive offsets from address 0) will be intercepted by the system during a page fault, causing the xv6 to terminate the offending process.
The xv6 book, chapter 2 describes the two-level page
table and virtual memory system of xv6, as well as how important
system calls like exec
manipulate these. Keep it close at hand when
you are reading code, and vice versa.
GDB and QEMU for virtual memory
As you poke around the virtual memory system, you might like to
inspect with GDB. The usual GDB commands like info locals
, print
,
x
, etc., are always helpful. In the QEMU monitor mode (Control-a
c
), you can also use info mem
to inspect the virtual memory
mappings.
Here is a rough plan to approach this task.
First, do some reconnaissance, starting from two system calls that do
key setup of address spaces: exec
and fork
.
- Examine the implementation of
exec
inexec.c
. How doesexec
initialize the address space? This is a good place to practice your skills for filtering out unnecessary detail vs. important detail.- Pay special attention to how
exec
loads a program’s code (or other parts) into memory from the ELF executable. (Remember, the xv6 book can help you interpret the code.) Where in the virtual address space do the executable segments get placed? - Does
exec
make any other pages inaccessible to the process? How? - Use the
grep
command to search for definitions and uses of functions in the xv6 source files, e.g.,grep loaduvm *
.
- Pay special attention to how
- Examine the implementation of
fork
. Find the code that copies the page table (i.e., the virtual address space) of the parent process to create the page table/address space of the child process. Will something here need to change?
Next, modify code to ensure the page based at virtual address 0 is unused and inaccessible:
-
Change how executables are created to ensure that their code is not marked to appear starting at address 0, or anywhere within the first page of memory.
- This change can be made without any change to kernel code. It
just requires a small tweak in the
Makefile
rules that direct the linker where to place the code (text
) section when generating executables for user programs. Look for the_%:
rule, which will be applied to build all theUPROGS
executables. Currently, this build rule follows the assumption that code (text
) is at virtual address0
in the virtual address space. make clean
and then rebuild and boot xv6 again. Runevilnull
in the xv6 shell. Do you expect its behavior to change? If so, how? Did its behavior actually change? If so, how?
Simply linking executables that will avoid placing code at address zero is only the first step. Address 0 is still accessible, so there’s more to do in the next step.
- This change can be made without any change to kernel code. It
just requires a small tweak in the
-
Make changes in
exec
orfork
as needed so that the page at address 0 in the virtual address space is left unused and is definitely marked inaccessible to the user process. Look for existing functions (especially those that are exposed throughdefs.h
) that could accomplish steps you need.
Finally, test your protections. Run evilnull
again. What happens
now? The evilnull
process should terminate with some kind of error
related to accessing address 0, but xv6 itself and the shell should
continue unaffected. Create variants of evilnull
that perform only
reads through a null pointer and only writes through a null pointer,
respectively. Both of these variants should be defeated by your
protections. In other words, your modified kernel must protect against
both writes and reads with the null pointer; protecting against only
writes (but allowing reads) does not suffice.
If successful protection against evilnull
seems to require more than
a few new or changed lines of code, consider rethinking your approach.
Task 2c: Check Assumptions, Safeguard Syscalls
When pointers are passed to the kernel as system call arguments, the kernel must take great care in using them. The kernel code that uses these arguments must avoid using bad pointer arguments from the user that may cause the kernel to clobber its own memory contents, or worse.
A useful experiment would be to make a copy of the cat
program (call
it evilcat
) that, instead of using the read
and write
syscalls
to read input to/write output from its buf
buffer (array), uses
those syscalls to read/write with memory at address zero ((void*) 0
,
or feel free to #define NULL ((void*) 0)
if you’d prefer to use
NULL
) acting as the buffer. How is this different than accessing
address zero directly in the user process? Do your protection
mechanism guard against this? Try it!
After this experiment, review other kernel code—such as code for
those read
/write
system calls that take a pointer argument—that
makes assumptions about the virtual address space. Determine if that
code needs to be updated for your modified approach to protecting page
zero. Update that code if so. Make sure that you can defeat
evilcat
’s attempts to use the null pointer.
Task 2d: Documentation
Document your implementation with inline comments and give a summary
of how it works in a Markdown file null.md
. Also include any known
issues or limitations.
Part 3: The mprotect
System Call and Immutable Code
Linux and most other operating systems ban writing to code (text) pages of memory by default as a safety and security precaution. xv6 leaves read-write permissions open on all mapped portions of the virtual address space by default. In this part of the project, you will implement a pair of system calls that allow changing the permissions for page-granularity sections of the virtual address space:
int mprotect(void* base, int length)
int munprotect(void* base, int length)
The versions of these system calls available in Linux, macOS, and other POSIX systems, allow the caller to dictate specific permissions with a third argument. Our simplified form will implicitly control read access alone:
mprotect
disablesreadwrite access for a section of the address space starting at thebase
address and continuing up through the end of the page containing the addressbase
+length
- 1.munprotect
reenablesreadwrite access for a section of the address space starting at thebase
address and continuing up through the end of the page containing the addressbase
+length
- 1.
After mprotect
ing a section of the address space any reads by the
user program should still succeed, but writes should cause a trap and
termination of the process. After munproctect
ing a section of
memory, writes to that section should succeed.
There are restrictions on what arguments to these two system calls are valid:
- The
base
address argument must be aligned to a page boundary. - The
length
argument must be positive. It shall be rounded up to the nearest multiple of the page size so thatmprotect
andmunprotect
operate over all pages that intersect the specified range. - All pages touched by the range specified by
base
andlength
must be pages that are already mapped and accessible by the user process. In other words, ifmprotect
is called with a range that includes virtual addresses of unmapped or inaccessible pages, it should return in error and leave protections as they were before themprotect
syscall.
The implementations of these system calls must check their arguments
for validity. If the above rules are violated, or if any other error
arises, the system calls should avoid making any changes and instead
return -1
to indicate failure. If the arguments are valid, the
system calls should proceed as requested and return 0
upon success.
Implementation note: after updating the page table, we need to
make sure that the TLB is also up-to-date. With no explicit action
from the system call, the TLB might otherwise continue to hold a
now-stale copy of a page table entry that has out-of-date protection
bits. For our purposes you should simply call the lcr3
function
with the existing physical address of the page table. (grep
around
to find its usage in the existing code.) This function writes to the
x86 cr3
register (the register holding the page table address) and
in turn forces invalidation of the TLB entries. (This general sort of
event is called a “TLB shootdown.”) The first following accesses to
any pages will thus suffer TLB misses that cause the MMU to reload TLB
entries from the (updated) in-memory page table. There are some
subtleties to getting this right on a multicore system, but this level
should suffice for our project.
Suggested Workflow
Build and test this incrementally. Work on mprotect
alone to
start. After mprotect
is working, repeat the steps (probably much
more quickly) for munprotect
.
- Write some simple test cases that try to “break”
mprotect
. - Build an
mprotect
that takes no arguments and does nothing other than printmprotect
. Refer to the syscall tour to remember all the places you need to add info about this system call. - Add arguments to
mprotect
and print them; return0
. - Check the validity of arguments and create the return value accordingly.
- Make the appropriate page table updates.
It may be useful to introduce temporary simplifying assumptions along
the way such as assuming that a single mprotect
call refers to only
1 page, and then remove those assumptions once the simple case is
working.
Demonstrate correctness with tests and documentation: At the end,
your tests should be able to demonstrate the correctness of your
implementation. Document your implementation with inline comments and
give a summary of how it works in a Markdown file
mprotect.md
. Include any known issues, as well as instructions for
how to run your tests and what they should demonstrate.
Optional: reuse your mprotect
implementation to remove write
access on the code (text) sections of memory by default in exec
.
Submission and Grading
Make sure to git add
any new files and changes, git commit
, and git push
.
There will be an interactive demo opportunity. Grading will be based on:
- 80% correctness
- 20% code quality