Malloc
Assignment: Malloc
- Assign: Tuesday 22 April
- Checkpoint: Complete Checkpoint Exercises Parts 1 through 3 by Friday 25 April
- Due: Monday 5 May. Custom extensions through 11:59pm on Friday May 9 may be granted, but that's a hard deadline.
- Policy: Pair graded synthesis assignment
- Finding a partner: Partners encouraged but not required. Use this Find a Partner document to find/document a partner. You can also use the Zulip partner search channel as well.
-
Code:
cs240 start malloc
- Submit:
git commit
andgit push
your completed code. - Reference:
Contents
- Assignment: Malloc
- Overview
- Setup
- Tasks
- Preparatory Exercises
- Specification
- Implementation
- Compiling and Testing
- How To Work
- Submission
- Grading
- Extra Fun Open-Ended Extensions
Overview1
In this assignment you will implement your own dynamic memory allocator
(malloc
/free
) for C programs!
Goals
- To understand general principles and trade-offs in dynamic memory allocation.
- To consider and empirically evaluate the effects of implementation and design choices on competing measures of performance.
- To develop defensive design, coding, and debugging skills for environments where standard structure and amenities are unavailable.
- To build a fundamental system component that you have been using (unwittingly?) since the first time you used a computer.
- To synthesize your accumulated bit-twiddling, pointer-plying, and algorithm-accelerating wisdom to build and optimize a “1-sentence” project instead of taking a final exam.
Advice
This assignment requires careful bit-level manipulation of memory using many of C’s unsafe pointer casting features without much structure. Your allocator code will not be enormous, but it will be subtle. Take planning seriously and work methodically. Ignoring this advice could cost hours or days. Also keep in mind that performance is a major component of the evaluation for this assignment.
We recommend this path for preparation and development of a successful allocator:
- Read the specification, complete the preparatory exercises,, and review effective work strategies.
- Implement an first-fit implicit free list allocator in C (and test it).
- Optionally, modify your first-fit implicit free list allocator to create a next-fit implicit free list allocator or an explicit free list allocator.
- Extra Fun Optionally, add open-ended extensions to the allocator.
Time Reports
According to self-reported times on this assignment from a prior semester:
- 25% of students spent <= 10 hours.
- 50% of students spent <= 20 hours.
- 75% of students spent <= 24 hours.
These times include the malloc lab, checkpoint preparatory assighment, and code submission, and represent a range of implementation strategies and stopping points.
Setup
Use a CS 240 computing environment. The allocator you build might work on similar platforms (no guarantees!), but it must work on our platforms for grading.
Get your repository with cs240 start malloc
.
Your starter repository contains the following files:
Makefile
– recipes for compilingmdriver.c
– testing drivermemlib.h
– memory/heap interfacemm.c
– memory allocator implementationmm.h
– memory allocator interfacetraces/
– several trace files (.rep
) used for simulated testing- remaining files are testing support files you do not need to inspect
Compile the allocator and test driver with make
to produce an
executable called mdriver.bin
. Usage of the mdriver.bin
testing
executable is described later.
Tasks
You will modify mm.c
to implement a memory allocator with the
interface declared in mm.h
and the functionality described by the
Specification section below.
int mm_init();
void* mm_malloc(size_t payload_size);
void mm_free(void* payload_pointer);
The provided mm.c
file partially implements an allocator based on an
implicit free list. It provides skeletons of several helper functions
beyond the interface above and an implementation of mm_malloc
that
uses these helper functions. Your job is to complete the helper
functions (and possibly add more of your own) as well as mm_free()
to implement a memory allocator with good balance of memory
utilization and throughput. You may use any implementation strategy
that works (subject to programming rules).
Performance of your memory allocator is a large
component of the grade. There are many opportunities for extensions,
whether in increasingly sophisticated allocator implementations for
better performance or in analysis and error-checking tools.
Grading considers design, documentation, style, correctness, and performance.
The remainder of this document includes:
- Preparatory exercises for designing and simulating an implicit free list allocator.
- The allocator specification.
- Notes on implementing various allocator strategies.
- Documentation of the test harness for correctness checking and performance evaluation.
- Advice for effective work strategies.
- Grading criteria.
- Extra Fun Ideas for open-ended extensions.
Preparatory Exercises
As you read this document, complete these exercises to familiarize yourself with heap layout, block layout, and the provided starter code. These exercises will help you design your allocator implementation before diving into messy implementation details. Good planning is key when developing low-level systems software.
Preparation is your ticket for assistance.
- You must complete the preparatory exercises (and show evidence) before asking questions about code or debugging on the main assignment.
- You may ask questions on preparatory exercises at any time.
CSAPP Review
Read CSAPP section 9.9 to become familiar with the type of thinking and coding you will do to build the allocator. The book discusses an implicit free list allocator similar in organization to the one we suggest you build first. The valuable part is the descriptions. Do not get too caught up in the code details in Sec 9.9.12, since you will use different starter code.
Note: In CSAPP section 9.9, word means 4 bytes and double-word means 8 bytes, but in the CS240 slides and assignment, word means 8 bytes and double-word means 16 bytes,
Based on the CSAPP reading, you should complete the Lab 12 pre-lab.
In the past, we have also recommended the following practice problems on memory allocation, but you do not need to submit solutions to these. Solutions not in the textbook are available by visiting office hours.
- CSAPP Practice Problem 9.6
- CSAPP Practice Problem 9.7
- CSAPP Homework Problem 9.15
- CSAPP Homework Problem 9.16
Required Exercises to Complete for Checkpoint
These exercises will help you plan your allocator implementation and avoid wasting time writing C code before you understand what you are doing.
These count toward your grade and have a separate checkpoint deadline.
Parts 1 through 3 are required. The first two parts will be submitted via Gradescope; the 3rd part will be submitted as a GoogleDoc. You will start working on these parts during the Malloc lab, but may need time outside of lab to complete them.
Part 4 is an optional ungraded part that asks you to write pseudocode comments
for the functions in mm.c
that you will flesh out for the implicit list first-fit
allocator.
Other tips:
- Take notes on anything you learn during these exercises.
- Read instructions and comments in provided code carefully.
- Draw detailed diagrams of the heap and execute operations on the visual heap by hand to become familiar with the details or to debug.
- Every line of provided code does something meaningful and necessary.
Part 1 (10 points): Understanding Dynamic Memory Allocation (Gradescope)
Checkpoint Part 1 submission
Submit Part 1 of the Malloc Checkpoint exercises on Gradescope by the Checkpoint deadline. Even if you’re working with a partner, each person needs to make their own submission.
To answer the questions in this checkpoint exercise assignment, you should read section 9.9, subsections 9.9.1–9.9.11 of the textbook and review the Memory Allocation slides from lecture, especially slides 17 and 24. As a reminder, you can access chapter 9 online (along with other select chapters of the textbook; see the course topics page). In total there are about 14.5 pages of the textbook to read which describe the basic concepts of heap allocation.
Part 2 (10 points): Understanding the Starter Code in mm.c
(Gradescope)
Checkpoint Part 2 submission
Submit Part 2 of the Malloc Checkpoint exercises on Gradescope by the Checkpoint deadline. Even if you’re working with a partner, each person needs to make their own submission.
To answer the questions in this checkpoint exercise assignment, you will need to
start the cs240 assignment and carefully study the starter code in mm.c
.
A brief overview of this code will be presented during the Malloc lab.
Bug fixes for Gradescope’s Malloc Checkpoint Part 2
There are a few bugs in the Gradescope assignment for Malloc Checkpoint Part 2. (Once students have started to answer the questions, they cannot be edited.)
Here are the bug fixes:
-
In Q2.3 make_status 3, the phrase Base on the answer to 1.6 should be Based on the answers to Q2.1 and Q2.2
-
In Q7.4 block_pred, the orphaned phrase and checks for that word that PRED_USED_BIT and USED_BIT are both set to 0. should be added to the end of the phrase It reads the footer word for the block that comes before the block at block_header for the previous checkbox. That is, the text for the previous checkbox should be It reads the footer word for the block that comes before the block at block_header and checks for that word that PRED_USED_BIT and USED_BIT are both set to 0.
Part 3 (10 points): Simulation heap snapshots (Submission via GoogleDoc)
Checkpoint Part 3 submission
Unlike Parts 1 & 2:
-
Part 3 should be submitted as a Google Doc that’s shared with fturbak@wellesley.edu (with Editor acccess for Lyn).
-
A partner-based team should create a single Google Doc to submit.
-
Your Google Doc should include all of the heap snapshots described below. You can draw the heap snapshot diagrams any way that you like for inclusion in your Google Doc submission. Note that drawing them as tables in a Google Doc makes them easy to copy and edit.
For the following questions, you will draw heap snapshots, where each snapshot
is a diagram that represents the state of the entire heap at a particular
point in time. A heap snapshot summarizes memory as an array of word cells.
For each snapshot, you will use the block layout rules from
mm.c
and the heap setup code in mm_init
.
Read mm.c
to learn about the suggested starting block layout. Words are 8
bytes. Unless specific questions indicate otherwise, pages are 4096
bytes.2 Pay special attention to the early comments in mm.c
about block and
heap layout, as well as the code in mm_init
.
Rather than using the horizontal heap layouts shown in CSAPP, use vertical heap layout consisting of vertically arranged rectangles, where each rectangle represents one or more 8-byte words. See also these sample heap diagrams. The heap should consist of the following:
- An 8-byte heap header
- A sequence of heap blocks, where
- Each free block has:
- an 8-byte block header with
- a block size (multiple of 16)
- status bits. The possible status bits for a free block are:
0b00
previous and current block both free0b10
previous block allocated and current block free
E.g.,
+------------------+ | 48 | 0b10 | # free block of size 48 bytes with prev block allocated +------------------+
-
A rectangle indicating the padding size in bytes between the header and footer. E.g.,
+------------------+ | ... 32 bytes ... | ~------------------+
-
An 8-byte block footer with the block size. (A footer has no status bits.). E.g.,
+------------------+ | 48 | +------------------+
- an 8-byte block header with
- Each allocated block has:
- an 8-byte heap header with block size and status bits. The possible status bits for an allocated block are:
0b01
previous block free and current block allocated0b11
previous and current block allocated
- A rectangle indicating the size in bytes for payload and internal fragmentation.
- Unlike a free block, an allocated block has no footer word at its end (see slide 24 in the Memory Allocation slides).
For example, a block allocated for a payload size between 9 and 24 bytes would be represented as follows, where
p
is0
if the previous block is free and1
if the previous block is allocated.+------------------+ | 32 | 0bp0 | +------------------+ | ... 24 bytes ... | # This includes the payload bytes plus | | # internal fragmentation bytes before next block ~------------------+
- an 8-byte heap header with block size and status bits. The possible status bits for an allocated block are:
- Each free block has:
- An 8-byte heap footer (with size 0 and status bits
0bp1
, where thep
bit indicates whether the previous block is allocated).
Part 3a (1 point): Simulate the starter allocator
For this part, you will simulate the starter code in mm.c
exactly as it is given to you initially to draw just one
heap snapshot that represents the final state of the heap after
executing all of the following statements:
mm_init();
void* p0 = mm_malloc(120);
void* p1 = mm_malloc(40);
void* p2 = mm_malloc(72);
void* p3 = mm_malloc(88);
mm_free(p0);
void* p4 = mm_malloc(24);
void* p5 = mm_malloc(104);
mm_free(p3);
mm_free(p2);
void* p6 = mm_malloc(136);
void* p7 = mm_malloc(56);
mm_free(p1);
mm_free(p6);
mm_free(p4);
mm_free(p7);
mm_free(p5);
Because the starter code is not a complete allocator, it actually does very little. Do not simulate your guess of what the allocator should do. Instead, simulate the starter code exactly as written. This will force you to read and understand the provided code carefully in detail. Your later self will appreciate it when you add comments to the starter code now to help remember what you learn about the starter code during the simulations.
Notes:
-
Here and below, you can draw the heap snapshot diagrams any way that you like for inclusion in your Google Doc submission. Note that drawing them as tables in a Google Doc makes them easy to copy and edit.
-
In the starter code,
mm_init
works correctly. -
Because the
allocate
function called bymm_malloc
in the starter code does not implement splitting, even a small requested payload will end up using a very large block. This means that the next call tomm_malloc
will end up needing to callextend_heap
in order to add one or more pages to the heap. -
The versions of
coalesce
andmm_free
in the starter code do nothing. -
The allocation/deallocation sequence above is exactly represented by the trace file
traces/small2.rep
in themalloc
directory. When theDEBUG
flag inmm.c
is set to1
, running./mdriver.bin -f traces/small2.rep
will show a sequence of heap displays that summarize the actions of the starter code on this allocation/deallocation sequence. It’s worth comparing the final heap in this sequence to your heap snapshot for this part.
Part 3b (9 points): Simulate a correct first-fit implicit list allocator
The simulation in part 3a is not very interesting because it is for a very crude first-fit implicit list allocator that does not implement splitting, freeing, or coalescing. In this part, you should perfrom the simulation of the allocation and deallocation events from part 3a, but instead assume that splitting, freeing, and coalescing work as they are supposed to in a correct first-fit implicit list allocator.
In this part, you should show a heap snapshot after each of the following events:
-
Calling
mm_init()
to create the initial heap. -
Each call to
mm_malloc
, which is assumed to correctly find the first free block that fits a payload of the requested size, potentially splitting it into the smallest initial allocated block that fits the payload and a smaller free block with the leftover space. Note that if no exisiting free block is big enough,extend_heap
will be called to add a free block whose size is a multiple of 4096 that will satisfy the request. -
Each call to
mm_free
, which should update theUSED_BIT
of the freed block and thePRED_USED_BIT
of the following block, as well as add a footer word to the bottom of the newly freed block. But do not show coalescing of freed blocks as part of themm_free
event; instead show the result of coalescing as a separate step (see the next bullet item). -
Show the heap snapshot that results from each call to (a correct implementation of) the
coalesce
function. This will help you to better understand what this function needs to do in various cases when you implement it as part of the first-fit implicit list allocator. You need only show the result of acoalesce
call when it has combined two or three free blocks.
Note:
-
Before you start drawing any heap snapshots for this part, make sure you completely understand this Sample Malloc Trace that we started to review in lecture on Tue Apr 22. This trace is the result of running
./mdriver.bin -f traces/small2.rep
in a correct first-fit implicit list allocator and tracks events similar to the ones you are asked to track in this part. -
When testing your implementation of your first-fit implicit list allocator, set
DEBUG
to1
and run./mdriver.bin -f traces/small2.rep
. This will usecheck_heap
to display text-based representations of the heap that you can compare with your solution to Part 3b to determine if your implementation is working as expected.
Part 4 (Ungraded): Write Features in Pseudocode
For each of the following features, before you starting implementing them
by writing C code, you are encouraged to add comments within the appropriate
parts of mm.c
that sketch the pseudocode for your implementation.
Write pseudocode at a level of detail that is sufficient to simulate and draw concretely, but do not get tangled up in C-level pointer work. Well-written comments at this stage can remain as documentation when you move on to implementation in C.
- Write a pseudocode implementation of
mm_free
as// inline comments
in the body ofmm_free
. - Write a pseudocode implementation of splitting in
allocate
as// inline comments
in the body ofallocate
. - Write a pseudocode implementation of
coalesce
as// inline comments
in the body ofcoalesce
.
Specification
Function Specifications
The three main memory management functions should work as follows.
-
int mm_init()
: Initialize the heap. Return0
on success or-1
if initialization failed. Client applications must callmm_init
once to initialize the system before the first call tomm_malloc()
ormm_free()
. The starter code provides a complete implementation ofmm_init
for implicit free lists with a first-fit allocation policy. void* mm_malloc(size_t payload_size)
: Allocate and return a pointer to a block payload of at leastpayload_size
contiguous bytes or returnNULL
if the requested allocation could not be completed.- Payload addresses must be aligned to 16 bytes.
- Blocks must be multiples of 16 bytes in size.
- Allocated blocks must be non-overlapping and within heap bounds.
size_t
is a type for describing sizes; it is an unsigned integer large enough to represent any size within the address space.
void mm_free(void* payload_pointer)
: Free the allocated block whose payload is referenced by the pointerpayload_pointer
. Assume thatpayload_pointer
was returned by an earlier call tomm_malloc()
and has not been passed tomm_free
since its most recent return frommm_malloc
.
The specifications for mm_malloc
and mm_free
match those for the
malloc(1)
and free(1)
functions, respectively, in the
standard C library.
Programming Rules and Style
-
Do not change any of the
mm_
function types inmm.c
/mm.h
. You may add, remove, or change helper functions inmm.c
as you wish. Declare all helper functions (other than the interface above) asstatic
(visible only within the file, so they do not impact our testing code). -
Do not call any standard memory-management related library functions or system calls such as
malloc
,calloc
,free
, etc. You may use all functions inmemlib.c
, but if you use the provided starter code, you likely will not need additional uses of thememlib.c
functions. -
Avoid declaring global or
static
mutable variables. (Constants are OK:static const MY_CONSTANT = ...;
) If you think you need them, consider how to store them within the heap region instead. -
Write function header comments for new functions (and expand the existing header comments for the main existing functions
mm_malloc
,mm_free
,search
,allocate
, andcoalesce
) to describe what the function does, what policy it follows (if applicable), and what it assumes. Use inline comments to describe details as needed. -
Where possible, use helper functions that are provided within
mm.c
rather than re-implementing the behavior of these helper functions in your code. These helper functions include:static word_t* PADD(word_t* address, word_t distance); static word_t* PSUB(word_t* address, word_t distance); static word_t status_size(word_t status_word); static word_t status_pred(word_t status_word); static word_t status_used(word_t status_word); static word_t make_status(word_t size, word_t pred_used, word_t used); static word_t LOAD(word_t* address); static word_t LOAD_BLOCK_HEADER(word_t* block_address); static word_t* PLOAD(word_t** address); static void STORE(word_t* address, word_t word); static void STORE_BLOCK_HEADER(word_t* block_address, word_t word); static void PSTORE(word_t** address, void* ptr); static word_t block_get_header(word_t* block_address); static void block_set_header(word_t* block_address, word_t header); static word_t block_get_footer(word_t* block_address); static void block_set_footer(word_t* block_address, word_t footer); static word_t* block_succ(word_t* block_address); static word_t* block_pred(word_t* block_address);
-
Since some of the unstructured pointer manipulation inherent to allocators can be confusing, you should write short inline comments on steps of the allocation algorithms that explain the steps. Also, in cases where the helper functions listed above are not sufficient, we recommend defining your own small helper functions.
Implementation
In addition to the following, see Lyn’s Malloc Notes and Advice.
Your implementation will be evaluated based on correctness, performance, documentation, and style. This section suggests an incremental implementation strategy that allows you to choose what balance of implementation sophistication and performance you prefer.
Start with an implicit free list allocator implementation (and commit and push that code once it works!). If you wish to improve the performance of your allocator once you have implemented and tested an implicit free list approach, consider an explicit free list or other search strategies as next steps, or eventually explore more sophisticated improvements.
Basic: First-fit Implicit Free List Allocator
Convert your pseudocode feature implementations to C code one at a time, testing and committing each before starting C code for the next feature.
- Implement and test
mm_free
. - Implement and test splitting in
allocate
. - Implement and test
coalesce
.
Run tests on individual traces for debugging and on all traces for general testing and performance evaluation.
- It is strongly recommended that you test your implementation by running
./mdriver.bin -f traces/small2.rep
. This will usecheck_heap
to display text-based representations of the heap that you can compare with your heap snapshots in Mallo Checkpoint Exercise Part 3b to determine whether your implementation is working as expected. - Some provided traces will cause your allocator to run out of memory unless you have completed all three features.
- Any other errors or crashes indicate correctness problems you must fix.
- The traces
small1.rep
,small2.rep
,short1-bal.rep
, andshort2-bal.rep
should work from the beginning, but they are suitable only as small debugging samples. - Write your own small traces to help test or debug individual cases.
- Use
check_heap
to sanity-check each feature before implementing the next or to help debug when things go wrong.
Implement and test incrementally.
Implement, test, and commit one feature (or sub-feature!) at a time, testing and committing each before you start implementing the next. This will make it much easier to understand what code is involved in a bug.
Optional Improvements [Independent]
You should only consider these optional improvements after you you have a correctly working allocator using a first-fit policy with freeing, splitting, and coalescing.
If you wish to improve the performance of your allocator further, consider any of these directions. Any improvements beyond the basic implicit free list are [Independent] work (though unlike previous independent work, course instructors will be more willing to discuss these extensions).
Be sure to commit your working implicit free list allocator with a recognizable commit message before attempting any improvements.
Basic/Intermediate: Next-Fit Search Strategy [Independent]
If you have only a relatively small amount of time to explore increasing your performance index relative to first-fit implicit list search policy, it’s worthwhile to explore the next-fit search policy as an alternative. This can yield a decent increase in performance for a modest investment of time. (If you have more time to improve performance, it’s better to ignore the next-fit strategy and instead focus on the more challenging (but higher performing) explicit list approach outlined below.)
For a next-fit policy, we suggest:
- Add a
static const int NEXT_FIT_POLICY
that is 1 if using next fit or 0 otherwise. Write code such that you can easily switch policies by switching this value. - Use a global variable—or better, the first heap word—to save a pointer to the block where the next search should begin. Add helper functions for getting and setting this pointer if using the latter.
- See Lyn’s notes about implementing and testing the next-fit policy for more details.
Intermediate: Explicit Free List Allocator [Independent]
An explicit free list allocator can significantly increase your performance index relative to an implicit list allocator, but it is challenging to implement, potentially require lots of time for implementation, testing, and debugging. You should explore this option only if you have signficant time. If your time is more limited, consider implementing the next-fit implicit list option described above.
The best way to develop an explicit free list allocator (or a more sophisticated seglist allocator) is to first develop a working implicit free list allocator and extend it to use explicit free lists.
Memory Order vs. List Order
Explicit free-list allocators must distinguish memory order from list order. Memory order refers to the order of blocks as arranged in the memory address space. We use the terms adjacent, preceding, predecessor, following, and successor to refer to memory order. List order refers to the order of blocks in the explicit free list. We use the terms next and previous to refer to list order. Confusing these orders will lead to tricky bugs.
Some suggestions as you implement explicit free lists:
- An explicit free list allocator (or a next-fit allocator) must store a pointer to the list head node somewhere. The first word of the heap (or a single global pointer variable if necessary) is a good place to store this. Helper functions for getting and setting this pointer will keep list manipulations clean.
- Free blocks must contain next and previous pointers. Consider how
this affects
MIN_BLOCK_SIZE
. Choose a fixed offset within the block to store each of the pointers. Helper functions for getting and setting these pointers will keep list manipulations clean. Using block headers as the target of these pointers in all cases is the simplest, easiest, most efficient option. - Helper functions to insert a block into the free list and remove a block from the free list are advisable, as you will need to do these same steps in multiple places.
- We suggest disabling coalescing, splitting and
mm_free
while you complete initial development of the explicit free list. - Update
mm_init
,search
,allocate
, andextend_heap
to use your explicit free list. Test. - Update and test each of
mm_free
, splitting, and coalescing for the explicit free list. - Study Lyn’s notes on Explicit List Allocation.
Advanced: Seglists, Alternative Policies, and Beyond [Independent]
A clean and efficient working explicit free list allocator should achieve a respectable performance index. To achieve even better performance, you could consider implementing seglists, alternative free list representations, or alternative search policies. If you embark on this path, weigh difficulty of implementation against likely effect on performance index.
Memory System
The memlib.c
package simulates the memory system for your dynamic
memory allocator. You will not need to call these functions directly,
but understanding their effects is necessary for understanding the
provided helper functions above.
-
void* mem_sbrk(int incr)
: Expand the heap byincr
bytes, whereincr
is a positive non-zero integer and returns a pointer to the first byte of the newly allocated heap area.The semantics are identical to the Unix
sbrk
function, except thatmem_sbrk
accepts only a positive non-zero integer argument. (Runman sbrk
if you want to learn more about what this does in Unix.) This function and its approach to managing heap bounds date from before virtual memory. We use it for simplicity (rather than more flexible ways of managing available pages). -
void* mem_heap_lo()
: Return a pointer to the first byte in the heap -
void* mem_heap_hi()
: Return a pointer to the last byte in the heap. -
size_t mem_heapsize()
: Return the current size of the heap in bytes. -
size_t mem_pagesize()
: Return the system’s page size in bytes (4096 on Linux systems).
Compiling and Testing
Test Harness
The mdriver.bin
program is a testing harness that evaluates your mm.c
implementation for correctness, space utilization, and throughput.
The test harness uses trace files to simulate
memory management workloads using your mm.c
implementation. A trace
is a sequence of allocate and free events. mdriver.bin
simulates a
trace by calling mm_malloc
and mm_free
for each corresponding
event in the trace in order.
To build the mdriver.bin
program, run make
or make mdriver.bin
. To
create an optimized version for performance testing (not debugging),
run make mdriver.opt.bin
.
- Use the
mdriver.bin
executable during development, testing, and debugging. It is compiled with minimal compiler optimizations and with assertions enabled. When testing withmdriver.bin
, it helps to set theDEBUG
flag to1
to display heap information for indiviual traces. -
Use the
mdriver.opt.bin
executable for performance evaluation only. It is compiled with advanced compiler optimizations and with assertions disabled, making it more difficult to debug. Its usage is identical to the usage ofmdriver.bin
, so we list only the former in examples. Be sure to set theDEBUG
flag to0
to disable displaying debugging information when measuring performance.make test
is a quick way to test peformance; it generatesmdriver.opt.bin
and then runs
./mdriver.opt.bin -V
.
To run the mdriver.bin
executable, use the command ./mdriver.bin
or
./mdriver.bin -V
. The -V
flag displays helpful summary information as
described below. The mdriver.bin
executable accepts the following
command line arguments:
-t <tracedir>
: Look for the default trace files in directorytracedir
instead of the default directory defined inconfig.h
.-f <tracefile>
: Use one particulartracefile
for testing instead of the default set of tracefiles.-h
: Print a summary of the command line arguments.-l
: Run and measurelibc
malloc in addition to yourmm.c
implementation.-v
: Verbose output. Print a performance breakdown for each tracefile in a compact table.-V
: More verbose output. Prints additional diagnostic information as each trace file is processed. Useful during debugging for determining which trace file is causing yourmm.c
implementation to fail.
Note that the mdriver.bin
test harness runs each trace at least
twice in each session: once to check correctness and once to check
performance. It is important to remember this when examining debugging
output.
Common Testing Tasks
- Run
mdriver.bin
on individual traces for debugging:./mdriver.bin -V -f traces/your-favorite-trace.rep
-
The traces
small1.rep
andsmall2.rep
are especially helpful for initial tests of your allocator. You’ve seen these already in lecture and in Checkpoint Exercise Part 3. - The traces
short1-bal.rep
andshort2-bal.rep
should also work from the beginning (inefficiently). - You may find it useful to write additional traces for debugging.
-
- Run
mdriver.opt.bin
on all traces for correctness testing and performance evaluation:
./mdriver.opt.bin -V
- Note that
make test
is equivalent tomake mdriver.opt.bin
followed by./mdriver.opt.bin -V
- Note that
- Some traces in the
traces
directory will cause your allocator to run out of memory unless you have completed all three implicit allocator features. - Any other errors (other than out-of-memory) or crashes indicate correctness problems you must fix.
Trace Format
When learning about the starter code, you will simulate small traces. When testing and debugging, you may find it useful to write and test your own small traces.
Traces used by mdriver.bin
summarize the execution of a program as a
sequence of mm_malloc
and mm_free
calls in a simple format.
A trace file contains 4 header lines:
- Suggested heap size (any number, ignored by our tests).
- Total number of blocks allocated.
- Total number of malloc/free events.
- Weight (any number, ignored by our tests).
Remaining lines after the header give a sequence of memory management events, one per line. Each event is either an allocate event or a free event:
- Event
a i size
indicates thei
th call tomalloc
in the trace, requesting a payload og the givensize
. The IDi
uniquely identifies the malloc call and the allocation it makes, starting at 0 for the first allocation in the trace. - Event
f i
indicates a call tomm_free
with the pointer that was returned by thei
thmalloc
call in the trace.
The following example C code would generate the corresponding trace below it.
C code:
p0 = malloc(12);
p1 = malloc(16);
p2 = malloc(16);
free(p0);
free(p1);
p3 = malloc(24);
A corresponding trace:
128
4
6
1
a 0 12
a 1 16
a 2 16
f 0
f 1
a 3 24
Heap Consistency Checker
The starter code function check_heap
implements basic heap consistency checks for
an implicit free list allocator and prints out all blocks in the heap
in memory order.
You may find it helpful to insert calls to check_heap
to assist in
testing and debugging allocator features you add. Make sure to
set DEBUG=0
to disable check_heap
when you submit your code. These checks will
otherwise hurt your performance evaluation.
You may also find it helpful to extend the check_heap
function to
check more detailed consistency properties. This becomes most
interesting when considering an explicit free list (or more
sophisticated) allocator. check_heap
should return a nonzero value
if and only if your heap is consistent according to the conditions you
check.
Example checks to add:
- Are all blocks present in the (explicit) free list also marked as free?
- Are all blocks present in the (explicit) free list also valid block addresses?
- Are all blocks marked as free also present in the (explicit) free list?
- Are all free blocks surrounded by allocated blocks (i.e., are all free blocks fully coalesced)?
- Are all blocks non-overlapping (check boundary tags)?
- Are all heap words (except the heap header and heap footer) part of some block?
Feel free to rename or split check_heap
into multiple static
helper functions. When you submit mm.c
, make sure all
calls to check_heap
are conditioned on DEBUG
; otherwise they will impact performance.
Performance
Performance of your allocator is important. Use mdriver.opt.bin
for performance evaluation.
Once you are ready to check your implemention’s performance (once running ./mdriver.bin
reports no errors),
make sure DEBUG
is set to off (set to 0
) to test performance. Performance with a debug flag of 1
will be poor (printing is slow!).
The mdriver.opt.bin
executable is a version of mdriver.bin
that is
compiled with aggressive compiler optimizations enabled and assertions
disabled. We suggest using mdriver.opt.bin
for performance evaluation, but use the
normal mdriver.bin
during testing and debugging.
The testing infrastructure automatically calculates a combined performance index based on two metrics:
- Space utilization: The peak ratio between the aggregate amount of
memory used by the driver (i.e., allocated via
mm_malloc
but not yet freed viamm_free
) and the size of the heap used by your allocator. The optimal ratio is 1. Choose policies that minimize fragmentation to maximize utilization. - Throughput: The average number of allocator operations completed per second.
Your performance index (PI) is a weighted sum of the space utilization and throughput:
PI = 0.6 × Umm.c + 0.4 × min(1, Tmm.c / Tlibc)
where Umm.c is the space utilization of your
mm.c
implementation, Tmm.c is the
throughput of your mm.c
implementation, and
Tlibc is the estimated throughput of the
standard C library (libc
) version of malloc
on the default traces.
The performance index values both good space utilization and good
throughput, with slight preference toward space efficiency.
mdriver.opt.bin
calculates and reports the performance index for you. For example, for an A reference solution:
$ ./mdriver.opt.bin
Using default tracefiles in traces/
Perf index = 48 (util) + 40 (thru) = 88/100
Your allocator must balance utilization and throughput. For reference, in previous semesters, representative implementations have achieved the following performance indices (out of 100):
- Implicit free list with first-fit, splitting, and coalescing: 50 to 52
- Implicit free list with next-fit, splitting, and coalescing: 67 to 72
- Explicit free list with first-fit, splitting, and coalescing: mid to high 80s
Because the highest peformance indices in previous semesters tends to be about 88, your peformance percentage will be measured out of 88 rather than 100. See more details here.
Better performance likely requires a more sophisticated free list implementation. Before haphazardly attempting to “optimize” your implementation, read these hints about performance engineering and optimization..
How To Work
Low-level unguarded memory manipulation code like this allocator is prone to subtle and frustrating bugs. There’s a reason you often use higher-level languages with more protections! The following strategies help to work effectively and responsibly to avoid frustration and confusion.
Plan Carefully
Review allocator concepts, complete the preparatory exercises, and read the provided code carefully. Never write code until you have a good idea of what you are doing and what is already there.
Code Defensively
Expect things to go wrong. Anywhere your code relies on an assumption, check it explicitly. Explain to your partner why your code should work. Use helper functions to help avoid errors.
-
Use assertions. Anywhere you assume a property of data to hold, you can assert that property explicitly. Assertions document your assumptions. They are also checked explicitly as your program runs, making it easier to localize the source of an error if one occurs.
-
Use the provided helpers for memory access (
LOAD
,STORE
,PLOAD
,PSTORE
.) and pointer arithmetic (PADD
andPSUB
). This helps avoid pointer manipulation errors or at least catch them early.
Implement and Test Incrementally
Implement, test, and commit one feature (or sub-feature!) at a time, testing and committing each before you start implementing the next. This will make it much easier to understand what code is involved in a bug.
Debug Methodically
GDB and the
check_heap
function are the tools of choice. Here
are some strategies and other tips.
- As you are debugging, remember that the
mdriver.bin
testing harness runs each trace twice: once to check for correctness and once to measure performance. Keep this in mind when examining any debugging output, especially if it seems to appear twice. - Did your program hit an explicit error? An assertion failure? A segfault?
- Localize the crash. Exactly what line of code, what
operation in this line, what value(s) used by this
operation manifested what error and crashed? This symptom
rarely indicates the cause, but it marks where to begin the
search.
- Use
gdb
torun
the code until it crashes, then find the line that crashed (and the context in which it executed) withbt
,backtrace
, orwhere
. - Inspect the most relevant line of code and the error report to determine what the error means. Was it a segfault? If it was an assertion failure, what was the assertion checking?
- Inspect the arguments shown in the backtrace or
print
variables to determine what ill-formed pointer was dereferenced to cause a segmentation fault or what illegal values failed an assertion.
- Use
- Trace backward through the dependencies of this broken
operation to the original logic error.
- Invalid values: Did any of this operation’s arguments
hold invalid values? Did this operation load any invalid
values from memory?
- What operations produced these invalid values?
- This is easy to answer for invalid arguments.
- For invalid memory values, consider what earlier operations (perhaps in completely separate function calls) may have saved (or failed to save) this value. Look ahead to the memory tips.
- Continue tracing backwards treating these producer operations as broken operations.
- What operations produced these invalid values?
- Invalid choices: Was it invalid to apply this operation
here given correct arguments (or regardless of the
arguments)?
- What control flow decision led to execution of this operation?
- Is the logic of the decision correct?
- If so, continue tracing backwards treating the control flow decision as a broken operation.
- Callers: You may need to trace back beyond the
function where the crash happened.
- Get a
backtrace
to see where the current function was called. (Don’t just assume you know which function called this one. Get the truth.) - If you want to inspect local variables in functions that
called this one, use
up
anddown
ingdb
to step up and down the call stack, thenprint
what you want to see.
- Get a
- Memory: Remember,
mm_malloc
andmm_free
are called many times during a single execution. They depend on their arguments, but also the entire contents of the heap as it exists when they are called, so changes made by earlier calls tomm_malloc
andmm_free
affect data used by this one.- Use the
check_heap
function to scan the heap. - Run it at arbitrary times in
gdb
withcall check_heap()
. - Or hard-code calls to it in your code so you see how the
heap evolves with each
mm_malloc
ormm_free
call.
- Use the
- Invalid values: Did any of this operation’s arguments
hold invalid values? Did this operation load any invalid
values from memory?
- Minimize the input. Try to write a small trace that captures the essence of the large trace on which your allocator has an error. A smaller trace is easier to reason about in full.
- Localize the crash. Exactly what line of code, what
operation in this line, what value(s) used by this
operation manifested what error and crashed? This symptom
rarely indicates the cause, but it marks where to begin the
search.
-
valgrind
will be less useful since you are doing very low-level work, implementing the memory allocator itself.valgrind
excels in analyzing programs that use the memory allocator. -
If resorting to printing (try
gdb
first), usefprintf(stderr, "...", ...)
instead ofprintf("...", ...)
and be sure to remember newlines for clarity. Disable any printing in final version. -
Use the
mdriver.bin
-f
option. During initial development, using tiny trace files will simplify debugging and testing. We have included two such trace files (short1-bal.rep
andshort2-bal.rep
) that you can use for initial debugging. You can also write your own for targeted debugging/testing. -
Use the
mdriver.bin
-V
option. The-V
will option will give a detailed summary for each trace file and indicate when each trace file is read, which will help you isolate errors. - Does your code seem to be running forever?
- That might be because it is no longer broken! Set
DEBUG=0
to disable those calls tocheck_heap
. Printing the full contents of the heap on every memory management event costs orders of magnitude more than the allocations or frees themselves. - Or, it might be because a logical mistake is causing an loop. Reenable
debugging and check with
gdb
.
- That might be because it is no longer broken! Set
Optimize Judiciously
“… avoid premature optimization…” – Donald Knuth (or Tony Hoare?) and countless others.
When your program is not as efficient as you wish, it is easy to jump in and start hacking on the first part that comes to mind as a potential reason for poor performance. A much better approach is to first:
- Make sure you have enabled all available automatic optimizations, such as compiler optimizations.
- If this does not yield sufficient gains, then:
- Gather and analyze empirical data about where your program is spending its time or other resources.
- Use this data to form a hypothesis about what particular implementation features are “bottlenecks” contributing to poor performance.
- Now you are ready to attempt some modifications and use this
same measurement process to evaluate their impact.
- Think first in Big-O terms. Consider lower level implementation optimizations only once you have exhausted opportunities for algorithmic optimizations.
Submission
To submit the required prep exercises, complete the Gradescope pre-lab and bring your diagrams for Part 2 to lab.
Before submitting, disable any diagnostic printing that you added (except in check_heap
).
Make sure DEBUG
is off (set to 0
).
For code, submit the best implementation that you have. If you have a partially implemented improvement or if your most recent version is not your best:
-
Copy your partial or non-best version of
mm.c
to a new file starting withmm-
and ending with.c
. Then add and commit it with a descriptive message. For example:cp mm.c mm-partial-explicit.c git add mm-partial-explicit.c git commit -m "Copied partial explicit free list implementation to mm-partial-explicit.c"
- Use
git log
to find the commit ID (large hexadecimal number) of your best working version. - Use
git checkout ID -- mm.c
(replaceID
with the commit ID you found above) to restore the working version you will submit. - Rebuild it with
make clean
andmake
, then test it to make sure this is the one. -
Add and commit this version of
mm.c
:git add mm.c git commit -m "Restored best working version of mm.c"
Then, to submit your code, follow the usual steps:
Submit: The course staff will collect your work directly from your hosted repository. To submit your work:
-
Test your source code files one last time. Make sure that, at a minimum, submitted source code is free of syntax errors and any other static errors (such as static type errors or name/scope errors). In other words: the code does not need to complete the correct computation when invoked, but it must be a valid program. We will not grade files that do not pass this bar.
-
Make sure you have committed your latest changes. (Replace
FILES
with the files you changed andMESSAGE
with your commit message.)$ git add FILES $ git commit -m "MESSAGE"
-
Run the command
cs240 sign
to sign your work and respond to any assignment survey questions.$ cs240 sign
(If this encounters an error, instead execute
cs240.s25 sign
.) -
Push your signature and your latest local commits to the hosted repository.
$ git push
Confirm: All local changes have been submitted if the output of
git status
shows both:
Your branch is up to date with 'origin/main'
, meaning all local commits have been pushednothing to commit
, meaning all local changes have been committed
Resubmit: If you realize you need to change something later, just repeat this process.
Grading
- Preparatory Exercises (30 points):
- Complete the 3 required Malloc Checkpoint Exercises by the checkpoint deadline.
- You are encouraged to start working on your code as soon as you have the checkpoint exercises done, even if you do not have your grade back for them yet.
-
Code (90 points total):
- Documentation and Style (10 points):
- Document your code clearly. This includes:
- Clear comments explaining the inputs, outputs, and behavior for any new helper functions.
- Comments explaining the cases/steps in multi-case/multi-step function bodies
like
allocate
,coalesce
, andmm_free
. - Comments explaining changes to the layout of heap blocks or the overall heap structure.
- Comments explaining the purpose of any nontrivial local variables.
- Other comments to explain code aspects that aren’t obvious.
- Follow the programming rules.
- Document your code clearly. This includes:
- Correctness (40 points): based on passing
mdriver.bin
tests- 2 points each for
short1-bal.rep
andshort2-bal.rep
. - 4 points each for the 9 remaining traces, excluding the optional
realloc*.rep
traces.
- 2 points each for
- Performance (40 points): _40 * (PI / 88)
- PI is the performance index reported by
mdriver.opt.bin
. - These points are possible only for working implementations.
- See this explanation for why the denominator is 88 rather than 100, and what sort of performance scores you can expect for working versions of different implementation strategies.
- PI is the performance index reported by
- Extra Fun Successful open-ended extensions may earn additional points.
- Documentation and Style (10 points):
You may use any implementation strategy that works (subject to programming rules). Allocator performance is a large component of the grade for this assignment.
Extra Fun Open-Ended Extensions
There are a number of interesting allocator features you could add if looking for more fun or extra credit. Each is annotated with a possible extra credit award for a thorough, well-designed, well-implemented extension. Other improvements to allocator performance on the standard test suite will naturally be reflected in your grade independent of extra credit. Talk to the instructors if you are curious about any of these extensions or if you have ideas for others. Some larger extensions would be larger than the original assignment.
Note that you will only receive credit for extensions if you already have full correctness points; do not attempt extra credit before that point.
Implement realloc
(Up to +10)
Implement another memory-allocation-related function:
mm_realloc
. The signature for this function, which you will find in
your mm.h
file, is:
extern void* mm_realloc(void* ptr, size_t size);
Similarly, you should add the following in your mm.c
file:
void* mm_realloc(void* ptr, size_t size) {
// ... implementation here ...
}
Follow the contract of the standard C library’s realloc
exactly
(pretending that malloc
and free
are mm_malloc
and mm_free
,
etc.). The man page entry for realloc
says:
The realloc() function changes the size of the memory block pointed to by
ptr to size bytes. The contents will be unchanged in the range from the
start of the region up to the minimum of the old and new sizes. If the
new size is larger than the old size, the added memory will not be
initialized. If ptr is NULL, then the call is equivalent to
malloc(size), for all values of size; if size is equal to zero, and ptr
is not NULL, then the call is equivalent to free(ptr). Unless ptr is
NULL, it must have been returned by an earlier call to malloc(), calloc()
or realloc(). If the area pointed to was moved, a free(ptr) is done.
A good test would be to compare the behavior of your mm_realloc
to
that of realloc
, checking each of the above cases. Your
implementation of mm_realloc
should also be performant. Avoid
copying memory if possible, making use of nearby free blocks. You may
not use standard library functions such as memcpy
to copy memory.
Instead, copy WORD_SIZE
bytes at a time to the new destination while
iterating over the existing data.
To run tracefiles that test mm_realloc
, compile using make
mdriver-realloc.bin
. Then, run mdriver-realloc.bin
with the -f
flag to
specify a tracefile, or first edit config.h
to include additional
realloc tracefiles in the default list.
Extended Error-Checking
(Up to +100) Augment the allocator to do additional error-checking on the fly to help detect common memory errors such as invalid frees (bad address), double-free, use-after-free, out-of-bounds access on heap objects corrupting heap metadata, etc. You could insert extra padding, use canary values, place recently freed blocks in a waiting area, track additional information about earlier allocations and frees to detect errors, or more. Come chat if you have ideas or interest here.
-
This document is based on the CSAPP Malloc Lab, which is available on the CSAPP website. The Malloc assignment includes additional structure in the starter code and makes the
realloc
implementation optional. ↩ -
The page size 4096 is the number of bytes returned by
mem_pagesize()
and used by heap enlarging functionmem_sbrk
to make the heap larger. ↩