In this lecture, we'll take a break from Alloy to cover a topic important in systems programming in general: the idea of memory management. This will prepare us to model memory management in next week's assignment.

Memory management in general

To write almost any computer program that runs over a period of time, programmers need to consider how to management memory use within the program. The phrase "memory management" typically refers to dynamic memory, or memory that is requested while the program is running (as opposed to static memory, such as global variables, which are pre-allocated by the compiler or interpreter before the program runs).

In any program, there are 3 steps a programmer needs to do to use dynamic memory:

  1. Allocate the memory needed for the specific size of object required.
  2. Use that allocated memory for some period (e.g., initializing it, doing further calculations or computation with it).
  3. Release, or free, the allocated memory when the task is done and the memory is no longer needed.

In some programming languages, like Java and Python, steps (1) and (3) are done automatically for the programmer by the language implementation. In low-level languages like C, the programmer must do all 3 steps manually.

There are a few different wide buckets that break down how these different languages and settings approach this problem.

Explicit/manual memory management (e.g., C, C++)

When you first learned to program in C (e.g., in CS240), that was likely your first time needing to manually manage all parts of using dynamic memory in your program via calls to malloc and free. The high level idea is that a memory allocator library implements malloc and free, but the programmer themselves is responsible for calling free on allocations that are no longer needed.

Implicit/automatic memory management: "garbage collection" (e.g., Java, Python, most high-level languages)

In high-level languages like Java and Python, the language implementation itself handles allocating the right size of memory and freeing memory that is no longer in use by the program.

There are two historically common flavors of automatic memory management we'll cover in today's lecture:

  1. Tracing garbage collection (e.g., Java, C#).
  2. Automatic reference counting (e.g., Python (usually), Objective-C)

Some languages, like Rust, allow some forms of garbage collection (e.g., automatic reference counting) while also allowing programmers to have some of the manual components of explicit memory management handled for them via the "borrow checker" (we'll cover this if there is time).


Factors for choosing different memory management strategies

The main factors/tradeoffs are:

  1. Efficiency (speed) of the code. Explicit management (e.g., C) tends to be faster than automatic, automatic reference counting tends to be fasting than tracing GC.
  2. Memory utilization. Different strategies have different amounts of overhead in how much "extra" memory they require.
  3. Most importantly (I would argue) is how "safe" each strategy is: that is, how easy it is for the programmer to make mistakes that can be costly or violate security properties.

Downsides of explicit/manual memory management (e.g., C)

In C, a programmer wanting to create and use a Cat struct would write something like:

void makeCat() {
    Cat *c = (Cat *)malloc(sizeof(Cat));
    // Initialize and use c
    free(c); // Programmer has to remember to free
    return;
}

What can go wrong in the use of manual memory management here?

(Think, then click!)

The answers we came up with in class:

  1. Easy to mess up the size (which could lead to buffer overflows in the extreme case!)
  2. Could forget to free, an instance of a "memory leak". "Memory leak" terminology is used especially when it would be no longer possible for the programmer to free the memory, because the local variable storing the pointer is now gone (popped off the call stack).
  3. Freeing doesn't actually clear the bytes (this is something not really handled by any of these strategies alone, though is exacerbated by other problems like buffer overflows).
  4. Pointers are just integers/raw bytes treated as memory addresses: they do not have any special protection within the language.
  5. Programmers could accidentally call free multiple times on the same pointer ("double free").
  6. Programmers could use memory after having called free ("use after free)")

Security implications of manual memory management

In computer security, there is a concept called "CVEs", or Common Vulnerabilities and Exposures.

CVEs are essentially a public, tracked list of known security vulnerabilities (bugs that could be exploited).

What percentage of CVEs listed by Microsoft do you think are from memory safety issues?

(Think, then click!)

The answer is around 70%.


The supplemental reading for today is a 2023 press release from the US Cybersecurity and Infrastructure Security Agency describing the urgent risk posed by memory safety issues in modern software.

Details on automatic memory management

The key idea of automatic memory management is to not make the programmer track the size of data and when data should be freed. Instead, the language implementation itself handles these details.

For example, consider our Cat example in Python:

def make_cat():
    Cat c = new Cat(); # Don't need to worry about size
    int result = c.do_something()
    return result # No need to free the Cat memory

The brings up the core question in this lecture: when is it safe for the programming language to automatically free allocated memory?

Note that there is more than one reasonable/correct answer here!

  1. When the function in which it was created returns.
  2. When the thread in which it was created ends.
  3. When the programmer can no longer reference it.
  4. When there are no more pointers to it.
  5. After the last reference to it in the code.
(Think, then click!)

In class, we had groups vote on the answers, and got a wide range!

The 2 answers that do not quite work are these two:

(1) When the function in which it was created returns.
2. When the thread in which it was created ends.


We can't just free memory when the function returns, because consider this modification to the code:


def call_make_cake():
    l = []
    c2 = make_cat()
    # Use c2 after make_cat returns
    
    
def make_cat():
    Cat c = new Cat(); 
    return c 

make_cat returns, but other functions can continue to use that memory!

Student question: can we just track what memory is returned by the function?

Answer: no, not quite, consider the following:


def make_cat_list():
    l = []
    make_cat(l)
    # use Cat via the list l
    
    
def make_cat(l2):
    Cat c = new Cat(); 
    l2.append(c)
    return 

Even though we don't return c, the Cat data is still potentially being used because it was added to some list data structure created outside of the function.

We also can't free data when the thread in which it was created ends for similar reasons: the data can be shared between threads, so the creator thread ending does not mean it is safe to free the data!

In the list from before, options (3) and (4) are the potentially-correct answers that correspond to reference counting (4) and tracing GC (3). Option (5) somewhat corresponds to Rust's borrow checking, but how this is accomplished is more complicated than we have time to go into today.

  1. When the function in which it was created returns.
  2. When the thread in which it was created ends.
  3. Correct (tracing GC): When the programmer can longer reference it.
  4. Correct reference counting):When there are no more pointers to it.
  5. Right idea, but needs more detail: After the last reference to it in the code.

Automatic reference counting

The overall idea of reference counting is that each dynamic allocation in the heap also tracks a bit of extra metadata: the number of incoming references to that object. The key idea is that when the count gets to 0, the language implementation can free that memory.

This proceeds as follows: the language implementation tracks a "root set" of references outside the heap that reference heap data. This includes global variables and any local variables, which are stored in the stack frame of the relevant function call where that local variable is defined.

When a new heap allocation is created, the reference count is automatically set to 1 (assuming the data is stored in some variable). When a new reference is created (e.g., by passing the data into a new function call) the count is incremented. When a reference is destroyed (e.g., by a function call returning), the reference count is decremented.

In short, automatic reference counting tracks the number of incoming edges in the reference graph, and free an object when its count reaches 0.

Can you think of an example, using the graph properties we're been examining frequently in Alloy, of a reference count where this strategy does not free data that could be freed?

(Think, then click!)

When there is a cycle in the reference graph, but the entire cycle is no longer reachable from the root set!


The problem of cycles in automatic reference counting

Consider how reference counting works on the following program modeling linked-list nodes:

def nodes()
    n1 = new Node() # n1 RC = 1
    n2 = new Node() # n2 RC = 1
    n1.next = n2    # n2 RC = 2
    n2.next = n1    # n1 RC = 2
    return          # n1, n2 RC = 1, not 0

Note that when the function returns, the local variables n1 and n2 are destroyed because the call stack frame is popped. This decrements each RC by 1, to 1. However, without further action, this does not bring the reference count to 0, so this data is never freed!

High-level note: cycles in the reference graph can cause these memory leaks in automatic reference counting! These are sometimes called "retain cycles". Alexa has gotten this as an interview question.

In practice, languages like Python have additional mechanisms to try to find and free these cycles, but these mechanisms are heuristic and potentially slow down automatic reference counting.

Note that while retain cycles can cause worse memory utilization, they do not actually create any memory safety problems. In other words, they cause the collector to not collect all possible data, but they don't free data that should not be freed.

Efficiency of code

Tracing garbage collection

Tracing garbage collection uses more complicated graph algorithms to get around this problem with collecting cycles. Instead of each object always tracking the incoming edges, a tracing GC periodically traces everything reachable from the root set.

Mark and sweep

One core algorithm, called "mark and sweep", proceeds as follows: everything in the root set is marked reachable. Then in the "mark" phase, the GC traverses the reference graph, marking a few metadata bits to indicate an object is still "live". In the "sweep" phase, the GC does a linear scan of entire heap, freeing any data that was not marked as live.

The downside of all tracing GCs is that they typically require the runtime to "stop the world" in order for garbage collection to periodically take place. More on this next time!