Today, we'll continue our discussion from last week on automatic memory management, specifically reference counting and tracing garbage collection.

Recall from last time

In any program, there are 3 steps a programmer needs to do to use dynamic memory:

  1. Allocate the memory needed for the specific size of object required.
  2. Use that allocated memory for some period (e.g., initializing it, doing further calculations or computation with it).
  3. Release, or free, the allocated memory when the task is done and the memory is no longer needed (so other parts of the program can use it).

In low-level languages like C, the programmer must do all 3 steps manually. We'll call this manual or explicit memory management: it is not considered "memory safe".

In automatic memory management, which occurs in many programming languages, like Java and Python, steps (1) and (3) are done for the programmer by the language implementation itself. These are considered memory safe languages.

Soundness vs. completeness for automatic memory management

Recall that last time, we showed that reference counting does not always successfully free all memory that can be freed. In particular, cycles in the reference graph can cause the bad outcome that unreachable objects are never actually freed.

However, while this is an undesirable outcome, this is not actually a memory safety issue. Going back to some terms we say earlier in CS340, reference counting is "sound, but not complete".

In the context of memory management:


Consider our earlier, incorrect attempt at a garbage collection strategy:

"Free objects when the function in which they were created returns."

Is this strategy sound? Is it complete?

(Think, then click!)

No, it is not sound, because of the counterexamples we saw previously (objects can live on and be used after that function returns, so it is not safe to free them). Using memory after a potential free is a memory safety issue.

It's also not complete (depending on our definition), because function can have nested scopes within them after which an object is never used.

For example, consider a case where inside a conditional, we create an object that is never used outside of that conditional. This strategy would not free the object until the end of a potentially long-running function.

def f()
    if rare_case
        c = new Cat()
    // otherwise, long running work


Tracing garbage collection

Recall that in tracing garbage collection, we avoid the issue with cycles by traversing reachable objects starting at the root set.

The core algorithm is:

  1. Starting at the root set, traverse the reference graph to get the reachable set of "live" objects.
  2. Free objects that are not reachable ("dead").

Mark and sweep

The simplest tracing GC implementation is mark-and-sweet.

  1. Mark: keep an extra few bits of metadata on each heap block. Traverse starting at the root set, marking blocks that you visit.
  2. Sweet: do a second, linear pass over the entire heap, freeing every block not marked in the previous step.

The downsides of this simple implementation are:

  1. It requires traversing the entire heap, instead of just live objects. O(sizeofentireheap) runtime.

What else? Hint: consider the problem from CS240 malloc where we can't find a certain block of data even though there could be room if we moved data around.

(Think, then click!)

The second major problem is fragmentation: our still-in-use allocated blocks can form a patchwork that makes it hard to use memory optimally.

The problem of fragmentation is somewhat unavoidable in any language that allows programmers to access raw pointers (e.g., C) rather than higher-level references (e.g., Java). This is because in languages like C, we can't just move objects in the heap around, because it would change their underlying address.


Moving vs. non-moving

The basic mark-and-sweep algorithm is non-moving: it keeps allocated blocks at the same addresses in memory.

In higher-level languages with references, GCs can be moving GCs that compact/compress blocks as part of collection, substantially reducing fragmentation.

The high level idea is this:

  1. Find live objects via the same root set traversal idea as before.
  2. Copy all live objects, eliminating any space between objects.

This is the somewhat mind blowing part: what is a little fishy about step 2?

(Think, then click!)

Where are we supposed to copy live objects to? The objects are in the heap, but need to be copied to another part of the heap. How do we not overwrite other objects?

This tension motivates a common type of moving GC: a "semi-space" GC.


Semi-space GC

The simplest semi-space GC has such a strange idea that it at first seems like it would note work:

  1. Divide the heap in half.
  2. Use only the first half. Then run a moving GC step, copying and condensing live objects into the second half.
  3. Consider the entire first half free.

To me, this idea is like deciding you don't like one room's decorations in your house, so you buy a new house and move all your furniture there!

In turns out in practice, because this only uses twice the memory as before and multiplying by 2 is a single constant multiplicative factor, this is often a reasonable tradeoff to make in practice.

This is especially true because collection is now conceptually O(numberofliveobjects) instead of O(sizeoftheentireheap).

Generational collectors

More complicated moving collectors take advantage of an adage that "most objects die young". This captures the idea that in programming, many objects are only used briefly, while a small proportion of objects might be used for almost the entire duration of the program. "Generational" GCs take advantage of this by further subdividing the heap into different areas for objects expected to live for different durations, and collecting from them at varying frequency.

Tradeoffs in choosing a GC strategy

There is a tradeoff between how much extra memory a GC strategy uses and how much slower it is from optimal. Using only roughly the same space as an optimally manually managed memory program is orders of magnitude slower. If a GC can use 2x the memory, it's only 70% slower than ideal; this drops to 17% slower if a GC can use 3x the memory.

The key takeaway: for most applications, the memory safety and convenience offered by an automatic garbage collecting language is worth the additional memory usage and speed cost.

Cases where an automatic garbage collector may not be worth the performance costs:

  1. Very performance sensitive settings, like operating systems, browsers, etc. These are the domains where C/C++ are still common.
  2. Embedded and real-time systems, where the fact that tracing GCs need to "stop the world" in order to collect is too unpredictable in terms of program timing.

These two settings have motivated the programming languages community to begin to popularize a different option.

Rust: semi-manual but safe, via its type system

Rust is a language that builds on many older, experimental programming languages from decades of research.

Rust:

  1. Mostly does step (1) size tracking and step (3) freeing for you.
  2. Is often as fast as C.
  3. Is memory safe (by default).

This might seem too good to be true, given the tradeoffs we have discussed! How Rust is able to provide this is via an advanced type system.

At a high level, Rust essentially builds on the idea of "free a variable when it goes out-of-scope", which could be from a function returning or from an if-statement-block ending. However, to avoid the safety issues in the counterexamples we saw previously, the Rust type system imposes strong restrictions on how different references can alias the same memory.

fn f() {
    let s = string::from("hello");
    // use s | Rust imposes strong restrictions on how
    // Rust inserts a call to drop(s)
    return
}

To make programs with these restrictions work in practice, Rust employs a "borrow checker" that essentially tracks an over-approximation of the lifetime of various references. The details are beyond the scope of this lecture, but come talk to me if you want to here more!


Modeling memory management

For the next assignment, you will be modeling some aspects of automatic memory management (and the soundness and completeness properties) in Alloy.

Today, we'll start with a basic model for reference counting:

Reference counting

abstract sig Memory {
  // Variable, since the reference graph changes over time as the programmer 
  // modifies the heap.
  var references: set HeapObject
}
// references: relation on Memory->HeapObject

// For simplicity, we'll just model the root set as a single sig with 
// potentially many outgoing edges in the references relation.
one sig RootSet extends Memory {}

sig HeapObject extends Memory {
  // Need to use an int here, since we need counting + zero checking
  // Variable, since the reference graph changes over time as the programmer
  // modifies the heap.
  var refCount: Int 
  // Assume we don't care about overflow: Search: Prevent Overflows: Yes
}
// Objects not in Allocated are freed
// Variable, since what is allocated vs freed changes over time.
var sig Allocated in HeapObject {} 

// Takes the var fields as arguments, so I can use this pred "now" and "after"
pred reachableFromRootSet[o: HeapObject, refs: Memory->HeapObject] {
  // RootSet
  o in RootSet.^refs
  // ^refs: is all reachability, to start at RootSet we do a join with it on the left
}

// Soundness: memory safety: no reachable memory has been freed.
pred memorySafe[alloc: Allocated, refs: Memory->HeapObject] {
  // Everything reachable from the root set is still allocated (not free)
  // if, then statement: logic, what to use? Implies
  all o: HeapObject | reachableFromRootSet[o, refs] => o in alloc
}

// Completeness: no leaks in the given step (everything not freed is reachable). 
pred noLeaks[alloc: Allocated, refs: Memory->HeapObject] {
  // Everything NOT reachable from the root set is NOT in alloc
  all o: HeapObject | o in alloc => reachableFromRootSet[o, refs]
}


// We could model every step of the algorithm to increment and decrement
// the reference count, but that would be a very detailed model.
// Instead: we'll just model the invariant that the reference count is 
// equal to the number of incoming edges, since the invariant alone is 
// all we need to understand the problem with cycles.
fact refCounts {
  always  {
    all o: HeapObject | { 
      // Fill in: the set of all incoming edges to this object 
	  o.refCount = #(references.o)
    }
  }
}

// We'll break up the changes to the heap into two predicates:
// in a given step, either the programmer does something to change
// the heap contents (e.g., change the reference graph, or 
// reference counting and freeing happens, or "do nothing" happens.
pred heapChangedProgrammer {
   // TODO: Think about our "var" sigs and fields. 
}

run {} for 5

Addition to make it easier to see steps

One way to make it easier to identify which step the trace is currently taking is to add an explicit sig marking the options for steps:

// Abstract because we only want specific subsigs
abstract sig Step {}

// One because each is a single category of step.
// "extends" because each is mutually exclusive.
one sig Programmer, RefCountFree, RefCountCascade, DoNothing extends Step {}

// Variable, single sig tracks which step we are currently in.
// "in" because it is not mutually exclusive.
var one sig WhichStep in Step {}

We then constrain WhichStep within each predicate, for example:

pred heapChangedByProgrammer {
  Programmer = WhichStep 
  // ...
}

Now, the visualizer will always show which step is the current step within a trace, which can make debugging easier. This is an addition that might be helpful if you use Alloy for your final project.

Modeling a tracing, mark-and-sweep GC

What do we need to change about our work-in-progress reference counting model to instead model a simple mark-and-sweep tracing GC?

First, we'll remove the refCount field on the HeapObject sig.

Instead, what metadata do we now need on heap objects?

(Think, then click!)

We need to track whether a heap object is marked. Rather than adding a boolean to each heap object, we can just track this globally with a Marked set, similar to our previous Allocated sig.

sig HeapObject extends Memory {}

var sig Allocated in HeapObject {} 

// Marked: heap object is in the set if marked.
// Variable: will change over time.
var sig Marked in HeapObject {} 

We had 4 transition predicates for reference counting: heapChangedByProgrammer, referenceCountingFrees, referenceCountingCascade, and doNothing.

Which of these do we need to only slightly update for a mark-and-sweep GC, and which do we need to replace entirely? What should we replace them with?

(Think, then click!)

heapChangedByProgrammer and doNothing need to be only slightly updated.

Both should be updated to include the new variable sig, Marked. In both, we can just say the marked set does not change:

Marked' = Marked 

Because the marked set is only used by the active steps of the garbage collector, here we could also just say the set should be empty in these steps:

no Marked 

The other two steps, referenceCountingFrees, referenceCountingCascade, should be replaced.

What would be a good model for what to replace them with?

The natural two steps in the algorithm: mark and sweep.


New mark and sweep transition predicates

pred mark {
  // Think about var sigs: references, Allocated, Marked
  references' = references 
      
  // The allocated set does not change in this first pass
  Allocated' = Allocated
  
  // Don't call predicates, just say what marked should be
  // Reachable tells us: ^references
  Marked' = RootSet.^references
}

pred sweep {
  // Assume this happens immediately after mark
  // Think about var sigs: references, Allocated, Marked
  no Marked' 
      
  // What stays allocated in the next step is what is marked in the current step.
  Allocated' = Marked

  // How should we change the references set. 
  // Hint: it should potentially shrink
  // 
  // Subtract: outgoing edges from free, incoming edges to free
  references' = references - ((HeapObject-Marked)->Memory) - (Memory -> (HeapObject-Marked))
  // Same as:
  references' = references & ((Marked + RootSet)->Marked)
  // Same as:
  references' = references & ((Allocated' + RootSet)->Allocated')
}