Today, we'll continue our discussion from last week on automatic memory management, specifically reference counting and tracing garbage collection.
Recall from last time
In any program, there are 3 steps a programmer needs to do to use dynamic memory:
- Allocate the memory needed for the specific size of object required.
- Use that allocated memory for some period (e.g., initializing it, doing further calculations or computation with it).
- Release, or free, the allocated memory when the task is done and the memory is no longer needed (so other parts of the program can use it).
In low-level languages like C, the programmer must do all 3 steps manually. We'll call this manual or explicit memory management: it is not considered "memory safe".
In automatic memory management, which occurs in many programming languages, like Java and Python, steps (1) and (3) are done for the programmer by the language implementation itself. These are considered memory safe languages.
Soundness vs. completeness for automatic memory management
Recall that last time, we showed that reference counting does not always successfully free all memory that can be freed. In particular, cycles in the reference graph can cause the bad outcome that unreachable objects are never actually freed.
However, while this is an undesirable outcome, this is not actually a memory safety issue. Going back to some terms we say earlier in CS340, reference counting is "sound, but not complete".
In the context of memory management:
- Soundness is the property that there are no issues that could cause crashes or memory safety issues (e.g., use-after-free, double-free.)
- Completeness is the property that everything that can safely be freed is freed. This is a bit harder to define precisely: do we mean "freed immediately", or "eventually freed"? For now, we'll stick with a definition of "freed on the next run of the garbage collector".
Consider our earlier, incorrect attempt at a garbage collection strategy:
"Free objects when the function in which they were created returns."
Is this strategy sound? Is it complete?
(Think, then click!)
No, it is not sound, because of the counterexamples we saw previously (objects can live on and be used after that function returns, so it is not safe to free them). Using memory after a potential free is a memory safety issue.
It's also not complete (depending on our definition), because function can have nested scopes within them after which an object is never used.
For example, consider a case where inside a conditional, we create an object that is never used outside of that conditional. This strategy would not free the object until the end of a potentially long-running function.
def f()
if rare_case
c = new Cat()
// otherwise, long running work
Tracing garbage collection
Recall that in tracing garbage collection, we avoid the issue with cycles by traversing reachable objects starting at the root set.
The core algorithm is:
- Starting at the root set, traverse the reference graph to get the reachable set of "live" objects.
- Free objects that are not reachable ("dead").
Mark and sweep
The simplest tracing GC implementation is mark-and-sweet.
- Mark: keep an extra few bits of metadata on each heap block. Traverse starting at the root set, marking blocks that you visit.
- Sweet: do a second, linear pass over the entire heap, freeing every block not marked in the previous step.
The downsides of this simple implementation are:
- It requires traversing the entire heap, instead of just live objects. runtime.
What else? Hint: consider the problem from CS240 malloc
where we can't find a certain block of data even though there could be room if we moved data around.
(Think, then click!)
The second major problem is fragmentation: our still-in-use allocated blocks can form a patchwork that makes it hard to use memory optimally.
The problem of fragmentation is somewhat unavoidable in any language that allows programmers to access raw pointers (e.g., C) rather than higher-level references (e.g., Java). This is because in languages like C, we can't just move objects in the heap around, because it would change their underlying address.
Moving vs. non-moving
The basic mark-and-sweep algorithm is non-moving: it keeps allocated blocks at the same addresses in memory.
In higher-level languages with references, GCs can be moving GCs that compact/compress blocks as part of collection, substantially reducing fragmentation.
The high level idea is this:
- Find live objects via the same root set traversal idea as before.
- Copy all live objects, eliminating any space between objects.
This is the somewhat mind blowing part: what is a little fishy about step 2?
(Think, then click!)
Where are we supposed to copy live objects to? The objects are in the heap, but need to be copied to another part of the heap. How do we not overwrite other objects?
This tension motivates a common type of moving GC: a "semi-space" GC.
Semi-space GC
The simplest semi-space GC has such a strange idea that it at first seems like it would note work:
- Divide the heap in half.
- Use only the first half. Then run a moving GC step, copying and condensing live objects into the second half.
- Consider the entire first half free.
To me, this idea is like deciding you don't like one room's decorations in your house, so you buy a new house and move all your furniture there!
In turns out in practice, because this only uses twice the memory as before and multiplying by 2 is a single constant multiplicative factor, this is often a reasonable tradeoff to make in practice.
This is especially true because collection is now conceptually instead of .
Generational collectors
More complicated moving collectors take advantage of an adage that "most objects die young". This captures the idea that in programming, many objects are only used briefly, while a small proportion of objects might be used for almost the entire duration of the program. "Generational" GCs take advantage of this by further subdividing the heap into different areas for objects expected to live for different durations, and collecting from them at varying frequency.
Tradeoffs in choosing a GC strategy
There is a tradeoff between how much extra memory a GC strategy uses and how much slower it is from optimal. Using only roughly the same space as an optimally manually managed memory program is orders of magnitude slower. If a GC can use 2x the memory, it's only 70% slower than ideal; this drops to 17% slower if a GC can use 3x the memory.
The key takeaway: for most applications, the memory safety and convenience offered by an automatic garbage collecting language is worth the additional memory usage and speed cost.
Cases where an automatic garbage collector may not be worth the performance costs:
- Very performance sensitive settings, like operating systems, browsers, etc. These are the domains where C/C++ are still common.
- Embedded and real-time systems, where the fact that tracing GCs need to "stop the world" in order to collect is too unpredictable in terms of program timing.
These two settings have motivated the programming languages community to begin to popularize a different option.
Rust: semi-manual but safe, via its type system
Rust is a language that builds on many older, experimental programming languages from decades of research.
Rust:
- Mostly does step (1) size tracking and step (3) freeing for you.
- Is often as fast as C.
- Is memory safe (by default).
This might seem too good to be true, given the tradeoffs we have discussed! How Rust is able to provide this is via an advanced type system.
At a high level, Rust essentially builds on the idea of "free a variable when it goes out-of-scope", which could be from a function returning or from an if-statement-block ending. However, to avoid the safety issues in the counterexamples we saw previously, the Rust type system imposes strong restrictions on how different references can alias the same memory.
fn f() {
let s = string::from("hello");
return
}
To make programs with these restrictions work in practice, Rust employs a "borrow checker" that essentially tracks an over-approximation of the lifetime of various references. The details are beyond the scope of this lecture, but come talk to me if you want to here more!
Modeling memory management
For the next assignment, you will be modeling some aspects of automatic memory management (and the soundness and completeness properties) in Alloy.
Today, we'll start with a basic model for reference counting:
Reference counting
abstract sig Memory {
var references: set HeapObject
}
one sig RootSet extends Memory {}
sig HeapObject extends Memory {
var refCount: Int
}
var sig Allocated in HeapObject {}
pred reachableFromRootSet[o: HeapObject, refs: Memory->HeapObject] {
o in RootSet.^refs
}
pred memorySafe[alloc: Allocated, refs: Memory->HeapObject] {
all o: HeapObject | reachableFromRootSet[o, refs] => o in alloc
}
pred noLeaks[alloc: Allocated, refs: Memory->HeapObject] {
all o: HeapObject | o in alloc => reachableFromRootSet[o, refs]
}
fact refCounts {
always {
all o: HeapObject | {
o.refCount = #(references.o)
}
}
}
pred heapChangedProgrammer {
}
run {} for 5
Addition to make it easier to see steps
One way to make it easier to identify which step the trace is currently taking is to add an explicit sig marking the options for steps:
abstract sig Step {}
one sig Programmer, RefCountFree, RefCountCascade, DoNothing extends Step {}
var one sig WhichStep in Step {}
We then constrain WhichStep
within each predicate, for example:
pred heapChangedByProgrammer {
Programmer = WhichStep
}
Now, the visualizer will always show which step is the current step within a trace, which can make debugging easier. This is an addition that might be helpful if you use Alloy for your final project.
Modeling a tracing, mark-and-sweep GC
What do we need to change about our work-in-progress reference counting model to instead model a simple mark-and-sweep tracing GC?
First, we'll remove the refCount
field on the HeapObject
sig.
Instead, what metadata do we now need on heap objects?
(Think, then click!)
We need to track whether a heap object is marked. Rather than adding a boolean to each heap object, we can just track this globally with a Marked
set, similar to our previous Allocated
sig.
sig HeapObject extends Memory {}
var sig Allocated in HeapObject {}
var sig Marked in HeapObject {}
We had 4 transition predicates for reference counting: heapChangedByProgrammer
, referenceCountingFrees
, referenceCountingCascade
, and doNothing
.
Which of these do we need to only slightly update for a mark-and-sweep GC, and which do we need to replace entirely? What should we replace them with?
(Think, then click!)
heapChangedByProgrammer
and doNothing
need to be only slightly updated.
Both should be updated to include the new variable sig, Marked
. In both, we can just say the marked set does not change:
Marked' = Marked
Because the marked set is only used by the active steps of the garbage collector, here we could also just say the set should be empty in these steps:
no Marked
The other two steps, referenceCountingFrees
, referenceCountingCascade
, should be replaced.
What would be a good model for what to replace them with?
The natural two steps in the algorithm: mark
and sweep
.
New mark
and sweep
transition predicates
pred mark {
references' = references
Allocated' = Allocated
Marked' = RootSet.^references
}
pred sweep {
no Marked'
Allocated' = Marked
references' = references - ((HeapObject-Marked)->Memory) - (Memory -> (HeapObject-Marked))
references' = references & ((Marked + RootSet)->Marked)
references' = references & ((Allocated' + RootSet)->Allocated')
}
Today, we'll continue our discussion from last week on automatic memory management, specifically reference counting and tracing garbage collection.
Recall from last time
In any program, there are 3 steps a programmer needs to do to use dynamic memory:
In low-level languages like C, the programmer must do all 3 steps manually. We'll call this manual or explicit memory management: it is not considered "memory safe".
In automatic memory management, which occurs in many programming languages, like Java and Python, steps (1) and (3) are done for the programmer by the language implementation itself. These are considered memory safe languages.
Soundness vs. completeness for automatic memory management
Recall that last time, we showed that reference counting does not always successfully free all memory that can be freed. In particular, cycles in the reference graph can cause the bad outcome that unreachable objects are never actually freed.
However, while this is an undesirable outcome, this is not actually a memory safety issue. Going back to some terms we say earlier in CS340, reference counting is "sound, but not complete".
In the context of memory management:
Consider our earlier, incorrect attempt at a garbage collection strategy:
"Free objects when the function in which they were created returns."
Is this strategy sound? Is it complete?
(Think, then click!)
No, it is not sound, because of the counterexamples we saw previously (objects can live on and be used after that function returns, so it is not safe to free them). Using memory after a potential free is a memory safety issue.
It's also not complete (depending on our definition), because function can have nested scopes within them after which an object is never used.
For example, consider a case where inside a conditional, we create an object that is never used outside of that conditional. This strategy would not free the object until the end of a potentially long-running function.
def f() if rare_case c = new Cat() // otherwise, long running work
Tracing garbage collection
Recall that in tracing garbage collection, we avoid the issue with cycles by traversing reachable objects starting at the root set.
The core algorithm is:
Mark and sweep
The simplest tracing GC implementation is mark-and-sweet.
The downsides of this simple implementation are:
What else? Hint: consider the problem from CS240
malloc
where we can't find a certain block of data even though there could be room if we moved data around.(Think, then click!)
The second major problem is fragmentation: our still-in-use allocated blocks can form a patchwork that makes it hard to use memory optimally.
The problem of fragmentation is somewhat unavoidable in any language that allows programmers to access raw pointers (e.g., C) rather than higher-level references (e.g., Java). This is because in languages like C, we can't just move objects in the heap around, because it would change their underlying address.
Moving vs. non-moving
The basic mark-and-sweep algorithm is non-moving: it keeps allocated blocks at the same addresses in memory.
In higher-level languages with references, GCs can be moving GCs that compact/compress blocks as part of collection, substantially reducing fragmentation.
The high level idea is this:
This is the somewhat mind blowing part: what is a little fishy about step 2?
(Think, then click!)
Where are we supposed to copy live objects to? The objects are in the heap, but need to be copied to another part of the heap. How do we not overwrite other objects?
This tension motivates a common type of moving GC: a "semi-space" GC.
Semi-space GC
The simplest semi-space GC has such a strange idea that it at first seems like it would note work:
To me, this idea is like deciding you don't like one room's decorations in your house, so you buy a new house and move all your furniture there!
In turns out in practice, because this only uses twice the memory as before and multiplying by 2 is a single constant multiplicative factor, this is often a reasonable tradeoff to make in practice.
This is especially true because collection is now conceptually instead of .
Generational collectors
More complicated moving collectors take advantage of an adage that "most objects die young". This captures the idea that in programming, many objects are only used briefly, while a small proportion of objects might be used for almost the entire duration of the program. "Generational" GCs take advantage of this by further subdividing the heap into different areas for objects expected to live for different durations, and collecting from them at varying frequency.
Tradeoffs in choosing a GC strategy
There is a tradeoff between how much extra memory a GC strategy uses and how much slower it is from optimal. Using only roughly the same space as an optimally manually managed memory program is orders of magnitude slower. If a GC can use 2x the memory, it's only 70% slower than ideal; this drops to 17% slower if a GC can use 3x the memory.
The key takeaway: for most applications, the memory safety and convenience offered by an automatic garbage collecting language is worth the additional memory usage and speed cost.
Cases where an automatic garbage collector may not be worth the performance costs:
These two settings have motivated the programming languages community to begin to popularize a different option.
Rust: semi-manual but safe, via its type system
Rust is a language that builds on many older, experimental programming languages from decades of research.
Rust:
This might seem too good to be true, given the tradeoffs we have discussed! How Rust is able to provide this is via an advanced type system.
At a high level, Rust essentially builds on the idea of "free a variable when it goes out-of-scope", which could be from a function returning or from an if-statement-block ending. However, to avoid the safety issues in the counterexamples we saw previously, the Rust type system imposes strong restrictions on how different references can alias the same memory.
To make programs with these restrictions work in practice, Rust employs a "borrow checker" that essentially tracks an over-approximation of the lifetime of various references. The details are beyond the scope of this lecture, but come talk to me if you want to here more!
Modeling memory management
For the next assignment, you will be modeling some aspects of automatic memory management (and the soundness and completeness properties) in Alloy.
Today, we'll start with a basic model for reference counting:
Reference counting
Addition to make it easier to see steps
One way to make it easier to identify which step the trace is currently taking is to add an explicit sig marking the options for steps:
We then constrain
WhichStep
within each predicate, for example:Now, the visualizer will always show which step is the current step within a trace, which can make debugging easier. This is an addition that might be helpful if you use Alloy for your final project.
Modeling a tracing, mark-and-sweep GC
What do we need to change about our work-in-progress reference counting model to instead model a simple mark-and-sweep tracing GC?
First, we'll remove the
refCount
field on theHeapObject
sig.Instead, what metadata do we now need on heap objects?
(Think, then click!)
We need to track whether a heap object is marked. Rather than adding a boolean to each heap object, we can just track this globally with a
Marked
set, similar to our previousAllocated
sig.We had 4 transition predicates for reference counting:
heapChangedByProgrammer
,referenceCountingFrees
,referenceCountingCascade
, anddoNothing
.Which of these do we need to only slightly update for a mark-and-sweep GC, and which do we need to replace entirely? What should we replace them with?
(Think, then click!)
heapChangedByProgrammer
anddoNothing
need to be only slightly updated.Both should be updated to include the new variable sig,
Marked
. In both, we can just say the marked set does not change:Because the marked set is only used by the active steps of the garbage collector, here we could also just say the set should be empty in these steps:
The other two steps,
referenceCountingFrees
,referenceCountingCascade
, should be replaced.What would be a good model for what to replace them with?
The natural two steps in the algorithm:
mark
andsweep
.New
mark
andsweep
transition predicates