Techniques for Safe Garbage Collection in Rust
2024-01-13
- Rust and Garbage Collection
- Tracing and Interior Mutability
- Finding GC Roots with Generativity
- Lifetime Projection
- Putting It All Together
- Closing Thoughts
This will be the first in a series of posts describing different design decisions made writing two Rust crates:
- gc-arena - A system for safely fitting Rust with incremental garbage collection, used by Ruffle.
- piccolo - A stackless Lua runtime written in safe Rust with a focus on safe sandboxing.
The goal of this series of posts is not really to sell anyone on these crates,
or even the ideas that went into them, but to document why I[1]in the
case of gc-arena
, we
made the design decisions I did, and to touch
on a web of issues and potential solutions that might[2]might
be applicable to the larger Rust ecosystem.
This is going to be a pretty long post, and also very technical and pretty
dry. I'm not even going to cover every technique developed for gc-arena
, I'm
just aiming to cover the ideas that you are guaranteed to encounter when trying
to use it at all. I may come back and cover things like the details around
Collect
impls and GC "finalization" in a future post.
This post also doesn't really ask any big questions or present any big, unsolved
problems, it is meant mostly as an answer to a set of very common questions,
asking why exactly gc-arena
is the way it is. This is also an essential
pre-requisite to talk at all about piccolo
, so I'm just going to have to get
this out of the way first before a hopefully much more interesting discussion of
piccolo
itself.
Rust and Garbage Collection
Rust as she is spoke,[3]and
English for that
matter
at least currently, does not have "true" garbage collection.
The terms around this are a bit contentious,[4]It is not inaccurate to
refer to Rc
and Arc
as a form of garbage collection without cycle detection,
and some people prefer to view them as part of the same continuum.
but
what I mean is garbage collection with cycle detection, the kind of which allows
you to more or less stop thinking about ownership and instead only think
about reachability. The fact that Rust did not ship with this feature is maybe
unsurprising in retrospect, since it is at odds in several ways with its design
ethos,[5]ownership semantics, mutability XOR aliasing, and minimal
behind-the-scenes runtime
but this wasn't always the plan! Rust used to
have built-in garbage-collected
managed pointers,
and there has been at least
some discussion
on bringing them back. gc-arena
represents my[6]I can no longer take
sole credit for gc-arena
, it has other major contributors since being used in
the Ruffle project, and in fact I have those contributors to thank for me even
continuing to work on it at all. More on this later.
attempt to
retrofit Rust with these types of garbage-collected pointers purely as a
library.
I'm far from the first person to attempt this, I think
this
post by Manish Goregaokar is the best starting
point I know of to read about some of the landscape, gc-arena
is featured in
that article as one point in a complex design space, and I'm only going to cover
the part of the space I'm familiar with.
gc-arena
exists because piccolo
(formerly "luster") needed a garbage
collector, and I had a very specific set of requirements in mind when I started:
- I was heavily borrowing[7]shamelessly stealing
much
of the interpreter design from the Ur Lua (The version
developed at PUC-Rio), and my chief goal was that
piccolo
should never be somehow doomed to be forever slower or inferior to PUC-Rio Lua. What I mean by this is complex and I hope to cover in a later post, but for now, what I mean is that, like PUC-Rio Lua and all of its forks, garbage-collected pointers needed to be zero-cost plain pointers, which in Rust terms means that they areCopy
and don't include bookkeeping during normal use. This was important for the stack-based design of the Lua VM, which was going to be inspired[8]stolen from PUC-Rio Lua. piccolo
needed to be written in (almost) entirely safe Rust. It exists as an experiment to see how far I could go writing a Lua runtime with provable memory safety and performance that was not fundamentally any slower than an equivalent runtime without memory safety.
These two requirements together, that garbage-collected pointers are (when
ignoring the actual process of garbage collection) free, and that the
runtime needed to be majority memory safe,[9]What I mean by this
is much more complex but it's too much to go into here. It needed to have no
infectious unsafety like PUC-Rio Lua has. Normal C code is always
memory unsafe, PUC-Rio Lua and her forks are on an entirely
different plane of existence,
my goal was to avoid things like this.
meant that none of the
existing designs I could find would work.[10]at least as of late 2018,
when these projects were originally conceived
I found a way forward
to satisfy both of these requirements, and with a garbage collector very similar
to the incremental collector in PUC-Rio Lua,[11]Because I stole the
design outright. Stealing is easy and fun!
and thus gc-arena
was
born.
Tracing and Interior Mutability
Designing a tracing garbage collector library for Rust with a safe interface is challenging. Usually, most of the difficulty orbits around the problems of correctly tracing values which hold managed (garbage-collected) pointers and deterministically knowing the set of garbage collector "roots" (the places where some non-managed value holds a pointer to a managed value, and thus garbage-collected values that are known to be externally reachable).[12]These problems often arise when retrofitting garbage collection to systems programming languages generally, the lack of a language runtime or some kind of machinery to know where managed pointers could be held often necessitates conservative stack / heap scanning, which isn't great.
The best way to explore these problems and how gc-arena
solves them is with
an example, and in this post we're going to slowly build up a doubly-linked
list data structure, covering different issues as they come up. We'll start with
defining a single Node
of this doubly-linked list:
use ;
// Node of a doubly-linked list data structure.
//
// Garbage collected pointers in `gc-arena` are called `Gc`, and all carry a 'gc
// lifetime with them. Don't worry too much about 'gc yet, it will be explained
// in detail in just a bit.
//
// The `prev` and `next` fields are plain pointers with no implicit bookkeeping.
// The garbage collector ensures that all unreachable pointers are freed (and
// their values dropped).
//
// When using `gc-arena`, especially since it is focused on single-threaded
// use, types can feel like they belong in a different language entirely. Often
// everything will implement `Copy`, and back or even self references abound.
// This makes sense, as `gc-arena` was designed to implement runtimes for
// languages where this is the norm.
We'll put the problem of deterministically finding roots aside for the moment.
Assuming we know the GC roots, we need to be able to trace every possible
reachable garbage collected pointer from them. This is a safety issue, if we
were to miss any reachable pointer, we might improperly collect it and this
could lead to a UAF (Use After Free). gc-arena
handles this requirement with
its
Collect
trait, which all garbage-collectable types must implement. The trait is unsafe
to implement, because implementations must ensure that every reachable pointer
is traced.
// `unsafe`, because implementing this trait incorrectly can lead to UB.
unsafe
However, this trait can safely be implemented via proc-macro! As long as the
implementation requires that every field itself implement Collect
and the
generated Collect::trace
implementation calls the Collect::trace
method of
every field, nothing can go wrong. The end result looks like this, which allows
defining and using custom garbage collected types without needing any unsafe
code at all!
use ;
// Safely implements the `Collect` trait (only when `T: Collect`) via procedural
// derive.
//
// There is another issue that I haven't covered yet, and it might be pretty
// obvious what it is from the `#[collect(no_drop)]` attribute here.
The only remaining problem is the Drop
trait... since objects can contain
circular references, there is no order in which we can drop them such that every
object would contain only un-dropped pointers. If we allowed users to implement
Drop::drop
for Collect
types, they could cause memory unsafety by accessing
these dropped values. This is easy to see with our linked list example, if there
was a Drop
implementation on our Node
type that accessed either prev
or
next
, in a circularly linked list that would mean that there would be no
possible safe order to drop our nodes in! gc-arena
solves this by forbidding
Drop
impls on types using auto-generated Collect
impls using a trick,[13]which I stole from
pin-project. Fun and easy!
internal to gc-arena
, there is a trait called MustNotImplDrop
that looks
like this:
Auto-generated Collect
impls (without an unsafe buy-in to allow custom Drop
impls) generate an impl for MustNotImplDrop
, thus, if the user tries to
implement Drop
for their type, this causes a conflicting implementation and a
compiler error.
gc-arena
is also an incremental collector, which poses a new problem...
Imagine that we have already traced an object during our incremental tracing
operation, then we allow the user to mutate that object. If the user puts a
pointer to an un-traced object inside the already traced object, we now have
an object which is marked as having been traced pointing to an object that
has not been traced. This also can lead to dangling pointers, so we need
to protect against this somehow. gc-arena
refers to this as the color
invariant, a "black" object (fully traced), may not point to a "white" object
(untraced).
The solution lies again with the Collect
trait. Gc
pointers are always
presumed to be aliased (they have no bookkeeping to prove otherwise) so Gc
provides only shared references to its inner type. Any mutation for garbage
collected values must come solely from types that provide interior mutability
through a shared reference. Such types are not allowed to implement Collect
directly,[14]They also can't implement Collect
via procedural derive
because we purposefully do not include Collect
impls for interior mutability
primitives. In order to implement Collect
for a type with its own interior
mutability, you must have an unsafe Collect
impl which asserts that you follow
the write barrier contract.
instead gc-arena
provides its own
versions of interior mutability (Lock
instead of Cell
, RefLock
instead of
RefCell
) which always require "write barriers" to enable mutation. A "write
barrier" is jargon from garbage collection literature, in our case it is
always a backwards write barrier (but we may add forward write barriers in the
future). If a "black" object is written to, it gets put back into the queue
of to-be-traced objects ("gray" objects, specifically the "gray again" list),
meaning it will be re-traced. In this way, we can't lose track of any reachable
objects. (There is a great
talk about the PUC-Rio garbage
collector which implements the same[15]because that's where I stole it
from
incremental design (as well as his newer generational collector)
by the creator of Lua, Roberto Ierusalimschy.)
This lack of interior mutability by default means that our previous linked list
Node
example is woefully incomplete! As it is written, we would never be able
to mutate any nodes once they were placed in a Gc
, so the data structure as
defined above would be almost useless. We need to enable safe mutation by using
one of the gc-arena
types which provide it:
use ;
// More realistic node type of a doubly-linked list data structure with safe
// interior mutability.
//
// We use `RefLock` here, but if we knew `T: Copy`, we could also use `Lock`.
type NodePtr<'gc, T> = ;
Now, we're able to safely mutate values of this type that live inside Gc
pointers:
// Set the `value` field of a `Node<'gc, T>`.
//
// This example is much more verbose than what you would actually write in order
// to fully illustrate the machinery at work. There are convenience methods on
// `Gc<Lock<_>>` and `Gc<RefLock<_>>` that allow you to skip interacting with
// the `Write` marker at all, which is normally what you'd use.
This strategy of using a custom unsafe trait with safe proc-macro generation
for tracing is not new, it was used by Manish in his
rust-gc library, from which gc-arena
heavily borrows.[16]steals... you might be noticing a pattern
rust-gc
calls its Collect
trait Trace
, and has a similar system to manage
otherwise unsafe mutability, where internally mutable types do not implement
Trace
directly for safety (though since rust-gc
is not an incremental
collector, the mutability safety problems are slightly different).
The Collect
trait forms one of the two pillars of gc-arena
's soundness
story, but it is only half of the story. We need to be able to deterministically
find all of the garbage collector's roots, AND to make sure that no internal
mutability occurs during incremental collection. For this we use another
trick...
Finding GC Roots with Generativity
There is an accidental feature of the Rust borrow checker that introduces something called "generativity" into Rust. This was, AFAIK, first described in Alexis Beingessner's Master's thesis,[17]You can find the description under the section "Hacking Generativity into Rust", right before she says "we really don’t recommend doing this". and you can see her thesis for probably a much better explanation, but I'll try my best to explain the concept very briefly. "Generativity" as it exists in Rust allows us a sort of very limited form of dependent typing: each time a closure is executed at runtime, a brand new type (in this case, a Rust lifetime) is generated that cannot be unified with any other generated instance. Of course, the Rust compiler and thus any notion of types does not exist at runtime, so this is only one way to look at what is happening. Really, the technique can be boiled down to a simple fact about lifetimes in Rust function types: you can construct a function that must work for any given lifetime, so every time such a function is called, the body of such a function only knows that there exists some lifetime that must satisfy its signature and nothing else, so the lifetime from one call of a function cannot be unified with any other. This technique has seen some existing use in Rust, there's even a crate for it,[18]It doesn't use the closure lifetime trick, but it uses something equivalent. but it is not widely used because frankly it can be a bit of a pain in the butt.
This trick is sometimes also called "lifetime branding", which is a name more
appropriate for how it is used in gc-arena
: if you have a generative lifetime
'id
, any type "branded" with this 'id
cannot ever be unified with another
type branded with a different 'id
. In gc-arena
this lifetime is called
'gc
, and it brands all garage-collected values (garbage-collected Gc
pointers and anything transitively containing them).
Describing this trick in thick jargon like this is not very useful, it's much
easier to understand when we look concretely at how it is actually used in
gc-arena
.
The word "arena" is commonly used in systems programming to refer to a technique of allocating memory from a single pool such that every allocation shares the same lifetime. The name "arena" is meant to be evocative of some kind of enclosure, all allocations are kept enclosed in an "arena" and ultimately share the same fate: allocations will always have a lifetime tied to their parent arena, and they are all freed when the arena is destroyed (or reset, depending on how the arena works). There is a great explanation of this "arena technique" as it appears in Rust on Manish's blog, allocated references returned from arenas all share some kind of 'arena lifetime such that all of the allocations can be invalidated at the same point, when the whole arena is reset.
The "arena" in "gc-arena" means sort of the same thing, that allocated pointers
are kept in an enclosure and may not escape it, and all of them must share
the same fate of being forever bound inside the arena they were birthed in.[19]Sometimes programmer-speak can be a bit too evocative.
gc-arena
is different in that pointers do not necessarily have to live for
the entire lifetime of the arena, only reachable pointers, but the idea is
similar enough to other arena allocators that I picked the name "gc-arena".
gc-arena
is also different from other arena allocators in specifically how
it uses its arena lifetime (which gc-arena
calls 'gc
). Rather than only
ensuring that pointers do not outlive some moment in time where every arena
pointer is freed, gc-arena
uses the same technique repeatedly over the
lifetime of the same arena, ensuring that pointers must not escape a single
unit of mutation. "Mutation" here is garbage collector jargon, and simply means
"things that are not collection", which is, well, any actual work your program
will do. Since useful work may mutate the object graph, everything useful
your program actually does is lumped together and called "mutation".[20]Sometimes programmers also can get tunnel vision.
The Gc
pointer in gc-arena
looks something like this:
/// A garbage collected pointer to a type `T`.
We have a simple pointer value (Gc
implements a slew of traits, but notably it
is always Copy
) and a zero-size marker type that both binds the Gc
pointer
to 'gc
and also makes it invariant[21]You can see a great
explanation of lifetime subtyping and variance in the Rustonomicon
here.
over 'gc
(more on this in a second). Since every Gc
pointer is "branded" with this
'gc
lifetime, we can leverage the power of the Rust borrow checker to tightly
control their lifetime.
The top-level type in gc-arena
is called Arena
, and it is parameterized
over an R
which is the arena's root type.[22]Sort of, I'll explain
in a bit
Everything that the arena will ever store must fit into this
single root type, so all long-term reachable pointers must always be reachable
somehow from this single root.[23]This is restrictive, but there are
things like
DynamicRootSet
which make this less restrictive than it sounds.
It provides methods
to mutate this single root, and by using lifetime branding, we can prove that
in-between calls to mutate, every reachable pointer must be reachable from
the root.
Whew, this is a lot setup to describe the main idea of gc-arena
, and this will
make a lot more sense with a concrete example. We'll use the linked list example
from before and decide that a NodePtr<'gc, i32>
should be our root type. I'm
skipping the part where we actually construct the arena because this requires
one final trick which I haven't talked about yet, but a full example will be
provided by the end, I promise.[24]Bare with me for a moment.
Access to an arena is tightly controlled in single units of mutation. This is not as complex as it sounds, what it effectively means is that in order to access an arena, you just need to call a method on the arena which takes a callback:
The signature for Arena::mutate
looks like this:
The most important part of the callback signature is the for<'gc>
part, which
is known as Higher Rank Trait Bounds or HRTBs, and it is how you do "universal
quantification" for a lifetime in Rust. The Rustonomicon
explains it better, but
for<'gc>
means "for all choices of 'gc
",[25]aka
universal quantification
and it requires that the callback be valid for any possible choice
of 'gc
. This means that the callback effectively can know nothing about
the 'gc
lifetime at all, so the borrow checker must prevent types with this
lifetime from escaping out of the single callback call. After all, we could
choose an arbitrarily small 'gc
, so any escape from the callback at all must
be treated as a borrow checker error. This is the key trick for "generativity"
that was discussed at the beginning of this section, and you can check Gankra's
thesis for a more complete explanation.
// This will cause some kind of "lifetime may not live long enough" error,
// because any type with 'gc lifetime is prevented from escaping the arena.
//
// The error message may not be *good*, but we know and can rely on this
// always triggering an error.
//
// BAD, SHOULD NOT COMPILE!
let escape = arena.mutate;
We've also made a big deal about ensuring that Gc
pointers are invariant
over the 'gc
branding lifetime, and here's why we do that: If Gc
pointers
were covariant over 'gc
, which we might accidentally do if we were not
careful, we could accidentally allow the user to shorten 'gc
which could
still cause problems:
So FINALLY we can put all the pieces together... IF we know that all garbage
collector pointers are branded by 'gc
AND we know that we can only access any
garbage collected pointers through a callback which is universally quantified
over 'gc
, THEN we can finally prove all of the properties we need for safe
garbage collection:
WHEW! All that is left is to explain one last trick that allows us to define our root type and thus the type of the arena itself.
Lifetime Projection
The last piece of the puzzle is how to define Arena
itself. I've talked about
Arena
having a root type, and conceptually this is not at all difficult
to understand. During calls to Arena::mutate
, this is the type of the root
callback parameter:
The problem is the 'gc
lifetime. As we have covered, the 'gc
lifetime
is generative, meaning, at least conceptually, that every invocation of
Arena::mutate
must get its own unique 'gc
parameter that can never be
unified with any other:
Besides this, there is another practical problem: the type of the arena
itself
should not have any sort lifetime associated with it at all. It is, after
all, a normal value and doesn't hold any borrows, and if it was branded with a
'gc
lifetime and then that lifetime was used in mutation callbacks, that would
allow pointers to escape the callbacks (assuming they didn't outlive the arena
itself at least), breaking our "mutation XOR collection" soundness system.
What we actually need is some kind of 'static
type that if we give it a
'gc
, we get back whatever our root type is with that 'gc
applied to it.
Hey.. that kinda sounds like a GAT!
So what we need to give Arena
is some kind of type that it can substitute
different 'gc
lifetimes into, enabling it to use lifetime generativity to
protect pointers from leaking out of calls to Arena::mutate
, and indeed, a GAT
fits the bill here:
// We could define some kind of `Rootable` trait that the `R` parameter on
// `Arena` must implement.
This Rootable
trait does, in fact, work perfectly to let us define Arena
types! This is not what gc-arena
does, which I'll get to in just a second,
but there's no reason this couldn't work instead. Going back to our concrete
linked list example, we could define a Rootable
impl this way:
;
and then we could create a new arena of type Arena<NodeRootable>
like this:
Bingo! this is the last piece we need to complete a workable gc-arena
design.
I sometimes refer to this idea of globally substituting a 'gc
lifetime as
"lifetime projection", but it is nothing more complicated than what a GAT
provides. BUT, like I said, this is not exactly what gc-arena
actually does,
so if it does not do this, what DOES it do, and why?
The biggest reason that gc-arena
does not use GATs to define root types
is that gc-arena
predates stable GATs by almost four years[26]gc-arena
didn't start by having a nice API for this either, at the beginning
you had to declare Arena
types with a macro instead and it was horrible.
, but there is another very good reason we do not do this. First, I'll
show you what the Rootable
type actually looks like in gc-arena
:
// Rather than use a GAT, we use a regular ol' lifetime parameter on a trait.
//
// Way before proper GATs were stabilized, there was sort of an "open secret"
// in Rust that GATs were already sort of... """available""" with trickery.
// However, the poor rustc compiler would get um... upset... if you tried to
// use them in anything other than simplistic ways. Here is one of those tricks!
// Rather than use a GAT, you can use a lifetime parameter on a trait, combined
// with a trait bound like this:
//
// `where R: for<'gc> Rootable<'gc>`
//
// Once you do this, instead of referring to `<R as Rootable>::Root<'gc>`, you
// can instead refer to `<R as Rootable<'gc>>::Root` to get basically the same
// result! At the time this feature went into `gc-arena`, rustc had calmed down
// enough to make the technique workable even without fully stabilized proper
// GATs.
//
// The 'static bound is not super important, it's an extra convenience to save
// having to write `R: 'static` bounds in a lot of places.
// Since `<R as Rootable<'gc>>::Root` is a lot to type, we have a type alias
// for it. This is where the `Root<'gc, R>` in the signature for `Arena::mutate`
// comes from.
pub type Root<'gc, R> = Root;
But so far this is just really a GAT with more steps. The real reason that we have not moved off of this technique is this whopper of a cool hack:[27]which I absolutely did not invent
// Instead of requiring the user to declare a new type that implements
// `Rootable` for every new `Arena` type they want to declare, we instead abuse
// trait objects.
//
// Since every trait object always implements its own trait, by simply
// *referring* to a trait object for `Rootable`, we automatically get a type
// that implements this trait without having to declare any additional new
// types, and every identical invocation of `Rootable!` will always refer to the
// same type! Wow!!
//
// Currently, traits with GATs are not object safe (*yet*, this will be fixed).
// However, our Cheater McCheaterson GATs sure *are* object safe!
There's
one
or
two
more small pieces that I have not covered here for the
real
Rootable!
macro, but these details are truly not that important. They
are but small QoL improvements over what I have just shown, which is the
most important part. The Rootable!
macro is also essential in understanding
how to actually use gc-arena
, because without it, using Arena
or
DynamicRootSet
is extremely annoying, and most example code will use it
heavily.
Finally, here is what the actual declaration of an Arena
that holds a
NodePtr<'gc, i32>
would actually look like in practice:
Putting It All Together
Okay, after ALL of this explanation, we can finally see a fully worked example of our doubly-linked list, and understand each piece of it and how it fits together...
use ;
// We define a node of a doubly-linked list data structure.
//
// `Collect` is derived procedurally, meaning that we can't mess up and forget
// to trace our inner `prev`, `next`, or `value`.
// For safety, we agree to not implement `Drop`. We could also use
// `#[collect(unsafe_drop)]` or `#[collect(require_static)]` (if our type were
// 'static) here instead.`
// By default, `Collect` types (other than 'static types) cannot have interior
// mutability. In order to provide safe mutation, we need to use `gc-arena`
// specific types to provide it which guarantee that write barriers are invoked.
//
// We use `RefLock` here as an alternative to `RefCell`.
type NodePtr<'gc, T> = ;
// Create a new `Node` and return a pointer to it.
//
// We need to pass the `&Mutation<'gc>` context here because we are mutating the
// object graph (by creating a new "object" with `Gc::new`).
// Join two nodes together, setting the `left` node's `next` field to `right`,
// and the `right` node's `prev` field to `left`.
//
// Again, we are mutating the object graph, so we must pass in the
// `&Mutation<'gc>` context.
// Use a `NodePtr` as a cursor, move forward through a linked list by following
// `next` pointers.
//
// Returns `true` if there was a `next` pointer and the target node has been
// changed.
// Use a `NodePtr` as a cursor, move backward through a linked list by following
// `prev` pointers.
//
// Returns `true` if there was a `prev` pointer and the target node has been
// changed.
Closing Thoughts
Thank you for reading this far, I hope you found this tour of the design of
gc-arena
interesting. It is a crate whose use requires understanding several
different unusual techniques all at one time, and explaining each piece is
difficult in isolation. Hopefully this post can serve as a good introduction to
someone who wants to use it in the future.
I understand that using gc-arena
comes with a pretty high cognitive and
usability overhead. In particular, satisfying the "mutability XOR collection"
design by mutating via callback can be particularly challenging, and I will
cover this much more in future posts about piccolo
. However, not many
languages can even be used to implement garbage collection at all, and even
fewer could hope to do so with provable memory safety. In many ways, trying
to design "safe, isolated garbage collection as a library" is wading through
uncharted territory, so I hope you can view gc-arena
's rough edges with this
in mind.
In future posts, I want to talk about the way that I wish gc-arena
could
work if there were enough features in Rust to make it possible, using async
and .await
to express GC "safe points", but that discussion is for another
time.
Bye!