Techniques for Safe Garbage Collection in Rust

2024-01-13

This will be the first in a series of posts describing different design decisions made writing two Rust crates:

gc-arena - A system for safely fitting Rust with incremental garbage collection, used by Ruffle.
piccolo - A stackless Lua runtime written in safe Rust with a focus on safe sandboxing.

The goal of this series of posts is not really to sell anyone on these crates, or even the ideas that went into them, but to document why I^[1]in the case of gc-arena, we made the design decisions I did, and to touch on a web of issues and potential solutions that might^[2]might be applicable to the larger Rust ecosystem.

This is going to be a pretty long post, and also very technical and pretty dry. I'm not even going to cover every technique developed for gc-arena, I'm just aiming to cover the ideas that you are guaranteed to encounter when trying to use it at all. I may come back and cover things like the details around Collect impls and GC "finalization" in a future post.

This post also doesn't really ask any big questions or present any big, unsolved problems, it is meant mostly as an answer to a set of very common questions, asking why exactly gc-arena is the way it is. This is also an essential pre-requisite to talk at all about piccolo, so I'm just going to have to get this out of the way first before a hopefully much more interesting discussion of piccolo itself.

Rust and Garbage Collection

Rust as she is spoke,^[3]and English for that matter at least currently, does not have "true" garbage collection. The terms around this are a bit contentious,^[4]It is not inaccurate to refer to Rc and Arc as a form of garbage collection without cycle detection, and some people prefer to view them as part of the same continuum. but what I mean is garbage collection with cycle detection, the kind of which allows you to more or less stop thinking about ownership and instead only think about reachability. The fact that Rust did not ship with this feature is maybe unsurprising in retrospect, since it is at odds in several ways with its design ethos,^[5]ownership semantics, mutability XOR aliasing, and minimal behind-the-scenes runtime but this wasn't always the plan! Rust used to have built-in garbage-collected managed pointers, and there has been at least some discussion on bringing them back. gc-arena represents my^[6]I can no longer take sole credit for gc-arena, it has other major contributors since being used in the Ruffle project, and in fact I have those contributors to thank for me even continuing to work on it at all. More on this later. attempt to retrofit Rust with these types of garbage-collected pointers purely as a library.

I'm far from the first person to attempt this, I think this post by Manish Goregaokar is the best starting point I know of to read about some of the landscape, gc-arena is featured in that article as one point in a complex design space, and I'm only going to cover the part of the space I'm familiar with.

gc-arena exists because piccolo (formerly "luster") needed a garbage collector, and I had a very specific set of requirements in mind when I started:

I was heavily borrowing^[7]shamelessly stealing much of the interpreter design from the Ur Lua (The version developed at PUC-Rio), and my chief goal was that piccolo should never be somehow doomed to be forever slower or inferior to PUC-Rio Lua. What I mean by this is complex and I hope to cover in a later post, but for now, what I mean is that, like PUC-Rio Lua and all of its forks, garbage-collected pointers needed to be zero-cost plain pointers, which in Rust terms means that they are Copy and don't include bookkeeping during normal use. This was important for the stack-based design of the Lua VM, which was going to be inspired^[8]stolen from PUC-Rio Lua.
piccolo needed to be written in (almost) entirely safe Rust. It exists as an experiment to see how far I could go writing a Lua runtime with provable memory safety and performance that was not fundamentally any slower than an equivalent runtime without memory safety.

These two requirements together, that garbage-collected pointers are (when ignoring the actual process of garbage collection) free, and that the runtime needed to be majority memory safe,^[9]What I mean by this is much more complex but it's too much to go into here. It needed to have no infectious unsafety like PUC-Rio Lua has. Normal C code is always memory unsafe, PUC-Rio Lua and her forks are on an entirely different plane of existence, my goal was to avoid things like this. meant that none of the existing designs I could find would work.^[10]at least as of late 2018, when these projects were originally conceived I found a way forward to satisfy both of these requirements, and with a garbage collector very similar to the incremental collector in PUC-Rio Lua,^[11]Because I stole the design outright. Stealing is easy and fun! and thus gc-arena was born.

Tracing and Interior Mutability

Designing a tracing garbage collector library for Rust with a safe interface is challenging. Usually, most of the difficulty orbits around the problems of correctly tracing values which hold managed (garbage-collected) pointers and deterministically knowing the set of garbage collector "roots" (the places where some non-managed value holds a pointer to a managed value, and thus garbage-collected values that are known to be externally reachable).^[12]These problems often arise when retrofitting garbage collection to systems programming languages generally, the lack of a language runtime or some kind of machinery to know where managed pointers could be held often necessitates conservative stack / heap scanning, which isn't great.

The best way to explore these problems and how gc-arena solves them is with an example, and in this post we're going to slowly build up a doubly-linked list data structure, covering different issues as they come up. We'll start with defining a single Node of this doubly-linked list:

use gc_arena::{Collect, Gc};

// Node of a doubly-linked list data structure.
//
// Garbage collected pointers in `gc-arena` are called `Gc`, and all carry a 'gc
// lifetime with them. Don't worry too much about 'gc yet, it will be explained
// in detail in just a bit.
//
// The `prev` and `next` fields are plain pointers with no implicit bookkeeping.
// The garbage collector ensures that all unreachable pointers are freed (and
// their values dropped).
//
// When using `gc-arena`, especially since it is focused on single-threaded
// use, types can feel like they belong in a different language entirely. Often
// everything will implement `Copy`, and back or even self references abound.
// This makes sense, as `gc-arena` was designed to implement runtimes for
// languages where this is the norm.
#[derive(Copy, Clone)]
struct Node<'gc, T: 'gc> {
    prev: Option<Gc<'gc, Node<'gc, T>>>,
    next: Option<Gc<'gc, Node<'gc, T>>>,
    value: T,
}

We'll put the problem of deterministically finding roots aside for the moment. Assuming we know the GC roots, we need to be able to trace every possible reachable garbage collected pointer from them. This is a safety issue, if we were to miss any reachable pointer, we might improperly collect it and this could lead to a UAF (Use After Free). gc-arena handles this requirement with its Collect trait, which all garbage-collectable types must implement. The trait is unsafe to implement, because implementations must ensure that every reachable pointer is traced.

// `unsafe`, because implementing this trait incorrectly can lead to UB.
unsafe impl<'gc, T: Collect> Collect for Node<'gc, T> {
    fn trace(&self, cc: &Collection) {
        self.prev.trace(cc);
        // If we were to forget to trace a field here, for example the `next`
        // field, then the garbage collector might believe that `next` is
        // unreachable and drop it. This could lead to a UAF, so `Collect`
        // *must* be correctly implemented for every garbage collected type
        // for safety.
        self.next.trace(cc);
        self.value.trace(cc);
    }
}

However, this trait can safely be implemented via proc-macro! As long as the implementation requires that every field itself implement Collect and the generated Collect::trace implementation calls the Collect::trace method of every field, nothing can go wrong. The end result looks like this, which allows defining and using custom garbage collected types without needing any unsafe code at all!

use gc_arena::{Collect, Gc};

// Safely implements the `Collect` trait (only when `T: Collect`) via procedural
// derive.
//
// There is another issue that I haven't covered yet, and it might be pretty
// obvious what it is from the `#[collect(no_drop)]` attribute here.
#[derive(Copy, Clone, Collect)]
#[collect(no_drop)]
struct Node<'gc, T: 'gc> {
    // Since `Gc` internally holds a `NonNull`, `Option<Gc<_>>` is pointer
    // sized!
    prev: Option<Gc<'gc, Node<'gc, T>>>,
    next: Option<Gc<'gc, Node<'gc, T>>>,
    value: T,
}

The only remaining problem is the Drop trait... since objects can contain circular references, there is no order in which we can drop them such that every object would contain only un-dropped pointers. If we allowed users to implement Drop::drop for Collect types, they could cause memory unsafety by accessing these dropped values. This is easy to see with our linked list example, if there was a Drop implementation on our Node type that accessed either prev or next, in a circularly linked list that would mean that there would be no possible safe order to drop our nodes in! gc-arena solves this by forbidding Drop impls on types using auto-generated Collect impls using a trick,^[13]which I stole from pin-project. Fun and easy! internal to gc-arena, there is a trait called MustNotImplDrop that looks like this:

pub trait MustNotImplDrop {}

impl<T: Drop> MustNotImplDrop for T {}

Auto-generated Collect impls (without an unsafe buy-in to allow custom Drop impls) generate an impl for MustNotImplDrop, thus, if the user tries to implement Drop for their type, this causes a conflicting implementation and a compiler error.

gc-arena is also an incremental collector, which poses a new problem... Imagine that we have already traced an object during our incremental tracing operation, then we allow the user to mutate that object. If the user puts a pointer to an un-traced object inside the already traced object, we now have an object which is marked as having been traced pointing to an object that has not been traced. This also can lead to dangling pointers, so we need to protect against this somehow. gc-arena refers to this as the color invariant, a "black" object (fully traced), may not point to a "white" object (untraced).

The solution lies again with the Collect trait. Gc pointers are always presumed to be aliased (they have no bookkeeping to prove otherwise) so Gc provides only shared references to its inner type. Any mutation for garbage collected values must come solely from types that provide interior mutability through a shared reference. Such types are not allowed to implement Collect directly,^[14]They also can't implement Collect via procedural derive because we purposefully do not include Collect impls for interior mutability primitives. In order to implement Collect for a type with its own interior mutability, you must have an unsafe Collect impl which asserts that you follow the write barrier contract. instead gc-arena provides its own versions of interior mutability (Lock instead of Cell, RefLock instead of RefCell) which always require "write barriers" to enable mutation. A "write barrier" is jargon from garbage collection literature, in our case it is always a backwards write barrier (but we may add forward write barriers in the future). If a "black" object is written to, it gets put back into the queue of to-be-traced objects ("gray" objects, specifically the "gray again" list), meaning it will be re-traced. In this way, we can't lose track of any reachable objects. (There is a great talk about the PUC-Rio garbage collector which implements the same^[15]because that's where I stole it from incremental design (as well as his newer generational collector) by the creator of Lua, Roberto Ierusalimschy.)

This lack of interior mutability by default means that our previous linked list Node example is woefully incomplete! As it is written, we would never be able to mutate any nodes once they were placed in a Gc, so the data structure as defined above would be almost useless. We need to enable safe mutation by using one of the gc-arena types which provide it:

use gc_arena::{Collect, Gc, lock::RefLock};

// More realistic node type of a doubly-linked list data structure with safe
// interior mutability.
//
// We use `RefLock` here, but if we knew `T: Copy`, we could also use `Lock`.

#[derive(Copy, Clone, Collect)]
#[collect(no_drop)]
struct Node<'gc, T: 'gc> {
    prev: Option<NodePtr<'gc, T>>,
    next: Option<NodePtr<'gc, T>>,
    value: T,
}

type NodePtr<'gc, T> = Gc<'gc, RefLock<Node<'gc, T>>>;

Now, we're able to safely mutate values of this type that live inside Gc pointers:

// Set the `value` field of a `Node<'gc, T>`.
//
// This example is much more verbose than what you would actually write in order
// to fully illustrate the machinery at work. There are convenience methods on
// `Gc<Lock<_>>` and `Gc<RefLock<_>>` that allow you to skip interacting with
// the `Write` marker at all, which is normally what you'd use.
fn node_set_value<'gc, T: Copy>(
    // This is one of the "context" types provided by `gc-arena`. This is almost
    // always required to safely mutate object graph, more on this in a bit.
    mc: &Mutation<'gc>,
    // The node pointer we want to mutate.
    node_ptr: NodePtr<'gc, T>,
    // The value we're going to set.
    value: T
) {
    // Invoke the "write barrier". If the `node` pointer is currently
    // "black" (marked as fully traced), this will make it "gray" again. This
    // is what allows us to safely mutate the pointer, so `gc-arena` must
    // assume that the object may adopt new pointers which might be "white",
    // invalidating the color invariant, so the object *must* be re-traced. This
    // is a "backwards" write barrier. In the future, we should have a way to
    // mark a pointer being *adopted* as gray immediately, also satisfying the
    // color invariant. This would be a "forward" write barrier.
    //
    // The type that this method returns is a bit gnarly (It's
    // `&'gc Write<RefLock<Node<'gc, T>>>` for reference). There are several
    // internal tricks involved which are not important right now for why the
    // type signature can safely look like this, but the important part is the
    // `Write` marker. This is a `#[repr(transparent)]` wrapper type that marks
    // a *reference* from the interior of the object graph as safely mutable,
    // because we know the write barrier has been triggered on its owning `Gc`
    // pointer.
    let write = node_ptr.write(mc);

    // Now that we have a `Write` reference, we can safely call `Write::unlock`
    // and get back a `&RefCell<Node<'gc, T>>`.
    let ref_cell = write.unlock(mc);

    // And we can safely use it like any other `RefCell` type, including
    // potentially adopting new pointers.
    ref_cell.borrow_mut().value = value;
}

This strategy of using a custom unsafe trait with safe proc-macro generation for tracing is not new, it was used by Manish in his rust-gc library, from which gc-arena heavily borrows.^[16]steals... you might be noticing a pattern rust-gc calls its Collect trait Trace, and has a similar system to manage otherwise unsafe mutability, where internally mutable types do not implement Trace directly for safety (though since rust-gc is not an incremental collector, the mutability safety problems are slightly different).

The Collect trait forms one of the two pillars of gc-arena's soundness story, but it is only half of the story. We need to be able to deterministically find all of the garbage collector's roots, AND to make sure that no internal mutability occurs during incremental collection. For this we use another trick...

Finding GC Roots with Generativity

There is an accidental feature of the Rust borrow checker that introduces something called "generativity" into Rust. This was, AFAIK, first described in Alexis Beingessner's Master's thesis,^[17]You can find the description under the section "Hacking Generativity into Rust", right before she says "we really don’t recommend doing this". and you can see her thesis for probably a much better explanation, but I'll try my best to explain the concept very briefly. "Generativity" as it exists in Rust allows us a sort of very limited form of dependent typing: each time a closure is executed at runtime, a brand new type (in this case, a Rust lifetime) is generated that cannot be unified with any other generated instance. Of course, the Rust compiler and thus any notion of types does not exist at runtime, so this is only one way to look at what is happening. Really, the technique can be boiled down to a simple fact about lifetimes in Rust function types: you can construct a function that must work for any given lifetime, so every time such a function is called, the body of such a function only knows that there exists some lifetime that must satisfy its signature and nothing else, so the lifetime from one call of a function cannot be unified with any other. This technique has seen some existing use in Rust, there's even a crate for it,^[18]It doesn't use the closure lifetime trick, but it uses something equivalent. but it is not widely used because frankly it can be a bit of a pain in the butt.

This trick is sometimes also called "lifetime branding", which is a name more appropriate for how it is used in gc-arena: if you have a generative lifetime 'id, any type "branded" with this 'id cannot ever be unified with another type branded with a different 'id. In gc-arena this lifetime is called 'gc, and it brands all garage-collected values (garbage-collected Gc pointers and anything transitively containing them).

Describing this trick in thick jargon like this is not very useful, it's much easier to understand when we look concretely at how it is actually used in gc-arena.

The word "arena" is commonly used in systems programming to refer to a technique of allocating memory from a single pool such that every allocation shares the same lifetime. The name "arena" is meant to be evocative of some kind of enclosure, all allocations are kept enclosed in an "arena" and ultimately share the same fate: allocations will always have a lifetime tied to their parent arena, and they are all freed when the arena is destroyed (or reset, depending on how the arena works). There is a great explanation of this "arena technique" as it appears in Rust on Manish's blog, allocated references returned from arenas all share some kind of 'arena lifetime such that all of the allocations can be invalidated at the same point, when the whole arena is reset.

The "arena" in "gc-arena" means sort of the same thing, that allocated pointers are kept in an enclosure and may not escape it, and all of them must share the same fate of being forever bound inside the arena they were birthed in.^[19]Sometimes programmer-speak can be a bit too evocative. gc-arena is different in that pointers do not necessarily have to live for the entire lifetime of the arena, only reachable pointers, but the idea is similar enough to other arena allocators that I picked the name "gc-arena".

gc-arena is also different from other arena allocators in specifically how it uses its arena lifetime (which gc-arena calls 'gc). Rather than only ensuring that pointers do not outlive some moment in time where every arena pointer is freed, gc-arena uses the same technique repeatedly over the lifetime of the same arena, ensuring that pointers must not escape a single unit of mutation. "Mutation" here is garbage collector jargon, and simply means "things that are not collection", which is, well, any actual work your program will do. Since useful work may mutate the object graph, everything useful your program actually does is lumped together and called "mutation".^[20]Sometimes programmers also can get tunnel vision.

The Gc pointer in gc-arena looks something like this:

/// A garbage collected pointer to a type `T`.
pub struct Gc<'gc, T>
    where T: ?Sized + 'gc
{
    // This type is a pointer (potentially wide) to a struct which contains a
    // value of type `T`. There is a small overhead per allocation for a header
    // section, and the last field is a (potentially unsized) `T`. In the
    // future, we might store a bare `T` pointer and use pointer math to find
    // the header when necessary (such as during tracing).
    ptr: NonNull<GcBox<T>>,

    // We also contain a `PhantomData` type which marks the struct as being
    // *invariant* over the 'gc lifetime. This is essential to prevent lifetime
    // branding from being defeated and for the soundness of `gc-arena`.
    _invariant: Invariant<'gc>,
}

We have a simple pointer value (Gc implements a slew of traits, but notably it is always Copy) and a zero-size marker type that both binds the Gc pointer to 'gc and also makes it invariant^[21]You can see a great explanation of lifetime subtyping and variance in the Rustonomicon here. over 'gc (more on this in a second). Since every Gc pointer is "branded" with this 'gc lifetime, we can leverage the power of the Rust borrow checker to tightly control their lifetime.

The top-level type in gc-arena is called Arena, and it is parameterized over an R which is the arena's root type.^[22]Sort of, I'll explain in a bit Everything that the arena will ever store must fit into this single root type, so all long-term reachable pointers must always be reachable somehow from this single root.^[23]This is restrictive, but there are things like DynamicRootSet which make this less restrictive than it sounds. It provides methods to mutate this single root, and by using lifetime branding, we can prove that in-between calls to mutate, every reachable pointer must be reachable from the root.

Whew, this is a lot setup to describe the main idea of gc-arena, and this will make a lot more sense with a concrete example. We'll use the linked list example from before and decide that a NodePtr<'gc, i32> should be our root type. I'm skipping the part where we actually construct the arena because this requires one final trick which I haven't talked about yet, but a full example will be provided by the end, I promise.^[24]Bare with me for a moment.

Access to an arena is tightly controlled in single units of mutation. This is not as complex as it sounds, what it effectively means is that in order to access an arena, you just need to call a method on the arena which takes a callback:

fn main() {
    // Construct an arena with a root of type `NodePtr<'gc, i32>`. I will cover
    // how to actually do this in the next section.
    let mut arena = todo!("unknowable");

    // We can "mutate" our arena by using the `Arena::mutate` method, which
    // always takes a callback that is passed a `&Root` parameter. "Mutation"
    // again is garbage collector jargon, really *any access at all* must take
    // place inside calls to `Arena::mutate` (or equivalent), since we have
    // full access to the object graph and can mutate it using methods for safe
    // internal mutation.
    //
    // The first `mc` parameter is the "mutation context" of type
    // `&Mutation<'gc>`. Anything that may mutate the object graph generally
    // requires a `mc: &Mutation<'gc>` parameter, and the 'gc brading lifetime
    // of the mutation context and the `Gc` pointer must match.
    //
    // This allows accessing arena-wide bookkeeping data without having to
    // keep an extra pointer for every allocation back to the parent arena, but
    // it is also useful in soundness reasoning. There are multiple different
    // context types, the existence of a `&Mutation<'gc>` acts as a witness
    // that we are doing *mutation* rather than *collection*. These are mostly
    // internal details of `gc-arena`, just know that these annoying context
    // types you have to pass around are very often a load-bearing part of
    // soundness proofs.
    arena.mutate(|mc, root| {
        dbg!(root.prev.is_some());
    });

    // If we need to access the root type mutably, there's a special method
    // for this which is passed a `&mut Root` parameter instead of `&Root`.
    //
    // The root type is a bit special in that it *itself* isn't allocated
    // inside a `Gc` pointer, so there is a special method to safely mutate the
    // root type *itself*, and the write barrier for the root type is handled
    // slightly differently than write barriers for `Gc` pointers. This is
    // not really essential to the design of `gc-arena`, it's mostly just an
    // accident of history. We could instead provide a `Gc<'gc, Root>` here
    // instead of a `&Root` or `&mut Root` to allow internal mutation the same
    // way as everything else, but doing it this way saves having to always wrap
    // the root type in `RefLock` which you would usually always want.
    arena.mutate_root(|mc, root| {
        root.prev = None;
    });
}

The signature for Arena::mutate looks like this:

impl<R> Arena<R> {
    pub fn mutate<T>(
        &self,
        // `Root<'gc, R>` will be explained shortly, but it is the root type you
        // actually expect, rather than `R` which is *not*.
        f: impl for<'gc> FnOnce(&'gc Mutation<'gc>, &'gc Root<'gc, R>) -> T
    ) -> T {
        // Accursed, unutterable unsafe code.
    }
}

The most important part of the callback signature is the for<'gc> part, which is known as Higher Rank Trait Bounds or HRTBs, and it is how you do "universal quantification" for a lifetime in Rust. The Rustonomicon explains it better, but for<'gc> means "for all choices of 'gc",^[25]aka universal quantification and it requires that the callback be valid for any possible choice of 'gc. This means that the callback effectively can know nothing about the 'gc lifetime at all, so the borrow checker must prevent types with this lifetime from escaping out of the single callback call. After all, we could choose an arbitrarily small 'gc, so any escape from the callback at all must be treated as a borrow checker error. This is the key trick for "generativity" that was discussed at the beginning of this section, and you can check Gankra's thesis for a more complete explanation.

    // This will cause some kind of "lifetime may not live long enough" error,
    // because any type with 'gc lifetime is prevented from escaping the arena.
    //
    // The error message may not be *good*, but we know and can rely on this
    // always triggering an error.
    //
    // BAD, SHOULD NOT COMPILE!
    let escape = arena.mutate(|mc, root| root.prev);

We've also made a big deal about ensuring that Gc pointers are invariant over the 'gc branding lifetime, and here's why we do that: If Gc pointers were covariant over 'gc, which we might accidentally do if we were not careful, we could accidentally allow the user to shorten 'gc which could still cause problems:

fn main() {
    // Two arenas with a root of type `NodePtr<'gc, i32>`.
    let mut arena1 = todo!("unknowable");
    let mut arena2 = todo!("unknowable");

    arena1.mutate_root(|_, root1| {
      arena2.mutate_root(|_, root2| {
        // If `Gc` pointers were covariant in 'gc, then the borrow checker
        // would shorten 'gc in its attempt to make everything borrow check,
        // and this would allow us to place a "longer-lived" '`gc' type into
        // a shorter-lived one. We need *invariance* to ensure that the mutate
        // callback knows as *little* about the 'gc lifetime as possible.
        //
        // BAD, SHOULD NOT COMPILE!
        root2.prev = root1.next;
      });
    });
}

So FINALLY we can put all the pieces together... IF we know that all garbage collector pointers are branded by 'gc AND we know that we can only access any garbage collected pointers through a callback which is universally quantified over 'gc, THEN we can finally prove all of the properties we need for safe garbage collection:

fn main() {
    // An arena of some root type, it doesn't matter for this explanation.
    let mut arena = todo!("unknowable");

    arena.mutate(|mc, root| {
        // Inside a call to `Arena::mutate`, we know that we can access `root`
        // and thus the full object graph, and we can change the object graph
        // by using safe, internal mutation which deterministically triggers
        // write barriers.
    });

    // Outside of any call to `Arena::mutate` though, we *know* that we cannot
    // access any `Gc` pointer. No `Gc` pointer can be alive anywhere on the
    // Rust stack, in thread local storage, smuggled into a different arena,
    // anywhere at all *except* the arena's root, which we cannot observe
    // without another call to `Arena::mutate`.

    // Thus, we can safely perform *collection* on this arena, and our collector
    // knows where the (single) GC root is, and it knows that the object graph
    // cannot be changed out from underneath us during collection.
    arena.collect_all();

    arena.mutate(|mc, root| {
        // If any object was collected, then it could not have possibly been
        // safely reachable, so there's no way to (safely) cause any sort of
        // UAF. Every pointer that was reachable from `root` will still be
        // live, and any other dead pointers could not have been safely stored
        // anywhere at all.
    });
}

WHEW! All that is left is to explain one last trick that allows us to define our root type and thus the type of the arena itself.

Lifetime Projection

The last piece of the puzzle is how to define Arena itself. I've talked about Arena having a root type, and conceptually this is not at all difficult to understand. During calls to Arena::mutate, this is the type of the root callback parameter:

fn main() {
    let mut arena = /* something */;

    arena.mutate(|mc, root| {});
    //                ^^^^ In our example, this is `NodePtr<'gc, i32>`.
}

The problem is the 'gc lifetime. As we have covered, the 'gc lifetime is generative, meaning, at least conceptually, that every invocation of Arena::mutate must get its own unique 'gc parameter that can never be unified with any other:

fn main() {
    let mut arena = /* something */;

    arena.mutate(|mc, root| {});
    //                ^^^^ `NodePtr<'gc, i32>`

    arena.mutate(|mc, root| {});
    //                ^^^^ `NodePtr<'gc, i32>` with *different* 'gc.

    arena.mutate(|mc, root| {});
    //                ^^^^ `NodePtr<'gc, i32>` with *yet another* 'gc.
}

Besides this, there is another practical problem: the type of the arena itself should not have any sort lifetime associated with it at all. It is, after all, a normal value and doesn't hold any borrows, and if it was branded with a 'gc lifetime and then that lifetime was used in mutation callbacks, that would allow pointers to escape the callbacks (assuming they didn't outlive the arena itself at least), breaking our "mutation XOR collection" soundness system.

What we actually need is some kind of 'static type that if we give it a 'gc, we get back whatever our root type is with that 'gc applied to it. Hey.. that kinda sounds like a GAT!

So what we need to give Arena is some kind of type that it can substitute different 'gc lifetimes into, enabling it to use lifetime generativity to protect pointers from leaking out of calls to Arena::mutate, and indeed, a GAT fits the bill here:

// We could define some kind of `Rootable` trait that the `R` parameter on
// `Arena` must implement.
pub trait Rootable {
    // And we could have an associated type on the trait that took the 'gc
    // lifetime as a generic parameter. That way, if we needed to make a root
    // with some *specific* 'gc, we could use `<R as Rootable>::Root<'gc>` for
    // such a root.
    type Root<'gc>: Collect + 'gc;
}

This Rootable trait does, in fact, work perfectly to let us define Arena types! This is not what gc-arena does, which I'll get to in just a second, but there's no reason this couldn't work instead. Going back to our concrete linked list example, we could define a Rootable impl this way:

struct NodeRootable;

impl Rootable for NodeRootable {
    type Root<'gc> = NodePtr<'gc, i32>;
}

and then we could create a new arena of type Arena<NodeRootable> like this:

fn main() {
    // Finally, we can actually define an arena! (Or we *could*, if this was
    // actually how `Arena` worked).
    let mut arena = Arena::<NodeRootable>::new(|mc| {
        // The `mc` parameter has a generative 'gc lifetime, and the returned
        // root type needs to match it.
        Node { prev: None, next: None, value: 0 }
    });

    // The `NodePtr<'gc, i32>` stored inside the arena has its 'gc lifetime
    // erased, `Arena<NodeRootable>` is a 'static type.
}

Bingo! this is the last piece we need to complete a workable gc-arena design. I sometimes refer to this idea of globally substituting a 'gc lifetime as "lifetime projection", but it is nothing more complicated than what a GAT provides. BUT, like I said, this is not exactly what gc-arena actually does, so if it does not do this, what DOES it do, and why?

The biggest reason that gc-arena does not use GATs to define root types is that gc-arena predates stable GATs by almost four years^[26]gc-arena didn't start by having a nice API for this either, at the beginning you had to declare Arena types with a macro instead and it was horrible. , but there is another very good reason we do not do this. First, I'll show you what the Rootable type actually looks like in gc-arena:

// Rather than use a GAT, we use a regular ol' lifetime parameter on a trait.
//
// Way before proper GATs were stabilized, there was sort of an "open secret"
// in Rust that GATs were already sort of... """available""" with trickery.
// However, the poor rustc compiler would get um... upset... if you tried to
// use them in anything other than simplistic ways. Here is one of those tricks!
// Rather than use a GAT, you can use a lifetime parameter on a trait, combined
// with a trait bound like this:
//
// `where R: for<'gc> Rootable<'gc>`
//
// Once you do this, instead of referring to `<R as Rootable>::Root<'gc>`, you
// can instead refer to `<R as Rootable<'gc>>::Root` to get basically the same
// result! At the time this feature went into `gc-arena`, rustc had calmed down
// enough to make the technique workable even without fully stabilized proper
// GATs.
//
// The 'static bound is not super important, it's an extra convenience to save
// having to write `R: 'static` bounds in a lot of places.
pub trait Rootable<'gc>: 'static {
    type Root: Collect + 'gc;
}

// Since `<R as Rootable<'gc>>::Root` is a lot to type, we have a type alias
// for it. This is where the `Root<'gc, R>` in the signature for `Arena::mutate`
// comes from.
pub type Root<'gc, R> = <R as Rootable<'gc>>::Root;

But so far this is just really a GAT with more steps. The real reason that we have not moved off of this technique is this whopper of a cool hack:^[27]which I absolutely did not invent

// Instead of requiring the user to declare a new type that implements
// `Rootable` for every new `Arena` type they want to declare, we instead abuse
// trait objects.
//
// Since every trait object always implements its own trait, by simply
// *referring* to a trait object for `Rootable`, we automatically get a type
// that implements this trait without having to declare any additional new
// types, and every identical invocation of `Rootable!` will always refer to the
// same type! Wow!!
//
// Currently, traits with GATs are not object safe (*yet*, this will be fixed).
// However, our Cheater McCheaterson GATs sure *are* object safe!
#[macro_export]
macro_rules! Rootable {
    ($gc:lifetime => $root:ty) => {
        dyn for<$gc> $crate::Rootable<$gc, Root = $root>
    };
}

There's one or two more small pieces that I have not covered here for the real Rootable! macro, but these details are truly not that important. They are but small QoL improvements over what I have just shown, which is the most important part. The Rootable! macro is also essential in understanding how to actually use gc-arena, because without it, using Arena or DynamicRootSet is extremely annoying, and most example code will use it heavily.

Finally, here is what the actual declaration of an Arena that holds a NodePtr<'gc, i32> would actually look like in practice:

fn main() {
    // No tricks this time. This is actually what you would write when using
    // `gc-arena` today.
    //
    // The `Rootable!` macro is in reality a bit nicer than the one that was
    // included above. In the real version, if you don't specify a lifetime with
    // the `'gc => _` syntax, by default just using '_ will work.
    //
    // `Rootable![NodePtr<'_, i32>]` is a type that implements the `Rootable`
    // trait and becomes the `R` parameter in `Arena<R>`. Each time that
    // `Arena::mutate` is called, the callback receives a `Root<'gc, R>` with
    // a *new* 'gc lifetime. `R` is not just one type but an infinite list of
    // types, one for any 'gc we provide, which is exactly what we need.
    let mut arena = Arena::<Rootable![NodePtr<'_, i32>]>::new(|mc| {
        Node { prev: None, next: None, value: 0 }
    });
}

Putting It All Together

Okay, after ALL of this explanation, we can finally see a fully worked example of our doubly-linked list, and understand each piece of it and how it fits together...

use gc_arena::{lock::RefLock, Arena, Collect, Gc, Mutation, Rootable};

// We define a node of a doubly-linked list data structure.
//
// `Collect` is derived procedurally, meaning that we can't mess up and forget
// to trace our inner `prev`, `next`, or `value`.
#[derive(Copy, Clone, Collect)]
// For safety, we agree to not implement `Drop`. We could also use
// `#[collect(unsafe_drop)]` or `#[collect(require_static)]` (if our type were
// 'static) here instead.`
#[collect(no_drop)]
struct Node<'gc, T: 'gc> {
    // The representation of the `prev` and `next` fields is a plain machine
    // pointer that might be NULL.
    //
    // Thanks, niche optimization!
    prev: Option<NodePtr<'gc, T>>,
    next: Option<NodePtr<'gc, T>>,
    value: T,
}

// By default, `Collect` types (other than 'static types) cannot have interior
// mutability. In order to provide safe mutation, we need to use `gc-arena`
// specific types to provide it which guarantee that write barriers are invoked.
//
// We use `RefLock` here as an alternative to `RefCell`.
type NodePtr<'gc, T> = Gc<'gc, RefLock<Node<'gc, T>>>;

// Create a new `Node` and return a pointer to it.
//
// We need to pass the `&Mutation<'gc>` context here because we are mutating the
// object graph (by creating a new "object" with `Gc::new`).
fn new_node<'gc, T: Collect>(mc: &Mutation<'gc>, value: T) -> NodePtr<'gc, T> {
    Gc::new(
        mc,
        RefLock::new(Node {
            prev: None,
            next: None,
            value,
        }),
    )
}

// Join two nodes together, setting the `left` node's `next` field to `right`,
// and the `right` node's `prev` field to `left`.
//
// Again, we are mutating the object graph, so we must pass in the
// `&Mutation<'gc>` context.
fn node_join<'gc, T>(
    mc: &Mutation<'gc>,
    left: NodePtr<'gc, T>,
    right: NodePtr<'gc, T>,
) {
    // This is `Gc<RefLock<_>>::borrow_mut`, which takes the mutation context as
    // a parameter. Write barriers will always be invoked on the target pointer,
    // so we know it is safe to mutate the value behind the pointer.
    left.borrow_mut(mc).next = Some(right);
    right.borrow_mut(mc).prev = Some(left);
}

// Use a `NodePtr` as a cursor, move forward through a linked list by following
// `next` pointers.
//
// Returns `true` if there was a `next` pointer and the target node has been
// changed.
fn node_rotate_right<'gc, T>(node: &mut NodePtr<'gc, T>) -> bool {
    if let Some(next) = node.borrow().next {
        *node = next;
        true
    } else {
        false
    }
}

// Use a `NodePtr` as a cursor, move backward through a linked list by following
// `prev` pointers.
//
// Returns `true` if there was a `prev` pointer and the target node has been
// changed.
fn node_rotate_left<'gc, T>(node: &mut NodePtr<'gc, T>) -> bool {
    if let Some(prev) = node.borrow().prev {
        *node = prev;
        true
    } else {
        false
    }
}

fn main() {
    // Create a new arena with a single `NodePtr<'_, i32>` as the root type.
    // 
    // We can't refer to some *particular* `NodePtr<'gc, i32>`, what we need to
    // be able to refer to is a set of `NodePtr<'_, i32>` for any possible '_
    // that we might pick. We use gc-arena's `Rootable!` macro for this.
    let mut arena = Arena::<Rootable![NodePtr<'_, i32>]>::new(|mc| {
        // Create a simple linked list with three links.
        // 
        // 1 <-> 2 <-> 3 <-> 4

        let one = new_node(mc, 1);
        let two = new_node(mc, 2);
        let three = new_node(mc, 3);
        let four = new_node(mc, 4);

        node_join(mc, one, two);
        node_join(mc, two, three);
        node_join(mc, three, four);

        // We return the pointer to 1 as our root
        one
    });

    // Outside of a call to `Arena::new` or `Arena::mutate`, we have no access
    // to anything *inside* the arena. We have to *visit* the arena with one of
    // the mutation methods in order to access its interior.

    arena.mutate_root(|_, root| {
        // We can examine the root type and see that our linked list is still
        // [1, 2, 3, 4]
        for i in 1..=4 {
            assert_eq!(root.borrow().value, i);
            node_rotate_right(root);
        }
    });

    arena.mutate_root(|_, root| {
        // Also, all of the reverse links work too.
        for i in (1..=4).rev() {
            assert_eq!(root.borrow().value, i);
            node_rotate_left(root);
        }
    });

    arena.mutate(|mc, root| {
        // Make the list circular! We sever the connection to 4 and link 3 back
        // to 1 making a list like this...
        //
        // +-> 1 <-> 2 <-> 3 <-+
        // |                   |
        // +-------------------+
        let one = *root;
        let two = one.borrow().next.unwrap();
        let three = two.borrow().next.unwrap();
        node_join(mc, three, one);
    });

    // The node for 4 is now unreachable!
    //
    // It can be freed during collection, but collection does not happen
    // automatically. We have to trigger collection *outside* of a mutation
    // method.
    //
    // The `Arena::collect_all` finishes the current collection cycle, but this
    // is not the only way to trigger collection.
    //
    // `gc-arena` is an incremental collector, and so keeps track of "debt"
    // during the GC cycle, pacing the collector based on the rate and size of new
    // allocations.
    //
    // We can also call `Arena::collect_debt` to do a *bit* of collection at a
    // time, based on the current collector debt.
    //
    // Since the collector has not yet started its marking phase, calling this
    // will fully mark the arena and collect all the garbage, so this method
    // will always free the 4 node.
    arena.collect_all();

    arena.mutate_root(|_, root| {
        // Now we can see that if we rotate through our circular list, we will
        // get:
        //
        // 1 -> 2 -> 3 -> 1 -> 2 -> 3
        for _ in 0..2 {
            for i in 1..=3 {
                assert_eq!(root.borrow().value, i);
                node_rotate_right(root);
            }
        }
    });
}

Closing Thoughts

Thank you for reading this far, I hope you found this tour of the design of gc-arena interesting. It is a crate whose use requires understanding several different unusual techniques all at one time, and explaining each piece is difficult in isolation. Hopefully this post can serve as a good introduction to someone who wants to use it in the future.

I understand that using gc-arena comes with a pretty high cognitive and usability overhead. In particular, satisfying the "mutability XOR collection" design by mutating via callback can be particularly challenging, and I will cover this much more in future posts about piccolo. However, not many languages can even be used to implement garbage collection at all, and even fewer could hope to do so with provable memory safety. In many ways, trying to design "safe, isolated garbage collection as a library" is wading through uncharted territory, so I hope you can view gc-arena's rough edges with this in mind.

In future posts, I want to talk about the way that I wish gc-arena could work if there were enough features in Rust to make it possible, using async and .await to express GC "safe points", but that discussion is for another time.

Bye!