#022 -- Memory Management and Garbage Collection

Every language runtime must answer one question: who cleans up the memory? In FLIN, the answer is a mark-and-sweep garbage collector written in 731 lines of Rust.

Session 011, the day after we built the VM core, was dedicated entirely to memory. The VM could execute arithmetic, push and pop values, call functions, and jump. But every string it created, every list it built, every entity it instantiated -- all of it leaked. Objects went onto the heap and never came off. Run a loop that creates a thousand strings, and you have a thousand strings consuming memory forever.

That had to stop. This article covers how we designed the heap, implemented mark-and-sweep collection, handled the tricky edge cases (transitive references, closures, cycles), and verified it all with 22 targeted tests.

The Heap Architecture

The heap is a flat vector of optional objects:

rustpub struct Heap {
    objects: Vec<Option<HeapObject>>,
    free_list: Vec<usize>,
    bytes_allocated: usize,
    gc_threshold: usize,
}

Each slot in objects is either Some(HeapObject) or None (freed). The free_list tracks which slots are available for reuse. When the VM allocates a new object, it checks the free list first. If there is a free slot, it reuses it. If not, it pushes a new entry onto the vector.

This design avoids the overhead of a traditional allocator (malloc/free) for every object. Instead, the heap grows monotonically, and freed slots are recycled. The trade-off is that the heap vector may contain gaps (freed slots between live objects), but that is acceptable because we never iterate the heap in hot paths -- we only access objects by their ObjectId, which is an index into the vector.

The ObjectId type is a newtype wrapper around usize:

rust#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub struct ObjectId(pub usize);

When the VM pushes Value::Object(ObjectId(42)) onto the stack, it means "the object at heap index 42." Dereferencing is a single array index operation -- O(1) with no pointer chasing, no indirection, no cache misses beyond the initial array access.

Allocation

Allocating an object follows a simple path:

rustimpl Heap {
    pub fn alloc(&mut self, data: ObjectData) -> ObjectId {
        let size = data.estimated_size();
        let header = ObjectHeader {
            type_tag: data.type_tag(),
            marked: false,
            size: size as u32,
        };
        let object = HeapObject { header, data };

        let id = if let Some(slot) = self.free_list.pop() {
            self.objects[slot] = Some(object);
            ObjectId(slot)
        } else {
            let slot = self.objects.len();
            self.objects.push(Some(object));
            ObjectId(slot)
        };

        self.bytes_allocated += size;
        id
    }
}

The free list is a LIFO stack. Popping from the end is O(1). This means recently freed slots are reused first, which has a subtle benefit for cache locality: recently freed objects were recently accessed, so their memory region is likely still in the CPU cache.

Typed allocation helpers wrap the generic alloc:

rustpub fn alloc_string(&mut self, s: String) -> ObjectId {
    self.alloc(ObjectData::String(s))
}

pub fn alloc_list(&mut self, items: Vec<Value>) -> ObjectId {
    self.alloc(ObjectData::List(items))
}

pub fn alloc_map(&mut self, entries: HashMap<String, Value>) -> ObjectId {
    self.alloc(ObjectData::Map(entries))
}

Each returns an ObjectId that the VM stores on the stack or in a variable. The caller never sees a raw pointer. Rust's ownership system guarantees that every ObjectId refers to a valid slot in the heap vector (or a slot that has been freed, which the GC handles).

The Garbage Collector

FLIN uses mark-and-sweep, the simplest correct GC algorithm. It works in two phases:

Mark: Starting from a set of roots (values that are definitely alive), traverse all reachable objects and set their marked bit.
Sweep: Walk the entire heap. Any object that is not marked is garbage -- free its slot and add it to the free list.

Roots

The root set is everything the program can currently reach:

Every value on the operand stack
Every global variable
Every local variable in every active call frame

If a value is on the stack, it is alive. If a value is in a global, it is alive. If a value is in a local variable of any function that is currently executing (or waiting for a callee to return), it is alive. Everything else is garbage.

Marking

rustfn mark_roots(&mut self, stack: &[Value], globals: &HashMap<String, Value>,
              frames: &[CallFrame]) {
    // Mark from stack
    for value in stack {
        if let Value::Object(id) = value {
            self.mark_object(*id);
        }
    }

    // Mark from globals
    for value in globals.values() {
        if let Value::Object(id) = value {
            self.mark_object(*id);
        }
    }

    // Mark from call frame locals
    for frame in frames {
        for value in &frame.locals {
            if let Value::Object(id) = value {
                self.mark_object(*id);
            }
        }
    }
}

fn mark_object(&mut self, id: ObjectId) {
    if let Some(ref mut obj) = self.objects[id.0] {
        if obj.header.marked {
            return; // Already visited -- cycle protection
        }
        obj.header.marked = true;

        // Recursively mark referenced objects
        match &obj.data {
            ObjectData::List(items) => {
                let items_clone: Vec<Value> = items.clone();
                for item in &items_clone {
                    if let Value::Object(child_id) = item {
                        self.mark_object(*child_id);
                    }
                }
            }
            ObjectData::Map(entries) => {
                let values_clone: Vec<Value> = entries.values().cloned().collect();
                for val in &values_clone {
                    if let Value::Object(child_id) = val {
                        self.mark_object(*child_id);
                    }
                }
            }
            ObjectData::Closure(closure) => {
                for captured in &closure.captures.clone() {
                    if let Value::Object(child_id) = captured {
                        self.mark_object(*child_id);
                    }
                }
            }
            ObjectData::Entity(entity) => {
                let fields_clone: Vec<Value> = entity.fields.values().cloned().collect();
                for val in &fields_clone {
                    if let Value::Object(child_id) = val {
                        self.mark_object(*child_id);
                    }
                }
            }
            _ => {} // String, Function, NativeFunction, Iterator -- no child objects
        }
    }
}

The mark_object function is recursive. When it marks a list, it also marks every object in that list. When it marks a map, it marks every object value in that map. When it marks a closure, it marks every captured value. When it marks an entity, it marks every field value. This ensures that the entire transitive closure of reachable objects is marked.

The if obj.header.marked { return; } check at the top is critical for cycle handling. If object A contains a reference to object B, and object B contains a reference back to object A, the naive recursive approach would loop forever. The marked bit breaks the cycle: the second visit to an already-marked object returns immediately.

Notice the cloning. We clone the list items, map values, closure captures, and entity fields before iterating. This is because mark_object takes &mut self (it needs to mutate the marked bit), but iterating over an object's children requires &self access to read the object's data. Rust's borrow checker forbids holding both &self and &mut self simultaneously. The clone-first pattern resolves this conflict at the cost of a temporary allocation during GC.

This is one of the places where Rust's ownership model adds friction to GC implementation. In C, you would just cast away the const. In Rust, you clone. The clone is correct by construction -- no data race, no use-after-free -- but it is not free. For a production GC, we would optimise this with unsafe code or a different data structure. For now, correctness matters more than speed.

Sweeping

rustfn sweep(&mut self) {
    for i in 0..self.objects.len() {
        if let Some(ref mut obj) = self.objects[i] {
            if obj.header.marked {
                obj.header.marked = false; // Reset for next collection
            } else {
                // Garbage -- reclaim
                self.bytes_allocated -= obj.header.size as usize;
                self.objects[i] = None;
                self.free_list.push(i);
            }
        }
    }
}

The sweep walks every slot in the heap. Marked objects survive -- their mark bit is cleared for the next collection cycle. Unmarked objects are freed: their slot is set to None, their size is subtracted from bytes_allocated, and their index is pushed onto the free list.

After the sweep, every live object has marked: false, and the free list contains the indices of all freed slots. The next allocation will reuse those slots before growing the heap.

When to Collect

The GC does not run on every allocation. That would be catastrophically slow. Instead, it runs when bytes_allocated exceeds gc_threshold:

rustpub fn should_collect(&self) -> bool {
    self.bytes_allocated >= self.gc_threshold
}

The default threshold is 1 MB. After each collection, the threshold is set to twice the current bytes_allocated. This adaptive strategy means that programs with small heaps collect infrequently (the threshold grows slowly), while programs with large heaps collect proportionally (the threshold grows with the heap).

The VM checks should_collect() after every allocation and triggers a collection if needed. This ensures that the heap never grows unboundedly, while also ensuring that the GC does not run more often than necessary.

String Interning

Strings deserve special attention because they are the most common heap-allocated object in most programs. Variable names, entity field names, type names, user-facing text -- all of them are strings.

FLIN's constant pool already provides a form of string interning: string literals that appear in the source code are stored once in the constant pool and referenced by index. When the VM loads a constant string, it does not allocate a new heap object -- it retrieves the already-allocated string from the pool.

This means that the expression name = "Thales" does not allocate a new string on every execution. The string "Thales" exists once in the constant pool, and the global variable name holds a Value::Object(ObjectId) pointing to it.

Runtime string operations (concat, replace, upper, lower) do create new heap strings, and those are subject to garbage collection. But the constant pool strings are roots -- they are always reachable and never collected.

Entity Lifecycle

Entities are FLIN's primary data type -- they represent domain objects with fields, versions, and timestamps:

rustpub struct EntityInstance {
    pub entity_type: String,
    pub id: u64,
    pub fields: HashMap<String, Value>,
    pub version: u64,
    pub created_at: DateTime<Utc>,
    pub updated_at: DateTime<Utc>,
}

An entity's fields can contain any Value, including Object references to other heap objects. When the GC marks an entity, it must also mark every object in its fields. If an entity has a field address whose value is a string, that string must survive collection as long as the entity is alive.

This transitive marking is what makes the GC correct. Without it, the entity would survive but its string field would be collected, leaving a dangling ObjectId that points to a freed heap slot. The next access would read garbage data or panic.

The Borrow Checker and GC: Lessons Learned

Implementing a garbage collector in Rust is an exercise in negotiating with the borrow checker. The fundamental tension is this: the GC needs to read the heap (to find references) and write the heap (to set mark bits and free objects) at the same time.

In Session 011, we hit this conflict directly. The QueryAll entity operation needed to iterate over the entity store while also allocating new list objects on the heap:

rust// This does NOT compile:
let entities = self.entity_store.get(&type_name).unwrap();
for entity in entities.values() {
    let obj_id = self.heap.alloc_entity(entity.clone()); // &mut self conflict
    list.push(Value::Object(obj_id));
}

The fix was the clone-first pattern: collect the data you need into a local variable, then operate on the copy:

rustlet entity_clones: Vec<EntityInstance> = self.entity_store
    .get(&type_name)
    .map(|store| store.values().cloned().collect())
    .unwrap_or_default();

for entity in entity_clones {
    let obj_id = self.heap.alloc_entity(entity);
    list.push(Value::Object(obj_id));
}

This pattern recurs throughout the VM wherever the GC interacts with live data. It is not elegant. It is correct. And in a language runtime where correctness means "your users' programs do not crash with use-after-free," correct is what matters.

Test Coverage

The GC was tested with 22 dedicated tests covering every root type and every edge case:

Basic allocation: Strings, lists, maps allocate correctly and return valid ObjectId values.
Free list reuse: After collection, freed slots are reused by subsequent allocations.
Stack roots: Objects reachable from the stack survive collection.
Global roots: Objects reachable from global variables survive collection.
Frame local roots: Objects reachable from call frame locals survive collection.
Primitive roots: Primitive values (Int, Float, Bool) on the stack do not interfere with collection.
Transitive references: A list containing a string -- both survive if the list is reachable.
Closure captures: Objects captured by closures survive as long as the closure is alive.
Entity fields: Objects stored in entity fields survive as long as the entity is alive.
Iterator sources: The collection backing an iterator survives during iteration.
Map values: Object values in maps survive as long as the map is alive.
Deeply nested structures: A list of maps of lists of strings -- the entire tree survives if the root is reachable.
Cycles: Object A references B, B references A -- both survive if either is reachable, and neither causes infinite recursion in the marker.
Multiple collections: Running the GC repeatedly does not corrupt the heap or lose live objects.

These tests were not written after the fact. They were written alongside the implementation, in parallel by a subagent, ensuring that the GC was correct from the moment it was created.

Performance Characteristics

The mark-and-sweep algorithm has well-known performance characteristics:

Mark phase: O(L), where L is the number of live objects. The marker visits every reachable object exactly once.
Sweep phase: O(H), where H is the total heap size. The sweeper visits every slot, including freed ones.
Pause time: The entire collection is stop-the-world -- the VM halts while the GC runs.
Throughput: Between collections, allocation is O(1) (free list pop or vector push).

For FLIN's target workloads -- web applications with moderate state -- these characteristics are acceptable. A typical FLIN app might have a few hundred live objects (entities, view bindings, cached strings). The GC runs in microseconds.

If FLIN ever needs to handle millions of objects, we would graduate to a generational collector or an incremental marker. But premature optimisation of the GC would be wasted effort. The current implementation is correct, tested, and fast enough.

What 731 Lines Bought Us

After Session 011, the VM had a complete memory management system: allocation with free-list recycling, mark-and-sweep collection with transitive marking, cycle handling, adaptive thresholds, and 22 tests proving it all correct.

The heap would grow as the program ran, and the GC would periodically reclaim dead objects. Strings, lists, maps, entities, closures -- all of them could be created and destroyed safely. No leaks. No dangling pointers. No use-after-free.

The next challenge was making those closures work. Higher-order functions -- map, filter, reduce -- need to capture variables from their enclosing scope and carry them along. That is the subject of the next article.

This is Part 22 of the "How We Built FLIN" series, documenting how a CEO in Abidjan and an AI CTO built a programming language from scratch.

Next up: [23] Closures and Higher-Order Functions in the VM -- how upvalues and capture semantics make map, filter, and reduce possible.