#160 -- When the VM Deadlocked on Entity Creation

A deadlock in the classical sense involves two threads waiting for each other, each holding a resource the other needs. Neither can proceed. The system freezes. But there is a more subtle variant that does not involve threads at all -- a logical deadlock where a sequence of operations creates a circular dependency that prevents any progress.

FLIN encountered this variant during the development of its action system. The symptom was reminiscent of a process writing to a full stderr pipe and blocking forever -- except in our case, it was the VM encountering an opcode it did not understand and wandering through bytecode memory until it hit something that caused a silent return.

The Architecture of the Action System

FLIN's action system bridges the gap between browser interactivity and server-side logic. When a user clicks a button in a FLIN application, the browser sends a POST request to the /_action endpoint. The server then:

Reads the action name and the current page state from the request
Loads the FLIN source file for the referring page
Compiles the source with the function call appended
Creates a new VM instance
Injects the browser state into the VM
Executes the compiled bytecode
Checks whether any entities were modified
Returns either {"type":"reload"} (entities changed) or {"type":"ok"} (no changes)

The critical detail is step 5: the VM is created fresh for each action request. This means it has no memory of previous requests. It does not share state with the VM that rendered the page. Every action executes in isolation.

The Deadlock Pattern

The deadlock emerged from the interaction between three components: the action handler, the VM, and the bytecode dispatcher.

When the action handler received a request to call addTask(), it compiled the full page source with addTask() appended at the end. The function addTask() contained entity creation:

flinfn addTask() {
    task = Task { title: newTitle }
    save task
}

The compiler generated correct bytecode for this function. The VM's main execute() method could run this bytecode without issue. But the action handler did not use the main execute() method -- it used execute_until_return, which has its own opcode dispatch table.

When execute_until_return encountered the CreateEntity opcode, it found no handler. The default case advanced the instruction pointer by one byte -- but CreateEntity is a four-byte instruction (opcode + 2-byte type index + 1-byte field count). So the IP advanced to the middle of the instruction's operands. The next byte was interpreted as an opcode. It was not a valid opcode, but the default case handled it by advancing again. The IP wandered through the bytecode like a needle skipping grooves on a record.

Eventually, the IP landed on a byte sequence that the VM interpreted as a Return or Halt instruction. The function returned None. The action handler received None, concluded no entities were modified, and responded with {"type":"ok"}.

The system was not frozen in the traditional sense -- it responded promptly to every request. But it was logically deadlocked: the user repeatedly clicked "Add Task," the server repeatedly executed the function, and the function repeatedly failed to create any entity. No progress was possible.

The Diagnostic Challenge

What made this bug exceptionally difficult to diagnose was the absence of any error signal. Consider the feedback loop available to the developer:

HTTP response: 200 OK with body {"type":"ok"} -- looks normal
Server logs: No errors, no warnings
Browser console: No JavaScript errors
Compilation: Successful, no warnings
Test suite: All 2,248 tests passing

Every observable metric indicated a healthy system. The bug existed in the gap between "function executed" and "function executed correctly" -- a gap that is invisible without deep instrumentation.

Tracing the Execution Path

We attacked the problem from three angles simultaneously.

Approach 1: Bytecode Verification

First, we verified that the compiler was generating correct bytecode. We added offset tracking to the emitter:

rusteprintln!("[EMIT] CreateEntity at offset {}, type_idx={}, fields={}",
    self.chunk.current_offset(), type_idx, field_count);
eprintln!("[EMIT] Save at offset {}",
    self.chunk.current_offset());

The output confirmed correct bytecode generation. The CreateEntity instruction was at the expected offset with the correct operands. The Save instruction followed at the correct offset. The compiler was not the problem.

Approach 2: VM Execution Logging

Next, we instrumented the VM to log every opcode execution inside execute_until_return:

rustlet opcode = OpCode::from_byte(byte);
eprintln!("[VM execute_until_return] ip={} opcode={:?} byte=0x{:02x}",
    self.ip, opcode, byte);

This immediately revealed the problem. The log showed:

[VM] ip=1742 opcode=CreateEntity byte=0x77
[VM] ip=1743 opcode=Unknown byte=0x00    <- reading type_idx high byte
[VM] ip=1744 opcode=Unknown byte=0x03    <- reading type_idx low byte
[VM] ip=1745 opcode=Unknown byte=0x02    <- reading field_count
[VM] ip=1746 opcode=StoreLocal byte=0x21

The VM was reading the operand bytes of CreateEntity as opcodes. The instruction was being skipped entirely, and three operand bytes were being misinterpreted. By coincidence, the StoreLocal at offset 1746 was correctly identified and executed, but it was storing None (because CreateEntity never pushed a value onto the stack).

Approach 3: Entity Operations Counter

The decisive confirmation came from counting entity operations:

rustlet ops_before = vm.get_entity_ops_count();
vm.execute_until_return(&chunk, function_offset)?;
let ops_after = vm.get_entity_ops_count();
eprintln!("[ACTION] Entity ops: before={} after={}", ops_before, ops_after);

When the counter was zero after executing a function that should have saved an entity, we had definitive proof that the Save opcode was never reached.

The Systemic Issue

The bug was not just a missing opcode handler. It revealed a systemic issue in FLIN's architecture: the existence of two parallel opcode dispatch tables that must be kept in sync.

The main execute() method in vm.rs handles every opcode the compiler can generate. It is the canonical, complete dispatcher. The execute_until_return() method is a specialized subset designed for function execution during action requests. It was introduced to provide a controlled execution environment -- run a function, capture its side effects, and stop.

But "specialized subset" is a dangerous concept. Every opcode that can appear inside a function body must be in both dispatchers. And the set of opcodes that can appear inside a function body is, in practice, nearly every opcode in the language. Functions can contain variable declarations, control flow, loops, entity operations, list and map construction, arithmetic, comparisons, function calls, and more.

The initial implementation of execute_until_return included the obvious opcodes -- loads, stores, jumps, calls. Session 269 added entity operations like SetField, Save, and Delete. But CreateEntity was overlooked because testing focused on editing existing entities, not creating new ones.

Building the Safety Net

After the fix, we documented every opcode category that must exist in execute_until_return:

rust// Required opcode categories for execute_until_return
//
// Entities:    CreateEntity, Save, Delete, Destroy, SetField, GetField
// Variables:   LoadLocal, StoreLocal, LoadGlobal, StoreGlobal
// Constants:   LoadConst, LoadInt0, LoadInt1, LoadNone, LoadTrue, LoadFalse
// Stack:       Pop, Dup, Swap
// Control:     Jump, JumpIfFalse, JumpIfTrue, JumpIfNone, JumpFar, Halt
// Arithmetic:  Add, Sub, Mul, Div, Mod, Pow, Neg
// Comparison:  Eq, NotEq, Lt, LtEq, Gt, GtEq
// Logic:       Not, IsNone, IsNotNone
// Data:        CreateMap, CreateList, GetIndex, MapGet
// Calls:       Call, CallNative

More than 30 opcodes, spanning every category. The documentation serves as a checklist for any future modification to execute_until_return.

We also established a verification protocol. After any change to the action system, we test both entity editing (which uses SetField) and entity creation (which uses CreateEntity), then verify the WAL log contains the expected data:

bashcurl -s -X POST "http://127.0.0.1:3000/_action" \
  -H "Content-Type: application/json" \
  -d '{"_action":"addTask","_args":"[]","_state":"{\"newTitle\":\"TEST\"}"}'

# Must return: {"type":"reload"}
# WAL must contain: "TEST"
grep "TEST" .flindb/wal.log

Lessons from the Deadlock

This bug taught us three enduring lessons about VM design.

Parallel dispatch tables are a maintenance hazard. Any time two code paths must handle the same set of cases, they will inevitably drift out of sync. The ideal solution is a single, shared dispatch mechanism. If that is not possible (as in our case, where execute_until_return needs different control flow behavior), then the two tables must be explicitly linked through documentation, tests, or code generation.

Silent failures require proactive detection. When a system can fail without producing any error signal, you need instrumentation that actively looks for expected effects. The entity operations counter was the key diagnostic -- it did not check for errors but for the absence of expected results. This "positive detection" pattern is more robust than error detection for this class of bug.

Test the creation path separately from the modification path. Entity creation and entity modification are fundamentally different operations even when they share the same persistence layer. Creation involves constructing an object from nothing. Modification involves loading an existing object, changing it, and saving it back. Testing only modification leaves the creation path unexercised.

The VM did not deadlock in the textbook sense. No threads were blocked. No resources were contended. But the effect was the same: a system that appeared to run but made no progress. The user clicked the button, the server processed the request, the function executed, and nothing happened. A deadlock of intent, if not of implementation.

This is Part 160 of the "How We Built FLIN" series, documenting how a CEO in Abidjan and an AI CTO designed and built a programming language from scratch.

Series Navigation: - [159] The HTML Whitespace Rendering Bug - [160] When the VM Deadlocked on Entity Creation (you are here) - [161] The Temporal Version Tracking Bug

#160 -- When the VM Deadlocked on Entity Creation

The Architecture of the Action System

The Deadlock Pattern

The Diagnostic Challenge

Tracing the Execution Path

Approach 1: Bytecode Verification

Approach 2: VM Execution Logging

Approach 3: Entity Operations Counter

The Systemic Issue

Building the Safety Net

Lessons from the Deadlock

Responses

Related Articles

Thirteen Agents, Forty-Three Minutes: The First Claude Fable 5 Workflow Session, And What A Deterministic Orchestration Script Changes About Multi-Agent Builds

The gate caught its own drift: one day inside CASP with Claude Fable 5

The CASP Transplant: How The Six-File Discipline Moved From Conductor To An Anti-Fraud Transport ERP, What The /next Skill Adds When The Operator Just Types 'next', And Why The Cost Of CASP Drift Rises When The Project Is Someone Else's Cash