#173 -- The .flinc Binary Format

When you run flin build app.flin, the compiler produces a .flinc file. That file contains everything the VM needs to execute the program: constants, bytecode instructions, and optional debug information. No source code. No AST. Just a compact binary representation that loads in microseconds and runs immediately.

Designing a binary format is an exercise in trade-offs. Too simple and you lose important information. Too complex and deserialization becomes a bottleneck. The .flinc format sits in a deliberate sweet spot: simple enough to implement in a single session, rich enough to carry debug information and detect corruption.

The Header

Every .flinc file begins with a 24-byte header:

.flinc File Format v1.0

HEADER (24 bytes):
  Offset  Size  Field         Description
  0x00    4     Magic         "FLIN" (0x464C494E)
  0x04    1     Major         Format version major (1)
  0x05    1     Minor         Format version minor (0)
  0x06    2     Flags         Feature flags
  0x08    4     ConstCount    Number of constants
  0x0C    4     CodeSize      Bytecode size in bytes
  0x10    4     EntryPoint    Entry point offset
  0x14    4     Checksum      CRC32 of content

The magic number serves two purposes. First, it identifies the file as a FLIN binary rather than arbitrary data. Any tool can check the first four bytes to determine whether a file is a valid .flinc. Second, it provides a human-readable marker in hex dumps: the bytes 46 4C 49 4E spell "FLIN" in ASCII.

The Rust implementation defines the header as a struct with serialization methods:

rustpub const FLINC_MAGIC: [u8; 4] = [0x46, 0x4C, 0x49, 0x4E]; // "FLIN"
pub const FLINC_VERSION_MAJOR: u8 = 1;
pub const FLINC_VERSION_MINOR: u8 = 0;

pub struct FlincHeader {
    pub magic: [u8; 4],
    pub version_major: u8,
    pub version_minor: u8,
    pub flags: FlincFlags,
    pub const_count: u32,
    pub code_size: u32,
    pub entry_point: u32,
    pub checksum: u32,
}

pub struct FlincFlags {
    pub debug_info: bool,
    pub source_map: bool,
}

Version numbering follows a strict compatibility contract. The major version changes when the format is incompatible with previous versions. The minor version changes when new optional features are added. A runtime with version 1.2 can load files with version 1.0 through 1.2 but will reject files with version 2.0 or higher.

The Constant Pool

After the header comes the constant pool: a sequence of typed values that the bytecode references by index. Each constant is prefixed with a one-byte type tag:

CONSTANT POOL:
  For each constant:
    Tag (1 byte) + Data (variable)

  Tags: 0x00=Null, 0x01=Bool, 0x02=Int, 0x03=Float,
        0x04=String, 0x05=Identifier, 0x06=EntityName,
        0x07=Function, 0x08=Time, 0x09=Money

Strings are length-prefixed with a 4-byte little-endian integer followed by UTF-8 bytes. Numbers are stored in their native representation: 8 bytes for 64-bit integers and 8 bytes for 64-bit floats. Booleans are a single byte after the tag.

The serialization code handles all ten constant types:

rustfn serialize_constant(buf: &mut Vec<u8>, constant: &Constant) -> Result<(), String> {
    match constant {
        Constant::Null => buf.push(0x00),
        Constant::Bool(v) => {
            buf.push(0x01);
            buf.push(if *v { 1 } else { 0 });
        }
        Constant::Int(v) => {
            buf.push(0x02);
            buf.extend_from_slice(&v.to_le_bytes());
        }
        Constant::Float(v) => {
            buf.push(0x03);
            buf.extend_from_slice(&v.to_le_bytes());
        }
        Constant::String(s) | Constant::Identifier(s) | Constant::EntityName(s) => {
            let tag = match constant {
                Constant::String(_) => 0x04,
                Constant::Identifier(_) => 0x05,
                Constant::EntityName(_) => 0x06,
                _ => unreachable!(),
            };
            buf.push(tag);
            buf.extend_from_slice(&(s.len() as u32).to_le_bytes());
            buf.extend_from_slice(s.as_bytes());
        }
        // Function, Time, Money follow similar patterns
    }
    Ok(())
}

The distinction between String, Identifier, and EntityName may seem redundant at the binary level -- they are all UTF-8 strings. But preserving the semantic type in the constant pool allows the VM to apply different handling. Entity names trigger database lookups. Identifiers are resolved against the scope chain. Strings are just data.

The Bytecode Section

The bytecode section contains raw instruction bytes preceded by a 4-byte length:

BYTECODE:
  Code Length: u32 (4 bytes)
  Instructions: [u8] (opcodes + operands)

Each instruction is one or more bytes. The opcode is the first byte, followed by zero to three operand bytes depending on the instruction. The VM reads the opcode, determines the operand count, advances the instruction pointer by the correct amount, and dispatches to the handler.

This section is the core of the binary. Everything else -- the header, the constant pool, the debug info -- exists to support the execution of these instructions.

Debug Information

When the debug_info flag is set in the header, a line number table follows the bytecode. This table maps each bytecode offset to the source line that generated it, enabling meaningful error messages at runtime:

DEBUG INFO (if flags.debug_info):
  Line number table (RLE encoded)

The line table uses Run-Length Encoding (RLE) to compress the data. Most bytecode instructions come from the same source line, so rather than storing one line number per instruction, we store pairs of (line_number, count):

rustfn write_line_table(buf: &mut Vec<u8>, lines: &[u32]) {
    if lines.is_empty() {
        buf.extend_from_slice(&0u32.to_le_bytes());
        return;
    }

    let mut runs: Vec<(u32, u32)> = Vec::new();
    let mut current_line = lines[0];
    let mut count = 1u32;

    for &line in &lines[1..] {
        if line == current_line {
            count += 1;
        } else {
            runs.push((current_line, count));
            current_line = line;
            count = 1;
        }
    }
    runs.push((current_line, count));

    buf.extend_from_slice(&(runs.len() as u32).to_le_bytes());
    for (line, count) in runs {
        buf.extend_from_slice(&line.to_le_bytes());
        buf.extend_from_slice(&count.to_le_bytes());
    }
}

RLE compression reduces the debug section size by 60-80% compared to a flat array. A function with 20 bytecode instructions on 5 source lines stores 5 runs instead of 20 entries.

Integrity Verification

The last four bytes of every .flinc file contain a CRC32 checksum of all preceding content. When the VM loads a .flinc file, it recalculates the checksum and compares it to the stored value:

rustimpl FlincFile {
    pub fn from_bytes(bytes: &[u8]) -> Result<Self, FlincError> {
        // Verify magic number
        if bytes.len() < 24 || &bytes[0..4] != &FLINC_MAGIC {
            return Err(FlincError::InvalidMagic);
        }

        // Version check
        let major = bytes[4];
        let minor = bytes[5];
        if major != FLINC_VERSION_MAJOR {
            return Err(FlincError::UnsupportedVersion(major, minor));
        }

        // Verify checksum
        let content = &bytes[..bytes.len() - 4];
        let stored_checksum = u32::from_le_bytes(
            bytes[bytes.len() - 4..].try_into().unwrap()
        );
        let calculated = crc32(content);

        if stored_checksum != calculated {
            return Err(FlincError::ChecksumMismatch {
                expected: stored_checksum,
                actual: calculated,
            });
        }

        // Deserialize content...
        Ok(file)
    }
}

This catches file corruption during transfer, partial writes from interrupted builds, and accidental modifications. The error message is specific: "Checksum mismatch: file corrupted" tells the developer exactly what happened and what to do about it (rebuild the file).

Roundtrip Testing

The binary format is tested with roundtrip tests that serialize a chunk, deserialize it, and verify that the result is identical to the original:

rust#[test]
fn test_flinc_file_roundtrip_all_constant_types() {
    let chunk = Chunk {
        constants: vec![
            Constant::Null,
            Constant::Bool(true),
            Constant::Int(42),
            Constant::Float(3.14),
            Constant::String("hello".into()),
            Constant::Identifier("count".into()),
            Constant::EntityName("User".into()),
        ],
        code: vec![0x01, 0x02, 0x03],
        lines: vec![1, 1, 2],
    };

    let file = FlincFile::from_chunk(chunk.clone(), FlincFlags::default());
    let bytes = file.to_bytes();
    let loaded = FlincFile::from_bytes(&bytes).unwrap();

    assert_eq!(loaded.chunk.constants.len(), chunk.constants.len());
    assert_eq!(loaded.chunk.code, chunk.code);
}

Fourteen tests cover the binary format: flag encoding, header roundtrips, all constant types, debug info preservation, checksum verification, invalid magic detection, unsupported version handling, CRC32 known values, RLE encoding, empty chunks, and full counter.flin compilation. These tests ensure that every .flinc file produced by flin build will be correctly loaded by flin run.

Real-World File Sizes

The .flinc format is compact. Here are the sizes for the example applications:

examples/counter.flin:     487 bytes (.flinc)
examples/calculator.flin:  1,204 bytes (.flinc)
examples/todo.flin:        2,891 bytes (.flinc)

A counter application -- the "Hello World" of reactive frameworks -- compiles to under 500 bytes. A full todo application with entities, CRUD operations, and a view layer compiles to under 3 kilobytes. These sizes are orders of magnitude smaller than the equivalent JavaScript bundle, making FLIN binaries trivial to deploy, cache, and transfer.

Loading Performance

Loading a .flinc file is approximately twice as fast as compiling from source. The source compilation path includes lexing, parsing, type checking, and code generation. The binary loading path includes only deserialization and checksum verification.

For development, this difference is negligible -- both paths complete in under 50 milliseconds for typical files. For production, particularly in serverless environments where cold start time is critical, the difference matters. A Lambda function that loads a pre-compiled .flinc file starts serving requests measurably faster than one that compiles from source on every cold start.

Design Decisions

Several deliberate trade-offs shaped the .flinc format:

Little-endian everywhere. All multi-byte integers use little-endian byte order. This matches the native byte order of x86 and ARM (in its common configuration), avoiding byte-swapping on the platforms where FLIN runs.

No compression. The format stores data uncompressed. The files are small enough that compression provides negligible benefit, and decompression adds latency to loading. If needed, standard file compression (gzip, zstd) can be applied externally.

No encryption. The .flinc format is not obfuscated or encrypted. It is a distribution format, not a protection mechanism. Source code protection, if needed, should be handled at the deployment level, not the file format level.

Forward compatibility through flags. The FlincFlags field in the header reserves space for future features. When we add source maps or profile information, existing files without those features will continue to load correctly -- the flag simply indicates that the optional section is absent.

The .flinc format is intentionally simple. It does not try to be a general-purpose binary container, a sophisticated compression scheme, or a portable executable format. It is exactly what FLIN needs: a fast, reliable way to serialize bytecode and load it back.

This is Part 173 of the "How We Built FLIN" series, documenting how a CEO in Abidjan and an AI CTO designed and built a programming language from scratch.

Series Navigation: - [172] The FLIN Formatter and Linting - [173] The .flinc Binary Format (you are here) - [174] Testing, Benchmarks, and Fuzzing

#173 -- The .flinc Binary Format

The Header

The Constant Pool

The Bytecode Section

Debug Information

Integrity Verification

Roundtrip Testing

Real-World File Sizes

Loading Performance

Design Decisions

Responses

Related Articles

Thirteen Agents, Forty-Three Minutes: The First Claude Fable 5 Workflow Session, And What A Deterministic Orchestration Script Changes About Multi-Agent Builds

The gate caught its own drift: one day inside CASP with Claude Fable 5

The CASP Transplant: How The Six-File Discipline Moved From Conductor To An Anti-Fraud Transport ERP, What The /next Skill Adds When The Operator Just Types 'next', And Why The Cost Of CASP Drift Rises When The Project Is Someone Else's Cash