#011 -- Session 1: Project Setup and 42 Keywords

On January 1, 2026, we opened a terminal in Abidjan and typed cargo init. FLIN did not exist yet. There was no lexer, no parser, no type system, no runtime. There was a PRD -- seven documents totalling thousands of lines -- and an empty directory. Forty-five minutes later, we had a Rust project with 14 files, over 1,000 lines of code, 42 keywords defined, 60+ token types enumerated, 25 unit tests, and the skeleton of every compiler phase mapped to its own module. Not a prototype. Not a brainstorm. The foundation of a programming language.

This is the story of that first session -- the decisions we made before writing a single line of code, the project structure that would carry FLIN through dozens of sessions, and the 42 keywords that define what FLIN is.

Why Rust for a Programming Language

The choice of implementation language was the first real decision. We could have written FLIN's compiler in TypeScript, Python, Go, or C. We chose Rust, and the reasoning was simple.

A programming language compiler is one of the most demanding programs you can write. It must be fast -- developers will run it thousands of times a day. It must be correct -- a bug in the compiler produces bugs in every program compiled with it. And it must handle complex data structures -- trees, graphs, hash maps, recursive enumerations -- without leaking memory or corrupting state.

Rust gives you all three. The type system catches entire categories of bugs at compile time. The ownership model eliminates memory leaks without a garbage collector. And the performance is within striking distance of C. When you are two people -- one human CEO, one AI CTO -- building a compiler from scratch, the Rust compiler is your most reliable team member. If the code compiles, you can trust it.

There was a practical reason too. FLIN is designed to compile to bytecode and run on a custom virtual machine. That VM needs to be fast, embeddable, and deployable as a single binary. Rust produces exactly that.

The Project Skeleton

Before writing any compiler logic, we laid out the module structure. Every compiler phase got its own directory. Every directory got a mod.rs file that would grow over the coming sessions. The principle was clear: each module owns one phase of the compilation pipeline, depends only on what it needs, and can be tested in isolation.

flin-official/
  Cargo.toml              # Project manifest
  src/
    main.rs               # CLI entry point
    lib.rs                # Library exports
    lexer/
      mod.rs              # Lexer module
      token.rs            # Token definitions
    parser/mod.rs          # Parser module (skeleton)
    typechecker/mod.rs     # Type checker (skeleton)
    codegen/mod.rs         # Code generator (skeleton)
    vm/mod.rs              # Virtual machine (skeleton)
    database/mod.rs        # FlinDB / ZEROCORE engine (skeleton)
    server/mod.rs          # HTTP server (skeleton)
    error/mod.rs           # Error handling
  examples/
    counter.flin           # 4-line counter example
    todo.flin              # Full todo app example

Seven modules. Seven phases. Each one a placeholder waiting for implementation -- except lexer/token.rs, which we would fill completely in this session.

The Cargo.toml was minimal but deliberate:

toml[package]
name = "flin"
version = "0.1.0"
edition = "2021"
description = "The FLIN programming language compiler and runtime"

[dependencies]
# Phase 0: Minimal dependencies
# Additional dependencies added per phase

[dev-dependencies]
# Testing utilities added as needed

No dependency bloat on day one. Each phase would add exactly what it needed and nothing more. This is a pattern we learned from building sh0.dev: declare dependencies at the workspace level, add them to leaf modules only when required. The fewer moving parts on day one, the fewer things that can break before you have written your first real feature.

Defining the Token: The Atom of Compilation

A compiler's first job is to turn a stream of characters into a stream of tokens. Before you can build the machine that does this -- the lexer -- you need to define what a token actually is.

In FLIN, a token is a struct with three fields:

rust#[derive(Debug, Clone)]
pub struct Token {
    pub kind: TokenKind,
    pub span: Span,
    pub lexeme: String,
}

#[derive(Debug, Clone, Copy)]
pub struct Span {
    pub start: Position,
    pub end: Position,
}

#[derive(Debug, Clone, Copy)]
pub struct Position {
    pub line: u32,
    pub column: u32,
    pub offset: u32,  // Byte offset in source
}

Three levels of information. The kind tells you what the token is -- a keyword, an operator, a literal, an identifier. The span tells you where it lives in the source file, both as line/column pairs (for human-readable error messages) and as byte offsets (for efficient slicing). The lexeme is the raw text that was matched.

This dual-coordinate system -- line/column plus byte offset -- was a deliberate design decision. Error messages need line numbers ("error on line 42, column 7"). But internal operations like string slicing and source mapping are faster with byte offsets. Carrying both costs a few extra bytes per token, but it eliminates an entire class of coordinate-translation bugs downstream.

The 42 Keywords

FLIN is not a general-purpose language. It is a domain-specific language for building full-stack web applications with built-in database operations, AI-powered intent queries, temporal data, and a reactive view layer. The 42 keywords reflect this scope exactly. Each keyword earned its place by representing a concept that FLIN handles natively, not through a library or framework.

We organized them into six categories:

Data operations (9 keywords): entity, save, delete, where, find, all, first, count, order. These are the verbs of FLIN's built-in persistence layer. You do not import an ORM. You do not write SQL. You write save user and the compiler knows what to do.

Types (8 keywords): text, int, float, bool, time, file, money, semantic. FLIN's type system is small and practical. money is a first-class type because FLIN targets business applications in West Africa where currency handling is not optional. semantic is a type modifier for AI-powered vector search -- semantic text means "this field can be searched by meaning, not just by exact match."

Control flow (5 keywords): if, else, for, in, match. Familiar territory. No while loop -- FLIN uses for with ranges and collections. No switch -- FLIN uses match with pattern matching.

Temporal references (7 keywords): now, today, yesterday, tomorrow, last_week, last_month, last_year. Time is not an afterthought in FLIN. Every entity is automatically versioned. These keywords are first-class temporal references that the compiler understands at the type level.

Intent / AI (4 keywords): ask, search, by, limit. These power FLIN's AI integration. ask "What are the overdue invoices?" is a valid FLIN expression that compiles to a bytecode instruction invoking an LLM. search "similar products" in Product by description limit 5 performs vector similarity search at the language level.

Literals and context (9 keywords): true, false, none, event, params, body, route, asc, desc. The glue that holds everything together -- boolean values, null handling, HTTP context for route handlers, and sort direction for queries.

The keyword lookup was implemented as a straightforward match expression in the lexer:

rustfn scan_identifier(&mut self) -> TokenKind {
    while self.peek().map_or(false, |c| c.is_alphanumeric() || c == '_') {
        self.advance();
    }

    let lexeme = self.current_lexeme();

    match lexeme.as_str() {
        "entity" => TokenKind::Keyword(Keyword::Entity),
        "save"   => TokenKind::Keyword(Keyword::Save),
        "delete" => TokenKind::Keyword(Keyword::Delete),
        "where"  => TokenKind::Keyword(Keyword::Where),
        "find"   => TokenKind::Keyword(Keyword::Find),
        "all"    => TokenKind::Keyword(Keyword::All),
        "first"  => TokenKind::Keyword(Keyword::First),
        "count"  => TokenKind::Keyword(Keyword::Count),
        "order"  => TokenKind::Keyword(Keyword::Order),
        // ... 33 more keywords
        "true"   => TokenKind::Keyword(Keyword::True),
        "false"  => TokenKind::Keyword(Keyword::False),
        "none"   => TokenKind::Keyword(Keyword::None),

        _ => TokenKind::Identifier(lexeme),
    }
}

Simple. A match on a string. No hash table, no trie, no perfect hash function. Could we optimize this later with phf (a compile-time perfect hash map crate)? Yes. Did we need to on day one? No. The Rust compiler will turn this match into a jump table or binary search. For 42 arms, that is fast enough for any realistic source file. Premature optimization in a compiler is doubly dangerous because you are writing the tool that optimizes other people's code -- if you cannot resist the urge in your own codebase, you will never ship.

60+ Token Types

Keywords are only one category of token. FLIN's TokenKind enum has over 60 variants, covering every syntactic element the language can contain:

Literals: Integer(i64), Float(f64), String(String), Bool(bool). Each literal carries its parsed value directly in the token. The lexer does not just identify that 42 is a number -- it parses it into an i64 and stores the result. This means the parser never has to re-parse literals, eliminating a common source of bugs.

Operators: arithmetic (+, -, *, /, %), comparison (==, !=, <, <=, >, >=), logical (&&, ||, !), increment/decrement (++, --), assignment (=), temporal (@), optional (?), and arrow (->). Twenty-two operator tokens in total.

Delimiters: parentheses, braces, brackets, comma, colon, semicolon, dot. The standard punctuation of structured code.

View tokens: this is where FLIN diverges from most languages. Because FLIN has a built-in HTML-like view layer, the lexer must handle tag syntax: TagOpen (<), TagClose (>), TagSelfClose (/>), TagEnd (</), plus TagName, AttrName, and Text tokens for content within views. The lexer achieves this through a modal design -- it switches between Code, View, and ViewExpression modes as it encounters HTML-like structures.

View control tokens: {if, {else, {else if, {/if}, {for, {/for}. These are the template control flow constructs that live inside the view layer, distinct from the code-mode if and for statements.

The full TokenKind enum is the contract between the lexer and the parser. Every syntactic form that FLIN supports must have a corresponding token kind. Getting this right on day one meant the parser -- which we would build in sessions 5 and 6 -- could trust that its input was well-formed and complete.

The First Tests

We wrote 25 unit tests in this session, all focused on token definitions and keyword recognition. These were not integration tests -- we had no lexer to test yet. They tested the token types themselves: that every keyword mapped to the correct enum variant, that Display formatting produced readable output, that span calculations were correct.

This might seem premature. Why test data types before you have any logic? Because the token types are the foundation of every subsequent phase. If Keyword::Save accidentally maps to the string "delete" in a from_str implementation, you will not catch that bug until the parser starts producing wrong ASTs, by which time you will have spent hours debugging the wrong layer.

Testing the foundation first is cheap insurance. Twenty-five tests took minutes to write. They would save hours of debugging over the following sessions.

Two Example Programs

We also wrote two .flin example files: a 4-line counter and a 55-line todo application. These were not runnable -- we had no compiler yet. They were specifications. Concrete programs that every compiler phase would be tested against.

The counter example was deliberately minimal:

flincount = 0
<button click={count++}>{count}</button>

Four lines. A variable declaration, an HTML element, an event handler with an expression, and text interpolation. But those four lines exercise almost every compiler phase: lexical analysis (keywords, operators, view tokens), parsing (variable declaration, view element, expression), type checking (integer inference, event handler typing), and code generation (bytecode for reactivity and DOM updates).

If the counter example compiles and runs correctly, FLIN works. That single program became our north star for the first nine sessions.

What We Did Not Do

Session 1 was as much about what we skipped as what we built.

We did not install Rust on the development machine. The session focused entirely on code design and generation. Compilation and testing would come in session 2. This was a deliberate workflow choice: define the architecture completely, then verify it compiles. In our CEO-AI CTO workflow, Claude can generate correct Rust code without a compiler feedback loop -- the type system reasoning happens in the model, not in cargo check.

We did not add any external dependencies beyond what the standard library provides. No serde for serialization. No clap for CLI argument parsing. No thiserror for error handling macros. Each dependency would be added in the session that needed it, justified by a specific use case.

We did not optimize anything. The keyword lookup is a linear match. The token struct carries a heap-allocated String for every lexeme. The position tracking uses u32 where u16 might suffice. None of this matters when you have zero users and zero programs to compile. Optimization without measurement is guessing, and guessing is not engineering.

The Session in Numbers

Metric	Value
Duration	~45 minutes
Files created	14
Lines of Rust	~1,000
Keywords defined	42
Token types defined	60+
Unit tests	25
External dependencies	0
Compiler phases scaffolded	7
Tasks completed	12 out of 350 total

Twelve tasks out of 350. Three point four percent of the total implementation plan. But the right 3.4% -- the foundation that every subsequent task depends on.

What Made This Possible

Building the token layer of a programming language in 45 minutes is not typical. Two factors made it possible.

First, the PRD was comprehensive. Before session 1, we had already written seven specification documents covering FLIN's syntax, type system, compiler architecture, bytecode format, runtime behavior, and database engine. The token definitions were not invented during the session -- they were transcribed from a specification that had been designed and reviewed over weeks. The session was pure implementation, not design.

Second, the CEO-AI CTO workflow eliminates the gap between decision and implementation. Thales decided the project structure. Claude implemented it. There was no context-switching, no "let me set up my editor," no "I need to read the docs for this crate." The entire session was a continuous flow from architectural decision to working code.

What Came Next

Session 1 gave us the token definitions. But tokens are inert data -- they do not do anything until a lexer produces them from source code. Session 2 would install Rust, verify compilation, and begin building the scanner: the character-by-character state machine that reads FLIN source files and emits the tokens we defined today.

Before we get there, the next article takes a deeper look at the lexer itself -- not just the tokens it produces, but the algorithm that produces them. How does a lexer handle a language that mixes imperative code with HTML-like views? How do you switch between code mode and view mode without losing your place? How do you scan strings, numbers, and multi-character operators in Rust?

The answers are in the scanner. That is the next article.

This is Part 11 of the "How We Built FLIN" series, documenting how a CEO in Abidjan and an AI CTO built a programming language from scratch.

Series Navigation: - [11] Session 1: Project Setup and 42 Keywords (you are here) - [12] Building a Lexer From Scratch in Rust - [13] Pratt Parsing: How FLIN Reads Your Code - [14] The Abstract Syntax Tree: FLIN's Internal Representation - [15] Hindley-Milner Type Inference in a Custom Language