Auto-Detecting 19 Tech Stacks from Source Code

A developer pushes their code. Thirty seconds later, it is running in a container. They did not write a Dockerfile. They did not configure a build pipeline. They did not specify which runtime, which package manager, or which port to expose.

This is the promise of a modern PaaS, and making it work requires solving a deceptively hard problem: looking at a directory of source files and figuring out what the project actually is.

sh0's build engine handles this in pure Rust -- no heuristics service, no LLM, no external API call. A priority-based detection algorithm examines the project, identifies the stack, generates a production-grade Dockerfile with multi-stage builds, and creates an optimised build context. All of it happens in milliseconds, and all of it was built on Day Zero.

The 19 Stacks

The Stack enum defines every technology sh0 can detect and build:

rustpub enum Stack {
    Dockerfile,   // User-provided Dockerfile (highest priority)
    NextJs,
    Nuxt,
    SvelteKit,
    Astro,
    NodeGeneric,
    Bun,
    Python,
    Django,
    FastApi,
    Go,
    Rust,
    JavaMaven,
    JavaGradle,
    Php,
    DotNet,
    Ruby,
    StaticSite,
    Unknown,      // Fallback (lowest priority)
}

Each variant carries metadata through trait methods:

rustimpl Stack {
    pub fn label(&self) -> &str {
        match self {
            Stack::NextJs => "Next.js",
            Stack::FastApi => "FastAPI",
            Stack::SvelteKit => "SvelteKit",
            // ...
        }
    }

    pub fn default_port(&self) -> u16 {
        match self {
            Stack::NextJs => 3000,
            Stack::Django => 8000,
            Stack::FastApi => 8000,
            Stack::Go => 8080,
            Stack::Rust => 8080,
            Stack::StaticSite => 80,
            // ...
        }
    }

    pub fn needs_dockerfile(&self) -> bool {
        matches!(self, Stack::Php | Stack::DotNet | Stack::Ruby | Stack::Unknown)
    }
}

The needs_dockerfile() method is honest: for PHP, .NET, and Ruby, the build configurations are too varied for a generic template to be reliable. sh0 tells the user "I detected your stack, but you need to provide a Dockerfile" rather than generating a bad one. Honesty beats magic.

The Priority Rule: Dockerfile Always Wins

The detection algorithm has one inviolable rule: if a Dockerfile exists in the project root, that is the stack. Period.

rustpub fn detect(path: &Path) -> DetectedStack {
    // Rule 1: User-provided Dockerfile always takes priority
    if path.join("Dockerfile").exists() {
        return DetectedStack {
            stack: Stack::Dockerfile,
            ..Default::default()
        };
    }

    // Rule 2: Detect by signature files
    if path.join("package.json").exists() {
        return detect_node(path);
    }
    if path.join("go.mod").exists() {
        return detect_go(path);
    }
    if path.join("Cargo.toml").exists() {
        return detect_rust(path);
    }
    // ... remaining stacks

    DetectedStack {
        stack: Stack::Unknown,
        ..Default::default()
    }
}

This is a deliberate design choice. A user who has written their own Dockerfile has made a conscious decision about how their application should be built. Overriding that decision with auto-detection would be presumptuous and, in many cases, wrong. Their Dockerfile might have custom system dependencies, specific compiler flags, or a proprietary base image that no detection algorithm could guess.

The priority ordering after Dockerfile is based on signature file specificity. We check package.json before looking for Python files because a project could have both (e.g., a Python backend with a JavaScript build step for assets). The most specific match wins.

Node.js: The Sub-Detection Challenge

Node.js is by far the most complex detection target. The ecosystem has four major package managers, dozens of frameworks, and several meta-frameworks, each requiring different build commands and runtime configurations.

The detection proceeds in layers:

rustfn detect_node(path: &Path) -> DetectedStack {
    let pkg_json = read_package_json(path);

    // Layer 1: Package manager detection
    let package_manager = if path.join("bun.lockb").exists() {
        PackageManager::Bun
    } else if path.join("pnpm-lock.yaml").exists() {
        PackageManager::Pnpm
    } else if path.join("yarn.lock").exists() {
        PackageManager::Yarn
    } else {
        PackageManager::Npm
    };

    // Layer 2: Meta-framework detection (config files)
    if path.join("next.config.js").exists()
        || path.join("next.config.mjs").exists()
        || path.join("next.config.ts").exists()
    {
        return DetectedStack {
            stack: Stack::NextJs,
            package_manager: Some(package_manager),
            framework: Some("next".into()),
            build_command: Some(build_cmd(&package_manager, "build")),
            start_command: Some("node server.js".into()),
            ..Default::default()
        };
    }

    if path.join("svelte.config.js").exists()
        || path.join("svelte.config.ts").exists()
    {
        return DetectedStack {
            stack: Stack::SvelteKit,
            package_manager: Some(package_manager),
            framework: Some("sveltekit".into()),
            build_command: Some(build_cmd(&package_manager, "build")),
            start_command: Some("node build/index.js".into()),
            ..Default::default()
        };
    }

    // Layer 3: Framework detection from dependencies
    if let Some(ref pkg) = pkg_json {
        if has_dependency(pkg, "express") {
            return DetectedStack {
                stack: Stack::NodeGeneric,
                framework: Some("express".into()),
                // ...
            };
        }
        // fastify, hono, koa, nestjs...
    }

    // Fallback: generic Node.js
    DetectedStack {
        stack: Stack::NodeGeneric,
        package_manager: Some(package_manager),
        ..Default::default()
    }
}

Notice the ordering. Meta-frameworks (Next.js, Nuxt, SvelteKit, Astro) are detected by config files, not by package.json dependencies. This matters because config files are unambiguous -- if next.config.js exists, this is a Next.js project -- while dependency detection is fuzzy (a project might import express as a sub-dependency without being an Express app).

The build_cmd helper generates the correct invocation for each package manager:

rustfn build_cmd(pm: &PackageManager, script: &str) -> String {
    match pm {
        PackageManager::Npm => format!("npm run {}", script),
        PackageManager::Yarn => format!("yarn {}", script),
        PackageManager::Pnpm => format!("pnpm run {}", script),
        PackageManager::Bun => format!("bun run {}", script),
    }
}

Small function. But getting it wrong means pnpm run build fails because someone used yarn build syntax.

Python: The wsgi.py Trick

Python detection has its own complexity. A Python project could be Django, FastAPI, Flask, or something entirely custom. We use file-based heuristics:

rustfn detect_python(path: &Path) -> DetectedStack {
    // Django: look for wsgi.py or manage.py
    if path.join("manage.py").exists() || find_wsgi(path) {
        return DetectedStack {
            stack: Stack::Django,
            start_command: Some("gunicorn config.wsgi:application --bind 0.0.0.0:8000".into()),
            ..Default::default()
        };
    }

    // FastAPI: look for main.py/app.py with fastapi import
    if detect_fastapi_entry(path) {
        return DetectedStack {
            stack: Stack::FastApi,
            start_command: Some("uvicorn main:app --host 0.0.0.0 --port 8000".into()),
            ..Default::default()
        };
    }

    // Generic Python
    DetectedStack {
        stack: Stack::Python,
        ..Default::default()
    }
}

The Django detection via wsgi.py is robust because every Django project has one, and non-Django projects almost never do. It is a better signal than checking requirements.txt for django (which might be a sub-dependency or an unused leftover).

Dockerfile Generation: 12 Production-Grade Templates

Once the stack is detected, the build engine generates a Dockerfile optimised for that stack. The templates are not simple "FROM node, COPY, RUN" scripts. They use multi-stage builds, Alpine or slim base images, non-root users, and build cache optimisation.

Here is the Node.js generic template (simplified):

rustfn dockerfile_node(detected: &DetectedStack) -> String {
    let pm = detected.package_manager.as_ref().unwrap_or(&PackageManager::Npm);
    let install_cmd = match pm {
        PackageManager::Npm => "npm ci --only=production",
        PackageManager::Yarn => "yarn install --frozen-lockfile --production",
        PackageManager::Pnpm => "pnpm install --frozen-lockfile --prod",
        PackageManager::Bun => "bun install --production",
    };

    format!(r#"
FROM node:20-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN {install_cmd}
COPY . .
RUN npm run build

FROM node:20-alpine
WORKDIR /app
RUN addgroup -S app && adduser -S app -G app
COPY --from=builder /app .
USER app
EXPOSE {port}
CMD ["node", "dist/index.js"]
"#, install_cmd = install_cmd, port = detected.stack.default_port())
}

Key details:

Multi-stage build: The builder stage installs dependencies and builds; the final stage copies only what is needed. This keeps the production image small.
npm ci over npm install: ci uses the lockfile exactly, ensuring reproducible builds. install might update dependencies.
--frozen-lockfile: Same principle for yarn and pnpm.
Non-root user: The app user prevents the container process from running as root, a basic security measure.
Alpine base: Roughly 50MB vs 900MB for the full Node image.

The Next.js template is more sophisticated, leveraging Next's standalone output mode:

rustfn dockerfile_nextjs(detected: &DetectedStack) -> String {
    format!(r#"
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ci

FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY . .
RUN npm run build

FROM node:20-alpine
WORKDIR /app
RUN addgroup -S app && adduser -S app -G app
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static
COPY --from=builder /app/public ./public
USER app
EXPOSE 3000
CMD ["node", "server.js"]
"#)
}

Three stages: dependency installation, build, and production. The final image contains only the standalone server, static assets, and public files -- no node_modules, no source code, no build tooling.

For Go, the template goes further, using scratch as the final base image:

rustfn dockerfile_go(_detected: &DetectedStack) -> String {
    r#"
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /app/server .

FROM scratch
COPY --from=builder /app/server /server
EXPOSE 8080
ENTRYPOINT ["/server"]
"#.to_string()
}

The final image literally contains nothing except the compiled binary. No shell, no OS utilities, no attack surface. The -ldflags="-s -w" strips debug symbols and DWARF information, further reducing binary size.

The .dockerignore: Preventing Build Context Bloat

A common deployment mistake: the Docker build context includes node_modules, .git, or other large directories, making builds slow and images unnecessarily large. sh0 generates a .dockerignore file tailored to the detected stack:

rustfn generate_dockerignore(stack: &Stack) -> String {
    let mut patterns = vec![
        ".git",
        ".gitignore",
        "*.md",
        "LICENSE",
        ".env*",
        ".DS_Store",
        "Thumbs.db",
    ];

    match stack {
        Stack::NodeGeneric | Stack::NextJs | Stack::SvelteKit | Stack::Nuxt | Stack::Astro => {
            patterns.extend_from_slice(&[
                "node_modules",
                ".next",
                ".nuxt",
                "dist",
                "coverage",
                ".cache",
            ]);
        }
        Stack::Python | Stack::Django | Stack::FastApi => {
            patterns.extend_from_slice(&[
                "__pycache__",
                "*.pyc",
                ".venv",
                "venv",
                ".pytest_cache",
            ]);
        }
        Stack::Go => {
            patterns.push("vendor");
        }
        Stack::Rust => {
            patterns.push("target");
        }
        _ => {}
    }

    patterns.join("\n")
}

The .env* pattern is particularly important. Environment files often contain secrets -- database passwords, API keys, cloud credentials. Including them in the Docker build context means they end up in the image layer history, readable by anyone with access to the image. sh0 excludes them by default.

Build Context: In-Memory Tar Archives

Docker builds require a tar archive as the build context. sh0 creates this archive in memory, using the Rust tar crate:

rustpub fn create_build_context(
    path: &Path,
    dockerfile_content: &str,
    ignore_patterns: &[&str],
) -> Result<Vec<u8>, BuilderError> {
    let mut archive = tar::Builder::new(Vec::new());

    // Inject generated Dockerfile
    let dockerfile_bytes = dockerfile_content.as_bytes();
    let mut header = tar::Header::new_gnu();
    header.set_path("Dockerfile")?;
    header.set_size(dockerfile_bytes.len() as u64);
    header.set_mode(0o644);
    header.set_cksum();
    archive.append(&header, dockerfile_bytes)?;

    // Walk project directory, respecting .dockerignore
    for entry in walkdir::WalkDir::new(path) {
        let entry = entry?;
        let rel_path = entry.path().strip_prefix(path)?;

        if should_ignore(rel_path, ignore_patterns) {
            continue;
        }

        // ... append file to archive
    }

    Ok(archive.into_inner()?)
}

The generated Dockerfile is injected directly into the tar archive. The project never needs a Dockerfile on disk. This is cleaner than writing a temporary file and hoping cleanup happens correctly.

The .git/ directory is always excluded, regardless of the .dockerignore patterns. A .git directory can be hundreds of megabytes and has no place in a build context.

The Orchestrator: detect, generate, build

The Builder struct ties everything together in a three-step pipeline:

rustimpl Builder {
    pub async fn build(&self, opts: BuildOpts) -> Result<BuildOutput, BuilderError> {
        let start = Instant::now();

        // Step 1: Detect stack
        let detected = detect(&opts.source_path);

        // Step 2: Generate Dockerfile (or use existing)
        let dockerfile = match detected.stack {
            Stack::Dockerfile => {
                std::fs::read_to_string(opts.source_path.join("Dockerfile"))?
            }
            stack if stack.needs_dockerfile() => {
                return Err(BuilderError::NeedsDockerfile(stack.label().into()));
            }
            _ => generate_dockerfile(&detected),
        };

        // Step 3: Create context and build
        let ignore = generate_dockerignore(&detected.stack);
        let context = create_build_context(&opts.source_path, &dockerfile, &ignore)?;
        let image_id = self.docker.build_image(&context, &opts.image_tag).await?;

        Ok(BuildOutput {
            image_id,
            stack: detected,
            duration: start.elapsed(),
            // ...
        })
    }
}

The pipeline is linear and transparent. Each step has a clear input and output. If any step fails, the error propagates with full context: "stack detection failed because the directory is empty," "Dockerfile generation is not supported for PHP -- provide your own," "Docker build failed with exit code 1 and these logs."

Testing: 23 Units, No Docker Required

The detection logic is thoroughly tested:

rust#[test]
fn test_detect_nextjs() {
    let dir = tempdir().unwrap();
    File::create(dir.path().join("package.json")).unwrap();
    File::create(dir.path().join("next.config.js")).unwrap();

    let detected = detect(dir.path());
    assert_eq!(detected.stack, Stack::NextJs);
}

#[test]
fn test_dockerfile_takes_priority() {
    let dir = tempdir().unwrap();
    File::create(dir.path().join("package.json")).unwrap();
    File::create(dir.path().join("next.config.js")).unwrap();
    File::create(dir.path().join("Dockerfile")).unwrap();

    let detected = detect(dir.path());
    assert_eq!(detected.stack, Stack::Dockerfile); // Not NextJs
}

#[test]
fn test_detect_yarn() {
    let dir = tempdir().unwrap();
    File::create(dir.path().join("package.json")).unwrap();
    File::create(dir.path().join("yarn.lock")).unwrap();

    let detected = detect(dir.path());
    assert_eq!(detected.package_manager, Some(PackageManager::Yarn));
}

Each test creates a temporary directory with the minimum files needed to trigger a specific detection path. No real projects, no fixtures, no Docker. The tests run in parallel in under a second.

Why This Matters

Stack detection is the feature that turns a deployment tool into a platform. Without it, every user has to write a Dockerfile, configure a build command, and specify a port. With it, they push code and it works.

Getting it wrong is worse than not having it at all. A misdetected stack means a failed build, a confusing error message, and a user who loses trust. That is why the detection is conservative: specific signals over vague ones, config files over dependency scanning, and "you need to provide a Dockerfile" over a bad guess.

The build engine is also where sh0's health check system plugs in. After detecting the stack but before building the image, sh0 runs 34 static analysis rules to catch deployment mistakes. That is the subject of the next article.

This is Part 3 of the "How We Built sh0.dev" series.

Series Navigation: - [1] Day Zero: 10 Rust Crates in 24 Hours - [2] Writing a Docker Engine Client from Scratch in Rust - [3] Auto-Detecting 19 Tech Stacks from Source Code (you are here) - [4] 34 Rules to Catch Deployment Mistakes Before They Happen

Auto-Detecting 19 Tech Stacks from Source Code

The 19 Stacks

The Priority Rule: Dockerfile Always Wins

Node.js: The Sub-Detection Challenge

Python: The wsgi.py Trick

Dockerfile Generation: 12 Production-Grade Templates

The .dockerignore: Preventing Build Context Bloat

Build Context: In-Memory Tar Archives

The Orchestrator: detect, generate, build

Testing: 23 Units, No Docker Required

Why This Matters

Responses

Related Articles

Thirteen Agents, Forty-Three Minutes: The First Claude Fable 5 Workflow Session, And What A Deterministic Orchestration Script Changes About Multi-Agent Builds

The gate caught its own drift: one day inside CASP with Claude Fable 5

The CASP Transplant: How The Six-File Discipline Moved From Conductor To An Anti-Fraud Transport ERP, What The /next Skill Adds When The Operator Just Types 'next', And Why The Cost Of CASP Drift Rises When The Project Is Someone Else's Cash