A developer pushes their code. Thirty seconds later, it is running in a container. They did not write a Dockerfile. They did not configure a build pipeline. They did not specify which runtime, which package manager, or which port to expose.
This is the promise of a modern PaaS, and making it work requires solving a deceptively hard problem: looking at a directory of source files and figuring out what the project actually is.
sh0's build engine handles this in pure Rust -- no heuristics service, no LLM, no external API call. A priority-based detection algorithm examines the project, identifies the stack, generates a production-grade Dockerfile with multi-stage builds, and creates an optimised build context. All of it happens in milliseconds, and all of it was built on Day Zero.
The 19 Stacks
The Stack enum defines every technology sh0 can detect and build:
pub enum Stack {
Dockerfile, // User-provided Dockerfile (highest priority)
NextJs,
Nuxt,
SvelteKit,
Astro,
NodeGeneric,
Bun,
Python,
Django,
FastApi,
Go,
Rust,
JavaMaven,
JavaGradle,
Php,
DotNet,
Ruby,
StaticSite,
Unknown, // Fallback (lowest priority)
}Each variant carries metadata through trait methods:
impl Stack {
pub fn label(&self) -> &str {
match self {
Stack::NextJs => "Next.js",
Stack::FastApi => "FastAPI",
Stack::SvelteKit => "SvelteKit",
// ...
}
}pub fn default_port(&self) -> u16 { match self { Stack::NextJs => 3000, Stack::Django => 8000, Stack::FastApi => 8000, Stack::Go => 8080, Stack::Rust => 8080, Stack::StaticSite => 80, // ... } }
pub fn needs_dockerfile(&self) -> bool { matches!(self, Stack::Php | Stack::DotNet | Stack::Ruby | Stack::Unknown) } } ```
The needs_dockerfile() method is honest: for PHP, .NET, and Ruby, the build configurations are too varied for a generic template to be reliable. sh0 tells the user "I detected your stack, but you need to provide a Dockerfile" rather than generating a bad one. Honesty beats magic.
The Priority Rule: Dockerfile Always Wins
The detection algorithm has one inviolable rule: if a Dockerfile exists in the project root, that is the stack. Period.
pub fn detect(path: &Path) -> DetectedStack {
// Rule 1: User-provided Dockerfile always takes priority
if path.join("Dockerfile").exists() {
return DetectedStack {
stack: Stack::Dockerfile,
..Default::default()
};
}// Rule 2: Detect by signature files if path.join("package.json").exists() { return detect_node(path); } if path.join("go.mod").exists() { return detect_go(path); } if path.join("Cargo.toml").exists() { return detect_rust(path); } // ... remaining stacks
DetectedStack { stack: Stack::Unknown, ..Default::default() } } ```
This is a deliberate design choice. A user who has written their own Dockerfile has made a conscious decision about how their application should be built. Overriding that decision with auto-detection would be presumptuous and, in many cases, wrong. Their Dockerfile might have custom system dependencies, specific compiler flags, or a proprietary base image that no detection algorithm could guess.
The priority ordering after Dockerfile is based on signature file specificity. We check package.json before looking for Python files because a project could have both (e.g., a Python backend with a JavaScript build step for assets). The most specific match wins.
Node.js: The Sub-Detection Challenge
Node.js is by far the most complex detection target. The ecosystem has four major package managers, dozens of frameworks, and several meta-frameworks, each requiring different build commands and runtime configurations.
The detection proceeds in layers:
fn detect_node(path: &Path) -> DetectedStack {
let pkg_json = read_package_json(path);// Layer 1: Package manager detection let package_manager = if path.join("bun.lockb").exists() { PackageManager::Bun } else if path.join("pnpm-lock.yaml").exists() { PackageManager::Pnpm } else if path.join("yarn.lock").exists() { PackageManager::Yarn } else { PackageManager::Npm };
// Layer 2: Meta-framework detection (config files) if path.join("next.config.js").exists() || path.join("next.config.mjs").exists() || path.join("next.config.ts").exists() { return DetectedStack { stack: Stack::NextJs, package_manager: Some(package_manager), framework: Some("next".into()), build_command: Some(build_cmd(&package_manager, "build")), start_command: Some("node server.js".into()), ..Default::default() }; }
if path.join("svelte.config.js").exists() || path.join("svelte.config.ts").exists() { return DetectedStack { stack: Stack::SvelteKit, package_manager: Some(package_manager), framework: Some("sveltekit".into()), build_command: Some(build_cmd(&package_manager, "build")), start_command: Some("node build/index.js".into()), ..Default::default() }; }
// Layer 3: Framework detection from dependencies if let Some(ref pkg) = pkg_json { if has_dependency(pkg, "express") { return DetectedStack { stack: Stack::NodeGeneric, framework: Some("express".into()), // ... }; } // fastify, hono, koa, nestjs... }
// Fallback: generic Node.js DetectedStack { stack: Stack::NodeGeneric, package_manager: Some(package_manager), ..Default::default() } } ```
Notice the ordering. Meta-frameworks (Next.js, Nuxt, SvelteKit, Astro) are detected by config files, not by package.json dependencies. This matters because config files are unambiguous -- if next.config.js exists, this is a Next.js project -- while dependency detection is fuzzy (a project might import express as a sub-dependency without being an Express app).
The build_cmd helper generates the correct invocation for each package manager:
fn build_cmd(pm: &PackageManager, script: &str) -> String {
match pm {
PackageManager::Npm => format!("npm run {}", script),
PackageManager::Yarn => format!("yarn {}", script),
PackageManager::Pnpm => format!("pnpm run {}", script),
PackageManager::Bun => format!("bun run {}", script),
}
}Small function. But getting it wrong means pnpm run build fails because someone used yarn build syntax.
Python: The wsgi.py Trick
Python detection has its own complexity. A Python project could be Django, FastAPI, Flask, or something entirely custom. We use file-based heuristics:
fn detect_python(path: &Path) -> DetectedStack {
// Django: look for wsgi.py or manage.py
if path.join("manage.py").exists() || find_wsgi(path) {
return DetectedStack {
stack: Stack::Django,
start_command: Some("gunicorn config.wsgi:application --bind 0.0.0.0:8000".into()),
..Default::default()
};
}// FastAPI: look for main.py/app.py with fastapi import if detect_fastapi_entry(path) { return DetectedStack { stack: Stack::FastApi, start_command: Some("uvicorn main:app --host 0.0.0.0 --port 8000".into()), ..Default::default() }; }
// Generic Python DetectedStack { stack: Stack::Python, ..Default::default() } } ```
The Django detection via wsgi.py is robust because every Django project has one, and non-Django projects almost never do. It is a better signal than checking requirements.txt for django (which might be a sub-dependency or an unused leftover).
Dockerfile Generation: 12 Production-Grade Templates
Once the stack is detected, the build engine generates a Dockerfile optimised for that stack. The templates are not simple "FROM node, COPY, RUN" scripts. They use multi-stage builds, Alpine or slim base images, non-root users, and build cache optimisation.
Here is the Node.js generic template (simplified):
fn dockerfile_node(detected: &DetectedStack) -> String {
let pm = detected.package_manager.as_ref().unwrap_or(&PackageManager::Npm);
let install_cmd = match pm {
PackageManager::Npm => "npm ci --only=production",
PackageManager::Yarn => "yarn install --frozen-lockfile --production",
PackageManager::Pnpm => "pnpm install --frozen-lockfile --prod",
PackageManager::Bun => "bun install --production",
};format!(r#" FROM node:20-alpine AS builder WORKDIR /app COPY package*.json ./ RUN {install_cmd} COPY . . RUN npm run build
FROM node:20-alpine WORKDIR /app RUN addgroup -S app && adduser -S app -G app COPY --from=builder /app . USER app EXPOSE {port} CMD ["node", "dist/index.js"] "#, install_cmd = install_cmd, port = detected.stack.default_port()) } ```
Key details:
- Multi-stage build: The
builderstage installs dependencies and builds; the final stage copies only what is needed. This keeps the production image small. npm ciovernpm install:ciuses the lockfile exactly, ensuring reproducible builds.installmight update dependencies.--frozen-lockfile: Same principle for yarn and pnpm.- Non-root user: The
appuser prevents the container process from running as root, a basic security measure. - Alpine base: Roughly 50MB vs 900MB for the full Node image.
The Next.js template is more sophisticated, leveraging Next's standalone output mode:
fn dockerfile_nextjs(detected: &DetectedStack) -> String {
format!(r#"
FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN npm ciFROM node:20-alpine AS builder WORKDIR /app COPY --from=deps /app/node_modules ./node_modules COPY . . RUN npm run build
FROM node:20-alpine WORKDIR /app RUN addgroup -S app && adduser -S app -G app COPY --from=builder /app/.next/standalone ./ COPY --from=builder /app/.next/static ./.next/static COPY --from=builder /app/public ./public USER app EXPOSE 3000 CMD ["node", "server.js"] "#) } ```
Three stages: dependency installation, build, and production. The final image contains only the standalone server, static assets, and public files -- no node_modules, no source code, no build tooling.
For Go, the template goes further, using scratch as the final base image:
fn dockerfile_go(_detected: &DetectedStack) -> String {
r#"
FROM golang:1.22-alpine AS builder
WORKDIR /app
COPY go.mod go.sum ./
RUN go mod download
COPY . .
RUN CGO_ENABLED=0 GOOS=linux go build -ldflags="-s -w" -o /app/server .FROM scratch COPY --from=builder /app/server /server EXPOSE 8080 ENTRYPOINT ["/server"] "#.to_string() } ```
The final image literally contains nothing except the compiled binary. No shell, no OS utilities, no attack surface. The -ldflags="-s -w" strips debug symbols and DWARF information, further reducing binary size.
The .dockerignore: Preventing Build Context Bloat
A common deployment mistake: the Docker build context includes node_modules, .git, or other large directories, making builds slow and images unnecessarily large. sh0 generates a .dockerignore file tailored to the detected stack:
fn generate_dockerignore(stack: &Stack) -> String {
let mut patterns = vec![
".git",
".gitignore",
"*.md",
"LICENSE",
".env*",
".DS_Store",
"Thumbs.db",
];match stack { Stack::NodeGeneric | Stack::NextJs | Stack::SvelteKit | Stack::Nuxt | Stack::Astro => { patterns.extend_from_slice(&[ "node_modules", ".next", ".nuxt", "dist", "coverage", ".cache", ]); } Stack::Python | Stack::Django | Stack::FastApi => { patterns.extend_from_slice(&[ "__pycache__", "*.pyc", ".venv", "venv", ".pytest_cache", ]); } Stack::Go => { patterns.push("vendor"); } Stack::Rust => { patterns.push("target"); } _ => {} }
patterns.join("\n") } ```
The .env* pattern is particularly important. Environment files often contain secrets -- database passwords, API keys, cloud credentials. Including them in the Docker build context means they end up in the image layer history, readable by anyone with access to the image. sh0 excludes them by default.
Build Context: In-Memory Tar Archives
Docker builds require a tar archive as the build context. sh0 creates this archive in memory, using the Rust tar crate:
pub fn create_build_context(
path: &Path,
dockerfile_content: &str,
ignore_patterns: &[&str],
) -> Result<Vec<u8>, BuilderError> {
let mut archive = tar::Builder::new(Vec::new());// Inject generated Dockerfile let dockerfile_bytes = dockerfile_content.as_bytes(); let mut header = tar::Header::new_gnu(); header.set_path("Dockerfile")?; header.set_size(dockerfile_bytes.len() as u64); header.set_mode(0o644); header.set_cksum(); archive.append(&header, dockerfile_bytes)?;
// Walk project directory, respecting .dockerignore for entry in walkdir::WalkDir::new(path) { let entry = entry?; let rel_path = entry.path().strip_prefix(path)?;
if should_ignore(rel_path, ignore_patterns) { continue; }
// ... append file to archive }
Ok(archive.into_inner()?) } ```
The generated Dockerfile is injected directly into the tar archive. The project never needs a Dockerfile on disk. This is cleaner than writing a temporary file and hoping cleanup happens correctly.
The .git/ directory is always excluded, regardless of the .dockerignore patterns. A .git directory can be hundreds of megabytes and has no place in a build context.
The Orchestrator: detect, generate, build
The Builder struct ties everything together in a three-step pipeline:
impl Builder {
pub async fn build(&self, opts: BuildOpts) -> Result<BuildOutput, BuilderError> {
let start = Instant::now();// Step 1: Detect stack let detected = detect(&opts.source_path);
// Step 2: Generate Dockerfile (or use existing) let dockerfile = match detected.stack { Stack::Dockerfile => { std::fs::read_to_string(opts.source_path.join("Dockerfile"))? } stack if stack.needs_dockerfile() => { return Err(BuilderError::NeedsDockerfile(stack.label().into())); } _ => generate_dockerfile(&detected), };
// Step 3: Create context and build let ignore = generate_dockerignore(&detected.stack); let context = create_build_context(&opts.source_path, &dockerfile, &ignore)?; let image_id = self.docker.build_image(&context, &opts.image_tag).await?;
Ok(BuildOutput { image_id, stack: detected, duration: start.elapsed(), // ... }) } } ```
The pipeline is linear and transparent. Each step has a clear input and output. If any step fails, the error propagates with full context: "stack detection failed because the directory is empty," "Dockerfile generation is not supported for PHP -- provide your own," "Docker build failed with exit code 1 and these logs."
Testing: 23 Units, No Docker Required
The detection logic is thoroughly tested:
#[test]
fn test_detect_nextjs() {
let dir = tempdir().unwrap();
File::create(dir.path().join("package.json")).unwrap();
File::create(dir.path().join("next.config.js")).unwrap();let detected = detect(dir.path()); assert_eq!(detected.stack, Stack::NextJs); }
#[test] fn test_dockerfile_takes_priority() { let dir = tempdir().unwrap(); File::create(dir.path().join("package.json")).unwrap(); File::create(dir.path().join("next.config.js")).unwrap(); File::create(dir.path().join("Dockerfile")).unwrap();
let detected = detect(dir.path()); assert_eq!(detected.stack, Stack::Dockerfile); // Not NextJs }
#[test] fn test_detect_yarn() { let dir = tempdir().unwrap(); File::create(dir.path().join("package.json")).unwrap(); File::create(dir.path().join("yarn.lock")).unwrap();
let detected = detect(dir.path()); assert_eq!(detected.package_manager, Some(PackageManager::Yarn)); } ```
Each test creates a temporary directory with the minimum files needed to trigger a specific detection path. No real projects, no fixtures, no Docker. The tests run in parallel in under a second.
Why This Matters
Stack detection is the feature that turns a deployment tool into a platform. Without it, every user has to write a Dockerfile, configure a build command, and specify a port. With it, they push code and it works.
Getting it wrong is worse than not having it at all. A misdetected stack means a failed build, a confusing error message, and a user who loses trust. That is why the detection is conservative: specific signals over vague ones, config files over dependency scanning, and "you need to provide a Dockerfile" over a bad guess.
The build engine is also where sh0's health check system plugs in. After detecting the stack but before building the image, sh0 runs 34 static analysis rules to catch deployment mistakes. That is the subject of the next article.
---
This is Part 3 of the "How We Built sh0.dev" series.
Series Navigation: - [1] Day Zero: 10 Rust Crates in 24 Hours - [2] Writing a Docker Engine Client from Scratch in Rust - [3] Auto-Detecting 19 Tech Stacks from Source Code (you are here) - [4] 34 Rules to Catch Deployment Mistakes Before They Happen