Real-Time Logs: WebSocket Streaming from Docker Containers

Logs are the first thing you check when a deployment fails. The second thing you check when a service misbehaves. The third thing you check when a customer reports something weird. If your PaaS makes you SSH into a server and run docker logs -f, you have already lost ten seconds of context and a developer's patience.

From Phase 12 onward, sh0 has had real-time log streaming in the browser. Open an app's Logs tab, and output appears as it happens -- no refresh, no polling, no waiting. The implementation spans three layers: a Rust WebSocket endpoint that reads from Docker, a transport layer with JWT authentication and auto-reconnect, and a Svelte component that renders logs in a terminal-style viewer.

This is how we built each layer, and the security decision that changed how we pass authentication tokens over WebSockets.

Layer 1: The WebSocket Log Endpoint

The log streaming endpoint lives in the Rust API server. Its job is simple in concept: connect to Docker's log stream for a container and forward the output to a WebSocket client.

Polling vs. Streaming

Docker offers two ways to read container logs: a one-shot GET /containers/{id}/logs with follow=false, and a streaming GET /containers/{id}/logs with follow=true. The streaming approach holds an open HTTP connection and sends log lines as they are produced.

We chose a polling approach with timestamp tracking. Every two seconds, the handler calls Docker's logs API with since=<last_timestamp>, retrieves any new lines, and sends them over the WebSocket. If there are no new lines, it sends nothing -- the WebSocket stays silent.

Why polling instead of a persistent Docker stream? Three reasons:

Resource management. A streaming connection to Docker is a held file descriptor. If 20 users are watching logs for 20 containers, that is 20 persistent connections to the Docker daemon. Polling opens a connection, reads, closes it, and the daemon is free.

Reconnect simplicity. If the Docker daemon restarts or a container is redeployed, a streaming connection breaks and requires complex reconnection logic. A polling loop simply retries on the next tick.

Timestamp deduplication. By tracking the timestamp of the last log line received, we guarantee no duplicates across poll cycles. The since parameter is an exclusive lower bound -- Docker returns only lines after that timestamp.

rustasync fn stream_logs(
    ws: &mut WebSocket,
    docker: &DockerClient,
    container_id: &str,
) -> Result<()> {
    let mut last_timestamp = Utc::now() - Duration::seconds(300); // Start with last 5 min

    loop {
        let logs = docker.container_logs(
            container_id,
            &LogsQuery {
                stdout: true,
                stderr: true,
                since: Some(last_timestamp),
                timestamps: true,
                tail: None,
            }
        ).await?;

        for line in &logs {
            if let Some(ts) = line.timestamp {
                if ts > last_timestamp {
                    last_timestamp = ts;
                }
            }
            ws.send(Message::Text(line.message.clone())).await?;
        }

        tokio::time::sleep(Duration::from_secs(2)).await;
    }
}

The initial since value is set to five minutes before the current time. This means when you open the Logs tab, you immediately see the last five minutes of output -- enough context to understand what is happening without flooding the browser with hours of history.

Container Stats

Alongside log streaming, we added a container stats endpoint that returns real-time CPU and memory usage:

GET /api/v1/apps/:id/stats

This endpoint calls Docker's GET /containers/{id}/stats?stream=false (single-shot mode), calculates CPU percentage from the delta between cpu_stats and precpu_stats, and returns memory usage as both absolute bytes and a percentage of the container's limit.

The monitoring page uses this endpoint with a 15-second auto-refresh to display CPU and memory gauges. It is not log streaming per se, but it shares the same infrastructure concern: getting live data from Docker to the browser.

Layer 2: WebSocket Authentication

WebSockets do not support custom HTTP headers on the initial handshake in browsers. The Authorization: Bearer <token> pattern that works for REST APIs does not work here. Browsers' WebSocket API only lets you set the URL and the protocol.

The Naive Approach (and Why We Changed It)

Our first implementation passed the JWT token as a URL query parameter:

ws://host/api/v1/apps/:id/logs/stream?token=eyJhbG...

This works, but it has a security problem: the token appears in server access logs, browser history, and any proxy logs between the client and server. For a self-hosted tool where the server and client are often on the same machine, this risk is modest. But it is still a bad habit.

The Sec-WebSocket-Protocol Trick

We moved the JWT token to the Sec-WebSocket-Protocol header. This is a legitimate WebSocket header that browsers allow you to set during the handshake:

typescript// Frontend
const ws = new WebSocket(url, [`bearer-${token}`]);

rust// Backend
fn extract_token_from_protocol(headers: &HeaderMap) -> Option<String> {
    headers
        .get("sec-websocket-protocol")
        .and_then(|v| v.to_str().ok())
        .and_then(|protocols| {
            protocols
                .split(',')
                .map(|p| p.trim())
                .find(|p| p.starts_with("bearer-"))
                .map(|p| p.strip_prefix("bearer-").unwrap().to_string())
        })
}

The server extracts the token from the protocol header, validates it as a JWT, and proceeds with the WebSocket upgrade if valid. The response includes the same protocol in the Sec-WebSocket-Protocol response header, completing the subprotocol negotiation.

This approach keeps the token out of URLs and logs. It is a well-known pattern used by Hasura, Supabase, and other platforms that need WebSocket authentication in browsers.

On Failure

If the token is missing, expired, or invalid, the server rejects the WebSocket upgrade with a 401 status code. The browser receives a connection failure, and the frontend's auto-reconnect logic kicks in -- but only after checking whether the auth store still has a valid token. If the token has expired, the client redirects to login instead of retrying forever.

Layer 3: Auto-Reconnect with Exponential Backoff

WebSocket connections drop. Networks are unreliable. Servers restart. Containers redeploy. The frontend must handle disconnections gracefully.

The reconnect logic uses exponential backoff starting at 1 second and capping at 30 seconds:

typescriptlet reconnectDelay = $state(1000);
let reconnectTimer: ReturnType<typeof setTimeout>;

function connect() {
  ws = new WebSocket(url, [`bearer-${token}`]);

  ws.onopen = () => {
    reconnectDelay = 1000; // Reset on successful connection
  };

  ws.onclose = (event) => {
    if (event.code === 4001) {
      // Auth failure -- do not reconnect, redirect to login
      goto('/login');
      return;
    }

    reconnectTimer = setTimeout(() => {
      reconnectDelay = Math.min(reconnectDelay * 2, 30000);
      connect();
    }, reconnectDelay);
  };
}

function disconnect() {
  clearTimeout(reconnectTimer);
  ws?.close();
}

The sequence is: 1s, 2s, 4s, 8s, 16s, 30s, 30s, 30s... until the connection succeeds, at which point the delay resets to 1 second. This prevents hammering the server during an outage while recovering quickly when the server comes back.

A custom close code (4001) distinguishes authentication failures from network failures. Auth failures should not trigger reconnection -- the user needs to log in again.

Layer 4: The LogViewer Component

The LogViewer.svelte component renders log lines in a terminal-style monospaced container:

svelte<script lang="ts">
  let { appId } = $props<{ appId: string }>();
  let lines = $state<string[]>([]);
  let autoScroll = $state(true);
  let logContainer: HTMLDivElement;

  const MAX_LINES = 1000;

  function appendLine(line: string) {
    lines = [...lines, line];
    if (lines.length > MAX_LINES) {
      lines = lines.slice(-MAX_LINES);
    }
    if (autoScroll) {
      tick().then(() => {
        logContainer.scrollTop = logContainer.scrollHeight;
      });
    }
  }

  function handleScroll() {
    const { scrollTop, scrollHeight, clientHeight } = logContainer;
    // Auto-scroll if within 50px of the bottom
    autoScroll = scrollHeight - scrollTop - clientHeight < 50;
  }
</script>

<div bind:this={logContainer}
     onscroll={handleScroll}
     class="h-96 overflow-y-auto bg-[var(--bg-secondary)] rounded-lg
            font-mono text-xs leading-5 p-4">
  {#each lines as line}
    <div class="whitespace-pre-wrap break-all">{line}</div>
  {/each}
</div>

The 1,000-Line Buffer

Without a buffer limit, a chatty service could push the browser's memory usage into the gigabytes. The MAX_LINES constant caps the rendered output at 1,000 lines. When a new line arrives and the buffer is full, the oldest line is dropped. This is a ring buffer in spirit, implemented as an array slice for simplicity.

One thousand lines covers roughly 20 minutes of output for a typical web server (at one log line per request per second). For debugging, this is more than enough. For historical analysis, users should use the one-shot log retrieval endpoint or an external log aggregation system.

Smart Auto-Scroll

The auto-scroll behaviour is the most important UX detail in the LogViewer. The rules are:

If the user is at the bottom of the log, new lines scroll into view automatically. This is the default state -- you open the Logs tab and see output streaming by.
If the user scrolls up to read older lines, auto-scroll disengages. New lines still arrive and are appended to the buffer, but the viewport stays where the user put it.
If the user scrolls back to the bottom (within 50 pixels), auto-scroll re-engages.

The 50-pixel threshold prevents a frustrating edge case: the user scrolls down to "almost the bottom" but is one pixel off, and new lines keep pushing the target further away. With the threshold, being "close enough" to the bottom counts as "at the bottom."

The tick() call before scrolling ensures that Svelte has finished updating the DOM with the new line before we measure scrollHeight. Without it, we would scroll to the bottom of the old content, and the new line would appear below the fold.

Container Stats: The Monitoring Complement

The real-time CPU and memory gauges on the Monitoring tab use a different mechanism than log streaming. Instead of WebSocket, they use a simple HTTP endpoint polled every 15 seconds:

typescript$effect(() => {
  const interval = setInterval(async () => {
    const stats = await appsApi.getStats(appId);
    cpuPercent = stats.cpu_percent;
    memoryUsed = stats.memory_usage;
    memoryLimit = stats.memory_limit;
  }, 15000);

  return () => clearInterval(interval);
});

We considered using the same WebSocket connection for stats, but the update frequency (every 15 seconds) does not justify a persistent connection. HTTP polling is simpler, more cacheable, and works correctly behind load balancers that might not support WebSocket.

The gauges are SVG circles with a stroke-dashoffset proportional to the percentage. They animate on update using a CSS transition, giving a smooth visual progression rather than a jumpy number change.

Lessons Learned

Polling with timestamp tracking beats persistent streams for logs. It is more resource-efficient, more resilient to Docker daemon restarts, and simpler to implement correctly. The two-second polling interval is fast enough that users perceive the logs as "real-time."

WebSocket authentication via Sec-WebSocket-Protocol is the right pattern. Query parameter tokens are expedient but leave secrets in logs. The protocol header approach is clean, standard-compliant, and well-supported across browsers.

Auto-scroll is a UX problem, not a technical one. The code is trivial. The design decision -- when to auto-scroll and when to stop -- is what makes the component pleasant to use. Getting the 50-pixel threshold right took more thought than the WebSocket implementation.

Buffer limits are not optional. An unbounded log viewer is a memory leak waiting to happen. One thousand lines is a good default: enough for debugging, small enough that the browser stays responsive.

Real-time logs are one of those features that users do not think about until they need them -- and then they need them urgently. Having them one click away, with authentication, reconnection, and a thoughtful viewer, is the kind of detail that separates a tool you tolerate from a tool you enjoy.

Next in the series: i18n from Day One: 5 Languages Across 105 Sessions -- why we built sh0 with five-language support from the very first dashboard session, and how we maintained it across 105 development sessions.