How We Unified AI Tool Calling With One Protocol Change

Two weeks ago, we built an AI assistant into sh0 that could query your server, restart your apps, and generate configuration files. It worked. But the architecture had a problem I knew we would have to solve.

The tool-calling loop was complex. Here is what happened every time Claude wanted to check your server status:

You send a message to the gateway at sh0.dev/api/ai/chat
The gateway calls Claude with 9 tool definitions
Claude responds with a tool_use block: "I want to call get_server_status"
The gateway emits a tool_call SSE event to the dashboard
The dashboard JavaScript catches the event, calls the local sh0-core REST API
The dashboard sends the result back to the gateway as tool_results
The gateway calls Claude again with the results
Claude responds with text
Steps 3-8 repeat for every tool call in the conversation

Nine steps. Three network round-trips. Tool definitions duplicated in two files -- one on the gateway, one on the dashboard. The dashboard had a recursive runStreamLoop function that managed the entire agentic loop client-side. It worked, but it was the kind of architecture that makes you nervous when you think about edge cases.

The Protocol That Already Existed

While we were building this, sh0-core already had a full MCP server. Twenty tools, proper JSON-RPC 2.0, streamable HTTP transport, API key scoping, confirmation tokens for destructive operations. We built it across three phases, audited it twice per phase. It was production-ready.

And Claude's API had something called the MCP Connector -- a beta feature that lets you tell Claude "here is an MCP server, connect to it yourself."

The solution was obvious. Instead of the gateway defining tools and the dashboard executing them, just tell Claude where the MCP server is. Let Claude handle tool discovery and execution directly. The entire client-side loop disappears.

What the Code Looks Like

The old way:

typescriptconst response = await client.messages.create({
  model: modelString,
  max_tokens: 4096,
  system: systemPrompt,
  messages: apiMessages,
  tools: SH0_TOOLS,  // 9 tool definitions, manually maintained
  stream: true,
});
// Then: parse tool_use blocks, emit SSE events, wait for dashboard
// to execute, receive results, call Claude again...

The new way:

typescriptconst response = await client.beta.messages.create({
  model: modelString,
  max_tokens: 4096,
  system: systemPrompt,
  messages: apiMessages,
  mcp_servers: [{
    type: 'url',
    url: `${instanceConfig.instanceUrl}/api/v1/mcp`,
    name: 'sh0',
    authorization_token: decryptedInstanceKey,
  }],
  tools: [
    { type: 'mcp_toolset', mcp_server_name: 'sh0' },
    ...GATEWAY_ONLY_TOOLS,
  ],
  betas: ['mcp-client-2025-11-20'],
  stream: true,
});
// Done. Claude discovers tools via MCP, calls them directly.
// The stream contains text + informational mcp_tool_use/result blocks.

Claude connects to the MCP server, discovers all 20 tools via tools/list, and calls them directly. The response stream includes mcp_tool_use and mcp_tool_result content blocks that we forward to the dashboard for display purposes only. No execution. No round-trips. No recursive loop.

The dashboard shows processing steps -- "Checking server status", "Listing apps" -- but it is purely cosmetic. The actual work happens between Claude and the MCP server, two machines talking to each other while the user watches text stream in.

The Decisions That Mattered

Gateway-handled tools stay as regular tools

Not everything is an MCP tool. suggest_actions generates follow-up action chips. generate_config_file creates downloadable configuration cards. These are UI features handled by the gateway itself -- they do not touch the sh0-core server.

We keep these as regular Anthropic tool definitions alongside the MCP toolset. When Claude calls them, the gateway processes them and emits the appropriate SSE events. This requires an inner loop -- Claude stops with stop_reason: 'tool_use', we provide the tool results, Claude continues. At most one extra round-trip, and only when Claude suggests follow-up actions.

The legacy path stays

MCP Connector only works when the sh0 instance is reachable via HTTPS from Anthropic's servers. If you are running sh0 on a local machine, behind a firewall, or on a network that Claude cannot reach, the MCP path fails.

When it fails, the gateway emits an mcp_fallback SSE event with the reason, resets its state, and runs the entire request through the legacy tool-calling path. The dashboard handles both modes transparently -- if mcp_tool_use events arrive, it shows them as processing steps; if tool_call events arrive instead, the recursive agentic loop kicks in as before.

No silent failures. The user knows what happened and still gets a working response.

Token accounting is unchanged

The MCP Connector response includes total token usage just like a regular response. message_start has input_tokens, message_delta has output_tokens. MCP tool call overhead -- the tokens Claude uses to format requests and parse responses from the MCP server -- is included in these counts automatically. The wallet deduction logic did not change at all.

What We Shipped

The implementation touches both codebases:

sh0-website (gateway): - New Prisma model for instance configuration (encrypted API key storage) - New API route for managing instance config - Chat endpoint split into streamMcpPath() and streamLegacyPath() - System prompt variant for MCP mode - Tool definitions split: client tools, gateway tools, combined

sh0-core dashboard: - Three new SSE event handlers (informational only) - MCP tool labels and icons for all 20 server tools - Bug fix: tool_result blocks now properly tracked in conversation history

The dashboard's agentic loop code was not removed. Not a single line. It simply never executes when MCP is active because the events that trigger it are never emitted. This is the kind of simplification I find most satisfying -- not deleting code, but making it irrelevant through a better abstraction.

The Architectural Lesson

The old architecture had the right separation of concerns but the wrong execution model. The gateway knew about tools. The dashboard executed tools. Claude orchestrated. Every layer was doing its job, but the choreography between them was fragile.

The MCP Connector collapses the execution model. Claude and the MCP server handle the entire tool lifecycle. The gateway becomes a pass-through. The dashboard becomes a display. The complexity does not disappear -- it moves to a protocol boundary where it belongs.

This is what protocol design is for. Not to make things simpler in theory, but to move complexity to where it can be managed by machines instead of maintained by engineers.

Twenty tools. Zero client-side execution. One protocol.

How We Unified AI Tool Calling With One Protocol Change

The Protocol That Already Existed

What the Code Looks Like

The Decisions That Mattered

Gateway-handled tools stay as regular tools

The legacy path stays

Token accounting is unchanged

What We Shipped

The Architectural Lesson

Responses

Related Articles

Don't Make the Founder Open Chrome

The Agents That Arrived After The Commit

Claude Fable 5 Field Notes For Senior Developers: Every Capability Thirteen Agents Actually Used To Ship A Production Website In One Session