Last week I built a publishing pipeline. My chat agent, Claudia (an in-house thing I run on a Mac Mini), would draft articles, generate a featured image, and push the result to my CMS. It worked. The articles showed up.
Then I tried to edit one in chat. "Tighten the second paragraph of the AI automation post." Claudia had no idea what I was talking about.
This is the part of building tools that almost nobody writes about. The first version of any feature works in a straight line. You wire input to output and the demo lands. But the moment you try to use the thing, you discover that you've built a function and not a workflow. The function doesn't remember anything. It doesn't know what it produced last time. It has no sense of return.
The fix took most of a day. I want to write down what it actually looked like, because the shape of it generalizes.
What was missing
The publisher created rows in a content_entries table: title, slug, external post ID, preview URL, all the stuff a CMS hands back when you push a draft. Good. That row was the canonical record of "this article exists."
But the chat surface had no connection to that table. Every conversation started from zero. There was no way to say "open the thread for this specific article" because conversations didn't know what an article was. They had a projectId and a list of messages and nothing else.
The smallest fix that would solve the problem was a foreign key. Add contentEntryId to the conversations table. Nullable, so existing threads stay untouched. When the user clicks "Chat about this →" from the article list, look up an existing conversation with that contentEntryId; if none, create one. Same thread, every time, for that article.
That part was a Drizzle migration and about thirty lines of route code.
The interesting part
The interesting part was what to put in the system prompt.
An article-scoped thread is not a generic chat. It's a conversation about a specific piece of writing. So at request time, the context builder reads the markdown body off disk and injects it into the prompt. Title, slug, site, post ID, preview URL, and the actual current text of the article. Every turn.
The cost of this is a few thousand tokens per request. The benefit is that the chat agent now has the article in working memory. I can quote a phrase and she knows where it lives in the piece. I can say "the second paragraph" and there is a second paragraph she can see.
This is the move that makes the difference between a chat agent that talks about your work and a chat agent that's actually inside it.
Editing without buttons
Here's where the architecture stopped being obvious. Once Claudia could see the article and could discuss edits, what does "applying" an edit look like?
The obvious answer is a button in the UI. "Apply edit." Pop a diff modal, let me approve, fire a write. That's the path most apps take.
I didn't want it. The button forces a UI for every workflow, and I have a lot of workflows. I wanted the edit to happen the same way every other action happens: through chat.
So the flow looks like this. User asks for an edit. The fast chat tier (a one-shot claude --print call, no tools, no state) drafts the edit inline and asks for confirmation. User replies "apply" or "ship it" or "do it." The fast tier emits a structured action token in its response: [DELEGATE:HeadstringEditor: edit drafts/<site>/<slug>/draft.md to ... then re-run the publisher].
That token gets parsed out of the chat response and spawns a worker. The worker is a real Claude Code instance: tmux, full filesystem, all the tools. It reads the file, makes the edit, runs the publish script, and posts the result back into the conversation as a new message. "Updated draft, preview here."
Two tiers. Tier 1 is cheap and fast and stateless. Tier 2 is expensive and slow and can do anything. The chat surface is Tier 1; the worker is Tier 2. The handoff is a parseable string in the model's output.
This pattern is doing a lot of work. It lets me keep chat responsive while still being able to take real action. It keeps state in the conversation, not in the UI. It means every edit, every publish, every change has a written record of why, because the user's request and the agent's draft and the user's confirmation are all just messages in the same thread.
What it feels like to use
I open the articles page. I see a list of everything published, grouped by site. I click into a piece from yesterday. The conversation that's there is the one I had when I first drafted it, plus whatever I've added since. The agent already knows the article. The body is in the prompt every turn.
"This intro is doing too much work. Cut the second sentence."
She rewrites the first paragraph, shows me the diff, asks if I want to apply it.
"Apply."
About forty seconds later: "Updated. Preview here."
That's it. No file picker, no IDE, no copy-paste between tools. The thread is the editorial conversation, and the agent is sitting inside it.
The piece I want to remember from this week is not the foreign key or the system prompt or the delegate pattern. It's that the shape of the architecture turned out to mirror the shape of the work. Articles get threads because articles are conversations across time. Drafts live on disk because drafts have a life outside the database. Edits happen in chat because that's where the thinking happens. Workers spawn from chat because chat is where the decision gets made.
When the architecture matches the work, the tools mostly disappear, and what's left is the thing you were actually trying to do.

