Local-first AI coding: why your transcripts should stay on your machine

Every time you hand a task to an AI coding agent, something leaves your editor: the prompt, the files it reads, the diffs it writes, and the running transcript of the session. The question most developers never ask is where that data actually goes — and for a lot of tools, the answer is "through someone else's cloud, on the way to the model."

"Local-first" is the answer to that question. It is not a privacy slogan; it is a concrete claim about where your code and your conversations live. This post is about what the term means, what is actually at stake, and how to check whether a tool earns the label.

what local-first actually means

A coding agent has to talk to a model, and the model runs in a data center. Local-first does not pretend otherwise. What it means is narrower and more useful: everything except the model call stays on your machine. Your repository is never uploaded. The session transcript is written to local disk, not a vendor database. The only thing that crosses the network is the request to the model provider — and ideally that request goes straight to the provider, not through an intermediary that logs it.

The opposite pattern is a hosted agent: you point it at your repo, and the code, the context, and the transcript are synced to the tool vendor's servers so their backend can orchestrate the run. That design has real upsides — zero setup, easy team sharing — but it means a third party now holds a copy of your source and a transcript of every decision the agent made.

the data trail an AI coding session leaves

It helps to be specific about what a single agent run actually produces:

The prompt and context — your task description, plus whatever files and snippets the agent pulled in to understand the codebase.
The code itself — the files the agent reads and the diffs it generates.
The terminal transcript — if the agent runs commands, the full output of those commands, which routinely includes environment variables, file paths, stack traces, and sometimes secrets printed by a misbehaving process.
The decision log — the chain of reasoning, retries, and tool calls, which is a fairly complete picture of how your software gets built.

In a local-first tool, all four land on your disk. In a hosted tool, all four are, by design, on a server you do not control. The difference is not theoretical — it is the difference between "my laptop has the transcript" and "a vendor has the transcript."

why it matters beyond paranoia

There are four reasons a working engineer should care, none of which require assuming bad intent from any vendor:

Source IP. For a lot of companies, the codebase is the asset. A blanket policy that proprietary source never leaves managed devices is common, and a hosted agent quietly violates it.

Secrets leakage. Terminal output is the messy one. A failed migration prints a connection string; a misconfigured tool dumps an API key; a stack trace carries a token. When transcripts live locally, that exposure stays on your machine. When they sync, it becomes someone else's log line.

Compliance and review. "Where is our code processed, and what is retained?" is a question a security review will ask. "It stays on the developer's device; only the model call goes out, directly to the provider" is a clean answer. "It is synced to a third-party orchestration backend" starts a much longer conversation.

Training and retention. Whether your prompts and code can be retained or used to improve a service depends entirely on whose servers they touch. The fewer parties in the path, the fewer policies you have to read and trust.

the economics nobody mentions: who you pay

Local-first has a second-order effect that shows up on the invoice. The model is the expensive part of any AI coding tool. There are two ways a tool can handle it.

It can resell you inference: the vendor holds the provider relationship, meters your usage, and bills you — usually at a markup, often behind seat tiers and monthly token caps. Or it can let you bring your own model account: the tool drives the agent, but the model calls run on your own provider key, and you pay the provider directly at cost.

The second model tends to travel with local-first architecture, because a tool that is not in the middle of your data is usually not in the middle of your billing either. The practical result is no per-seat reseller margin and no artificial usage ceiling — you pay for exactly the inference you use, to the company that produced it.

how to tell if a tool is local-first

The label is easy to claim and harder to verify. A short checklist, vendor-neutral:

Does your repository get uploaded? If the tool needs a copy of your code on its servers to function, it is not local-first.
Where do session transcripts live? Look for an explicit statement that transcripts are written to local disk and not retained server-side.
Who holds the model relationship? If you connect your own provider key, the model call is going straight to the provider. If the tool meters and bills your inference, it is in the path.
What crosses the network, exactly? A credible local-first tool can name the one thing that leaves — the model request — and tell you everything else stays put.
Does it work offline for non-model tasks? Opening a project, browsing diffs, reading history — if those need the vendor's cloud, your data is more entangled than the label suggests.

where defract stands

defract is built local-first by default. It orchestrates Claude Code on your machine: the repository never leaves your device, and the PTY transcripts from each agent session are written to local disk, not to a defract server. The model calls run on your own Anthropic account, so you pay Anthropic directly for inference — there is no reseller markup and no token cap in the middle.

That is a deliberate trade. Hosted orchestration is genuinely easier to set up and to share across a team, and defract is adding team features for the people who want them. But the default is that your code and your transcripts stay where they started: on your machine. If you are evaluating tools, run the checklist above against all of them — the point is not to take any vendor's word for it, including ours.

If you want the longer argument for why a structured, on-device lifecycle beats raw parallelism, the cognitive load of running parallel Claude Code agents is a good next read, and defract vs Claude Code covers how defract builds on the terminal rather than replacing it.

defract is in open beta

a local-first lifecycle for your parallel claude code agents — your code and transcripts stay on your machine, you pay anthropic directly. free, no caps, no signup.

download for mac talk to us in discord ›