How Coding Agents Actually Work
How do tools like Claude Code, Cursor, and Devin actually write code? The ReAct loop, tool calling, and context management — animated.
Neural Download
Installing mental model for coding agents.
You're staring at a bug you can't find. You type "fix the auth bug in login dot pie" and press Enter. Ten seconds later, the agent opens three files, finds the issue, writes a fix, and runs the tests. All green.
How did it do that?
Most developers assume the answer is complicated. Some kind of neural reasoning engine. A sophisticated multi-phase planner.
But here's what's wild. At the core of every major coding agent — Cursor, Claude Code, Copilot, Devin — there's the same fundamental pattern. And that pattern... is a while loop.
Let me show you.
Here's the core pattern. While not done: think, act, observe. The agent reads your prompt. It decides what to do next. It takes an action. It looks at the result. And it loops — until the task is done, or it gets stuck.
Of course, the real systems have more going on — safety checks, re-tries, planning steps. But this loop is the beating heart underneath all of it.
Let's trace a real example. You say "fix the auth bug." The model reasons: "I should read login dot pie first." So it calls a tool — file read — and gets back the source code.
It reasons again: "Line forty-two is checking the token wrong." So it calls another tool — file edit — and patches the line.
Then: "Did that actually work?" It calls bash, runs the test suite. Tests pass. Done.
Three trips around the loop. Think, act, observe. Each time, the model picks the next best action based on everything it's seen so far.
But here's the thing. The model can reason... but it can't actually do anything on its own.
An LLM by itself is a brain in a jar. It can reason, plan, and generate code. But it can't touch a file. It can't run a command. It lives entirely inside text.
So how does it fix bugs and write programs? Tools.
When the model decides to act, it outputs a structured message that says: "Call this tool, with these arguments." A program called the harness catches that message, executes the tool, and feeds the result back into the conversation.
Watch. The model outputs: tool — file read, path — login dot pie. The harness reads the file and returns the contents. The model outputs: tool — edit, line forty-two, new code. The harness patches the file. Tool — bash, command — run tests. The harness executes and returns the output.
The model never touches your file system directly. It writes instructions. The tools are its hands.
Cursor gives its agent around ten tools. Claude Code has closer to twenty. Devin gets an entire virtual machine. Same core pattern, different hands. And of course, the quality of the model matters enormously — a smarter model picks better tools, in a better order, with better arguments.
But all of these tools share one constraint. Every result comes back into the same place.
The context window. This is the agent's entire working memory. Its instructions, your prompt, every file it's read, every tool result — it all has to fit inside this one container. And it's not infinite.
Every trip around the loop adds more. The window fills up. And here's what most developers don't realize: the agent starts degrading before the window is full. Researchers have found that information in the middle of long contexts tends to get lost — the model pays strongest attention to the beginning and the end.
So the agent might forget a file it read ten steps ago. It might drift from instructions you gave earlier. Not because it ran out of space — because its attention didn't reach.
And when you close the session? Gone. The window empties completely. Next session starts from zero. Most agents work around this with a simple trick: a text file — loaded at the start of every conversation — with notes about your project, your preferences, things it should remember.
That's the real constraint shaping every coding agent. Not intelligence — the models are already incredibly capable. The bottleneck is memory. How much can it hold, and how well can it use what's inside.
So now you see the full picture. A while loop at the core. Tools for hands. And a context window that's always filling up.
This changes how you should use these tools.
Be specific. Every vague word in your prompt wastes precious context. "Fix the bug" forces the agent to burn loops searching. "Fix the token check on line forty-two of login dot pie" gets there in one shot.
Break big tasks into small ones. A fresh context window is a sharp context window. The longer the session, the fuzzier the middle gets. Start new sessions often.
And let it loop. The agent doesn't solve problems on the first try. It tries, fails, reads the error, and tries again. That's not a bug — that's the architecture working.
These tools aren't magic. They're a while loop, a set of tools, and a window of memory. And now, every time you press Enter, you know exactly what happens next.
Cognitive architecture... updated.
