Over the past few months, we are seeing a lot of coding agents being built. Most of them have a good performance, not because the underlying tooling is amazing but just because the underlying models are extremely good. Building a coding agent is becoming easier at each model iteration. I am not saying this is trivial, but the barrier to entry for building something good is lower than a year ago. In this post, I explain the main components that make a coding agent.
Opencode is an open-source coding agent. It’s similar to aider, another coding agent that I will review in another post. Note that there are different flavors/forks of opencode, I will use the sst repository to explain the code.
What is an agent?
With all the hype around AI and LLM, everyone has their own definition of what an agent is. From a technical perspective, an agent is a program that keeps calling an tool-augmented LLM until a condition is met. By “tool-augmented LLM”, I refer to a call to the LLM paired with some tools1 (edit a file, read a file, fetch some documents online, send a text, etc) to augment its capabilities.
From a programming perspective, an agent is a while
loop that keeps calling the LLM until a final condition is satisfied (for example, passing a text, getting the answer “yes” by voice or just getting an answer from the model). The LLM is augmented with tools to make progress on a task, be it reading a file, editing a file, getting semantic information or doing websearch. To give a concrete example, when ChatGPT searches the internet for you, it uses a web search tool to find information on the internet for you. Coding agents use tools that read files, grep the codebase or call LSP-related features (find a symbol definition, list the functions with a specific name, etc).
The mental model to keep in mind is something like the snipper below. The callLLM()
function is where the tools are added to the LLM call.
while (true) {
result = callLLM(message)
if (isFinalConditionMet(result)) {
exit()
}
message = updateMessage()
}
Coding Agent in a Nutshell
Your coding agent is simply iterating on messages sent to an LLM until your condition is met. The agent sends in its message the list of tools available and the argument of the tools. The LLM returns either some text or requests the agent to invoke some tools. You can see a very simplistic message sequence of how this would work for a very simple function creation.
OpenAI has some good documentation about tools, the important part is that all tools are executed locally.
Model Interoperability
There are many model providers today (OpenAI, Anthropic, xAI, Mistral, etc) and each provider has its own way to call a model and handle tools, making it hard to switch models. Opencode uses the AI SDK from Vercel, which abstract model providers and lets developers write provider-agnostic AI agents.
The core of the model orchestration is done in session/index.ts through the function streamText()
. Initializing the models along with the tools is done in provider/provider.ts.
The beauty of using such a framework is simple: you can focus on building the core of your application without having to worry about what platform or AI framework you can use. Any developer building an AI application should rely on such a library (AI SDK for TypeScript, Langchain for Python) to avoid any platform dependence.
Prompt quality matters a lot
The prompt size is no longer a major constraint but the quality of the prompt is really what matters. Editing or analyzing a monorepo with 1000 files and trying to update only 2 files is like finding a needle in a haystack. And to make your LLM finding the needle, you better give it really good instructions.
This is why the quality of your prompt matters. A lot.
The prompt for opencode is quite long and gives a ton of examples of what to follow. There are some general guidelines to follow for a good prompt:
be extremely explicit about what you want. If something matters, use CAPITAL LETTERS TO INDICATE YOU CARE ABOUT A PARTICULAR INSTRUCTION.
keep using Markdown to write your prompt. Hierarchical information with titles using the # character and very important information using ***BOLD***
use special tokens to specify examples, like
<example>
Once you have a good basic prompt, you need to be able to augment it with tools that provides the necessary features to find the needle in the haystack.
Augment LLM with tools
Tools are the real coding agent "magic sauce". An LLM without tools is like Bruce Wayne. With the tools, it becomes Batman and can save developers from hours of debugging.
Common tools for coding agents are: grep, read file or write a file. All tools for opencode are located in packages/opencode/src/tool and added to the LLM in provider/provider.ts. All coding agents (including Cursor, Claude Code, etc) rely only on two things:
the quality of the prompt
the quality and accuracy of the tools
The important part of the tool is to correctly specify what the tool does through a clear description (like this one) and parameters (like those ones). These tools are then called to the LLM and the program loops with the tool results until the LLM is finished.
I insist that the quality of the tools contributes significantly to the quality of your coding agent. This is why all popular coding agent (Cursor) index your codebase: they index symbols to easily feed accurate data to the model. Similar to the tool to get a line of code (many LLM struggle to find a line of code accurately) or edit files: small tweaks can have major effects on the overall tool quality. This is why having good evals help you quickly iterate on the quality of your product.
Ask the Agent to Plan
Opencode has an interesting approach: it provides a TODO list tool to the agent. The initial prompt explains how the TODO list works and the tools available to read and write the TODO list.
The agent constantly uses this todo list tool to keep track of what to do and in what order.
Why coding agents cost so much to operate?
Since the release of Claude Code, users complained about the cost.
The team behind Claude Code even discussed this in the Latent Space podcast, explaining that some users at Anthropic can spend thousands of dollars a day. While this is probably an extreme case, this shows this is a valid concern. Cursor probably did not have this problem yet because usage was heavily subsidized by VC money.
Coding agents are expensive because of how they are implemented: we pass a ton of token in the prompt and context. When your typical ChatGPT request contains a few hundred tokens at most, a coding agent will consumer ten of thousands of tokens per request. If you constantly vibe code as the cool kids do today, it will very quickly be extremely expensive.
I believe this is not a real problem and we have seen that story before. VC subsidized Uber for years until users were hooked on the platform and were ready to pay the real cost of a ride. Amazon subsidized many products to establish dominance and eventually reach economies of scale. There is no doubt that coding agents are here to stay ; the current economic model may have to evolve.
What is coming next?
I can see mostly two major improvements in the coming months:
Running locally: projects like ollama make it really easy to run LLM locally. As LLM are getting optimized and CPU/GPU are getting more powerful, there is no question that coding agents will run locally in just a few months.
More accurate tools: today, most tools are just grep’ing the codebase, similar to what you would do when you start working on a new codebase. I can see coding agents having more tools to analyze the codebase and giving better information to the prompt.
You can learn more about tools in the tools section of the OpenAI API