Agent Harnesses - Ouachita Labs

Tags: agents, software engineering
Audience: technical

A lot of folks would be better off not rolling their own harness. There are a couple of reasons for this. Rolling your own agent loop seems simple. It's easy to underestimate. The core agent loop just looks like this.

messages = [{"role": "system", "content": "you are a rly smart..."}]
while True:
    prompt = input("> ")
    messages += [{"role": "human", "content": prompt}]
    response = LLM.complete(messages)
    print(response)

Then you have to add tools and you're faced with a few questions. For example:

Tooling & Design: Which tools do you want to give your agent? How do you design the tool surface so that the agent is able to infer how to use it efficiently?
Reliability & Safety: Should we sandbox? Should we allow for undo operations? How do I tell if a tool call was undoable? How are tool approvals handled?
Data & Context: How do we handle long context tool call responses? How do we bubble up useful error messages to the model? How do we log or trace tool calls effectively?
Implementation Details: Should the tool live in a separate MCP server? Is bash all you need? How do you build a file edit tool? Do you use string replace? Force the model to output a raw diff? Something else entirely?
Multimedia & Streaming: How do you handle multimedia reads? Does the model support image input? Do PDFs have to be converted to text inside of read_file first, or can the model handle it? Am I going to support streaming? Stopping mid-stream? Resuming from a previous message?

This can go on forever. As I've built agents on my own at my day job, I've found one thing: "buy don't build" - except there is zero cost. The colloquialism is cut down to simply "don't build". This may come to a shock to those of us who enjoy tinkering with these agent loops and understanding how context is managed, etc, but there is no reason to rebuild something that has been open sourced and perfected by full time employees working on it.

Take codex for example, an apache 2.0 licensed coding agent built for their models with 74k stars on github, over 8k closed pull requests, over 5k commits, etc. The project is exceedingly popular, and I use it every day. It's great and it's got most of the bells and whistles you'd expect from a modern coding agent: skills, custom slash commands, and built in session tracking/session-resume. If codex seems too complicated for your simple slackbot, try something like the pi-mono coding agent package, which is a much more minimalist interpretation of a general agent harness, with a focus on extensibility and ease of customization. This was the harness powering the rocket ship of a project openclaw that you may have heard of.

The point is, unless you're doing experimental harness research or something else extremely niche, a pre-built coding agent can offer you to skip the line. Don't get stuck building your own agentic loop if you don't have to.

Author's Note: John is anti AI slop. Ouachita Labs blog posts are always written with care by a human brain, proofread with Claude Code.