Field Notes / Multi-Agent Systems

We built multi-agent coordination before it was a feature.

Native multi-agent capabilities are the headline of 2026. We were building the coordination layer manually for years before the platform shipped one. Here is what that experience revealed about what has to exist before the agents matter at all.

By Mike Goetz April 2026 10 min read
Read
3
Things that have to exist before the agents matter at all
1
Source of truth every agent reads from before output
0
Native multi-agent features when this system was built

Multi-agent AI is now a headline feature. Anthropic, OpenAI, and every major platform is racing to add it. The pitch is that agents working in parallel, coordinated by an orchestrating layer, will accomplish more than any single context window can hold.

That is true. But the part that does not make the marketing copy is this. Having agents is not what makes multi-agent coordination work. Having methodology is. Most people spinning up agents in 2026 are going to hit the same wall anyone hits when they add people to a project without defining roles, handoffs, and shared context. More agents, more chaos.

This is the story of building multi-agent coordination before it was a feature, and what that experience revealed about what actually has to exist before the agents matter at all.

01

What we built and why

Before Claude Code existed, the only interface was the web app. If you wanted to coordinate multiple AI instances, one handling strategy, one handling social media, one handling execution, one synthesizing everything, you had to build the coordination layer yourself.

The architecture was simple in concept. A Command Center chat acted as the orchestrator. It held the shared context. What was being worked on, what each specialized instance knew, what decisions had been made. Specialized chats handled specific domains. When work needed to cross from one domain to another, it crossed through documents, not through direct communication between agents.

The web app was not designed for this. Spinning up agents required working around limitations. Context windows. Session boundaries. The fact that no two chat instances share memory. Every workaround revealed something about what coordination actually requires.

What got built was not elegant. There was no native tool calling between chats. There was no shared memory. Every handoff was manual, every status update was a fresh document, every coordination decision was something I had to actively maintain instead of something the platform handled. But it worked well enough to run a multi-system operation across social media, strategy, execution, and content simultaneously. Before any of those capabilities were built into the platform.

The improvisation forced clarity that a more elegant tool would have hidden. When the platform does not handle coordination for you, you have to think about coordination explicitly.

That thinking is what survived.

02

What Anthropic eventually built versus what we actually needed

When native multi-agent features arrived, the first reaction was recognition. These are solutions to the same problems we were solving manually. Sub-agents for parallel work. Orchestration layers for coordination. Tool use for grounding agents in real file systems and real data.

The difference is that the platform solutions handle the mechanics. The thing they do not handle, the thing they cannot handle, is the methodology.

An agent needs to know what it is for. Not in a system prompt sense. That is the easy part. In a structural sense. What is this agent's domain, what does it not touch, how does it hand off to the next layer, and what does it do when the answer it needs is somewhere else in the system.

Without that, multi-agent coordination is just multiple single agents running simultaneously without knowing about each other. That is not coordination. That is parallelism. The two look the same from the outside until something breaks. And when something breaks in a parallel-but-uncoordinated system, the failure mode is not one agent producing a wrong answer. It is five agents producing five different answers that all sound reasonable, and now you have to figure out which one was actually right.

The platforms cannot solve that for you. They can give you the tools. The thinking is still yours.

There is also a quieter problem nobody talks about. Native multi-agent features make it trivially easy to spin up agents, which means the cost of adding one is almost zero. When the cost of adding something is zero, you stop asking whether it should exist. You just add it. The manual architecture had a built-in friction that forced the question. Every new chat I created was a chat I had to maintain, document, and route work to. That friction was annoying. It was also the thing that kept the system from sprawling into something nobody could hold in their head. The native features remove that friction, which is good for speed and bad for discipline. The methodology has to supply the discipline the platform no longer requires.

03

The mechanics of building agent ability in the web app

For readers who want to understand how this worked technically, not to reproduce it but to understand the underlying principles, the architecture had three elements.

The first was the Command Center. A single chat that held the master status document. Every other chat could produce a document that the Command Center could read. Nothing communicated directly between chats. Everything went through shared documents that the Command Center synthesized. This is the key insight. Agents do not need to talk to each other. They need shared state that one orchestrating layer maintains.

The second was specialized chats with defined roles. Each chat had a specific domain and did not operate outside it. The social media chat did not make strategy decisions. The strategy chat did not write code. Role definition is not about capability. Every Claude instance has the same capabilities. It is about preventing scope creep that produces inconsistent outputs. When the social media chat tried to make a strategy call, the answer would be reasonable but it would not match what the strategy chat had decided last week. Defining the boundaries was what kept the system coherent.

The third was document-based handoffs. Work crossed between chats as structured documents, not as copy-pasted conversation. A delegation brief going from the strategy chat to the execution chat contained everything the execution chat needed to start without asking questions. The quality of the document determined the quality of the handoff. A weak brief produced a weak result every time, regardless of how capable the receiving chat was.

These three elements, shared state, defined roles, structured handoffs, are the coordination layer. The agents themselves are almost incidental.

04

What this means for anyone building multi-agent systems now

The platforms have solved the technical problem. Agents can now communicate natively, share memory, pass tool results, and run in parallel without workarounds. That is genuinely better than what was built manually. I am not nostalgic for the workarounds.

But the methodology problem is unchanged. Spinning up five agents without defining what each one owns, how they hand off, and what shared state they all read from produces the same chaos as adding five developers to a project with no architecture, no documentation, and no defined responsibilities. The agents are faster than developers. The chaos arrives faster.

The Three Things

What has to exist before the agents matter at all

01

Role definition that is structural, not instructional

"You are a senior developer" is an instruction. "This agent handles file system operations and does not make architectural decisions" is structural. The difference shows up when agents encounter something ambiguous. Structural definition produces nothing when the work falls outside its scope, and that nothing is a signal that the work needs to route somewhere else. The agent stopping is the system working, not failing.

02

Shared state with a single owner

The coordination failure mode is not agents producing wrong answers. It is agents producing conflicting answers because each one had a different picture of the current state. One layer has to own the authoritative state and all agents have to read from it. Database, status file, dedicated orchestration agent, the medium does not matter. What matters is that there is one source of truth and every agent reads from it before producing output.

03

Handoffs that are complete, not conversational

A handoff that works is a document an agent can execute from without asking follow-up questions. A handoff that breaks is a summary that assumes context the receiving agent does not have. The test: could a completely fresh agent, with no conversation history, start from this document and produce the right output? If no, the handoff needs more context. The receiving agent will not save you from a weak handoff.

These are not insights about AI. They are insights about coordination that apply to any system where multiple parties need to work together on a shared goal. The AI part just makes the gaps in coordination methodology visible faster, because agents move faster than humans and they do not pause to ask whether the structure is sound before they execute.

What success looks like in a coordinated multi-agent system is mostly invisible. Nothing dramatic happens. The strategy chat hands a brief to the build chat, the build chat produces the output, the result comes back to the orchestrating layer, the next loop starts. No agent is doing anything heroic. No single output is impressive in isolation. The compounding happens across loops, across weeks, across months. That is what well-coordinated work looks like.

When a multi-agent system feels exciting, that usually means something is wrong. The exciting moments are the ones where the coordination broke down and somebody had to do recovery work that should not have been necessary.

05

Closing

The multi-agent future is real. The capability is here and it is going to keep improving. What it will not improve is the thinking required to coordinate work across multiple agents doing different things toward a shared goal. That is a methodology problem, not a capability problem.

The coordination architecture described here evolved into the system that runs today. A command center, specialized chats, delegation briefs, and a status document that keeps everything coherent. The agents got better. The methodology stayed the same because it was right from the start.

The methodology is public.

The framework methodology behind this coordination system, including delegation brief templates and orchestration patterns, lives at HowToFramework.

Visit HowToFramework.com
MG
Mike Goetz

Mike Goetz is the founder of RageDesigner, where he has built systematic thinking methodology since 2003. His framework library now exceeds 600 documented frameworks. He teaches framework generation at whatisaframework.com and howtoframework.com.