AI AgentsResearch PaperMarkdown

Markdown Agent OS (MAOS): Multi-Agent Systems Built from Plain Text

June 1, 2026

📄 Read the full paper (PDF) — Markdown Agent OS (MAOS): Building Multi-Agent Systems from Markdown, Terminals, and Coding Agents.

I wrote this paper back in February 2026 and am only publishing it now. The core idea has only become more relevant since: you don't need a new framework to build powerful multi-agent systems. You need a terminal, a filesystem, and a coding agent — and the willingness to write your workflows as plain Markdown.

I call the pattern Markdown Agent OS (MAOS).

The problem with how we orchestrate agents

Coding agents like Claude Code are everywhere now. They read and write files, run shell commands, and call tools. On top of them, multi-agent frameworks promise specialized agents that plan, collaborate, and maintain long-running workflows.

But almost all of that orchestration is still code. Defining a workflow means writing Python graphs, config files, pipelines, or custom backends. Even when the framework is powerful, the barrier stays high: you have to learn its abstractions, manage dependencies, deploy services, and debug both your code and the model's behaviour.

For non-technical users, that's effectively impossible. For technical users, it's slow and fragile.

So I started from a different question: what if we don't build a new framework at all? What if the terminal, the filesystem, and a coding agent CLI we already have are the runtime — and everything else is just text?

The core idea

In MAOS, the workflow is the document. Tasks, steps, constraints, and expected outputs are written as Markdown, and an agent with local tool access interprets them and carries out the work in the same workspace.

The mental model is simple:

The terminal + filesystem are the body.
The coding agent CLI is the brain.
Markdown files are the nervous system — they define skills, workflows, and constraints in plain language.
Tools and MCP integrations are the hands — they browse, query APIs, update databases, and call external CLIs.

A skill is a natural-language description of a repeatable workflow, realised as a folder containing a SKILL.md file plus optional templates, reference docs, and scripts. A flow is a short trigger — like /new-idea or /client-research — that tells the agent which skill to run. Once the skill is written, no further coding is required.

This turns the terminal into a kind of natural-language operating system: flows are commands, skills are programs, tools are system calls, and the filesystem is shared memory.

Why plain text wins

The whole argument of the paper comes down to four claims:

Markdown can be the orchestration layer. A workflow is plain-language steps, constraints, and output contracts that both humans and agents can read and edit.
The local workspace is the runtime. No new framework — just a filesystem plus an agent allowed to read/write files and run tools with your permission.
File-based workflows are portable and inspectable. "State" becomes human-readable artifacts — notes, drafts, logs, checklists — not hidden memory. You can audit, version, and reuse it.
The approach is model- and tool-agnostic. Any agent runtime that can read Markdown and operate on local files can implement it.

Moving a system to another machine means copying a folder. Updating the logic means editing text.

Multi-LLM collaboration, without orchestration code

One of my favourite parts is that MAOS enables multiple models to collaborate inside a single workflow — without any explicit orchestration code, agent graphs, sockets, or schedulers.

Claude Code typically acts as the orchestrator. Other CLIs — Codex, Gemini, Qwen — run headlessly as specialised sub-agents, each writing its output to a structured location in the shared workspace. When they finish, the main agent reads those files, cross-compares the results, and synthesises one cohesive deliverable.

For example, a /start-blog-research flow might trigger Codex for a technical literature scan, Gemini for mainstream sources, and Qwen for community discussions — then Claude aggregates, filters redundancy, and writes the unified draft.

Coordination emerges naturally through the shared filesystem and plain-text instructions. The filesystem is the lingua franca between agents, the Markdown skill is the conductor's score, and the coding agent CLI is the orchestra leader.

Keeping a human in the loop

Because workflows can edit files and run commands, MAOS treats third-party skills as untrusted code and leans on a few safeguards:

Explicit scopes — which folders may be read or written.
Approval gates — for anything irreversible (e.g. write a human-readable content.md draft first; only commit SQL or perform irreversible actions after approval).
Lightweight logs — so every run is inspectable and repeatable.

Memory is just ordinary workspace state: durable Markdown files recording preferences, constraints, and past decisions. Humans can read it; agents can diff it.

What it's actually done

In real use, this pattern has cut tasks like client website analysis and long-form content creation from hours or days down to minutes or under an hour — while keeping me in the loop for review and final calls. It's still limited by model context, occasional drift from the written plan, and the fragility of some tool integrations. But those limits are manageable, and they keep shrinking as models improve.

Most importantly, MAOS is not a product — it's a way of working. A philosophy of building multi-agent systems as collections of readable documents, rather than opaque backends or complex graphs. Your workflows stay vendor-flexible and your data stays on your own machine. Even if a specific model or provider disappears, the projects and documents remain usable.

If multi-agent systems are going to become everyday collaborators rather than research projects, making them editable with the same tools we use to think and write may be one of the simplest paths forward.

📄 Read the full paper (PDF) →