The Anthropic Claude complex of 16 AI agents independently created a C compiler.

The Anthropic Claude complex of 16 AI agents independently created a C compiler.

21 software

In the course of an experiment, Anthropic gathered a group of 16 autonomous AI agents that, from scratch, built a C language compiler in Rust. The result is a “pure” implementation capable of compiling the Linux 6.19 kernel and projects such as PostgreSQL, SQLite, Redis, FFmpeg, and QEMU, but it remains significantly behind GCC in quality and efficiency.

How It Was Done
Stage What Happened Preparation 16 instances of Claude Opus 4.6 were launched in separate Docker containers with no internet access. Each clones a shared Git repository and receives tasks via lock files. Independent Planning No central coordinator: each agent decides on its own which “obvious” chunk of work to tackle next. In case of merge conflicts, code is automatically merged. Development The agents were tasked with writing a C compiler from scratch. The work lasted 2 weeks and required almost 2000 Claude Code sessions. Testing To avoid “cluttering” the model’s context with long queries, tests run in summary mode (only a few lines of output). A fast mode for processing 1–10 % of tests was added to speed things up.

Final Product
* Size – about 100 000 lines of Rust code.

* Functionality – can build Linux 6.19 on x86, ARM, and RISC‑V; compiles PostgreSQL, SQLite, Redis, FFmpeg, QEMU; passes ~99 % of GCC tests.

* Limitations – does not generate 16‑bit machine code (Linux requires GCC), the assembler and linker have errors, and code performance is lower than GCC’s. The quality of the Rust source leaves much to be desired compared with an experienced programmer.

Cost of the Experiment
Metric Cost Tokens Claude API ~$20,000 Additional costs (model training, project organization, test suites) not included in the stated amount

Lessons and Takeaways
1. Limits of autonomy – as code grew to ~100 000 lines, agents stopped fully understanding the project; this appears to be an upper bound for independent AI.

2. Need for support – attempts to extend functionality often broke already working parts of the code.

3. Importance of the development environment – isolation from the internet and proper test configuration proved critical for stable agent operation.

Conclusion
The experiment shows that modern AI models can generate complex software systems with minimal oversight. However, they still cannot fully replace seasoned developers: code quality, performance, and reliability remain below those of traditional compilers, and project scale is limited to a few hundred thousand lines. It’s an important step forward but still far from fully autonomous software development.

Comments (0)

Share your thoughts — please be polite and stay on topic.

No comments yet. Leave a comment — share your opinion!

To leave a comment, please log in.

Log in to comment