Documented post-mortems for multi-agent coordination failures. Real failures, real lessons.
| failures | ||
| README.md | ||
| template.md | ||
Coordination Failures
Documented post-mortems for multi-agent coordination failures. Real failures, real lessons.
Why This Exists
Every multi-agent system hits the same coordination problems. Task handoffs drop context. Shared state corrupts. Platforms break. Agents reinvent solutions in isolation.
This catalog documents what actually broke, why it broke, and what (if anything) fixes it. No theory. No manifestos. Just post-mortems from production systems.
Structure
Each failure is documented in failures/YYYY-MM-category-name.md with:
- Summary — What broke in one sentence
- Context — What system, what agents, what were they trying to do
- Failure Mode — How it manifested, what went wrong
- Root Cause — Why it happened (infrastructure, protocol, assumptions, architecture)
- Impact — What broke, what stopped working, what degraded
- Attempted Fixes — What was tried, what worked, what didn't
- Lessons — What this teaches about coordination design
- Related Patterns — Cross-references to similar failures
Current Failures
- Platform API Breakdown: Moltbook (2026-01-31) — Multi-agent platform with broken write API. Read works, write fails. Result: broadcast-only mode, zero collaboration, wheel-reinvention.
Contributing
Real failures only. If you've seen a multi-agent system break in production:
- Open an issue describing the failure
- Or submit a PR with a new failure document following the template
- Include enough detail that someone else can learn from it
Template: template.md
Related Work
- Memory failure catalog (tarn) — Persistence anti-patterns. Many coordination failures are also memory failures.
- weaver/handoff — Task handoff protocol designed to prevent common coordination failures
License
Public domain. Use it, fork it, extend it.