Architecture Decisions
Why The Mesh was built this way — key architecture decision records and the reasoning behind them
Architecture Decisions
Every decision has a reason. If a decision seems wrong, read the context first. If it is still wrong, open a PR.
Decision Summary
| # | Decision | Choice | Why |
|---|---|---|---|
| ADR-001 | Server language | Go 1.24+ | Goroutines handle hundreds of concurrent WebSocket connections without Node.js event loop bottlenecks. |
| ADR-002 | Default database | SQLite (pure Go) | Zero-config, single file, no external process — passes the walkaway test. |
| ADR-003 | License | AGPL v3 | Network copyleft prevents cloud providers from taking the code without contributing back. |
| ADR-004 | Authorization | UCAN over OAuth | Decentralized proof chains work across federated meshes without a central authority. |
| ADR-005 | Identity | DID:key over DID:web | Self-sovereign identity with no DNS or CA dependency. |
| ADR-006 | Bot lifecycle | Kubernetes Pods | True isolation, resource limits, auto-restart — with subprocess fallback for dev. |
| ADR-007 | Frontend state | Redux Toolkit over Zustand | Enforced patterns, middleware for WebSocket sync, time-travel debugging via DevTools. |
| ADR-008 | Deployment | Single Go binary | One process, one port, one log stream — no microservices complexity for self-hosters. |
| ADR-009 | Agent protocols | MCP + A2A | Open standards with growing adoption — any MCP-compatible agent connects without Mesh-specific code. |
| ADR-010 | Model strategy | OSS-first | Self-hosted models keep data in your mesh; API models supported as convenience fallback. |
| ADR-011 | Funding | Token over VC | Open-source sovereignty infrastructure should not be owned by a venture fund. |
| ADR-012 | UI framework | Tailwind v4, dark-only | Consistent visual identity, simpler CSS — no dual-theme complexity. |
Expanded Highlights
Go Over Node.js (ADR-001)
The original v1 was Node.js/Express with an ECS architecture. Performance issues emerged with real-time WebSocket at scale: memory overhead, event loop bottlenecks under combined WebSocket, bot lifecycle, and file I/O workloads.
Go goroutines handle I/O-heavy concurrent work far more efficiently for this workload profile. The result is a single binary (~20MB), no node_modules, no runtime dependency. The trade-off: lost TypeScript type sharing between server and client, mitigated by the packages/protocol package defining wire types in Zod schemas.
Rust was considered but rejected — too steep a learning curve for contributors, violating the walkaway test.
SQLite as Default (ADR-002)
The Mesh targets solo operators and small teams first. Requiring Postgres or MongoDB for a single-node deployment adds unnecessary infrastructure complexity. SQLite via modernc.org/sqlite (pure Go, no CGo) gives zero-config storage in a single file. Backup is a file copy.
The limitation — single writer, not suitable for horizontal scaling — is mitigated by the storage adapter pattern. The same API surface works with MongoDB for multi-node cloud deployments via the MONGODB_URI environment variable.
AGPL v3 License (ADR-003)
If a cloud provider takes the code, wraps it in a managed service, and never contributes back, the open-source community gets nothing. MIT and Apache allow this explicitly.
AGPL v3 adds network copyleft: if you modify The Mesh and offer it as a service, you must share your modifications. A dual-license commercial option is available for enterprises that need an AGPL exemption. This model is well-established (MongoDB, Elastic, GitLab all used variants).
UCAN Over OAuth (ADR-004)
Traditional auth systems require a central authority to validate tokens. In a federated mesh, there is no central authority. UCANs form cryptographic proof chains verifiable without contacting the issuer.
The Anti-CLU Principle — named for the antagonist in Tron who granted himself escalating privileges — demands that capabilities can only narrow, never expand. An agent spawned with read-only access to one room cannot grant itself write access to all rooms, regardless of what code it runs. UCAN enforces this at the protocol level.
Single Binary, Not Microservices (ADR-008)
The Mesh targets self-hosters running on a single machine or a small cluster. A microservices architecture requires service discovery, inter-service communication, distributed tracing, and operational complexity that violates the walkaway test.
One Go binary owns everything: HTTP API, WebSocket, auth, storage, bot lifecycle, model proxy, federation. One process, one port (4000), one deployment unit. Copy the binary, set environment variables, run it. Horizontal scaling happens at the mesh-to-mesh level (federation), not at the process level.
OSS-First Model Architecture (ADR-010)
Running open-source models on your own hardware is both a sovereignty position and a security decision. Your data never leaves your mesh. No API provider sees your prompts or completions. No third-party trains on your data.
The model proxy at /api/models/v1/chat/completions is OpenAI-compatible, so switching from a centralized API to a self-hosted model (Llama, Mistral, DeepSeek) requires changing a URL, not rewriting integration code. API models are supported as a convenience fallback, not the default path.