The challenge: orchestrating AI agents in real time
When we started designing Orkestr8's orchestration engine, one thing was clear: every millisecond matters. An AI agent waiting 200ms for a router response means 200ms of added latency per user interaction. Multiply that by 50 requests per session, and you get a product that feels slow — even if the LLMs themselves respond in under a second.
We needed a runtime capable of handling thousands of simultaneous connections, routing requests to the right LLM model in microseconds, and maintaining consistent state for each agent session — all without compromising on reliability.
Why not Node.js or Go?
Node.js was the obvious candidate: rich ecosystem, easy hiring, solid async I/O handling. But the garbage collector is unpredictable. In our benchmarks, GC pauses of 10 to 50ms occurred regularly under load — unacceptable for a real-time router that needs to maintain p99 latencies under one millisecond.
Go offered better performance guarantees with goroutines. But memory management remained non-deterministic, and the lack of algebraic types made modeling our agent states more fragile. Every invalid state was a potential production bug.
Rust gave us exactly what we needed: predictable performance without GC, a type system that makes invalid states unrepresentable, and a mature async ecosystem with Tokio.
The starting point: Zeroclaw
Orkestr8 didn't start from scratch. We built on top of Zeroclaw, an open-source multi-agent orchestration framework written in Rust. Zeroclaw gave us the solid foundations we needed: a high-performance async runtime, a modular architecture based on Rust traits, and a security-first philosophy baked in from the start.
From that base, we built the layers specific to Orkestr8: the intelligent LLM router with circuit breaker, the human approval system, the context engine with vector memory, and the API Gateway that exposes it all. Zeroclaw saved us months of development on the foundations — and let us focus our energy on what makes the product difference.
The orchestration engine architecture
Orkestr8's core, built on Zeroclaw's foundations, is an asynchronous event loop powered by Tokio. Each incoming request is processed in a lightweight task (green thread) with no dynamic allocation on the critical path. The LLM router uses a pre-compiled decision table based on pre-computed task description embeddings.
The circuit breaker protecting against LLM provider failures is implemented as a typed finite state machine. Thanks to Rust's enum system, every state transition is verified at compile time. It's literally impossible to go from an 'open' state to 'closed' without going through 'half-open' — the compiler refuses.
For agent session storage, we use a lock-free LRU cache based on atomic operations. Reads are wait-free (never blocked), and writes use a compare-and-swap mechanism that guarantees consistency without a global mutex.
Production results
After going to production, the numbers speak for themselves. Median LLM router latency is 0.3ms. P99 latency is 0.8ms — even under 10,000 requests per second. The engine memory footprint is 45 MB to handle 5,000 simultaneous agent sessions.
Compared to our initial Node.js prototype, we reduced latency by 40x and memory usage by 8x. But the most significant gain is reliability: zero production crashes since launch, thanks to Rust compiler guarantees.
The cost of this approach? A steeper learning curve for the team and longer compile times. But for a component as critical as the orchestration engine, the tradeoff was obvious.
What Rust doesn't solve
Rust isn't the solution to every problem. Our web dashboard is in Next.js, our CLI in TypeScript, and our document processing workers in Python. Each tool has its place.
The real lesson from our experience is that language choice should be driven by the problem's constraints, not by hype. For a real-time orchestration engine handling thousands of connections — Rust was the right choice. For a responsive user interface — React remains unbeatable.
Ready to try Orkestr8?
Start for free with the Community plan. No credit card required.
Start for free