Memory · Learning · Isolation · Convergence — compliant from the ground up.
Live scrape of the control plane — current state of the system as of 2026-05-23 18:55 UTC.
Swarm mode paused · bandit strategy thompson
The same prompt fired twice doesn't re-learn the lesson — the result is durable, not chat-window-bound.
| Task type | Prompt variant | Samples | Win |
|---|---|---|---|
| code-change | test-aware |
9 | 45.5% |
| code-change | plan-first |
10 | 41.7% |
| code-change | minimal |
10 | 41.7% |
| code-change | diff-focused |
9 | 36.4% |
| feature | scaffold |
6 | 30.0% |
| feature | default |
10 | 25.0% |
| bug-fix | test-aware |
6 | 25.0% |
| feature | plan-first |
7 | 22.2% |
Top arms by Thompson-sampling win rate from 30,687 observed task outcomes across 651 arms. No synthetic data on this slide.
| Task type | Model arm | Pulls | Win | Latency |
|---|---|---|---|---|
| CI-gate fix | OpenAI sonar-pro |
65 | 97.2% | 10.6s |
| CI-gate fix | openai_native:gpt-4.1 |
77 | 97.2% | 15.4s |
| code-change | DeepSeek deepseek-chat |
908 | 97.1% | 73.9s |
| code-change | Gemini gemini-3.1-flash-lite-preview |
2,141 | 97.1% | 24.3s |
| code-change | groq:qwen3-32b |
5,015 | 97.1% | 7.3s |
| code-change | groq:gpt-oss-120b |
5,068 | 97.1% | 4.8s |
| CI-gate fix | DeepSeek deepseek-chat |
732 | 97.1% | 34.2s |
| CI-gate fix | Gemini gemini-pro-latest |
392 | 97.1% | 54.8s |
Strategy: thompson. Arm cold-start at 10 pulls; rebalance runs continuously.
| Worker | Type | Capacity | Last seen |
|---|---|---|---|
sidhe-spark |
light | 5000 | 4613s |
sluagh-d9278465 |
heavy | 10 | 131s |
Live snapshot of the task store — same store the bandit, verifiers, and failure-triage coach all read from:
| ID | Type | Status | Title |
|---|---|---|---|
d462a277-4a50-435e… |
research | completed | [lore] X-ray: hf-trl 1.0.0 — medium-risk (score:50) |
task_verify_532126… |
verify | completed | Verify: Update lint-staged: 16.4.0 → 17.0.5 (major) |
task_verify_532126… |
verify | completed | Verify: Update eslint-plugin-jsdoc: 62.9.0 → 63.0.0 (major) |
task_verify_532126… |
verify | completed | Verify: Update @types/node: 24.12.4 → 25.9.1 (major) |
task_verify_532126… |
verify | completed | Verify: Update @google/genai: 1.50.1 → 2.6.0 (major) |
task_verify_532126… |
verify | completed | Verify: Update zod: 4.3.6 → 4.4.3 (minor) |
Failures don't disappear and they don't page anyone first. The control plane has four loops that try to repair before a human is involved:
Cold-start floor is 10 pulls per arm. Bandits don't get to pick until each option has been tried at least that many times — so the win rates on the previous slide are earned, not assumed.
Calibration state on every task type:
Fully calibrated: true
If a new model is added, the calibration counter drops and the router automatically explores it before letting it influence routing decisions.
Every dispatch is priced from provider token usage and written to a reward/cost ledger — the same ledger the bandit reads to update arm rewards.
| Task type | Dispatches | Lifetime cost | Avg latency |
|---|---|---|---|
| code-change | 18,597 | $27.55 | 16.4s |
| CI-gate fix | 5,218 | $12.35 | 20.9s |
| failure-triage | 3,377 | $7.96 | 24.5s |
| context-analysis | 2,595 | $5.00 | 10.2s |
| research | 424 | $1.51 | 22.3s |
Zero lifetime fallbacks = bandit's first pick succeeded on every routed task; no degraded-mode dispatch.
PROXY_API_KEY required on internal hops — no anonymous calls.CLOUDFLARE_ACCOUNT_ID · CF_WORKERS_AI_TOKEN · CF_WORKERS_TOKEN · CF_DNS_TOKEN · DEEPSEEK_API_KEY · ANTHROPIC_API_KEY · GEMINI_API_KEY · PERPLEXITY_API_KEY
Closes the three places agentic stacks usually leak:
keys in code open admin untrusted callers
The same architecture that runs inside firewalled networks maps directly to SOC 2 / GDPR / HIPAA / ISO 27001 controls. Each property is a property of the build, not a checkbox added later:
| Framework | Control area | How it's already satisfied |
|---|---|---|
| SOC 2 | CC6 / CC7 — access & monitoring | Zero-trust access on admin surface; append-only event store for every dispatch. |
| GDPR | Art. 25 / 32 — data minimization, integrity | No client-side keys, no shadow copies; provider calls proxied through one audited gateway. |
| HIPAA | §164.312 — audit, transmission security | HMAC-verified ingress; bearer-token egress; per-task microVM isolation. |
| ISO 27001 | A.5 / A.8 / A.12 — policy, asset, ops | KV vault as single source of truth (82 keys); rotation in-place; outbound restricted to webhook-pull. |
Side effect: the system runs where most agent stacks can't — locked-down corporate networks, regulated environments, air-gapped review.
This deck is informational — a snapshot of what's built and where it's heading. Not a proposal.