Skip to content

Roadmap

Sharpened scope. A small working core activates contributors before breadth.

PhaseContentRationale
v0.1: CorePortable llama-server + Local Chat + browser-only UI + preflight-check (FS/speed/RAM/GPU) + minimal-trace smoke testA small working core activates contributors
v0.2: CouncilLocal role council + trace UI + demo prompts + bias guardThe wow moment, runnable on weak hardware
v0.3: OnlineOpenRouter escalation + Privacy-Diff + cost cap + BYO keys (vault)Premium depth on network, privacy consistent
v0.4: Evalsevals/ Council-vs-Single + local multi-model Lab tierProof of the core claim
v0.5: MultimodalVoice (whisper.cpp) + Vision (VLM) + Phone Access (TLS)Only after the Council wow is proven
v0.6: Local DistributionOptional Wi-Fi “hub mode”: a powered stick (e.g. on a power bank) serves signed model-packs, tools, and a local council API to paired nearby devices. A reverse proxy fronts one /v1 endpoint, gated by a QR-paired bearer tokenField teams without per-device installs; internet-off LAN, not air-gapped
v1.0Signed releases, update/rollback, landing page live, hardware matrix/CIA dependable spec

VLM note: multimodal support in llama.cpp is version-fragile. “Just load a VLM GGUF” underestimates the work; hence deliberately late.

Not on this roadmap: browser-side WebGPU inference and a self-hosting board are scoped but unchosen. They live in explorations, not here, until an open question settles them.