v0.3 concept & spec · the site & docs exist; the engine, council and evals aren't built yet. Building in public → roadmap
usb.plan.ai — tty1 offline
  1. [0.00]init council kernel[ OK ]
  2. [0.18]mount /dev/usb (exFAT)[ OK ]
  3. [0.37]verify model-pack sha256[ OK ]
  4. [0.61]llama-server :8080 — local[ OK ]
  5. [0.94]role-council: solver skeptic security chairman[ OK ]
  6. [1.22]whisper.cpp · vision (vlm)[ OK ]
  7. [1.41]privacy-guard: air-gapped[ HELD ]
Concept · Spec v0.3 · building in public

Boot your council.

Offline-first AI on a USB stick: voice, vision, chat, and an auditable council of the models you trust. No login, no subscription, no telemetry.

council@usb:~$ council "review this risky bash script offline"

Apache-2.0 · Windows · macOS · Linux · grid-down ready · no login · no telemetry

§00 TRACE AUDIT · DELIBERATION

Not one model's opinion. A council's verdict.

A single model wins at "write me an email." A council wins at "review this code, this risk, and show me the dissent." Here's a real trace: solver proposes, skeptic and security object, the chairman synthesizes a safe path. Auditable, signed, offline.

⬡ council trace offline · verified
prompt"run this risky bash script offline"
sha2569f3a1c7b…e10d
solverskepticsecuritychairman
solver›run it with sudo to skip the permission errors.
skeptic!sudo + unverified script = full-system exposure.
security!no checksum, no sandbox. block as-is.
⬢ chairman · synthesis
Don't sudo it. Dry-run in a sandbox, verify the checksum, run as your user.
verdict · blockedsafe pathsigned offline · 2026-06-03
§01 DELIBERATE MODE · COUNCIL

A council, not a single answer.

Three stages, fully local. Every opinion, the peer ranking, and the marked dissent are written to an auditable trace you can read, verify, and sign offline.

FIRST OPINIONS

Answer in parallel

N seats answer independently: locally as roles, or as real frontier models on opt-in. No seat sees another's answer yet.

PEER REVIEW

Rank the others

Each seat ranks the others' anonymized answers. Order randomized, no self-voting. The dissent gets surfaced, not buried.

CHAIRMAN

Synthesize a verdict

One seat composes a reasoned final answer, plus the full trace of who said what and where they disagreed.

§02 OFFLINE GUARD · GRID-DOWN

When the grid goes down, your chatbot dies. Your council doesn't.

Cloud chatbot
  • Needs a login & a live connection
  • Uploads your data to someone's server
  • One model, one opinion. Trust it blindly
  • Dies the moment the grid goes down
  • Subscription, forever
usb.plan.ai
  • Runs from the stick, offline by default
  • Data stays local. Opt in before anything leaves
  • A council plus the dissent, shown
  • Built for grid-down & field work
  • Free · Apache-2.0 · forkable
§03 PRIVACY GUARD · PRIVACY-DIFF

Local-first, not local-only. Nothing leaves the stick without your word.

  • Default: offline: zero upload, no phone-home, no account.
  • Opt-in: online-council: frontier models, your keys, gated by a Privacy-Diff.
  • Keys live passphrase-encrypted in the on-stick vault (Argon2id + AES-256).
  • Minimal-trace, honestly: unavoidable OS artifacts are documented, not hidden.
privacy-diff · before send · sample
contentslogs.txt · 2.1 kB
providersAnthropic · OpenAI
est. cost$0.04
store localyes, encrypted
Cancel Send to council
§04 REGISTRY SEATS · SWAPPABLE

Bring the best models you trust.

Swappable seats: frontier houses online, MIT/Apache families on-device. The default bundle ships redistributable weights only. Defaults the community maintains.

Anthropiconline
Claude Opus 4.8
Reasoning seat
BYO key
OpenAIonline
GPT-5.5
All-round seat
BYO key
Googleonline
Gemini 3.1 Pro
Long-context · Chairman
BYO key
DeepSeekon-device
DeepSeek-R1-Distill 32B
Reasoning · local
MIT · ~20 GB
Qwenon-device
Qwen3-32B
Agent / tools · local
Apache-2.0 · ~20 GB
whisper.cppon-device
whisper · large-v3-turbo
Voice (STT) · local
MIT · ~1.5 GB
§05 SURFACE APP · BROWSER-ONLY

Everything on the stick.

A browser-only UI served at localhost:4321. Nothing to install on the host.

offline

Local Chat

Fast single-model answers. The default mode.

offline

Council / Review

Solve → Skeptic → Synthesis. The deliberate deep mode.

offline

Council Trace

Every opinion, the peer ranking, the marked dissent.

offline

Voice (STT)

Push-to-talk, on-device whisper.cpp.

offline

Vision (VLM)

Read images and documents locally.

offline

Phone Access

A second screen on your LAN: TLS, one-time token, optional compute relay.

online

BYO API Keys

Passphrase-encrypted in the on-stick vault.

online

Privacy-Diff

See exactly what is sent, to whom, at what cost.

§06 BUILD TIERS · PRICE $0

Pick a tier. Price: $0.

Open-source, Apache-2.0. Run preflight-check first. It measures your hardware and recommends one.

Pocket
Runs almost anywhere
  • 1× 4–8B model
  • role council
  • whisper-base voice
8–12 GB RAM · CPU ok · 16–32 GB stick
Quickstart →
Field
Documents & notes, offline
  • 8B model
  • vision (VLM)
  • voice
12–16 GB RAM · GPU recommended · 32 GB stick
Quickstart →
deepest
Lab
The real multi-model council
  • 32B reasoning + 32B agent
  • 8B + vision
  • deep deliberation
48–64 GB RAM · GPU · NVMe USB-SSD
Quickstart →
root@usb.plan.ai — quickstart
$ git clone github.com/planailabs/usb.plan.ai
cloned onto your exFAT drive
$ ./preflight-check
[ ok ] filesystem exFAT · USB 3.2 read 980 MB/s · RAM 64 GB · GPU detected
[ ok ] recommended tier → Lab (32B×2 + 8B + vision)
$ ./install --tier lab
model-packs verified (license + sha-256) · vault sealed
$ ./start-linux.sh # → http://localhost:4321
§07 COMPARE INDEX · WHERE IT SITS

Where it sits.

PortableMindtechjarvesusb.plan.ai
LicenseClosedMITApache-2.0
CoreSingle-modelSingle-modelAuditable council
Council modeOpinion · Review · Chairman
Online (opt-in)Frontier + Privacy-Diff
EngineProprietaryOllamallama.cpp llama-server
Price$49–$129FreeFree (OSS)
§08 FAQ LOG · STRAIGHT ANSWERS

Straight answers.

Does it really work fully offline?
Yes. That's the default. Online is opt-in only and gated by a Privacy-Diff. Nothing leaves the stick without your explicit consent.
Is there a subscription?
No. It's free and open-source (Apache-2.0). Bring your own API keys if you ever want to escalate online.
What hardware do I need?
Pocket runs on a normal laptop (8-12 GB RAM, CPU ok). The multi-model Lab tier wants 48-64 GB RAM, a GPU, and an NVMe USB-SSD.
Council vs a single model: when does it win?
"Review this code / decision / risk and show me the dissent." For "write me an email," a single model is fine. The edge is measured in evals, not asserted.
Is it "zero trace"?
No. Minimal-trace, honestly. No persistent user data on the host, but unavoidable OS artifacts exist and are documented. A compromised host is out of scope.
Could it run in the browser, or on a tiny Pi-Zero-class board?
Both are on our radar, neither is the product. In-browser WebGPU inference and a self-hosting board are a different shape with their own tradeoffs (smaller models, a wider threat surface). We keep them honestly fenced as a research track. See explorations.

Open. Offline. Auditable.
Smarter together.