yothere has a small vocabulary. Learn these nine words and the rest of the docs read easily. The one thing to hold onto: yothere is the interface and the attention router. It does not do the work. A brain does. Everything below is either part of the interface, part of the router, or the contract to a brain.
Thread
A thread is one task. You hail yothere with a task and it creates a thread to carry that task to completion.
Each thread is a directory on disk (a dir-per-thread state machine). Its live state lives in a status.json written atomically (via an atomic rename), so a reader never sees a half-written file. A thread resumes across turns by its session id, so a brain keeps its own context and yothere just points at the same session again.
A thread moves through a fixed set of states:
| State | Meaning |
|---|---|
running |
A worker is advancing it right now. |
resumed |
You just replied; it will advance on the next tick. |
blocked |
It is stalled on you (it asked a question or wants an approval). |
done |
The work finished. |
stuck |
Its worker died or went stale; it needs a look. |
parked |
Settled and set aside (for example an ask nobody answered for a day). |
The board and the attention router read a smaller, normalized vocabulary derived from those states: running, ready (a finished done thread), blocked (covers both blocked and stuck), and idle (a parked thread). So a “ready” or “idle” thread you see in the cockpit is the glance-level view of the underlying thread state.
Runner
The runner is the headless advance engine. It is the background service that actually moves the fleet forward. It runs a loop, one tick roughly every 30 seconds, and on each tick it:
- Reaps finished worker subprocesses.
- Sweeps stale threads (a
runningthread whose worker died or went silent) tostuck. - Checks the cost caps before spawning anything new.
- Asks the attention router which threads to advance, bounded by a max concurrency and a per-thread cooldown.
- Spawns one worker subprocess per chosen thread.
- Coalesces any new “needs your eyes” transitions into ONE rate-limited nudge (“N threads need your eyes, open the board”) rather than a per-event firehose.
yothere runner loop.Attention router
The attention router is the moat. Other agent cockpits decide which agent runs next. yothere decides which thread wants the human’s eyes next, and it protects your focus while the rest of the fleet churns.
It is deterministic and offline-first (no model call, no online learning), so it is unit-testable and never stalls on a network hop. It ranks the whole fleet by an additive score (a blocked thread that is stalling on you outranks a finished one, urgency and staleness nudge the order), recommends a single focus thread, and honors a pinned focus so a thread you chose to concentrate on is never bumped by the ranker.
Brain (and the Brain Protocol)
A brain is what actually does the work: it reads data, runs tools, drafts and sends messages. yothere is harness-agnostic by design, so a brain can be anything that speaks Brain Protocol v1 (WebSocket transport, JSON-RPC 2.0 frames).
Three brain shapes are in use today:
- Local Claude Code running on your machine (
claude -pper thread). The default. - A WebSocket-daemon harness, for example ubob, that yothere drives over its local WebSocket (
ws://127.0.0.1:8765). - A remote brain over the internet at a
wss://endpoint (a hosted sprite, a teammate’s stack, a work platform).
The same cockpit drives all three. See How it works for the full protocol and Agent onboarding for wiring one up.
Voice
Voice lets you hail and reply hands-free. It is a Gemini-Live voice loop that runs one of two ways:
- WebRTC over your tailnet (free, no phone provider). A browser on your network connects directly to the local voice server.
- Twilio / PSTN, so a real phone call can reach the same loop.
Voice is optional. Install the voice extra (pip install 'yothere[voice]') to light it up; without the media stack the cockpit still renders and the Connect button reads “voice unavailable”.
Tunnel
The tunnel is how a Twilio phone call reaches your local voice server. It is a cloudflared tunnel that fronts the loopback voice server (127.0.0.1:8767) at a public wss:// host, so Twilio’s inbound webhook has a stable public endpoint to hit while the server itself stays bound to loopback. You only need it for the Twilio/PSTN path; the tailnet WebRTC path does not use it.
Cockpit (/overview)
The cockpit is the web UI, served at /overview. It is a live fleet board (an inbox of threads that need you, a working lane, and past sessions) plus a voice companion (the Connect button and the call transcript). It is served by the voice service on 127.0.0.1:8767. See the Cockpit tour for a full walkthrough.
Board
The board is a standalone, server-rendered glance surface: the focus recommendation up top, then every thread as a card with its state, live progress line, needs-eyes badge, deliverable link, age, and source. It is the lighter, read-only cousin of the cockpit: it never writes thread state, and every worker-authored field is HTML-escaped, so an agent that emits arbitrary text can never inject markup (XSS-safe by construction). Build and open it with yothere board --open.
Cost caps
The runner enforces spend limits so a runaway thread cannot quietly burn money. The defaults, from the runner config:
| Cap | Default | What it does |
|---|---|---|
| Daily fleet cap | $10.00 |
Once the fleet’s spend today crosses this, the runner stops spawning new turns and nudges you once. |
| Per-thread cap | $1.50 |
A thread that has spent this much today is blocked with a “continue, raise the cap, or stop” ask. |
| Tick interval | ~30s |
How often the runner loop advances the fleet. |
| Max concurrency | 3 |
At most this many worker subprocesses advance at once. |
| Per-thread cooldown | ~180s |
A thread waits this long between turns (its first turn and a fresh reply bypass it). |
| Stale after | ~1h |
A running thread with a dead worker this old is swept to stuck. |
| Auto-park ask | ~24h |
A blocked ask nobody answered for this long is auto-parked to keep the inbox clean; a late reply un-parks it for free. |
cost events and is advisory. For a local Claude Code brain the cap is enforced from the recorded cost ledger. See Configuration to change any of these.