Models that run on your own hardware, with no cloud and no token bill, and the only thing allowed to read raw private data. Claude orchestrates; the local models do the work.
The same MCP tool servers are shared by Claude Code (the orchestrator) and by Open WebUI (your local chat surface), bridged by mcpo and all talking to Ollama on your machine.
The default local reasoner and the Command Center's draft engine. Schema-constrained generation for the everyday structured jobs.
7.6 GB · primaryThe resident driver. Won the 2026-05-24 tool-use bake-off at 96% tool-call correctness and runs the multi-step agent loop behind the Mac-control toolkit.
19 GB · driverFast structured-output worker and bake-off runner-up (86%, needs an XML-parse shim). Strong on code and single-shot tool calls.
18 GB · workerA general-purpose mid-size worker for everyday local jobs.
4.9 GB · generalSmall and fast. Powers the local Open WebUI chat assistant and quick on-device tasks.
3.3 GB · fastVision (image → text), fully on-device. Image bytes never reach the cloud.
4.7 GB · visionThe honest scope: local models started as reliable workers Claude delegates to. The autonomous multi-tool agent was deliberately pared back because multi-step tool loops weren't reliable, but a 2026-05-24 bake-off shows that's changed, which has reopened the question.
Claude orchestrates and hands a local model a single-shot job through the ollama-bridge MCP. Bulk work runs locally; Claude keeps the reasoning.
A deterministic script drives a private-file job; the local model is the only LLM that ever sees raw personal content, and only a sanitized derivative crosses to Brain.
Single-shot tool calls always worked. Verified end-to-end (local model → mcpo → filesystem MCP → a real directory listing), with 56 tools wired across both clients.
Reliable multi-step tool loops were the weak spot, which is why the autonomous-agent ambition got scoped down. But the 2026-05-24 tool-use bake-off changed the verdict: glm-4.7-flash hit 96% tool-call correctness driving a 5-tool multi-step agent loop end-to-end (the Mac-control toolkit: app, messages, calendar, notes, screen-read) behind dry-run/confirm gates.
The hard privacy constraint is unchanged, and it's now the interesting part: an orchestrator must see what it orchestrates, so Claude still can't loop over Claude-denied private files, but a capable local agent can. So the open question is no longer "can local models do this" but "should a local agent take over the private-file orchestration that deterministic scripts do today." That's a scope call now actively reopened.
Raw personal data (the Journal, health, finances) can be processed by an LLM without a single byte leaving the Mac.
Local inference costs zero tokens and works with no network. Bulk jobs don't run up a bill.
ConnorGPT is reachable from your phone over Tailscale, never exposed on the public internet.