TokenShield is a local proxy with a live ledger of every Claude request — real billed tokens, by model and session, on your machine. Point Claude Code at it in one line. Your ANTHROPIC_API_KEY never leaves your shell.
Optional optimization processors are opt-in and experimental — and the same ledger shows you their real effect on your workload before you trust them. We don't promise a percentage; we give you the measurement.
Runs entirely on your machine — cloud sync is opt-in.
Your ANTHROPIC_API_KEY never leaves it.
Illustrative figures. Install the free CLI to see your own spend, broken down by model and token class, updated live as you work.
No signup required to start measuring. The CLI runs entirely on your machine; cloud sync is opt-in.
The CLI boots a local proxy on 127.0.0.1:7777 and a dashboard at 127.0.0.1:7778 — which will open automatically in your browser. Leave the terminal running; press Ctrl-C to stop.
npm install -g @curatedmcp/tokenshieldtokenshield upIn the shell you run claude from, set the base URL. Your existing ANTHROPIC_API_KEY stays where it is — we never read it.
export ANTHROPIC_BASE_URL=http://127.0.0.1:7777Cursor / Windsurf: same idea — Settings → AI → Custom Base URL. Integration guides →
The dashboard at 127.0.0.1:7778 shows live token counts and dollar amounts — broken down by model and token class. Note: 127.0.0.1:7777 is the proxy itself — it forwards traffic to Anthropic, not a UI. Visit the dashboard URL instead.
The ledger is always on — that's the part you can trust on day one. The optimization processors are opt-in, fail-open, and measured against your own bill. We ship the ones that prove net-positive in your dashboard, not the ones that look good in a slide.
Always on. Records every request — real billed input, cached-read, and output tokens, by model and session — to a local SQLite ledger. This is the part you can trust on day one.
Content-hashes repeated tool-result blocks and replaces exact repeats with pointers. Experimental and off by default: on prompt-cached sessions the repeats are already discounted, so the win is workload-dependent — the ledger shows you yours.
Returns identical non-streaming tool calls from a local LRU+TTL cache. Note: streaming requests (Claude Code's default) bypass it by design.
When Claude re-reads a file, send a unified diff against the version it already has instead of the whole file. This targets waste caching doesn't cover — the highest-value processor on the roadmap.
Detect 'Would you like me to continue…' patterns and let you kill generation in one tap. Output tokens cost ~5× input — stopping early is real money.
When a session passes 100K tokens, summarize the early turns into a compact prefix. Powerful but cache-invalidating — shipping only once it's net-positive in measurement.
TokenShield speaks each provider's wire format natively. Anthropic is live today; OpenAI and Gemini are next.
via Claude Code, Cursor, Windsurf, Zed, Aider
All Anthropic SDK clients work today.
via Cursor, Continue, Aider, custom apps
Provider adapter on the roadmap — join the waitlist below.
via Gemini CLI, custom apps
Coming after OpenAI.
Tell us which provider matters most and we'll email you the day it goes live (and no other day).
Most token-savings tools lead with a big number you have to take on faith. We think a trust tool should do the opposite — so the measurement is the product, and it's free.
Install the free CLI, point Claude Code at it, and just work. The local dashboard records your real billed tokens — fresh input, cached reads, and output — by model and session.
Turn on conversation-dedup for a few sessions in diff-mode. Watch the ledger. On heavily prompt-cached workloads the effect may be small — and the ledger will tell you that honestly, on your own data.
If it pays for itself, subscribe to Pro for the cloud dashboard, multi-machine sync, and every processor as it ships. If it doesn't, you keep the free visibility forever. No guarantee to argue about — just your bill.
The visibility is free and local forever. Pay only for the cloud dashboard and team sync.
For everyone — start here
$0 forever
MIT licensed · no account
Install the free CLIFor individual ICs and contractors
$19 / month
cancel anytime
Subscribe — $19/moUp to 10 developers
$199 / month flat
shared team dashboard
Talk to usNeed org-wide MCP visibility and governance, not just spend? See the Team Plane →
TokenShield runs as a local proxy bound to 127.0.0.1. Your ANTHROPIC_API_KEY lives in the shell that launched Claude Code — we never read it, log it, or persist it.
Optional cloud telemetry is aggregate-only. Token counts, dollar amounts, processor IDs. Never prompts. Never tool names. Never content. Off by default; verifiable in tokenshield doctor.
Localhost binding by default
Proxy listens on 127.0.0.1 only. Opt-in --bind 0.0.0.0 for trusted team networks.
Fail-open on every middleware
If a processor throws, the request goes through untouched. We never break Claude Code to save tokens.
Local SQLite ledger
Every request span recorded on your disk. Configurable retention; nuke with one command.
Honest answer: it depends entirely on your workload, and we won't put a percentage in front of you that your own bill might not back up. On prompt-cached workloads — which is Claude Code's default — much of the repeated context is already discounted by Anthropic, so the deduplication win can be small. That's exactly why the ledger is free and local: run it for a week, toggle the experimental processors, and read your own number off your own data. We'd rather you trust your bill than our marketing.
No. TokenShield runs as a local proxy on your machine. Your ANTHROPIC_API_KEY stays in your shell — we never read it, log it, or send it anywhere. Optional cloud telemetry is aggregate-only (token counts and dollar amounts) and the payload schema rejects any field whose name suggests content. It's off by default and verifiable in `tokenshield doctor`.
Yes. Every middleware is fail-open: if a processor throws, the request goes through untouched, and we preserve SSE streaming byte-faithfully. The optimization processors are off by default, and you can run the first weeks in diff-mode to review every modification side-by-side in the local dashboard. You can also `unset ANTHROPIC_BASE_URL` to bypass TokenShield instantly.
Those are cloud observability and routing gateways — your traffic flows through their servers. TokenShield is local-first: the proxy and ledger run on your machine, your API key never leaves it, and there's no account required to start measuring. The visibility is the product; optimization is an opt-in extra you measure yourself.
Because the value depends on never seeing your prompts. A hosted proxy would mean we receive every token your AI sends and receives — a liability we won't take and a privacy story you shouldn't have to trust. The Pro cloud dashboard syncs aggregate metrics only; the proxy itself is always local.
Install, point Claude Code at localhost:7777, and watch the ledger. No signup, no credit card, no telemetry — until you decide to opt in.