Local-first · your API key never leaves your machine · MIT

See exactly where your Claude tokens go.

TokenShield is a local proxy with a live ledger of every Claude request — real billed tokens, by model and session, on your machine. Point Claude Code at it in one line. Your ANTHROPIC_API_KEY never leaves your shell.

Optional optimization processors are opt-in and experimental — and the same ledger shows you their real effect on your workload before you trust them. We don't promise a percentage; we give you the measurement.

Install free in 60 seconds See how it works

Pro · $19/mo

Cloud dashboard · multi-machine · cancel anytime

Founding Team · $199/mo flat

Up to 10 devs · shared dashboard

Local-firstYour API key stays on your machineFail-open — never breaks Claude

localhost:7778 · your local dashboardExample

Spend this session

$2.41

Input · fresh184K$0.55

Input · cached read1.6M$0.48

Output119K$1.38

Runs entirely on your machine — cloud sync is opt-in.

Your ANTHROPIC_API_KEY never leaves it.

Illustrative figures. Install the free CLI to see your own spend, broken down by model and token class, updated live as you work.

Three steps. Sixty seconds.

No signup required to start measuring. The CLI runs entirely on your machine; cloud sync is opt-in.

1
Install and start the proxy
The CLI boots a local proxy on 127.0.0.1:7777 and a dashboard at 127.0.0.1:7778 — which will open automatically in your browser. Leave the terminal running; press Ctrl-C to stop.
bash
npm install -g @curatedmcp/tokenshield tokenshield up
2
Point Claude Code at the proxy
In the shell you run claude from, set the base URL. Your existing ANTHROPIC_API_KEY stays where it is — we never read it.
bash
export ANTHROPIC_BASE_URL=http://127.0.0.1:7777
Cursor / Windsurf: same idea — Settings → AI → Custom Base URL. Integration guides →
3
Watch your spend in real time
The dashboard at 127.0.0.1:7778 shows live token counts and dollar amounts — broken down by model and token class. Note: 127.0.0.1:7777 is the proxy itself — it forwards traffic to Anthropic, not a UI. Visit the dashboard URL instead.
Full quickstart guide Troubleshooting

What sits between Claude and your wallet

The ledger is always on — that's the part you can trust on day one. The optimization processors are opt-in, fail-open, and measured against your own bill. We ship the ones that prove net-positive in your dashboard, not the ones that look good in a slide.

Core · live

Live spend ledger

Always on. Records every request — real billed input, cached-read, and output tokens, by model and session — to a local SQLite ledger. This is the part you can trust on day one.

Experimental · opt-in

Conversation deduplication

Content-hashes repeated tool-result blocks and replaces exact repeats with pointers. Experimental and off by default: on prompt-cached sessions the repeats are already discounted, so the win is workload-dependent — the ledger shows you yours.

Live · non-streaming

Result cache

Returns identical non-streaming tool calls from a local LRU+TTL cache. Note: streaming requests (Claude Code's default) bypass it by design.

Roadmap

Diff-based file reads

When Claude re-reads a file, send a unified diff against the version it already has instead of the whole file. This targets waste caching doesn't cover — the highest-value processor on the roadmap.

Roadmap

Streaming early-stop

Detect 'Would you like me to continue…' patterns and let you kill generation in one tap. Output tokens cost ~5× input — stopping early is real money.

Roadmap

Context auto-summarize

When a session passes 100K tokens, summarize the early turns into a compact prefix. Powerful but cache-invalidating — shipping only once it's net-positive in measurement.

Works with what you already use

TokenShield speaks each provider's wire format natively. Anthropic is live today; OpenAI and Gemini are next.

Anthropic

Live

via Claude Code, Cursor, Windsurf, Zed, Aider

All Anthropic SDK clients work today.

OpenAI

Coming up

via Cursor, Continue, Aider, custom apps

Provider adapter on the roadmap — join the waitlist below.

Google Gemini

Coming up

via Gemini CLI, custom apps

Coming after OpenAI.

Get notified when OpenAI / Gemini ships

Tell us which provider matters most and we'll email you the day it goes live (and no other day).

We don't promise a percentage. We give you the ledger.

Most token-savings tools lead with a big number you have to take on faith. We think a trust tool should do the opposite — so the measurement is the product, and it's free.

1
Run your normal week with the ledger on
Install the free CLI, point Claude Code at it, and just work. The local dashboard records your real billed tokens — fresh input, cached reads, and output — by model and session.
2
Toggle the experimental processors
Turn on conversation-dedup for a few sessions in diff-mode. Watch the ledger. On heavily prompt-cached workloads the effect may be small — and the ledger will tell you that honestly, on your own data.
3
Decide with your own numbers
If it pays for itself, subscribe to Pro for the cloud dashboard, multi-machine sync, and every processor as it ships. If it doesn't, you keep the free visibility forever. No guarantee to argue about — just your bill.

Simple pricing. Free to measure.

The visibility is free and local forever. Pay only for the cloud dashboard and team sync.

Free CLI

For everyone — start here

$0 forever

MIT licensed · no account

Install the free CLI

Local live spend ledger
Breakdown by model & session
Experimental processors, opt-in
Your API key never leaves your machine

For daily drivers

Pro

For individual ICs and contractors

$19 / month

cancel anytime

Subscribe — $19/mo

Everything in Free
Cloud dashboard & history
Multi-machine sync
Every new processor as it ships

Founding Team

Up to 10 developers

$199 / month flat

shared team dashboard

Talk to us

Everything in Pro, up to 10 devs
Shared spend dashboard
Priority support

Need org-wide MCP visibility and governance, not just spend? See the Team Plane →

Privacy by architecture

Your API key never leaves your machine.

TokenShield runs as a local proxy bound to 127.0.0.1. Your ANTHROPIC_API_KEY lives in the shell that launched Claude Code — we never read it, log it, or persist it.

Optional cloud telemetry is aggregate-only. Token counts, dollar amounts, processor IDs. Never prompts. Never tool names. Never content. Off by default; verifiable in tokenshield doctor.

Privacy whitepaper View on npm

Localhost binding by default
Proxy listens on 127.0.0.1 only. Opt-in --bind 0.0.0.0 for trusted team networks.
Fail-open on every middleware
If a processor throws, the request goes through untouched. We never break Claude Code to save tokens.
Local SQLite ledger
Every request span recorded on your disk. Configurable retention; nuke with one command.

Frequently asked

How much will TokenShield actually save me?

Honest answer: it depends entirely on your workload, and we won't put a percentage in front of you that your own bill might not back up. On prompt-cached workloads — which is Claude Code's default — much of the repeated context is already discounted by Anthropic, so the deduplication win can be small. That's exactly why the ledger is free and local: run it for a week, toggle the experimental processors, and read your own number off your own data. We'd rather you trust your bill than our marketing.

Does TokenShield see my prompts?

No. TokenShield runs as a local proxy on your machine. Your ANTHROPIC_API_KEY stays in your shell — we never read it, log it, or send it anywhere. Optional cloud telemetry is aggregate-only (token counts and dollar amounts) and the payload schema rejects any field whose name suggests content. It's off by default and verifiable in `tokenshield doctor`.

Will Claude still work correctly when a processor is on?

Yes. Every middleware is fail-open: if a processor throws, the request goes through untouched, and we preserve SSE streaming byte-faithfully. The optimization processors are off by default, and you can run the first weeks in diff-mode to review every modification side-by-side in the local dashboard. You can also `unset ANTHROPIC_BASE_URL` to bypass TokenShield instantly.

How is this different from Helicone, LiteLLM, or Portkey?

Those are cloud observability and routing gateways — your traffic flows through their servers. TokenShield is local-first: the proxy and ledger run on your machine, your API key never leaves it, and there's no account required to start measuring. The visibility is the product; optimization is an opt-in extra you measure yourself.

Why local-first instead of a hosted SaaS?

Because the value depends on never seeing your prompts. A hosted proxy would mean we receive every token your AI sends and receives — a liability we won't take and a privacy story you shouldn't have to trust. The Pro cloud dashboard syncs aggregate metrics only; the proxy itself is always local.

See your Claude spend in sixty seconds.

Install, point Claude Code at localhost:7777, and watch the ledger. No signup, no credit card, no telemetry — until you decide to opt in.

Install free CLI Subscribe — $19/mo Team — $199 flat

See exactly where your Claude tokens go.

Three steps. Sixty seconds.

Install and start the proxy

Point Claude Code at the proxy

Watch your spend in real time

What sits between Claude and your wallet

Live spend ledger

Conversation deduplication

Result cache

Diff-based file reads

Streaming early-stop

Context auto-summarize

Works with what you already use

Anthropic

OpenAI

Google Gemini

Get notified when OpenAI / Gemini ships

We don't promise a percentage. We give you the ledger.

Run your normal week with the ledger on

Toggle the experimental processors

Decide with your own numbers

Simple pricing. Free to measure.

Free CLI

Pro

Founding Team

Your API key never leaves your machine.

Frequently asked

See your Claude spend in sixty seconds.