Back to Blog Cover image for "Every /clear Costs You Tokens. Stop Re-Exploring."

Every /clear Costs You Tokens. Stop Re-Exploring.

Octopus Team·May 1, 2026

The Hidden Cost of a Fresh Session

Across enterprise deployments, Claude Code averages roughly $13 per developer per active day, with monthly bills landing in the $150 to $250 range per seat. A meaningful slice of that bill is spent on something developers rarely think about: the agent re-discovering a codebase it already explored an hour ago.

Every /clear, every new task, every session restart kicks off the same expensive ritual. The agent opens files. Reads imports. Walks the directory tree. Greps for symbols. By the time it understands enough to start working, the input token meter has been running for thousands of tokens, and the actual task has not begun.

Why /clear Is Both Necessary and Expensive

Context management in Claude Code is a two-sided trade. As your conversation grows, quality starts to degrade. This is well documented as context rot: accuracy drops, the model forgets earlier decisions, and /compact cycles start to chain. The recommended fix is simple. Use /clear between unrelated tasks. Reset to a clean slate. Start fresh.

The trade-off is that "fresh" means amnesiac. Whatever the agent learned about your authentication module, your service boundaries, your weird circular dependency between billing and subscriptions, it is all gone. The next prompt sends it back into exploration mode.

A vague request like "fix the bug in the auth flow" can trigger broad scanning across dozens of files. A specific request like "fix JWT validation in src/auth/validate.ts at line 42" sends the agent straight to the file. The token gap between those two prompts can be 10x or more, and most of the gap is exploration cost.

This is the bind. You have to clear context to keep quality high, but every clear forces the agent to re-pay the discovery tax. Multiply that by the number of times your team types /clear in a day, and you can see the leak.

The Re-Read Problem

Claude Code reads files to build understanding. Then re-reads them. Within a single session, the same file is often opened, modified, opened again to verify, then opened a third time when working on a related file. One developer measured this and found that 40 to 60 percent of file read tokens went to redundant re-reads.

Across a fresh session, the redundancy is even worse. The agent has zero memory of what it already learned in the previous session, so it starts from scratch. If five engineers on your team each restart a session three times a day, that is fifteen blind explorations of the same codebase, every day, for the same handful of files.

There is a cheaper way to do this.

Context as a Service

Octopus Review indexes your repository into a Qdrant vector store the first time you connect it. Embeddings persist. The codebase is queryable, semantically, by any client that speaks to the Octopus CLI.

The CLI exposes that index through commands that pull relevant context out of Qdrant on demand, instead of forcing the agent to walk the file tree to find it. When you start a new Claude Code task, you can prefetch only the files and snippets the agent actually needs, paste them into the prompt, and skip the discovery phase entirely.

Here is how that looks in practice. Index the repo once:

octopus repo index

When you open a fresh Claude Code session for a task, instead of letting the agent guess where the relevant code lives, ask Octopus first:

octopus repo chat

Usage: octopus repo chat [options] [repo]

Start an interactive chat about a repository

Arguments:
  repo                   Repository name or full name (auto-detects from git remote)

Options:
  -p, --print <message>  Pipeline mode: ask a single question and print the answer (no interactive UI)
  -g, --global           Global mode: ask questions across all repos in your organization
  -h, --help             display help for command

This drops you into an interactive session against the indexed repo. Ask it for the auth flow. Ask it which files implement rate limiting. Ask it where the payment retry logic lives. Octopus pulls the matching chunks from Qdrant and returns them, scoped and ranked. Copy the relevant context into your Claude Code prompt and start the task.

The agent now begins work with the right files already in context, instead of spending the first few thousand tokens looking for them.

What This Saves

Three things, roughly in order of impact.

First, input tokens. Skipping exploration on a fresh session typically removes the first wave of file reads, which on a medium-sized repo can run anywhere from 5K to 30K tokens before the actual task starts. On a Sonnet session, that is real money over a month. On a subscription plan, it is real headroom against your 5-hour usage window.

Second, response quality. The agent gets a curated, semantically relevant slice of the codebase upfront, rather than whatever it stumbled into via grep. Targeted context is denser context, and denser context produces sharper answers. You spend tokens on reasoning, not on rediscovery.

Third, time. Pre-loading context is faster than waiting for the agent to perform a dozen Read and Glob calls in sequence. Especially on large repos, the difference is felt in seconds per turn.

A Workflow That Compounds

The pattern is straightforward and survives /clear:

Index your repo with Octopus once. Re-index periodically as the codebase moves.
When starting a new Claude Code task, query the index first with octopus repo chat or octopus repo analyze.
Paste the relevant snippets into your opening prompt with the actual task.
Let Claude Code work with the right files already in hand.

The same indexed embeddings that power Octopus Review's PR comments now power your local agent workflow. The Qdrant store is reused. The work of understanding your codebase is paid once and amortized across every session, every developer, every PR review.

Try It

Octopus Review is open source under a Modified MIT license, self-hostable, and works with both Claude and OpenAI through BYOK. The CLI ships as @octp/cli on npm. Index your first repo at octopus-review.ai, star the repo on GitHub, or jump into our Discord if you want to compare notes on context engineering.

Stop paying the discovery tax on every fresh session. Index once, query anywhere.