Stop Paying for Code You Never Read

Stop Dumping Files Into Your Context Window

If you've been building utilities against the Claude API for any length of time, you've developed an instinct for context hygiene. You know that what goes into the prompt determines what comes out, and you've probably learned the hard way that "just throw the whole file in" is a fast path to bloated costs, degraded reasoning, and context windows that hit the ceiling at the worst possible moment.

So when I came across jCodeMunch, I recognized the problem it was solving immediately. Not because the README told me to, but because I've written that expensive code. You probably have too.

The Problem We've All Solved Wrong

Picture a common pattern in local Claude API tooling: you want the model to understand a codebase well enough to answer a question, generate a patch, or audit a function for security issues. The naive implementation goes something like this:

with open("src/auth/login.py") as f: context = f.read()

response = client.messages.create( model="claude-opus-4-5", messages=[{"role": "user", "content": f"{context}\n\nWhere's the session token validated?"}] )

It works. Until it doesn't. The file has 800 lines. The function you care about is 30. You just sent 770 lines of boilerplate, imports, and unrelated helpers to a model that charged you for every token of it and then had to mentally filter it out before it could answer.

Scale that to a multi-file agent workflow, and the math gets ugly fast. Multiply it by dozens of API calls per session and you've built a tool that's functionally correct but economically indefensible.

What jCodeMunch Actually Does

jCodeMunch is an MCP server built around a deceptively simple idea: index once, retrieve by symbol.

Rather than serving files, it builds a structured symbol index of your codebase using tree-sitter AST parsing then exposes that index through a set of MCP tools that let an agent (or your own API utility) ask for exactly what it needs. Not "give me auth/login.py" but "give me the UserService.validate_token method."

The retrieval is O(1) byte-offset seeking. It's not searching through files at runtime it stored the byte position of every symbol during indexing, so it can jump directly to the source without re-reading anything.

The benchmark numbers on the repo are striking. For finding a function in a real-world Python repo, the traditional approach consumed around 40,000 tokens. jCodeMunch served the same information for roughly 200. That's not a marginal improvement, that's a different cost structure entirely.

Wiring It Up in a Local API Utility

This is where it gets interesting for people writing API-key-based tooling rather than using Claude Desktop. jCodeMunch is an MCP server, so it integrates natively with MCP-compatible clients but you can also drop it into your local environment and drive it from your own code.

Install:

pip install git+https://github.com/jgravelle/jcodemunch-mcp.git

Configure environment:

export GITHUB_TOKEN=ghp_... # optional, raises API rate limits export ANTHROPIC_API_KEY=sk-ant-... # optional, enables AI-generated symbol summaries

The ANTHROPIC_API_KEY flag is worth pausing on. When you set it, jCodeMunch uses Claude to generate one-line summaries for each symbol during indexing. These summaries become part of search results, making search_symbols() queries semantically richer. Without it, it falls back to docstrings and signatures. It is still useful, but less descriptive for ambiguous symbols.

Index your codebase:

index_folder: { "path": "/path/to/your/project" }

The index gets stored in ~/.code-index/ by default. From there, your workflow changes fundamentally.

Before:

# Read entire file → send to Claude → hope it finds what it needs

After:

# search_symbols() → get_symbol() → send only the relevant source

Real Workflows Where This Changes the Game

Architecture audits. get_repo_outline gives you the high-level shape of an entire repo, module hierarchy, symbol counts, file structure, in roughly 2,000 tokens. What used to require dumping dozens of __init__.py and module files now fits comfortably in a single API call's worth of context.

Targeted code review. Instead of sending an entire PR's worth of files to Claude for review, use get_file_outline to identify the changed symbols, then get_symbol to retrieve only those implementations. Your review prompt stays lean and focused.

Onboarding workflows. If you're building internal tools that help engineers understand an unfamiliar codebase, jCodeMunch is a natural fit. search_symbols("authenticate") returns every function, method, and class across the repo that matches with signatures and summaries at a fraction of the cost of loading files to scan manually.

Multi-agent pipelines. This is probably where the savings compound most aggressively. In a pipeline where agents hand off context across steps, each step that reaches back into the codebase without jCodeMunch is paying full file-reading costs. With it, those retrieval calls become cheap enough that you can afford more of them.

A Few Honest Caveats

jCodeMunch is syntactic, not semantic. It parses ASTs, it doesn't understand data flow, runtime behavior, or cross-language dependencies. If you need to trace a value through an async call chain or understand how a class behaves at runtime, you'll still need to bring more context in manually. It's a precision retrieval tool, not a program analyzer.

It also explicitly doesn't support: real-time file watching, LSP-style diagnostics, or editing workflows. If you're building something that needs live awareness of file changes, you're reaching for the wrong tool.

And search_text still reads files. It's there for when symbol lookup misses, but leaning on it too heavily brings you back toward the naive pattern you were trying to escape.

The Underlying Principle

What jCodeMunch is really selling isn't a specific feature set, but instead it's a philosophy about how agents should interact with codebases.

Structured retrieval beats brute-force context.

Agents don't need larger windows; they need better targeting. That philosophy applies well beyond jCodeMunch itself. If you're building local API utilities that work with code, the same instinct should inform every design decision: what's the minimum unit of context that actually answers this question? Structure your retrieval around that unit, and your costs, and your model's reasoning quality, will reflect it.

jCodeMunch just makes that instinct easy to act on.

jCodeMunch MCP is available at github.com/jgravelle/jcodemunch-mcp. MIT licensed, Python 3.10+.

This blog post was compiled with the help of Claude.ai.