CodeIR is open source.
pip install codeir-tools
Works with Claude Code out of the box.
CodeIR compiles every function, class, and method in a codebase into three levels of progressive detail. The agent works top-down: orient broadly, reason about dependencies, expand to source only when needed.
The workflow in practice: the agent reads Index to find what's relevant, inspects Behavior to understand how entities connect, and expands to Source only where it needs to act. Most entities in a session never go past Behavior. The ones that get expanded are the ones that actually matter.
When Claude or GPT tries to understand a large codebase, it does something slightly absurd: it reads files.
Entire files. Thousands of lines of syntax, indentation, and formatting, burning tokens and filling context windows. Most of the information the LLM is taking in carries little information about how the system actually works.
Humans don’t navigate codebases this way. After a few weeks on a project, developers stop thinking in files entirely. They think in architecture: which modules own which responsibilities, which entities call each other, where the system boundaries are.
They only dive into source code when something specific needs to change.
LLMs never get that representation. They get raw code.
CodeIR is essentially a semantic compiler for codebases. It is software that takes raw source code and distills every function, class, and method into a layered representation of what it does, what it touches, and how central it is, no LLM required.
Your coding agent gets to see the whole system's functionality in its context window, and can drill from a 20-token summary down to full source only when it needs to.
Instead of reading code to understand functionality the model starts with understanding and reads code to confirm details.
Instead of searching files and hoping to find the right ones, CodeIR gives the agent every function, class, and method in view with the ability to drill into deeper levels of functional detail on demand.
CodeIR compiles a codebase into a hierarchical representation of functionality with three levels: Index, Behavior, and Source. This allows models to orient themselves across thousands of entities and understand how they function together before expanding only the specific code they need.
CodeIR gives your coding agent a set of structured queries against the index. Instead of reading files and hoping to find what's relevant, it asks directly:
CodeIR is open source.
pip install codeir-tools
Works with Claude Code out of the box.
Since Claude is the primary "user" of CodeIR, I didn't want to just guess if it helped. I asked him. During development, Claude's feedback was so specific that it actually shaped the feature set.
For example, Claude found that while entity search was great, he still missed the "vibe"
of a classic grep. Based on that, we built codeir grep, which returns IR
context alongside regex matches.
Here is what Claude had to say:
If you use the tool we would love to hear about Claude's experience. We encourage you to ask him and share his feedback (and yours) in our Claude Feedback thread.