RepoMap
RepoMap is a default-on tool that builds a lightweight symbol index of the working repository and hands a compact text rendering of that index to the agent at the start of planning steps. It gives the agent a map of what is in the codebase before it decides which files to read.
The implementation lives in crates/wcore-repomap.
What the index contains
Section titled “What the index contains”When RepoMap::build runs, it walks the repository root with ignore, which respects .gitignore, .git/info/exclude, and other standard ignore rules by default. For each file it finds, it records:
| Field | Type | Notes |
|---|---|---|
path | relative PathBuf | Sorted deterministically by path; order is stable across runs on any OS. |
language | Language enum | Rust, TypeScript, JavaScript, or Other. Detected by file extension. |
lines | usize | Total line count. |
size_bytes | u64 | On-disk size. |
symbols | Vec<Symbol> | Extracted declarations. Empty for Language::Other. |
imports | Vec<String> | Import or use lines. Empty for Language::Other. |
first_meaningful_line | Option<String> | For files where language is Other: the first non-blank, non-comment line, capped at 200 bytes. |
Files over 5 MiB or over 50,000 lines are recorded with their path, language, and size, but no symbols are extracted. Non-UTF-8 files (binaries) are recorded with size only.
Symbol extraction
Section titled “Symbol extraction”The extractor uses regex patterns compiled once via std::sync::OnceLock, not a full parser. This keeps binary growth in the sub-5 MB range at the cost of not handling every edge case in deeply nested or macro-heavy code. The approach is intentional: the map is a navigation aid, not a type-checked AST.
Comment stripping runs first (line comments //… and block comments /* … */) so declarations inside commented-out code are not surfaced.
Extracted symbol kinds from .rs files:
| Kind | What is matched |
|---|---|
fn | fn, pub fn, pub(crate) fn, async fn, const fn, unsafe fn, extern "C" fn, and combinations |
struct | struct, pub struct |
enum | enum, pub enum |
trait | trait, pub trait, unsafe trait |
impl | Inherent impl Type blocks |
impl (trait) | impl Trait for Type. The symbol name is rendered as "Trait for Type". |
mod | mod, pub mod |
use (symbol) | pub use … re-exports. Recorded as a SymbolKind::Use entry, not as an import. |
Plain use …; lines (non-pub) are captured in imports, not as symbols. const and static declarations are intentionally not extracted; the design spec did not include them.
TypeScript and JavaScript
Section titled “TypeScript and JavaScript”Extracted from .ts, .tsx, .js, .mjs, .cjs, .jsx files:
| Kind | What is matched |
|---|---|
fn | function, async function, export function, export default function |
class | class, export class, abstract class |
interface | interface, export interface |
type | type Alias = …, export type Alias = … |
export | export const, export let, export var, export { A, B } |
import … from '…' lines are captured in imports.
Other files
Section titled “Other files”For any extension not in the list above (.md, .toml, .json, .py, etc.), the extractor returns empty symbols and empty imports. Instead, first_meaningful_line is set to the first non-blank, non-comment line of the file, capped at 200 bytes on a character boundary. Markdown # headings are not skipped. They are the meaningful first line for a README.md.
The rendered view
Section titled “The rendered view”render::render_compact produces the text the agent receives. The format is stable and byte-identical across re-renders of the same map:
repo: /home/user/myprojectindexed_at: 1748902400files: 3
src/lib.rs [lang=rust lines=120 size=3840] fn: build@12 fn: render_compact@45 struct: Options@8 impl: Options@31 imports: std::path::PathBuf, std::sync::OnceLock
web/app.ts [lang=typescript lines=60 size=1920] fn: greet@5 class: App@10 export: PI@22 imports: react
README.md [lang=other lines=8 size=210] first: # My ProjectEach file entry starts with its relative path and a metadata bracket. Symbol lines follow as kind: name@line. Import lines list the collected import paths, sorted alphabetically for determinism. For Language::Other, a first: line shows the meaningful first line if one exists.
The format is kept deliberately compact: no closing brackets, no extra whitespace. The agent typically receives this rendering at the start of a planning turn rather than reading each file individually.
Options
Section titled “Options”RepoMap::build uses IndexOptions::default(). You can pass RepoMap::build_with_options with a custom IndexOptions to adjust:
| Option | Default | Effect |
|---|---|---|
max_file_bytes | 5 MiB (5,242,880 bytes) | Files larger than this are indexed as size-only entries with no symbols. |
max_lines | 50,000 | Files with more lines are recorded with lines and size_bytes but no symbols. |
respect_gitignore | true | When false, the walker ignores .gitignore and related exclusion files. |
How the agent uses the map
Section titled “How the agent uses the map”RepoMap runs before the agent selects files to read on a planning or navigation step. The agent receives the compact rendering in its context window. From that rendering it can:
- Identify which files define a specific function or type without reading every file.
- Find all
impl Trait for Typeblocks across the codebase. - Locate module boundaries and re-exports.
- See which TypeScript files export a given class or interface.
Files flagged as large (over the line or byte limits) appear in the map with their size and language, signalling the agent that the file exists but that symbol-level navigation requires a direct read.
Crate boundary
Section titled “Crate boundary”wcore-repomap has zero internal wcore-* dependencies. It does not emit protocol events and does not know about the agent loop. The crate is a pure data library: walk a path, return a RepoMap, render to a string. Wiring the rendered map into the agent’s context window lives in wcore-agent’s tool bootstrap, not in this crate.