Source code knowledge graph for smart code review context — zero dependencies, single binary
AI coding assistants waste tokens reading entire codebases when reviewing changes. codegraph-go builds a structural knowledge graph of your code, then uses that graph to select only the most relevant context for a given diff — giving you smart, token-efficient code review context.
This is a Go reimplementation of code-review-graph (3.7K stars, Python). Our version provides:
- Zero external dependencies — uses Go's
go/astfor Go files, regex for others - Single static binary — no Python, no pip, no tree-sitter C bindings
- Concurrent parsing — goroutine-per-file with configurable parallelism
- Token budget optimization — explicit knapsack-style context selection within a token limit
- DOT graph export — visualize your code's dependency graph with Graphviz
| Feature | Description |
|---|---|
| Multi-language parsing | Go (via go/ast), Python, TypeScript, JavaScript |
| Code knowledge graph | Functions, classes, imports, call relationships, test coverage |
| Diff-aware context | Given a git diff, find the most relevant code context |
| Risk scoring | Security keywords, caller count, test coverage → priority ranking |
| Token budget | Select most relevant context within a configurable token limit |
| DOT export | Export graph in Graphviz DOT format for visualization |
| JSON export | Export stats and subgraphs as JSON |
| Incremental updates | Remove and re-parse only changed files |
| Test gap detection | Identify functions without test coverage in changed code |
go install github.com/JSLEEKR/codegraph-go/cmd/codegraph@latestgit clone https://github.com/JSLEEKR/codegraph-go.git
cd codegraph-go
go build -o codegraph ./cmd/codegraph/# Parse all source files in the current directory
codegraph build-graph
# Or specify a repository root
codegraph build-graph --repo /path/to/projectOutput:
Building code graph for /path/to/project...
Found 42 source files
Build complete in 85ms
Files: 42
Nodes: 287
Edges: 534
Languages: go, python, typescript
Saved to: /path/to/project/.codegraph/graph.json
# Search by name
codegraph query --name "ParseFile"
# Filter by kind
codegraph query --kind Function --limit 10
# List all nodes in a file
codegraph query --file "parser/parser.go"
# JSON output
codegraph query --name "Server" --format jsonOutput:
Found 2 results:
[f] ParseFile
Qualified: parser/parser.go::ParseFile
File: parser/parser.go:35-68
Params: (filePath string, source []byte)
Returns: (*ParseResult, error)
Calls: parseGo, parsePython, parseTypeScript
Called by: (none)
[f] parseGo
Qualified: parser/go_parser.go::parseGo
File: parser/go_parser.go:15-120
Params: (filePath string, source []byte)
Returns: ([]graph.Node, []graph.Edge, error)
# Find relevant context for recent changes
codegraph diff-context --base HEAD~1
# With custom token budget
codegraph diff-context --base main --budget 4000
# JSON output for programmatic use
codegraph diff-context --base HEAD~3 --format jsonOutput:
Changes in 3 files affecting 5 code entities
Impact radius: 12 nodes
Test gaps: 2 functions without tests
Token budget: 3200 / 4000 used
Functions missing tests:
- validateInput (handler.go:45)
- parseConfig (config.go:12)
Review priorities (by risk score):
## validateInput (Function) [risk: 0.50]
File: handler.go (lines 45-60)
Tokens: 120
```
func validateInput(input string) error {
if input == "" {
return fmt.Errorf("empty input")
}
// ... sanitize ...
}
```
codegraph stats
codegraph stats --format jsonOutput:
Graph Statistics:
Total nodes: 287
Total edges: 534
Files: 42
Languages: go, python, typescript
Nodes by kind:
Function 142
File 42
Class 35
Type 28
Test 40
Edges by kind:
CALLS 215
CONTAINS 187
IMPORTS_FROM 89
TESTED_BY 31
INHERITS 12
# Export as DOT (for Graphviz)
codegraph export --format dot > graph.dot
dot -Tpng graph.dot -o graph.png
# Export filtered subgraph
codegraph export --format dot --filter "parser.go::ParseFile,parser.go::parseGo"
# Export as JSON
codegraph export --format json --output stats.jsoncodegraph-go/
├── graph/ # In-memory knowledge graph with JSON persistence
│ └── graph.go # Node/Edge types, BFS impact analysis, save/load
├── parser/ # Multi-language source code parser
│ ├── parser.go # Language detection, file-level orchestration
│ ├── go_parser.go # Go parser using go/ast (stdlib)
│ └── regex_parser.go # Python/TypeScript parser using regex
├── diff/ # Git diff parsing and node mapping
│ └── diff.go # Unified diff parser, line range overlap detection
├── context/ # Token-budget-aware context selection
│ └── context.go # Risk scoring, greedy knapsack, context formatting
├── export/ # Graph export (DOT, JSON)
│ ├── dot.go # Graphviz DOT format with color-coded node types
│ └── json.go # JSON stats and subgraph export
└── cmd/codegraph/ # CLI entry point
└── main.go # Command router and flag parsing
Source Files → Parser (go/ast + regex) → Nodes + Edges
↓
In-Memory Graph
↓
Git Diff → Changed Ranges → Node Mapping → Risk Scoring
↓
Token Budget → Context Selection
↓
Review Priorities
-
Parsing: Go files are parsed with
go/astfor perfect fidelity. Python and TypeScript files use regex patterns to extract classes, functions, imports, and calls. -
Graph Construction: Nodes (File, Class, Function, Type, Test) and Edges (CALLS, IMPORTS_FROM, INHERITS, IMPLEMENTS, CONTAINS, TESTED_BY, DEPENDS_ON) are stored in an in-memory graph with index maps for fast lookup.
-
Diff Analysis: Git diffs are parsed into line ranges. Each range is overlapped against graph nodes to find affected code entities.
-
Risk Scoring: Each affected node gets a risk score (0.0-1.0) based on:
- Caller count (more callers = higher blast radius)
- Test coverage (untested = higher risk)
- Security keywords (auth, password, sql = higher risk)
-
Context Selection: Nodes are sorted by risk and selected greedily within the token budget. Source code is read from disk and included in the output.
| Kind | Description | DOT Shape |
|---|---|---|
| File | Source file | folder |
| Class | Class/struct definition | box |
| Function | Function/method | ellipse |
| Type | Interface/type alias | hexagon |
| Test | Test function | diamond |
| Kind | Description | Example |
|---|---|---|
| CALLS | Function invocation | main() → helper() |
| IMPORTS_FROM | Module import | handler.go → fmt |
| INHERITS | Class/struct embedding | Dog → Animal |
| IMPLEMENTS | Interface implementation | Server → Handler |
| CONTAINS | Structural containment | file → function |
| TESTED_BY | Test coverage link | Parse() → TestParse() |
| DEPENDS_ON | Dependency relationship | module → package |
| Language | Parser | Functions | Classes | Imports | Calls | Tests |
|---|---|---|---|---|---|---|
| Go | go/ast (stdlib) |
Yes | Yes (structs) | Yes | Yes | Yes |
| Python | Regex | Yes | Yes | Yes | Yes | Yes |
| TypeScript | Regex | Yes | Yes | Yes | Yes | Yes |
| JavaScript | Regex | Yes | Yes | Yes | Yes | Yes |
The Go parser provides first-class support via the standard library:
- Functions and methods with receiver types
- Struct and interface declarations
- Struct embedding (→ INHERITS edges)
- Import resolution
- Call extraction including package-qualified calls
- Test function detection (Test*, Benchmark*, Example*)
- Full parameter and return type extraction
The regex parsers extract:
- Class definitions with inheritance
- Function/method definitions with parameters
- Import statements
- Function calls (direct and method calls)
- Arrow functions (TypeScript)
- Interface and type alias declarations (TypeScript)
- Test file and test function detection
The following directories are automatically skipped during parsing:
.git, node_modules, vendor, __pycache__, .codegraph,
.cache, dist, build, .next, .nuxt, target, bin,
.idea, .vscode, .vs
The graph is stored in .codegraph/graph.json in the repository root. This file can be committed to the repository for faster subsequent builds, or added to .gitignore.
| Feature | code-review-graph (Python) | codegraph-go |
|---|---|---|
| Dependencies | tree-sitter, networkx, mcp, watchdog | Zero (stdlib only) |
| Languages | 25+ (via tree-sitter) | 4 (Go, Python, TS, JS) |
| Distribution | pip install + tree-sitter compilation | Single binary |
| Parsing | tree-sitter (C bindings) | go/ast + regex |
| Storage | SQLite | In-memory + JSON |
| Concurrency | Single-threaded | Goroutine-per-file |
| Token Budget | Implicit (bounded BFS) | Explicit knapsack |
| Graph Export | HTML visualization | DOT + JSON |
| MCP Server | Yes (24+ tools) | No (CLI only) |
| File Watcher | Yes (watchdog) | No |
- codegraph-go: When you want a fast, zero-dependency CLI tool for code review context, especially for Go projects
- code-review-graph: When you need MCP integration, 25+ language support, or VS Code extension
# Run tests
go test ./...
# Run tests with verbose output
go test ./... -v
# Build binary
go build -o codegraph ./cmd/codegraph/
# Run vet
go vet ./...MIT License — see LICENSE for details.
Inspired by code-review-graph by tirth8205.