Skip to content

alvgeppetto/gputrace

 
 

Repository files navigation

gputrace

gputrace parses and analyzes Apple Metal GPU trace files (.gputrace bundles).

Installation

go install github.com/tmc/gputrace/cmd/gputrace@latest

Verify installation:

gputrace version

Quick Start

# Show trace statistics (dispatch counts, kernel names, timing)
gputrace stats trace.gputrace

# Full profiler breakdown (timing, pipelines, execution cost)
gputrace profiler trace.gputrace

# Export to pprof format for use with go tool pprof
gputrace pprof trace.gputrace -o trace.pb
go tool pprof -http=:8080 trace.pb

# Export Chrome/Perfetto timeline
gputrace timeline trace.gputrace -o trace.json

# Compare two traces
gputrace diff A.gputrace B.gputrace --explain

Commands

Group Command Description
Overview stats Comprehensive trace statistics
api-calls API call sequences
dump Raw API call dump
Kernel & Shader shaders Shader performance metrics
kernels Kernel functions and pipeline mappings
shader-source Source-level performance attribution
Timing & Profiling timing Timing metrics export
profiler GPU profiler data extraction
pprof pprof format export
correlate Correlate timing with hardware metrics
Command Buffers command-buffers Command buffer analysis
encoders Compute encoder listing
Buffer Analysis buffers Buffer listing and properties
buffer-access Buffer access patterns
buffer-timeline Buffer allocation timeline
Visualization timeline Chrome/Perfetto timeline export
graph Graph visualization
tree Execution tree view
diff Compare two traces
insights Actionable performance insights
Capture capture Capture GPU trace from a command
xcode-profile Xcode GPU profiler automation
Utilities serve Web server for trace browsing
mtlb Metal Library Binary inspection
clear-buffers Zero out buffers to reduce trace size
version Print build version

Run gputrace [command] --help for details on any command.

Trace Diff

Compare two profiled traces and explain performance deltas at dispatch, kernel, encoder, and timeline-window levels:

# Human-readable summary
gputrace diff A.gputrace B.gputrace --explain

# Function and encoder views
gputrace diff A.gputrace B.gputrace --by function,encoder --limit 25

# Dispatch outliers (with source indices)
gputrace diff A.gputrace B.gputrace --by dispatch --min-delta-us 30 --limit 50

# JSON or CSV output
gputrace diff A.gputrace B.gputrace --json > diff.json
gputrace diff A.gputrace B.gputrace --csv --by function > function_deltas.csv

# Auto-discover newest trace pair and run quick triage
gputrace diff --bench-dir /path/to/bench-traces --quick

# Write markdown report
gputrace diff A.gputrace B.gputrace --md-out /tmp/report.md

See docs/TRACE_DIFF_WORKFLOW.md for the full workflow and sample output.

Testing

go test ./...

The repository includes a small canonical fixture set under testdata/traces:

  • 01-single-encoder for basic parsing and diff smoke tests
  • 06-six-encoders for multi-encoder parsing and shader/debug coverage

Some success paths require capabilities that are not shipped in the small in-repo fixtures:

  • profiler requires traces with .gpuprofiler_raw
  • shader-source requires traces with source attribution data

Documentation

Detailed format and workflow documentation lives in docs/:

Reverse-engineering notes and implementation status documents live in docs/research/.

GPU Timing Methodology

.gputrace files do not contain pre-computed timing percentages. Xcode Instruments derives shader cost by replaying captured GPU workloads with performance counters enabled. This library extracts timing from profiler streamData (dispatch/kernel duration, execution cost sampling, and GPRWCNTR encoder profiles) when a .gpuprofiler_raw directory is present. For traces without profiler data, only structural information (kernels, encoders, buffers) is available.

Developer Convenience

For local macOS reinstall and permission setup:

make reinstall

License

MIT License. See LICENSE for details.

About

Apple Silicon gputrace analysis tool

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Go 94.0%
  • TypeScript 4.9%
  • Other 1.1%