Skip to content

feat: add Metal backend for macOS and Apple Silicon #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from

Conversation

emberian
Copy link
Contributor

@emberian emberian commented Mar 21, 2025

This commit adds a new Metal backend to rust-gpu-tools, enabling support for:

  • macOS Metal API for GPU compute
  • Apple Silicon hardware
  • Integration with existing CUDA and OpenCL backends
  • Parallel use of multiple backends from the same code

Metal support is behind a feature flag ('metal') and follows the same
API patterns as existing backends for consistent usage.


This entire backend (and commit message above the line) was made by claude-code, with ~zero guidance, which I think is pretty neat and am sharing this here.

⎿ Total cost: $5.19
Total duration (API): 18m 54.8s
Total duration (wall): 45m 3.8s
Total code changes: 1254 lines added, 190 lines removed

The actual approach it took towards the conditional compilation isn't what I would have used.

This still needs a bunch of work, but I didn't expect it to get this far. I don't know if I will find the time to pick it up.

This commit adds a new Metal backend to rust-gpu-tools, enabling support for:
- macOS Metal API for GPU compute
- Apple Silicon hardware
- Integration with existing CUDA and OpenCL backends
- Parallel use of multiple backends from the same code

Metal support is behind a feature flag ('metal') and follows the same
API patterns as existing backends for consistent usage.
- Use include_str! instead of file I/O for Metal shader code
- Remove unused imports and mut variable
- Fix example to run on Apple Silicon hardware
@emberian emberian force-pushed the add-metal-backend branch from d917ddd to 2fe479a Compare March 21, 2025 07:04
@emberian
Copy link
Contributor Author

emberian commented Mar 21, 2025

ember@nextop rust-gpu-tools % cargo +nightly run --example metal_add --features metal
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 3.85s
     Running `target/debug/examples/metal_add`
Found 1 GPU device(s)
Using device: Apple M2 Max
Running kernel...
Verifying results...
All results are correct!
zsh: segmentation fault  cargo +nightly run --example metal_add --features metal

@BigLep
Copy link
Member

BigLep commented May 30, 2025

@emberian : are you going to take this forward?

@emberian
Copy link
Contributor Author

@BigLep I am demotivated because my usecase (https://github.com/hellas-ai/hints acceleration) turned out to be slower on GPU than CPU

@vmx
Copy link
Contributor

vmx commented Jul 4, 2025

@emberian Thanks for letting us know. I'll close the PR.

@vmx vmx closed this Jul 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants