Skip to content

Conversation

@christiangnrd
Copy link
Member

@christiangnrd christiangnrd commented Aug 4, 2025

Now depends on #688

@github-actions
Copy link
Contributor

github-actions bot commented Aug 4, 2025

Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic main) to apply these changes.

Click here to view the suggested changes.
diff --git a/lib/mtl/capture.jl b/lib/mtl/capture.jl
index c2c1a77a..c101c5b7 100644
--- a/lib/mtl/capture.jl
+++ b/lib/mtl/capture.jl
@@ -59,7 +59,8 @@ function MTLCaptureDescriptor()
 end
 
 # TODO: Add capture state
-function MTLCaptureDescriptor(obj::Union{MTLDevice,MTLCommandQueue,MTLCaptureScope},
+function MTLCaptureDescriptor(
+        obj::Union{MTLDevice, MTLCommandQueue, MTLCaptureScope},
                               destination::MTLCaptureDestination;
                               folder::String=nothing)
     desc = MTLCaptureDescriptor()
@@ -110,7 +111,8 @@ end
 
 Start GPU frame capture using the default capture object and specifying capture descriptor parameters directly.
 """
-function startCapture(obj::Union{MTLDevice,MTLCommandQueue,MTLCaptureScope},
+function startCapture(
+        obj::Union{MTLDevice, MTLCommandQueue, MTLCaptureScope},
                       destination::MTLCaptureDestination=MTLCaptureDestinationGPUTraceDocument;
                       folder::String=nothing)
     if destination == MTLCaptureDestinationGPUTraceDocument && folder === nothing
diff --git a/perf/array.jl b/perf/array.jl
index 008ab4d6..b86a675e 100644
--- a/perf/array.jl
+++ b/perf/array.jl
@@ -63,12 +63,12 @@ gpu_vec_ints = reshape(gpu_mat_ints, length(gpu_mat_ints))
 let group = addgroup!(group, "reverse")
     group["1d"] = @benchmarkable Metal.@sync reverse($gpu_vec)
     group["1dL"] = @benchmarkable Metal.@sync reverse($gpu_vec_long)
-    group["2d"] = @benchmarkable Metal.@sync reverse($gpu_mat; dims=1)
-    group["2dL"] = @benchmarkable Metal.@sync reverse($gpu_mat_long; dims=1)
+    group["2d"] = @benchmarkable Metal.@sync reverse($gpu_mat; dims = 1)
+    group["2dL"] = @benchmarkable Metal.@sync reverse($gpu_mat_long; dims = 1)
     group["1d_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_vec)
     group["1dL_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_vec_long)
-    group["2d_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat; dims=1)
-    group["2dL_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat_long; dims=2)
+    group["2d_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat; dims = 1)
+    group["2dL_inplace"] = @benchmarkable Metal.@sync reverse!($gpu_mat_long; dims = 2)
 end
 
 # 'evals=1' added to prevent hang when running benchmarks of CI
diff --git a/perf/runbenchmarks.jl b/perf/runbenchmarks.jl
index 17bf4ea0..98aa3153 100644
--- a/perf/runbenchmarks.jl
+++ b/perf/runbenchmarks.jl
@@ -1,7 +1,7 @@
 # benchmark suite execution and codespeed submission
 
 using Pkg
-Pkg.add(url="https://github.com/christiangnrd/GPUArrays.jl", rev="reverse")
+Pkg.add(url = "https://github.com/christiangnrd/GPUArrays.jl", rev = "reverse")
 
 using Metal
 
diff --git a/test/runtests.jl b/test/runtests.jl
index 081fc280..42f00908 100644
--- a/test/runtests.jl
+++ b/test/runtests.jl
@@ -1,5 +1,5 @@
 using Pkg
-Pkg.add(url="https://github.com/christiangnrd/GPUArrays.jl", rev="reverse")
+Pkg.add(url = "https://github.com/christiangnrd/GPUArrays.jl", rev = "reverse")
 
 using Distributed
 using Dates

@christiangnrd christiangnrd force-pushed the reverse branch 2 times, most recently from 4c15cc1 to 108f6d1 Compare August 5, 2025 01:24
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metal Benchmarks

Benchmark suite Current: 36bd61e Previous: cd8846e Ratio
latency/precompile 28688656917 ns 25041664500 ns 1.15
latency/ttfp 2291530833 ns 2123110083 ns 1.08
latency/import 1381234520.5 ns 1219920625 ns 1.13
integration/metaldevrt 840667 ns 833750 ns 1.01
integration/byval/slices=1 1555083 ns 1545000 ns 1.01
integration/byval/slices=3 8878646 ns 9534208 ns 0.93
integration/byval/reference 1538083 ns 1538458 ns 1.00
integration/byval/slices=2 2615499.5 ns 2567417 ns 1.02
kernel/indexing 627458 ns 570625 ns 1.10
kernel/indexing_checked 633229 ns 587875 ns 1.08
kernel/launch 12375 ns 12250 ns 1.01
kernel/rand 568167 ns 559417 ns 1.02
array/reverse/1d 632875 ns
array/reverse/2dL_inplace 2500979 ns
array/reverse/1dL 2114333.5 ns
array/reverse/2d 1346708 ns
array/reverse/1d_inplace 577000 ns
array/reverse/2d_inplace 809208 ns
array/reverse/2dL 6548417 ns
array/reverse/1dL_inplace 863000 ns
array/construct 6333 ns 6250 ns 1.01
array/broadcast 595792 ns 568853.5 ns 1.05
array/accumulate/Int64/1d 1321375 ns 1252541.5 ns 1.05
array/accumulate/Int64/dims=1 1916729.5 ns 1812125 ns 1.06
array/accumulate/Int64/dims=2 2272167 ns 2154541.5 ns 1.05
array/accumulate/Int64/dims=1L 11932167 ns 11676250 ns 1.02
array/accumulate/Int64/dims=2L 10049604 ns 9788583 ns 1.03
array/accumulate/Float32/1d 1162333.5 ns 1110125 ns 1.05
array/accumulate/Float32/dims=1 1661541.5 ns 1542584 ns 1.08
array/accumulate/Float32/dims=2 2007729 ns 1844333 ns 1.09
array/accumulate/Float32/dims=1L 10013979.5 ns 9855833 ns 1.02
array/accumulate/Float32/dims=2L 8145167 ns 7549292 ns 1.08
array/random/randn/Float32 810667 ns 806041 ns 1.01
array/random/randn!/Float32 609708 ns 604084 ns 1.01
array/random/rand!/Int64 548125 ns 548625 ns 1.00
array/random/rand!/Float32 582125 ns 569834 ns 1.02
array/random/rand/Int64 768041 ns 813479.5 ns 0.94
array/random/rand/Float32 668062.5 ns 598917 ns 1.12
array/reductions/reduce/Int64/1d 1354396 ns 1246458 ns 1.09
array/reductions/reduce/Int64/dims=1 1113667 ns 1066250 ns 1.04
array/reductions/reduce/Int64/dims=2 1312083 ns 1159375 ns 1.13
array/reductions/reduce/Int64/dims=1L 2023000 ns 2051083 ns 0.99
array/reductions/reduce/Int64/dims=2L 4377083 ns 3424666.5 ns 1.28
array/reductions/reduce/Float32/1d 986750 ns 854125 ns 1.16
array/reductions/reduce/Float32/dims=1 837625 ns 811584 ns 1.03
array/reductions/reduce/Float32/dims=2 881750 ns 743292 ns 1.19
array/reductions/reduce/Float32/dims=1L 1321750.5 ns 1329354.5 ns 0.99
array/reductions/reduce/Float32/dims=2L 1866500 ns 1745667 ns 1.07
array/reductions/mapreduce/Int64/1d 1343500 ns 1416563 ns 0.95
array/reductions/mapreduce/Int64/dims=1 1074125 ns 1069687.5 ns 1.00
array/reductions/mapreduce/Int64/dims=2 1299833 ns 1172437.5 ns 1.11
array/reductions/mapreduce/Int64/dims=1L 2026250 ns 1986541.5 ns 1.02
array/reductions/mapreduce/Int64/dims=2L 4311458.5 ns 3283958 ns 1.31
array/reductions/mapreduce/Float32/1d 1049770.5 ns 992000 ns 1.06
array/reductions/mapreduce/Float32/dims=1 823792 ns 813708 ns 1.01
array/reductions/mapreduce/Float32/dims=2 879292 ns 746375 ns 1.18
array/reductions/mapreduce/Float32/dims=1L 1331417 ns 1326333 ns 1.00
array/reductions/mapreduce/Float32/dims=2L 1853709 ns 1752958 ns 1.06
array/private/copyto!/gpu_to_gpu 634167 ns 618000 ns 1.03
array/private/copyto!/cpu_to_gpu 803604.5 ns 784542 ns 1.02
array/private/copyto!/gpu_to_cpu 782833 ns 785458 ns 1.00
array/private/iteration/findall/int 1718667 ns 1561958 ns 1.10
array/private/iteration/findall/bool 1503666.5 ns 1421958 ns 1.06
array/private/iteration/findfirst/int 2103042 ns 1808166 ns 1.16
array/private/iteration/findfirst/bool 2080229.5 ns 1675041 ns 1.24
array/private/iteration/scalar 5453000 ns 4652479 ns 1.17
array/private/iteration/logical 2645958 ns 2505708.5 ns 1.06
array/private/iteration/findmin/1d 2271250 ns 1902125 ns 1.19
array/private/iteration/findmin/2d 1565500 ns 1510458 ns 1.04
array/private/copy 600250 ns 554729 ns 1.08
array/shared/copyto!/gpu_to_gpu 83875 ns 83750 ns 1.00
array/shared/copyto!/cpu_to_gpu 82875 ns 81542 ns 1.02
array/shared/copyto!/gpu_to_cpu 83000 ns 82459 ns 1.01
array/shared/iteration/findall/int 1669250 ns 1577417 ns 1.06
array/shared/iteration/findall/bool 1515666.5 ns 1437209 ns 1.05
array/shared/iteration/findfirst/int 1708250 ns 1321541.5 ns 1.29
array/shared/iteration/findfirst/bool 1680583.5 ns 1308542 ns 1.28
array/shared/iteration/scalar 208000 ns 199708 ns 1.04
array/shared/iteration/logical 2610854 ns 2227625 ns 1.17
array/shared/iteration/findmin/1d 1887083 ns 1410625 ns 1.34
array/shared/iteration/findmin/2d 1569666 ns 1511604 ns 1.04
array/shared/copy 241541 ns 250333 ns 0.96
array/permutedims/4d 2638959 ns 2361500 ns 1.12
array/permutedims/2d 1135083.5 ns 1143583 ns 0.99
array/permutedims/3d 1660334 ns 1654771 ns 1.00
metal/synchronization/stream 19667 ns 18667 ns 1.05
metal/synchronization/context 19458 ns 20000 ns 0.97

This comment was automatically generated by workflow using github-action-benchmark.

@codecov
Copy link

codecov bot commented Oct 9, 2025

Codecov Report

❌ Patch coverage is 97.05882% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 80.83%. Comparing base (cd8846e) to head (36bd61e).

Files with missing lines Patch % Lines
src/indexing.jl 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #648      +/-   ##
==========================================
- Coverage   80.92%   80.83%   -0.09%     
==========================================
  Files          62       62              
  Lines        2820     2844      +24     
==========================================
+ Hits         2282     2299      +17     
- Misses        538      545       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maleadt maleadt marked this pull request as draft October 14, 2025 08:00
@maleadt
Copy link
Member

maleadt commented Oct 14, 2025

Let's mark this as draft until it pulls from a dev branch on GPUArrays.

@christiangnrd christiangnrd force-pushed the reverse branch 8 times, most recently from 36bd61e to 7874058 Compare November 6, 2025 15:57
@christiangnrd christiangnrd changed the base branch from main to kaintr November 13, 2025 15:21
@christiangnrd christiangnrd changed the base branch from kaintr to main November 13, 2025 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants