Skip to content

opencl: mark argsort unsupported if cols exceed workgroup limit #15375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 19, 2025

Conversation

lhez
Copy link
Collaborator

@lhez lhez commented Aug 17, 2025

In the current implementation, the workgroup size is (cols, 1, 1). For large tensors, the workgroup size may exceed the kernel limit, resulting in kernel launch failure. E.g., in the new test case added in #15354, the max workgroup size for this kernel on A750 is 896 but 1024 is required.

This PR marks such argsort as unsupported. Also print out the max workgroup size supported by the device (the upper bound of the max workgroup size supported by a particular kernel) for reference.

@lhez lhez requested a review from max-krasnyansky August 17, 2025 14:48
@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Aug 17, 2025
@lhez lhez marked this pull request as ready for review August 18, 2025 14:42
@max-krasnyansky max-krasnyansky merged commit fb22dd0 into ggml-org:master Aug 19, 2025
47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants