-
Notifications
You must be signed in to change notification settings - Fork 13.6k
Add circular tiling support to conv2d and pad, for Vulkan, CUDA, and CPU (used for making seamless textures) #16985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| int d1); // dilation dimension 1 | ||
|
|
||
|
|
||
| GGML_API struct ggml_tensor * ggml_conv_2d_circular( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd personally prefer the wrapping to be an option on existing commands (either add an optional parameter to existing functions, or do something like ggml_mul_mat_set_prec to modify it after it's created. But the core maintainers should decide. I just don't want to end up with 2^N different convolution functions as these additional options keep getting added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think an optional parameter is an option (iiuc) since they are c api? (could hack with macros, but not ideal) But I'm open to some state modifying thing if that's what they want
| uint32_t H_idx = OH_idx * p.s1 + KH_idx_b * p.d1 - p.p1; | ||
| uint32_t W_idx = OW_idx * p.s0 + KW_idx_b * p.d0 - p.p0; | ||
| #endif | ||
| H_pos = (p.circular != 0) ? wrap_coord(int(H_idx), p.H) : H_idx; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This causes a pretty significant slowdown in test-backend-ops -o CONV_2D perf for some cases. Once #16978 lands, circular should become a spec constant and this problem will go away.
|
I am wondering, is it possible to add only a variant of Adding circular padding natively to all convolutions on all/most backends is a lot of investment. I'm not sure how common it is, so it would be interesting to know the trade-off. |
This adds extra functions
That have equivalent signatures to the non-circular versions (I considered modifying the existing ones, but didn't want to break existing code). Instead of padding with zeros, they act "on a torus" and loop x and y around.
I implemented this for CUDA, CPU, and Vulkan, as those are the primary backends people use in KoboldCpp/Stable Diffusion Cpp to generate images. For other backends, it'll fall back to non-circular.
This can be used to make seamless textures, see leejet/stable-diffusion.cpp#914 for an example and the changes needed on the image generation side. For some models (Stable Diffusion) simply calling the circular functions is sufficient, for other models (Qwen Image) you need to modify Rope embeddings slightly as well (so they cleanly loop).
I ran CI tests and added tests for these, but happy to answer any questions/modify things as needed.