Skip to content

Conversation

ClarkChin08
Copy link

@ClarkChin08 ClarkChin08 commented Sep 23, 2025

Adds a safety check in the 05_bmg_gemm_with_epilogue_splitk.cpp to handle cases where the N dimension is insufficient for the split-K fusion logic.

Without this check, small N values can lead to out-of-bounds memory access in this line :

              D1[l * M * NUM_HEAD * NOPE_DIM + i * NUM_HEAD * NOPE_DIM + j * NOPE_DIM + k] =
                  D[l * M * N + i * N + j * (NOPE_DIM + ROPE_DIM) + k];

This is because N is at least NUM_HEAD * (NOPE_DIM + ROPE_DIM) to properly split the output into D1 and D2 tensors. If a user specifies a smaller N (e.g., via command-line arguments like --n=128), the loop attempts to access indices beyond the bounds of the D array, resulting in a segmentation fault (core dump).

./05_bmg_gemm_with_epilogue_splitk --m=128 --n=128 --k=128 --iterations=0
Segmentation fault (core dumped)

@tdeng5 tdeng5 requested a review from taozha2 September 26, 2025 06:23
Comment on lines +222 to +224
std::cout << "Error: n < num_head * (nope_dim + rope_dim). Please set a sufficiently large value for n. Skipping the check." << std::endl;
return true;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put this parameters checking before launching the kernel?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And, should we skip the validation of these out-of-range data or never let the kernel launching?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants