Skip to content

Conversation

@ngxson
Copy link
Collaborator

@ngxson ngxson commented Nov 29, 2025

This may silently fail if mul_mat return something other than F32 in the future, so I'm using ggml_row_size to be more future-proof.

@ngxson ngxson requested review from CISC and ggerganov November 29, 2025 15:19
@CISC
Copy link
Collaborator

CISC commented Nov 29, 2025

Did I miss something? I don't think we'll ever expect it to return quantized tensors, so this is surely overkill? If expecting anything other than F32 (in which case, we need to fix up a lot of places) using d_head*ggml_element_size(cur) should be enough?

@ngxson
Copy link
Collaborator Author

ngxson commented Nov 29, 2025

@CISC honestly I don't know if hard-coding sizeof(float) is a good practice in GGML overall, even when the tensor is guaranteed to be F32 (who knows, maybe a novel backend will store the tensor using just 31 bits instead of full 32 bits? hmm well bad example, in this case other things will break before this 😄 )

ggml_element_size(cur) * N is also OK, indeed I don't know what's the recommendation.

Not sure if you already came across this pattern somewhere else @ggerganov , maybe in whisper.cpp or somewhere else. What's your recommendation here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants