Skip to content

gguf: use an unordered_map for faster duplicate tensor name lookups #14818

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 0 commits into from

Conversation

struct
Copy link
Contributor

@struct struct commented Jul 22, 2025

The current implementation iterates the vector of tensors comparing each name using strcmp. This operation gets slower with each new tensor added. We can improve the performance of this using an std::unordered_map which has a better average lookup time as the tensor vector grows in size.

Tested with:

$ build/bin/llama-bench 
| model                          |       size |     params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| llama 1B Q4_0                  | 606.54 MiB |     1.10 B | Metal,BLAS |      12 |           pp512 |       5694.21 ± 8.22 |
| llama 1B Q4_0                  | 606.54 MiB |     1.10 B | Metal,BLAS |      12 |           tg128 |        339.54 ± 4.73 |

build: 38d3af1b (5955)
$ build/bin/llama-gguf models/7B/ggml-model-q4_0.gguf r n
$ build/bin/test-gguf 
...
108/108 tests passed
OK
 bash ./ci/run.sh ./tmp/results ./tmp/mnt

@struct struct requested a review from JohannesGaessler as a code owner July 22, 2025 15:15
@github-actions github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jul 22, 2025
@struct
Copy link
Contributor Author

struct commented Jul 22, 2025

This is a trivial change involving a temporary local unordered_map. I can't help but wonder why gguf_context doesn't hold similar data structures to improve lookups in gguf_get_key or gguf_get_tensor? Is there an explicit preference for these simple O(N) for loops?

@@ -10,6 +10,7 @@
#include <cstdlib>
#include <cstring>
#include <map>
#include <unordered_set>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remnant from earlier iteration?

ok = false;
break;
}
auto [it, result] = tensor_names.emplace(info.t.name, i);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
auto [it, result] = tensor_names.emplace(info.t.name, i);
auto [it, result] = tensor_names.emplace(info.t.name, i);

@struct struct closed this Jul 22, 2025
@struct struct force-pushed the tensor_name_duplicate_lookup branch from 1302318 to 38d3af1 Compare July 22, 2025 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ggml changes relating to the ggml tensor library for machine learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants