gguf: use an unordered_map for faster duplicate tensor name lookups #14818

struct · 2025-07-22T15:15:03Z

The current implementation iterates the vector of tensors comparing each name using strcmp. This operation gets slower with each new tensor added. We can improve the performance of this using an std::unordered_map which has a better average lookup time as the tensor vector grows in size.

Tested with:

$ build/bin/llama-bench 
| model                          |       size |     params | backend    | threads |            test |                  t/s |
| ------------------------------ | ---------: | ---------: | ---------- | ------: | --------------: | -------------------: |
| llama 1B Q4_0                  | 606.54 MiB |     1.10 B | Metal,BLAS |      12 |           pp512 |       5694.21 ± 8.22 |
| llama 1B Q4_0                  | 606.54 MiB |     1.10 B | Metal,BLAS |      12 |           tg128 |        339.54 ± 4.73 |

build: 38d3af1b (5955)

$ build/bin/llama-gguf models/7B/ggml-model-q4_0.gguf r n

$ build/bin/test-gguf 
...
108/108 tests passed
OK

 bash ./ci/run.sh ./tmp/results ./tmp/mnt

struct · 2025-07-22T17:28:43Z

This is a trivial change involving a temporary local unordered_map. I can't help but wonder why gguf_context doesn't hold similar data structures to improve lookups in gguf_get_key or gguf_get_tensor? Is there an explicit preference for these simple O(N) for loops?

CISC · 2025-07-22T17:30:22Z

ggml/src/gguf.cpp

@@ -10,6 +10,7 @@
 #include <cstdlib>
 #include <cstring>
 #include <map>
+#include <unordered_set>


Remnant from earlier iteration?

CISC · 2025-07-22T17:30:36Z

ggml/src/gguf.cpp

-                    ok = false;
-                    break;
-                }
+            auto [it, result]  = tensor_names.emplace(info.t.name, i);


Suggested change

auto [it, result] = tensor_names.emplace(info.t.name, i);

auto [it, result] = tensor_names.emplace(info.t.name, i);

struct requested a review from JohannesGaessler as a code owner July 22, 2025 15:15

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jul 22, 2025

CISC reviewed Jul 22, 2025

View reviewed changes

struct closed this Jul 22, 2025

struct force-pushed the tensor_name_duplicate_lookup branch from 1302318 to 38d3af1 Compare July 22, 2025 17:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

gguf: use an unordered_map for faster duplicate tensor name lookups #14818

gguf: use an unordered_map for faster duplicate tensor name lookups #14818

Uh oh!

struct commented Jul 22, 2025

Uh oh!

struct commented Jul 22, 2025

Uh oh!

CISC Jul 22, 2025

Uh oh!

CISC Jul 22, 2025

Uh oh!

Uh oh!

	auto [it, result] = tensor_names.emplace(info.t.name, i);
	auto [it, result] = tensor_names.emplace(info.t.name, i);

gguf: use an unordered_map for faster duplicate tensor name lookups #14818

gguf: use an unordered_map for faster duplicate tensor name lookups #14818

Uh oh!

Conversation

struct commented Jul 22, 2025

Uh oh!

struct commented Jul 22, 2025

Uh oh!

CISC Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

CISC Jul 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!