[Feature Request] Make token indexing work for batched input and UnifiedTransformer #138

Butanium · 2024-05-21T18:35:44Z

Using .token[0] returns the padding token

from nnsight import LanguageModel
model = LanguageModel("gpt2", device_map="cpu")

probs = model.trace("a zd zdb", trace=False).logits
with model.trace(["ab dfez zd", "a", "b"]):
    inp = model.input.save()
    inp2 = model.input[1]['input_ids'].t[0].save()
print(inp, inp2)

out:

(((), {'input_ids': tensor([[  397,   288,  5036,    89,  1976,    67],
         [50256, 50256, 50256, 50256, 50256,    64],
         [50256, 50256, 50256, 50256, 50256,    65]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1],
         [0, 0, 0, 0, 0, 1],
         [0, 0, 0, 0, 0, 1]])}),
 tensor([  397, 50256, 50256]))

Name: nnsight
Version: 0.2.16

The text was updated successfully, but these errors were encountered:

Butanium · 2024-05-26T12:05:02Z

Oh wait, seems like token indexing is supposed to work only with tracer.invoke calls.
It would be nice if it also works on directly batched input, not sure how easy it is to add it with the current implementation

Butanium · 2024-05-26T13:42:47Z

Ok but with UnifiedTransformer, token indexing doesn't work as padding side is right by default :

l = ["ab dfez zd", "a", "b"]
from nnsight import LanguageModel
model = LanguageModel("gpt2", device_map="cpu")

with model.trace(l):
    inp = model.input[1]['input_ids'].token[0].save()

with model.trace() as tracer:
    inp_l = []
    for s in l:
        with tracer.invoke(s):
            inp_l.append(model.input[1]['input_ids'].token[0].save())

from nnsight.models.UnifiedTransformer import UnifiedTransformer
umodel = UnifiedTransformer("gpt2", device="cpu")

with umodel.trace() as tracer:
    inp_l2 = []
    for s in l:
        with tracer.invoke(s):
            inp_l2.append(umodel.input[1]['input'].token[0].save())
print(inp) 
print([i.item() for i in inp_l])
print([i.item() for i in inp_l2])

You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
Loaded pretrained model gpt2 into HookedTransformer
You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
tensor([  397, 50256, 50256])
[397, 64, 65]
[397, 50256, 50256]

Butanium changed the title ~~[Bug] Token indexing seems to be broken~~ [Feature Request] Make token indexing work for batched input May 26, 2024

Butanium changed the title ~~[Feature Request] Make token indexing work for batched input~~ [Feature Request] Make token indexing work for batched input and UnifiedTransformer May 26, 2024

Butanium mentioned this issue Aug 22, 2024

Fix tokenizer kwarg in unifiedTransformer #203

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Make token indexing work for batched input and UnifiedTransformer #138

[Feature Request] Make token indexing work for batched input and UnifiedTransformer #138

Butanium commented May 21, 2024 •

edited

Loading

Butanium commented May 26, 2024

Butanium commented May 26, 2024 •

edited

Loading

[Feature Request] Make token indexing work for batched input and UnifiedTransformer #138

[Feature Request] Make token indexing work for batched input and UnifiedTransformer #138

Comments

Butanium commented May 21, 2024 • edited Loading

Butanium commented May 26, 2024

Butanium commented May 26, 2024 • edited Loading

Butanium commented May 21, 2024 •

edited

Loading

Butanium commented May 26, 2024 •

edited

Loading