Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a rename_modules_dict arg to Envoy that allows renaming modules #255

Closed
wants to merge 1 commit into from

Conversation

Butanium
Copy link
Contributor

@Butanium Butanium commented Oct 1, 2024

After discussing with @cadentj some alternatives to transformer_lens (which sometimes has a significant implementation gap with HuggingFace), he suggested adding a renaming feature to nnsight itself, as it might be useful not only for transformers.

The basic usage would be to convert module names from, e.g., GPT-2 to the standard LLaMA / Gemma format using:

The basic usage would be to convert module names from e.g. gpt2 to the standard llama / gemma format using

dict_change = {
    "transformer": "model",
    "self_attention": "self_attn",
    "attn": "self_attn",
    "h": "layers",
}
patched_model = LanguageModel("gpt2", rename_modules_dict=dict_change)
with patched_model.trace("hello"):
    out = patched_model.model.layers[0].output[0].save()

@Butanium
Copy link
Contributor Author

Butanium commented Oct 4, 2024

@JadenFiotto-Kaufman I can try to add tests if needed but I wanted to first check if you were ok with adding this feature to the nnsight core.

@JadenFiotto-Kaufman
Copy link
Member

@JadenFiotto-Kaufman I can try to add tests if needed but I wanted to first check if you were ok with adding this feature to the nnsight core.

Hey this is an awesome idea and something I've wanted to add to nnsight. Essentially we want aliasing right?

I think it should be a dictionary where the keys are valid regex strings that map to an alias. This dictionary would live at Envoy._aliases and would be checked on the Envoy.getattr method. If a key matches the reverse mapping, it returns the real underlying envoy for that alias.

Does that sound like a good solution?

@Butanium
Copy link
Contributor Author

Butanium commented Oct 4, 2024

Hmm, would you still pass the alias dict through the NNSight class init?

I guess one advantage of renaming over aliasing is that when you print the model, you actually see modules' new names.

What advantage do you see of using aliasing instead of renaming?

@Butanium
Copy link
Contributor Author

Butanium commented Nov 3, 2024

@JadenFiotto-Kaufman bump

@JadenFiotto-Kaufman
Copy link
Member

@Butanium Does this work? #298

@Butanium Butanium mentioned this pull request Dec 5, 2024
@Butanium
Copy link
Contributor Author

Butanium commented Dec 7, 2024

Implemented in #298

@Butanium Butanium closed this Dec 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants