Add ability to run Gemma 2 models without post layer norm #40670

amer-sinha · 2025-09-03T18:58:10Z

What does this PR do?

adds the ability for Gemma2 models to run without post attention layer normalization and post feedforward layer normalization.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

github-actions · 2025-09-03T23:36:38Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma2, t5gemma

…t feedforward norm

vasqu

Are there any models that use this? We usually only support features that are already included in (to be) published models

Cyrilvallez · 2025-09-04T17:26:53Z

Indeed! As @vasqu said, the best is to create a new model with modular (it will require very little code and efforts with modular)! 🤗

amer-sinha force-pushed the gemma2-nopostnorm branch 7 times, most recently from 940ad76 to f8f3264 Compare September 3, 2025 23:35

Add ability to run Gemma 2 models without post attention norm and pos…

eadadbc

…t feedforward norm

amer-sinha force-pushed the gemma2-nopostnorm branch from f8f3264 to eadadbc Compare September 3, 2025 23:43

vasqu reviewed Sep 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add ability to run Gemma 2 models without post layer norm #40670

Add ability to run Gemma 2 models without post layer norm #40670

Uh oh!

amer-sinha commented Sep 3, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 3, 2025

Uh oh!

vasqu left a comment

Uh oh!

Cyrilvallez commented Sep 4, 2025

Uh oh!

Uh oh!

Add ability to run Gemma 2 models without post layer norm #40670

Are you sure you want to change the base?

Add ability to run Gemma 2 models without post layer norm #40670

Uh oh!

Conversation

amer-sinha commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

github-actions bot commented Sep 3, 2025

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez commented Sep 4, 2025

Uh oh!

Uh oh!

amer-sinha commented Sep 3, 2025 •

edited

Loading