Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve T5 encoder tests with more prompts and static context length #976

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

sogartar
Copy link
Contributor

The set of prompts is not big enough for statistically sound testing of
the T5 encoder. This is true for other text encoders.
With the expansion of the prompt set the bf16 numerical difference
between eager and IREE vanished. IREE is even more accurate.

In tests the tokenizer padding has been change to produce always max
length token sequence. This is in line how T5 is used int the Flux
pipeline. The T5 encoder export has been expanded with an option to
export with a static token sequence length.

The tests were refactored to share tolerance values for f32 and bf16.

@sogartar
Copy link
Contributor Author

This PR is on top of #967, which must be merged first.

@sogartar sogartar force-pushed the t5-improve-test-numerics branch from 92c078e to 7226258 Compare February 17, 2025 23:42
We don't want the stack to depend on the conversion tool from the
lamma.cpp repo. Also the conversion to GGUF would not convert all
tensors to bf16, but leave some in f32. We would like to control that
ourselves if needed.

This change makes any previously generated IRPA files obsolete.
The set of prompts is not big enough for statistically sound testing of
the T5 encoder. This is true for other text encoders.
With the expansion of the prompt set the bf16 numerical difference
between eager and IREE vanished. IREE is even more accurate.

In tests the tokenizer padding has been change to produce always max
length token sequence. This is in line how T5 is used int the Flux
pipeline. The T5 encoder export has been expanded with an option to
export with a static token sequence length.

The tests were refactored to share tolerance values for f32 and bf16.
@sogartar sogartar force-pushed the t5-improve-test-numerics branch from 7226258 to 753414f Compare February 17, 2025 23:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant