-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: replace inf with max or min finite value, then do softmax #3059
base: main
Are you sure you want to change the base?
Conversation
Seems that this issue happened on NPU devices. |
We faced this issue when the temperature was set to 0. Could you check the value of temperature in your case? |
When I encountered this problem, it was the following configuration.
|
def _softmax_scores(scores: torch.Tensor): | ||
"""softmax scores.""" | ||
# if score has inf, replace it with max or min finite value, then do softmax | ||
if torch.isinf(scores).any(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any()
would synchronize the stream, and harm the performance.
|
||
device = scores.device | ||
|
||
scores = torch.where(scores == float('inf'), torch.tensor(max_finite_value, dtype=dtype, device=device), scores) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clamp should be better.
Please set a non-zero temperature(please check the code in lmdeploy(
checking inf reduces the performance in my opinion. |
You're right, when I set temperature=0, it's actually temperature=1e-6. After calling this function, the score becomes inf. |
Motivation
When I deploy a large model inference model, there is an inf value in the scores value, and calling the softmax function results in an nan value. This can cause some errors, such as:
Modification
I added the _softmax_stores function and wrapped the softmax function, if score has inf, replace it with max or min finite value, then do softmax.