-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Adding sampling parameters for vllm generation #3210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@qgallouedec This PR is ready for review :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @shaipranesh2! I've added a few comments
stop: list[str] = [], | ||
stop_token_ids: list[int] = [], | ||
bad_words: list[str] = [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using mutable default arguments can lead to unexpected behavior because the same list is shared across all function calls.
Maybe replace it with None, and in the function:
stop = stop or []
"help": "Minimum length of the prompt. If the prompt is shorter than this value, it will be truncated left." | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How truncation would solve it?
repetition_penalty: int = field( | ||
default=1, | ||
metadata={ | ||
"help": "List of text prompts for which the model will generate completions." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a mismatch here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For parameters specific to vllm, I would name it with a vllm
prefix. Ideally some are shared by transformers and vllm (like temperature), in such case, they don't need a prefix
What does this PR do?
Fixes issue #3201 . Added support in GRPO config files to set additional parameters of vllm sampling.