Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance terminology in GRPO code #593

Open
nouhadziri opened this issue Mar 3, 2025 · 1 comment
Open

Enhance terminology in GRPO code #593

nouhadziri opened this issue Mar 3, 2025 · 1 comment

Comments

@nouhadziri
Copy link
Contributor

The grpo_vllm_thread_ray_gtrl.py uses --actor_num_gpus_per_node to refer to the training processes which is a bit confusing and instead we can use --num_learners_per_node.

@vwxyzjn
Copy link
Collaborator

vwxyzjn commented Mar 3, 2025

Also I think number_samples_per_prompt should be renamed to num_samples_per_prompt for consistency

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants