Enhance terminology in GRPO code #593

nouhadziri · 2025-03-03T21:27:15Z

The grpo_vllm_thread_ray_gtrl.py uses --actor_num_gpus_per_node to refer to the training processes which is a bit confusing and instead we can use --num_learners_per_node.

The text was updated successfully, but these errors were encountered:

vwxyzjn · 2025-03-03T21:35:25Z

Also I think number_samples_per_prompt should be renamed to num_samples_per_prompt for consistency

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance terminology in GRPO code #593

Enhance terminology in GRPO code #593

nouhadziri commented Mar 3, 2025

vwxyzjn commented Mar 3, 2025

Enhance terminology in GRPO code #593

Enhance terminology in GRPO code #593

Comments

nouhadziri commented Mar 3, 2025

vwxyzjn commented Mar 3, 2025