generated from fastai/nbdev_template
    
        
        - 
                Notifications
    You must be signed in to change notification settings 
- Fork 2.3k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
      Add On-Policy Distillation from thinking labs to paper index.
      
    
      
  
        
          #4410
            opened Oct 30, 2025  by
            pramodith
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    4 of 5 tasks
  
      Add tip for logging evaluation metrics during regular evaluations
      
    
      
  
        
          #4367
            opened Oct 29, 2025  by
            cam1llynha
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      [OpenENV] Openenv rollout_func signature proposal
      
    
      
  
        
          #4344
            opened Oct 27, 2025  by
            kashif
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    5 tasks
  
      Use explicit tiny-Qwen2ForCausalLM-2.5 model_id param in CI tests
      
    
      
  
        
          #4331
            opened Oct 23, 2025  by
            albertvillanova
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      refactor: simplify parameter freezing in modeling_base.py
      
    
      
  
        
          #4305
            opened Oct 20, 2025  by
            Ki-Seki
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 of 5 tasks
  
      GRPO: ScaleRL -> Support casting LM Head to FP32
      
    
      
  
        
          #4303
            opened Oct 18, 2025  by
            pramodith
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    4 of 5 tasks
  
      [SFT] Log mean token accuracy from Liger kernel
      
    
      
  
        
          #4302
            opened Oct 18, 2025  by
            kashif
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    5 tasks
  
      feat: Add Multi-Token Prediction (MTP) support to SFTTrainer
      
    
      
  
        
          #4290
            opened Oct 15, 2025  by
            KLGR123
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      Remove FSDP1 support: use FSDP2 exclusively
      
    
      
  
        
          #4260
            opened Oct 11, 2025  by
            behroozazarkhalili
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      Fix DPO Trainer Bug For Qwen2-VL (Issue 2660)
      
    
      
  
        
          #4257
            opened Oct 11, 2025  by
            FabianSchuetze
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    1 of 3 tasks
  
      Update 
    
      
  max_length explanation for VLM trainers
      
        
          #4220
            opened Oct 7, 2025  by
            sergiopaniego
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    5 tasks
  
      update guided decoding param to structured outputs
      
    
      
  
        
          #4117
            opened Sep 22, 2025  by
            jiqing-feng
            
        
        
            
    
  
    Loading…
 
        
        
      
    
      feat:add support for 'image_grid_thw'(QwenVL) in DPOTrainer
      
    
      
  
        
          #4091
            opened Sep 15, 2025  by
            ycma8
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 of 5 tasks
  
      Add 
    
      
  config_init_kwargs option in GRPOConfig
      
        
          #4069
            opened Sep 12, 2025  by
            hokuyama0106
            
        
        
            
    
  
    Loading…
 
        
          
   
        
      
    
      
        
      
      
  
    2 of 5 tasks
  
Previous Next
  
  
  ProTip!
  Exclude everything labeled 
    bug with -label:bug.