You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
trainer = trlx.train(config=config, reward_fn=lambdasamples, **kwargs: [float(int(sample)) for sample in samples])
78
+
trainer = trlx.train(config=config, reward_fn=lambdasamples, **kwargs: [len(sample) for sample in samples])
79
+
```
80
+
To reduce memory usage (if you're experiencing CUDA Out of Memory errors), first try the lowest setting for the following hyperparameters and eventually increase them:
81
+
```python
82
+
# micro batch size per gpu
83
+
config.train.batch_size =1
84
+
# freeze all transformer layers
85
+
config.model.num_layers_unfrozen =0
86
+
# maximum sample length, prompts or samples longer than that will be truncated
87
+
config.train.seq_length =128
88
+
89
+
# micro batch size for sampling (specific for PPO)
90
+
config.method.chunk_size =1
91
+
# use an additional Q-head (specific for ILQL)
92
+
config.method.two_qs =False
79
93
```
80
94
81
95
#### Save the resulting model to a Hugging Face pretrained language model. (Ready to upload to the Hub!)
0 commit comments