Skip to content

Commit 5c5abca

Browse files
feat(readme): add instructions to avoid OOMs with hyperparameters (#470)
Co-authored-by: reciprocated <[email protected]>
1 parent 1446523 commit 5c5abca

File tree

1 file changed

+18
-4
lines changed

1 file changed

+18
-4
lines changed

README.md

+18-4
Original file line numberDiff line numberDiff line change
@@ -68,14 +68,28 @@ trainer.generate(**tokenizer('Q: Who rules the world? A:', return_tensors='pt'),
6868
#### Configure Hyperparameters
6969

7070
```python
71-
from trlx.data.default_configs import default_ppo_config, TrainConfig
71+
from trlx.data.default_configs import default_ppo_config
7272

7373
config = default_ppo_config()
7474
config.model.model_path = 'EleutherAI/gpt-neox-20b'
75-
config.train.seq_length = 32
76-
config.train.batch_size = 16
75+
config.tokenizer.tokenizer_path = 'EleutherAI/gpt-neox-20b'
76+
config.train.seq_length = 2048
7777

78-
trainer = trlx.train(config=config, reward_fn=lambda samples, **kwargs: [float(int(sample)) for sample in samples])
78+
trainer = trlx.train(config=config, reward_fn=lambda samples, **kwargs: [len(sample) for sample in samples])
79+
```
80+
To reduce memory usage (if you're experiencing CUDA Out of Memory errors), first try the lowest setting for the following hyperparameters and eventually increase them:
81+
```python
82+
# micro batch size per gpu
83+
config.train.batch_size = 1
84+
# freeze all transformer layers
85+
config.model.num_layers_unfrozen = 0
86+
# maximum sample length, prompts or samples longer than that will be truncated
87+
config.train.seq_length = 128
88+
89+
# micro batch size for sampling (specific for PPO)
90+
config.method.chunk_size = 1
91+
# use an additional Q-head (specific for ILQL)
92+
config.method.two_qs = False
7993
```
8094

8195
#### Save the resulting model to a Hugging Face pretrained language model. (Ready to upload to the Hub!)

0 commit comments

Comments
 (0)