Skip to content

Commit 026d2d0

Browse files
committed
v1.0.10 Updat verl models
1 parent 0d22a7d commit 026d2d0

23 files changed

+2885
-1
lines changed

.gitignore

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@ logs/
55
outputs/
66
results/
77
wandb/
8-
sh/
8+
# sh/
9+
*verl/models/
910
openr1_ckpts/
1011
*.wandb
1112
*.out

verl/models/README.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
# Models
2+
Common modelzoo such as huggingface/transformers stuggles when using Pytorch native model parallelism. Following the design principle of vLLM, we keep a simple, parallelizable, highly-optimized with packed inputs in verl.
3+
## Adding a New Huggingface Model
4+
### Step 1: Copy the model file from HF to verl
5+
- Add a new file under verl/models/hf
6+
- Copy ONLY the model file from huggingface/transformers/models to verl/models/hf
7+
8+
### Step 2: Modify the model file to use packed inputs
9+
- Remove all the code related to inference (kv cache)
10+
- Modify the inputs to include only
11+
- input_ids (total_nnz,)
12+
- cu_seqlens (total_nnz + 1,)
13+
- max_seqlen_in_batch: int
14+
- Note that this requires using flash attention with causal mask.
15+
16+
### Step 2.5: Add tests
17+
- Add a test to compare this version and the huggingface version
18+
- Following the infrastructure and add tests to tests/models/hf
19+
20+
### Step 3: Add a function to apply tensor parallelism
21+
- Please follow
22+
- https://pytorch.org/docs/stable/distributed.tensor.parallel.html
23+
- https://pytorch.org/tutorials/intermediate/TP_tutorial.html
24+
- General comments
25+
- Tensor Parallelism in native Pytorch is NOT auto-parallelism. The way it works is to specify how model parameters and input/output reshards using configs. These configs are then registered as hooks to perform input/output resharding before/after model forward.
26+
27+
### Step 4: Add a function to apply data parallelism
28+
- Please use FSDP2 APIs
29+
- See demo here https://github.com/pytorch/torchtitan/blob/main/torchtitan/parallelisms/parallelize_llama.py#L413
30+
31+
### Step 5: Add a function to apply pipeline parallelism
32+
- Comes in Pytorch 2.4
33+
- Currently only in alpha in nightly version
34+
- Check torchtitan for more details
35+

verl/models/__init__.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Copyright 2024 Bytedance Ltd. and/or its affiliates
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
148 Bytes
Binary file not shown.
1.89 KB
Binary file not shown.

verl/models/llama/__init__.py

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Copyright 2024 Bytedance Ltd. and/or its affiliates
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# Copyright 2024 Bytedance Ltd. and/or its affiliates
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
15+
from .modeling_llama_megatron import (
16+
# original model with megatron
17+
ParallelLlamaModel,
18+
ParallelLlamaForCausalLM,
19+
# rmpad with megatron
20+
ParallelLlamaForCausalLMRmPad,
21+
ParallelLlamaForValueRmPad,
22+
# rmpad with megatron and pipeline parallelism
23+
ParallelLlamaForCausalLMRmPadPP,
24+
ParallelLlamaForValueRmPadPP)
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Copyright 2024 Bytedance Ltd. and/or its affiliates
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.

0 commit comments

Comments
 (0)