Skip to content

Commit

Permalink
update example docs
Browse files Browse the repository at this point in the history
  • Loading branch information
hiyouga committed May 6, 2024
1 parent 34d33e2 commit f02f87c
Show file tree
Hide file tree
Showing 35 changed files with 1,049 additions and 595 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -337,15 +337,15 @@ Please refer to [data/README.md](data/README.md) for checking the details about
### Quickstart

The following 3 commands conduct LoRA fine-tuning, inference and merging for Llama3-8B-Instruct model, respectively.
Use the following 3 commands to conduct LoRA **fine-tuning**, **inference** and **merging** for Llama3-8B-Instruct model, respectively.

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
```

See [examples/README.md](examples/README.md) for advanced usage.
See [examples/README.md](examples/README.md) for advanced usage (including distributed training).

> [!TIP]
> Use `llamafactory-cli help` to show help information.
Expand Down
6 changes: 3 additions & 3 deletions README_zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -337,18 +337,18 @@ pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/downl
### 快速开始

下面三行命令分别对 Llama3-8B-Instruct 模型进行 LoRA 微调、推理和合并
下面三行命令分别对 Llama3-8B-Instruct 模型进行 LoRA **微调****推理****合并**

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
```

高级用法请参考 [examples/README_zh.md](examples/README_zh.md)
高级用法请参考 [examples/README_zh.md](examples/README_zh.md)(包括多 GPU 微调)

> [!TIP]
> 使用 `llamafactory-cli help` 显示使用帮助
> 使用 `llamafactory-cli help` 显示帮助信息
### 使用 LLaMA Board 可视化界面(由 [Gradio](https://github.com/gradio-app/gradio) 驱动)

Expand Down
2 changes: 1 addition & 1 deletion data/dataset_info.json
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
},
"identity": {
"file_name": "identity.json",
"file_sha1": "ffe3ecb58ab642da33fbb514d5e6188f1469ad40"
"file_sha1": "0f67e97fd01612006ab3536cdaf6cfb0d1e7f279"
},
"oaast_sft": {
"file_name": "oaast_sft.json",
Expand Down
170 changes: 85 additions & 85 deletions data/identity.json

Large diffs are not rendered by default.

253 changes: 200 additions & 53 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -1,57 +1,204 @@
We provide diverse examples about fine-tuning LLMs.

### LoRA Fine-Tuning on A Single GPU

#### (Continuous) Pre-Training

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_pretrain.yaml
```

#### Supervised Fine-Tuning

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_sft.yaml
```

#### Reward Modeling

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_reward.yaml
```

#### PPO Training

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_ppo.yaml
```

#### DPO Training

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_dpo.yaml
```

#### ORPO Training

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_orpo.yaml
```

#### Multimodal Supervised Fine-Tuning

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llava1_5_lora_sft.yaml
```

#### Preprocess Dataset

It is useful for large dataset, use `tokenized_path` in config to load the preprocessed dataset.

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_preprocess.yaml
```

#### Evaluating on MMLU/CMMLU/C-Eval Benchmarks

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli eval examples/lora_single_gpu/llama3_lora_eval.yaml
```

#### Batch Predicting and Computing BLEU and ROUGE Scores

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/lora_single_gpu/llama3_lora_predict.yaml
```

### QLoRA Fine-Tuning on a Single GPU

#### Supervised Fine-Tuning with 4/8-bit Bitsandbytes Quantization (Recommended)

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_bitsandbytes.yaml
```

#### Supervised Fine-Tuning with 4/8-bit GPTQ Quantization

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_gptq.yaml
```

#### Supervised Fine-Tuning with 4-bit AWQ Quantization

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_awq.yaml
```

#### Supervised Fine-Tuning with 2-bit AQLM Quantization

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/qlora_single_gpu/llama3_lora_sft_aqlm.yaml
```

### LoRA Fine-Tuning on Multiple GPUs

#### Supervised Fine-Tuning with Accelerate on Single Node

```bash
bash examples/lora_multi_gpu/single_node.sh
```

#### Supervised Fine-Tuning with Accelerate on Multiple Nodes

```bash
bash examples/lora_multi_gpu/multi_node.sh
```

#### Supervised Fine-Tuning with DeepSpeed ZeRO-3 (Weight Sharding)

```bash
bash examples/lora_multi_gpu/ds_zero3.sh
```

### Full-Parameter Fine-Tuning on Multiple GPUs

#### Supervised Fine-Tuning with Accelerate on Single Node

```bash
bash examples/full_multi_gpu/single_node.sh
```

#### Supervised Fine-Tuning with Accelerate on Multiple Nodes

```bash
bash examples/full_multi_gpu/multi_node.sh
```

#### Batch Predicting and Computing BLEU and ROUGE Scores

```bash
bash examples/full_multi_gpu/predict.sh
```

### Merging LoRA Adapters and Quantization

#### Merge LoRA Adapters

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
```

#### Quantizing Model using AutoGPTQ

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli export examples/merge_lora/llama3_gptq.yaml
```

### Inferring LoRA Fine-Tuned Models

#### Use CLI

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli chat examples/merge_lora/llama3_lora_sft.yaml
```

#### Use Web UI

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli webchat examples/merge_lora/llama3_lora_sft.yaml
```

#### Launch OpenAI-style API

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli api examples/merge_lora/llama3_lora_sft.yaml
```

### Extras

#### Full-Parameter Fine-Tuning using GaLore

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/galore/llama3_full_sft.yaml
```

#### Full-Parameter Fine-Tuning using BAdam

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/badam/llama3_full_sft.yaml
```

#### LoRA+ Fine-Tuning

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/loraplus/llama3_lora_sft.yaml
```

#### Mixture-of-Depths Fine-Tuning

```bash
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/mod/llama3_full_sft.yaml
```

#### LLaMA-Pro Fine-Tuning

```bash
bash examples/extras/llama_pro/expand.sh
CUDA_VISIBLE_DEVICES=0 llamafactory-cli train examples/extras/llama_pro/llama3_freeze_sft.yaml
```

#### FSDP+QLoRA Fine-Tuning

```bash
export CUDA_VISIBLE_DEVICES=0
cd examples/lora_single_gpu
llamafactory-cli train llama3_lora_pretrain.yaml # Do continuous pre-training using LoRA

```

```
examples/
├── lora_single_gpu/
│ ├── `
│ ├── sft.sh: Do supervised fine-tuning using LoRA
│ ├── reward.sh: Do reward modeling using LoRA
│ ├── ppo.sh: Do PPO training using LoRA
│ ├── dpo.sh: Do DPO training using LoRA
│ ├── orpo.sh: Do ORPO training using LoRA
│ ├── sft_mllm.sh: Do supervised fine-tuning on multimodal data using LoRA
│ ├── prepare.sh: Save tokenized dataset
│ └── predict.sh: Do batch predict and compute BLEU and ROUGE scores after LoRA tuning
├── qlora_single_gpu/
│ ├── bitsandbytes.sh: Fine-tune 4/8-bit BNB models using QLoRA
│ ├── gptq.sh: Fine-tune 4/8-bit GPTQ models using QLoRA
│ ├── awq.sh: Fine-tune 4-bit AWQ models using QLoRA
│ └── aqlm.sh: Fine-tune 2-bit AQLM models using QLoRA
├── lora_multi_gpu/
│ ├── single_node.sh: Fine-tune model with Accelerate on single node using LoRA
│ ├── multi_node.sh: Fine-tune model with Accelerate on multiple nodes using LoRA
│ └── ds_zero3.sh: Fine-tune model with DeepSpeed ZeRO-3 using LoRA (weight sharding)
├── full_multi_gpu/
│ ├── single_node.sh: Full fine-tune model with DeepSpeed on single node
│ ├── multi_node.sh: Full fine-tune model with DeepSpeed on multiple nodes
│ └── predict.sh: Do parallel batch predict and compute BLEU and ROUGE scores after full tuning
├── merge_lora/
│ ├── merge.sh: Merge LoRA weights into the pre-trained models
│ └── quantize.sh: Quantize the fine-tuned model with AutoGPTQ
├── inference/
│ ├── cli_demo.sh: Chat with fine-tuned model in the CLI with LoRA adapters
│ ├── api_demo.sh: Chat with fine-tuned model in an OpenAI-style API with LoRA adapters
│ ├── web_demo.sh: Chat with fine-tuned model in the Web browser with LoRA adapters
│ └── evaluate.sh: Evaluate model on the MMLU/CMMLU/C-Eval benchmarks with LoRA adapters
└── extras/
├── galore/
│ └── sft.sh: Fine-tune model with GaLore
├── badam/
│ └── sft.sh: Fine-tune model with BAdam
├── loraplus/
│ └── sft.sh: Fine-tune model using LoRA+
├── mod/
│ └── sft.sh: Fine-tune model using Mixture-of-Depths
├── llama_pro/
│ ├── expand.sh: Expand layers in the model
│ └── sft.sh: Fine-tune the expanded model
└── fsdp_qlora/
└── sft.sh: Fine-tune quantized model with FSDP+QLoRA
bash examples/extras/fsdp_qlora/single_node.sh
```
Loading

0 comments on commit f02f87c

Please sign in to comment.