Skip to content

Commit 9d85509

Browse files
committedAug 4, 2023
make bitsandbytes optional
1 parent f3be995 commit 9d85509

File tree

6 files changed

+125
-426
lines changed

6 files changed

+125
-426
lines changed
 

‎README.md

+52-28
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,17 @@ __Stable Diffusion web UI now seems to support LoRA trained by ``sd-scripts``.__
2222

2323
The feature of SDXL training is now available in sdxl branch as an experimental feature.
2424

25+
Aug 4, 2023: The feature will be merged into the main branch soon. Following are the changes from the previous version.
26+
27+
- `bitsandbytes` is now optional. Please install it if you want to use it. The insructions are in the later section.
28+
- `albumentations` is not required anymore.
29+
- An issue for pooled output for Textual Inversion training is fixed.
30+
- `--v_pred_like_loss ratio` option is added. This option adds the loss like v-prediction loss in SDXL training. `0.1` means that the loss is added 10% of the v-prediction loss. The default value is None (disabled).
31+
- In v-prediction, the loss is higher in the early timesteps (near the noise). This option can be used to increase the loss in the early timesteps.
32+
- Arbitrary options can be used for Diffusers' schedulers. For example `--lr_scheduler_args "lr_end=1e-8"`.
33+
- `sdxl_gen_imgs.py` supports batch size > 1.
34+
- Fix ControlNet to work with attention couple and reginal LoRA in `gen_img_diffusers.py`.
35+
2536
Summary of the feature:
2637

2738
- `tools/cache_latents.py` is added. This script can be used to cache the latents to disk in advance.
@@ -65,12 +76,17 @@ Summary of the feature:
6576
### Tips for SDXL training
6677

6778
- The default resolution of SDXL is 1024x1024.
68-
- The fine-tuning can be done with 24GB GPU memory with the batch size of 1. For 24GB GPU, the following options are recommended:
79+
- The fine-tuning can be done with 24GB GPU memory with the batch size of 1. For 24GB GPU, the following options are recommended __for the fine-tuning with 24GB GPU memory__:
6980
- Train U-Net only.
7081
- Use gradient checkpointing.
7182
- Use `--cache_text_encoder_outputs` option and caching latents.
7283
- Use Adafactor optimizer. RMSprop 8bit or Adagrad 8bit may work. AdamW 8bit doesn't seem to work.
73-
- The LoRA training can be done with 12GB GPU memory.
84+
- The LoRA training can be done with 8GB GPU memory (10GB recommended). For reducing the GPU memory usage, the following options are recommended:
85+
- Train U-Net only.
86+
- Use gradient checkpointing.
87+
- Use `--cache_text_encoder_outputs` option and caching latents.
88+
- Use one of 8bit optimizers or Adafactor optimizer.
89+
- Use lower dim (-8 for 8GB GPU).
7490
- `--network_train_unet_only` option is highly recommended for SDXL LoRA. Because SDXL has two text encoders, the result of the training will be unexpected.
7591
- PyTorch 2 seems to use slightly less GPU memory than PyTorch 1.
7692
- `--bucket_reso_steps` can be set to 32 instead of the default value 64. Smaller values than 32 will not work for SDXL training.
@@ -93,19 +109,11 @@ state_dict = {"clip_g": embs_for_text_encoder_1280, "clip_l": embs_for_text_enco
93109
save_file(state_dict, file)
94110
```
95111

96-
### TODO
97-
98-
- [ ] Support conversion of Diffusers SDXL models.
99-
- [ ] Support `--weighted_captions` option.
100-
- [ ] Change `--output_config` option to continue the training.
101-
- [ ] Extend `--full_bf16` for all the scripts.
102-
- [x] Support Textual Inversion training.
103-
104112
## About requirements.txt
105113

106114
These files do not contain requirements for PyTorch. Because the versions of them depend on your environment. Please install PyTorch at first (see installation guide below.)
107115

108-
The scripts are tested with PyTorch 1.12.1 and 2.0.1, Diffusers 0.17.1.
116+
The scripts are tested with PyTorch 1.12.1 and 2.0.1, Diffusers 0.18.2.
109117

110118
## Links to how-to-use documents
111119

@@ -151,13 +159,16 @@ pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 --extra-index-url http
151159
pip install --upgrade -r requirements.txt
152160
pip install -U -I --no-deps https://github.com/C43H66N12O12S2/stable-diffusion-webui/releases/download/f/xformers-0.0.14.dev0-cp310-cp310-win_amd64.whl
153161
154-
cp .\bitsandbytes_windows\*.dll .\venv\Lib\site-packages\bitsandbytes\
155-
cp .\bitsandbytes_windows\cextension.py .\venv\Lib\site-packages\bitsandbytes\cextension.py
156-
cp .\bitsandbytes_windows\main.py .\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py
157-
158162
accelerate config
159163
```
160164

165+
__Note:__ Now bitsandbytes is optional. Please install any version of bitsandbytes as needed. Installation instructions are in the following section.
166+
167+
<!--
168+
cp .\bitsandbytes_windows\*.dll .\venv\Lib\site-packages\bitsandbytes\
169+
cp .\bitsandbytes_windows\cextension.py .\venv\Lib\site-packages\bitsandbytes\cextension.py
170+
cp .\bitsandbytes_windows\main.py .\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py
171+
-->
161172
Answers to accelerate config:
162173

163174
```txt
@@ -190,10 +201,6 @@ pip install torch==2.0.1+cu118 torchvision==0.15.2+cu118 --index-url https://dow
190201
pip install --upgrade -r requirements.txt
191202
pip install xformers==0.0.20
192203
193-
cp .\bitsandbytes_windows\*.dll .\venv\Lib\site-packages\bitsandbytes\
194-
cp .\bitsandbytes_windows\cextension.py .\venv\Lib\site-packages\bitsandbytes\cextension.py
195-
cp .\bitsandbytes_windows\main.py .\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py
196-
197204
accelerate config
198205
```
199206

@@ -204,26 +211,43 @@ Answers to accelerate config should be the same as above.
204211
Other versions of PyTorch and xformers seem to have problems with training.
205212
If there is no other reason, please install the specified version.
206213

207-
### Optional: Use Lion8bit
214+
### Optional: Use `bitsandbytes` (8bit optimizer)
215+
216+
For 8bit optimizer, you need to install `bitsandbytes`. For Linux, please install `bitsandbytes` as usual (0.41.1 or later is recommended.)
217+
218+
For Windows, there are several versions of `bitsandbytes`:
219+
220+
- `bitsandbytes` 0.35.0: Stable version. AdamW8bit is available. `full_bf16` is not available.
221+
- `bitsandbytes` 0.39.1: Lion8bit, PagedAdamW8bit and PagedLion8bit are available. `full_bf16` is available.
222+
223+
Note: `bitsandbytes`above 0.35.0 till 0.41.0 seems to have an issue: https://github.com/TimDettmers/bitsandbytes/issues/659
208224

209-
For Lion8bit, you need to upgrade `bitsandbytes` to 0.38.0 or later. Uninstall `bitsandbytes`, and for Windows, install the Windows version whl file from [here](https://github.com/jllllll/bitsandbytes-windows-webui) or other sources, like:
225+
Follow the instructions below to install `bitsandbytes` for Windows.
226+
227+
### bitsandbytes 0.35.0 for Windows
228+
229+
Open a regular Powershell terminal and type the following inside:
210230

211231
```powershell
212-
pip install https://github.com/jllllll/bitsandbytes-windows-webui/raw/main/bitsandbytes-0.38.1-py3-none-any.whl
232+
cd sd-scripts
233+
.\venv\Scripts\activate
234+
pip install bitsandbytes==0.35.0
235+
236+
cp .\bitsandbytes_windows\*.dll .\venv\Lib\site-packages\bitsandbytes\
237+
cp .\bitsandbytes_windows\cextension.py .\venv\Lib\site-packages\bitsandbytes\cextension.py
238+
cp .\bitsandbytes_windows\main.py .\venv\Lib\site-packages\bitsandbytes\cuda_setup\main.py
213239
```
214240

215-
For upgrading, upgrade this repo with `pip install .`, and upgrade necessary packages manually.
241+
This will install `bitsandbytes` 0.35.0 and copy the necessary files to the `bitsandbytes` directory.
216242

217-
### Optional: Use PagedAdamW8bit and PagedLion8bit
243+
### bitsandbytes 0.39.1 for Windows
218244

219-
For PagedAdamW8bit and PagedLion8bit, you need to upgrade `bitsandbytes` to 0.39.0 or later. Uninstall `bitsandbytes`, and for Windows, install the Windows version whl file from [here](https://github.com/jllllll/bitsandbytes-windows-webui) or other sources, like:
245+
Install the Windows version whl file from [here](https://github.com/jllllll/bitsandbytes-windows-webui) or other sources, like:
220246

221247
```powershell
222-
pip install https://github.com/jllllll/bitsandbytes-windows-webui/releases/download/wheels/bitsandbytes-0.39.1-py3-none-win_amd64.whl
248+
pip install https://github.com/jllllll/bitsandbytes-windows-webui/raw/main/bitsandbytes-0.38.1-py3-none-any.whl
223249
```
224250

225-
For upgrading, upgrade this repo with `pip install .`, and upgrade necessary packages manually.
226-
227251
## Upgrade
228252

229253
When a new release comes out you can upgrade your repo with the following command:
-15 KB
Binary file not shown.

‎bitsandbytes_windows/main.py

+70-396
Large diffs are not rendered by default.

‎library/train_util.py

+1
Original file line numberDiff line numberDiff line change
@@ -2164,6 +2164,7 @@ def cache_batch_latents(
21642164
if flip_aug:
21652165
info.latents_flipped = flipped_latent
21662166

2167+
# FIXME this slows down caching a lot, specify this as an option
21672168
if torch.cuda.is_available():
21682169
torch.cuda.empty_cache()
21692170

‎requirements.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ftfy==6.1.1
66
opencv-python==4.7.0.68
77
einops==0.6.0
88
pytorch-lightning==1.9.0
9-
bitsandbytes==0.39.1
9+
# bitsandbytes==0.39.1
1010
tensorboard==2.10.1
1111
safetensors==0.3.1
1212
# gradio==3.16.2

‎sdxl_minimal_inference.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -213,7 +213,7 @@ def call_text_encoder(text, text2):
213213
enc_out = text_model2(tokens, output_hidden_states=True, return_dict=True)
214214
text_embedding2_penu = enc_out["hidden_states"][-2]
215215
# print("hidden_states2", text_embedding2_penu.shape)
216-
text_embedding2_pool = enc_out["text_embeds"] # do not suport Textual Inversion
216+
text_embedding2_pool = enc_out["text_embeds"] # do not support Textual Inversion
217217

218218
# 連結して終了 concat and finish
219219
text_embedding = torch.cat([text_embedding1, text_embedding2_penu], dim=2)

0 commit comments

Comments
 (0)
Please sign in to comment.