Skip to content

Commit 884e23e

Browse files
committed
docs: add kontext doc
1 parent c9b5735 commit 884e23e

File tree

3 files changed

+51
-6
lines changed

3 files changed

+51
-6
lines changed

README.md

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Inference of Stable Diffusion and Flux in pure C/C++
1313
- SD1.x, SD2.x, SDXL and [SD3/SD3.5](./docs/sd3.md) support
1414
- !!!The VAE in SDXL encounters NaN issues under FP16, but unfortunately, the ggml_conv_2d only operates under FP16. Hence, a parameter is needed to specify the VAE that has fixed the FP16 NaN issue. You can find it here: [SDXL VAE FP16 Fix](https://huggingface.co/madebyollin/sdxl-vae-fp16-fix/blob/main/sdxl_vae.safetensors).
1515
- [Flux-dev/Flux-schnell Support](./docs/flux.md)
16-
16+
- [FLUX.1-Kontext-dev](./docs/kontext.md)
1717
- [SD-Turbo](https://huggingface.co/stabilityai/sd-turbo) and [SDXL-Turbo](https://huggingface.co/stabilityai/sdxl-turbo) support
1818
- [PhotoMaker](https://github.com/TencentARC/PhotoMaker) support.
1919
- 16-bit, 32-bit float support
@@ -220,7 +220,7 @@ arguments:
220220
-m, --model [MODEL] path to full model
221221
--diffusion-model path to the standalone diffusion model
222222
--clip_l path to the clip-l text encoder
223-
--clip_g path to the clip-l text encoder
223+
--clip_g path to the clip-g text encoder
224224
--t5xxl path to the the t5xxl text encoder
225225
--vae [VAE] path to vae
226226
--taesd [TAESD_PATH] path to taesd. Using Tiny AutoEncoder for fast decoding (low quality)
@@ -231,26 +231,32 @@ arguments:
231231
--normalize-input normalize PHOTOMAKER input id images
232232
--upscale-model [ESRGAN_PATH] path to esrgan model. Upscale images after generate, just RealESRGAN_x4plus_anime_6B supported by now
233233
--upscale-repeats Run the ESRGAN upscaler this many times (default 1)
234-
--type [TYPE] weight type (f32, f16, q4_0, q4_1, q5_0, q5_1, q8_0, q2_k, q3_k, q4_k)
234+
--type [TYPE] weight type (examples: f32, f16, q4_0, q4_1, q5_0, q5_1, q8_0, q2_K, q3_K, q4_K)
235235
If not specified, the default is the type of the weight file
236236
--lora-model-dir [DIR] lora model directory
237237
-i, --init-img [IMAGE] path to the input image, required by img2img
238+
--mask [MASK] path to the mask image, required by img2img with mask
238239
--control-image [IMAGE] path to image condition, control net
240+
-r, --ref_image [PATH] reference image for Flux Kontext models (can be used multiple times)
239241
-o, --output OUTPUT path to write result image to (default: ./output.png)
240242
-p, --prompt [PROMPT] the prompt to render
241243
-n, --negative-prompt PROMPT the negative prompt (default: "")
242244
--cfg-scale SCALE unconditional guidance scale: (default: 7.0)
245+
--guidance SCALE guidance scale for img2img (default: 3.5)
246+
--slg-scale SCALE skip layer guidance (SLG) scale, only for DiT models: (default: 0)
247+
0 means disabled, a value of 2.5 is nice for sd3.5 medium
248+
--eta SCALE eta in DDIM, only for DDIM and TCD: (default: 0)
243249
--skip-layers LAYERS Layers to skip for SLG steps: (default: [7,8,9])
244250
--skip-layer-start START SLG enabling point: (default: 0.01)
245251
--skip-layer-end END SLG disabling point: (default: 0.2)
246-
SLG will be enabled at step int([STEPS]*[START]) and disabled at int([STEPS]*[END])
252+
SLG will be enabled at step int([STEPS]*[START]) and disabled at int([STEPS]*[END])
247253
--strength STRENGTH strength for noising/unnoising (default: 0.75)
248254
--style-ratio STYLE-RATIO strength for keeping input identity (default: 20%)
249255
--control-strength STRENGTH strength to apply Control Net (default: 0.9)
250256
1.0 corresponds to full destruction of information in init image
251257
-H, --height H image height, in pixel space (default: 512)
252258
-W, --width W image width, in pixel space (default: 512)
253-
--sampling-method {euler, euler_a, heun, dpm2, dpm++2s_a, dpm++2m, dpm++2mv2, ipndm, ipndm_v, lcm}
259+
--sampling-method {euler, euler_a, heun, dpm2, dpm++2s_a, dpm++2m, dpm++2mv2, ipndm, ipndm_v, lcm, ddim_trailing, tcd}
254260
sampling method (default: "euler_a")
255261
--steps STEPS number of sample steps (default: 20)
256262
--rng {std_default, cuda} RNG (default: cuda)
@@ -267,7 +273,7 @@ arguments:
267273
This might crash if it is not supported by the backend.
268274
--control-net-cpu keep controlnet in cpu (for low vram)
269275
--canny apply canny preprocessor (edge detection)
270-
--color Colors the logging tags according to level
276+
--color colors the logging tags according to level
271277
-v, --verbose print extra info
272278
```
273279

assets/flux/kontext1_dev_output.png

496 KB
Loading

docs/kontext.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# How to Use
2+
3+
You can run Kontext using stable-diffusion.cpp with a GPU that has 6GB or even 4GB of VRAM, without needing to offload to RAM.
4+
5+
## Download weights
6+
7+
- Download Kontext
8+
- If you don't want to do the conversion yourself, download the preconverted gguf model from [FLUX.1-Kontext-dev-GGUF](https://huggingface.co/QuantStack/FLUX.1-Kontext-dev-GGUF)
9+
- Otherwise, download FLUX.1-Kontext-dev from https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/flux1-kontext-dev.safetensors
10+
- Download vae from https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/ae.safetensors
11+
- Download clip_l from https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/clip_l.safetensors
12+
- Download t5xxl from https://huggingface.co/comfyanonymous/flux_text_encoders/blob/main/t5xxl_fp16.safetensors
13+
14+
## Convert Kontext weights
15+
16+
You can download the preconverted gguf weights from [FLUX.1-Kontext-dev-GGUF](https://huggingface.co/QuantStack/FLUX.1-Kontext-dev-GGUF), this way you don't have to do the conversion yourself.
17+
18+
```
19+
.\bin\Release\sd.exe -M convert -m ..\..\ComfyUI\models\unet\flux1-kontext-dev.safetensors -o ..\models\flux1-kontext-dev-q8_0.gguf -v --type q8_0
20+
```
21+
22+
## Run
23+
24+
- `--cfg-scale` is recommended to be set to 1.
25+
26+
### Example
27+
For example:
28+
29+
```
30+
.\bin\Release\sd.exe -M edit -r .\flux1-dev-q8_0.png --diffusion-model ..\models\flux1-kontext-dev-q8_0.gguf --vae ..\models\ae.sft --clip_l ..\models\clip_l.safetensors --t5xxl ..\models\t5xxl_fp16.safetensors -p "change 'flux.cpp' to 'kontext.cpp'" --cfg-scale 1.0 --sampling-method euler -v
31+
```
32+
33+
34+
| ref_image | prompt | output |
35+
| ---- | ---- |---- |
36+
| ![](../assets/flux/flux1-dev-q8_0.png) | change 'flux.cpp' to 'kontext.cpp' |![](../assets/flux/kontext1_dev_output.png) |
37+
38+
39+

0 commit comments

Comments
 (0)