Skip to content

Commit 2695bb9

Browse files
authored
Merge pull request #199 from huggingface/main
Merge changes
2 parents 2a1f43f + 7b100ce commit 2695bb9

File tree

304 files changed

+11130
-2378
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

304 files changed

+11130
-2378
lines changed

.github/workflows/nightly_tests.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -265,7 +265,7 @@ jobs:
265265
266266
- name: Run PyTorch CUDA tests
267267
env:
268-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
268+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
269269
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
270270
CUBLAS_WORKSPACE_CONFIG: :16:8
271271
run: |
@@ -505,7 +505,7 @@ jobs:
505505
# shell: arch -arch arm64 bash {0}
506506
# env:
507507
# HF_HOME: /System/Volumes/Data/mnt/cache
508-
# HF_TOKEN: ${{ secrets.HF_TOKEN }}
508+
# HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
509509
# run: |
510510
# ${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps \
511511
# --report-log=tests_torch_mps.log \
@@ -561,7 +561,7 @@ jobs:
561561
# shell: arch -arch arm64 bash {0}
562562
# env:
563563
# HF_HOME: /System/Volumes/Data/mnt/cache
564-
# HF_TOKEN: ${{ secrets.HF_TOKEN }}
564+
# HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
565565
# run: |
566566
# ${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps \
567567
# --report-log=tests_torch_mps.log \

.github/workflows/push_tests.yml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -187,7 +187,7 @@ jobs:
187187
188188
- name: Run Flax TPU tests
189189
env:
190-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
190+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
191191
run: |
192192
python -m pytest -n 0 \
193193
-s -v -k "Flax" \
@@ -235,7 +235,7 @@ jobs:
235235
236236
- name: Run ONNXRuntime CUDA tests
237237
env:
238-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
238+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
239239
run: |
240240
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
241241
-s -v -k "Onnx" \
@@ -283,7 +283,7 @@ jobs:
283283
python utils/print_env.py
284284
- name: Run example tests on GPU
285285
env:
286-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
286+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
287287
RUN_COMPILE: yes
288288
run: |
289289
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "compile" --make-reports=tests_torch_compile_cuda tests/
@@ -326,7 +326,7 @@ jobs:
326326
python utils/print_env.py
327327
- name: Run example tests on GPU
328328
env:
329-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
329+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
330330
run: |
331331
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "xformers" --make-reports=tests_torch_xformers_cuda tests/
332332
- name: Failure short reports
@@ -372,7 +372,7 @@ jobs:
372372
373373
- name: Run example tests on GPU
374374
env:
375-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
375+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
376376
run: |
377377
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
378378
python -m uv pip install timm

.github/workflows/release_tests_fast.yml

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ jobs:
8181
python utils/print_env.py
8282
- name: Slow PyTorch CUDA checkpoint tests on Ubuntu
8383
env:
84-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
84+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
8585
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
8686
CUBLAS_WORKSPACE_CONFIG: :16:8
8787
run: |
@@ -135,7 +135,7 @@ jobs:
135135
136136
- name: Run PyTorch CUDA tests
137137
env:
138-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
138+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
139139
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
140140
CUBLAS_WORKSPACE_CONFIG: :16:8
141141
run: |
@@ -186,7 +186,7 @@ jobs:
186186
187187
- name: Run PyTorch CUDA tests
188188
env:
189-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
189+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
190190
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
191191
CUBLAS_WORKSPACE_CONFIG: :16:8
192192
run: |
@@ -241,7 +241,7 @@ jobs:
241241
242242
- name: Run slow Flax TPU tests
243243
env:
244-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
244+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
245245
run: |
246246
python -m pytest -n 0 \
247247
-s -v -k "Flax" \
@@ -289,7 +289,7 @@ jobs:
289289
290290
- name: Run slow ONNXRuntime CUDA tests
291291
env:
292-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
292+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
293293
run: |
294294
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
295295
-s -v -k "Onnx" \
@@ -337,7 +337,7 @@ jobs:
337337
python utils/print_env.py
338338
- name: Run example tests on GPU
339339
env:
340-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
340+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
341341
RUN_COMPILE: yes
342342
run: |
343343
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "compile" --make-reports=tests_torch_compile_cuda tests/
@@ -380,7 +380,7 @@ jobs:
380380
python utils/print_env.py
381381
- name: Run example tests on GPU
382382
env:
383-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
383+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
384384
run: |
385385
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "xformers" --make-reports=tests_torch_xformers_cuda tests/
386386
- name: Failure short reports
@@ -426,7 +426,7 @@ jobs:
426426
427427
- name: Run example tests on GPU
428428
env:
429-
HF_TOKEN: ${{ secrets.HF_TOKEN }}
429+
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
430430
run: |
431431
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
432432
python -m uv pip install timm

docs/source/en/_toctree.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,8 @@
7979
- sections:
8080
- local: using-diffusers/cogvideox
8181
title: CogVideoX
82+
- local: using-diffusers/consisid
83+
title: ConsisID
8284
- local: using-diffusers/sdxl
8385
title: Stable Diffusion XL
8486
- local: using-diffusers/sdxl_turbo
@@ -270,6 +272,8 @@
270272
title: AuraFlowTransformer2DModel
271273
- local: api/models/cogvideox_transformer3d
272274
title: CogVideoXTransformer3DModel
275+
- local: api/models/consisid_transformer3d
276+
title: ConsisIDTransformer3DModel
273277
- local: api/models/cogview3plus_transformer2d
274278
title: CogView3PlusTransformer2DModel
275279
- local: api/models/dit_transformer2d
@@ -372,6 +376,8 @@
372376
title: CogVideoX
373377
- local: api/pipelines/cogview3
374378
title: CogView3
379+
- local: api/pipelines/consisid
380+
title: ConsisID
375381
- local: api/pipelines/consistency_models
376382
title: Consistency Models
377383
- local: api/pipelines/controlnet
@@ -592,6 +598,8 @@
592598
title: Attention Processor
593599
- local: api/activations
594600
title: Custom activation functions
601+
- local: api/cache
602+
title: Caching methods
595603
- local: api/normalization
596604
title: Custom normalization layers
597605
- local: api/utilities

docs/source/en/api/cache.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# Caching methods
13+
14+
## Pyramid Attention Broadcast
15+
16+
[Pyramid Attention Broadcast](https://huggingface.co/papers/2408.12588) from Xuanlei Zhao, Xiaolong Jin, Kai Wang, Yang You.
17+
18+
Pyramid Attention Broadcast (PAB) is a method that speeds up inference in diffusion models by systematically skipping attention computations between successive inference steps and reusing cached attention states. The attention states are not very different between successive inference steps. The most prominent difference is in the spatial attention blocks, not as much in the temporal attention blocks, and finally the least in the cross attention blocks. Therefore, many cross attention computation blocks can be skipped, followed by the temporal and spatial attention blocks. By combining other techniques like sequence parallelism and classifier-free guidance parallelism, PAB achieves near real-time video generation.
19+
20+
Enable PAB with [`~PyramidAttentionBroadcastConfig`] on any pipeline. For some benchmarks, refer to [this](https://github.com/huggingface/diffusers/pull/9562) pull request.
21+
22+
```python
23+
import torch
24+
from diffusers import CogVideoXPipeline, PyramidAttentionBroadcastConfig
25+
26+
pipe = CogVideoXPipeline.from_pretrained("THUDM/CogVideoX-5b", torch_dtype=torch.bfloat16)
27+
pipe.to("cuda")
28+
29+
# Increasing the value of `spatial_attention_timestep_skip_range[0]` or decreasing the value of
30+
# `spatial_attention_timestep_skip_range[1]` will decrease the interval in which pyramid attention
31+
# broadcast is active, leader to slower inference speeds. However, large intervals can lead to
32+
# poorer quality of generated videos.
33+
config = PyramidAttentionBroadcastConfig(
34+
spatial_attention_block_skip_range=2,
35+
spatial_attention_timestep_skip_range=(100, 800),
36+
current_timestep_callback=lambda: pipe.current_timestep,
37+
)
38+
pipe.transformer.enable_cache(config)
39+
```
40+
41+
### CacheMixin
42+
43+
[[autodoc]] CacheMixin
44+
45+
### PyramidAttentionBroadcastConfig
46+
47+
[[autodoc]] PyramidAttentionBroadcastConfig
48+
49+
[[autodoc]] apply_pyramid_attention_broadcast
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
3+
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
4+
the License. You may obtain a copy of the License at
5+
6+
http://www.apache.org/licenses/LICENSE-2.0
7+
8+
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
9+
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
10+
specific language governing permissions and limitations under the License. -->
11+
12+
# ConsisIDTransformer3DModel
13+
14+
A Diffusion Transformer model for 3D data from [ConsisID](https://github.com/PKU-YuanGroup/ConsisID) was introduced in [Identity-Preserving Text-to-Video Generation by Frequency Decomposition](https://arxiv.org/pdf/2411.17440) by Peking University & University of Rochester & etc.
15+
16+
The model can be loaded with the following code snippet.
17+
18+
```python
19+
from diffusers import ConsisIDTransformer3DModel
20+
21+
transformer = ConsisIDTransformer3DModel.from_pretrained("BestWishYsh/ConsisID-preview", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
22+
```
23+
24+
## ConsisIDTransformer3DModel
25+
26+
[[autodoc]] ConsisIDTransformer3DModel
27+
28+
## Transformer2DModelOutput
29+
30+
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
2+
#
3+
# Licensed under the Apache License, Version 2.0 (the "License");
4+
# you may not use this file except in compliance with the License.
5+
# You may obtain a copy of the License at
6+
#
7+
# http://www.apache.org/licenses/LICENSE-2.0
8+
#
9+
# Unless required by applicable law or agreed to in writing, software
10+
# distributed under the License is distributed on an "AS IS" BASIS,
11+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
# See the License for the specific language governing permissions and
13+
# limitations under the License.
14+
-->
15+
16+
# ConsisID
17+
18+
[Identity-Preserving Text-to-Video Generation by Frequency Decomposition](https://arxiv.org/abs/2411.17440) from Peking University & University of Rochester & etc, by Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyang Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan.
19+
20+
The abstract from the paper is:
21+
22+
*Identity-preserving text-to-video (IPT2V) generation aims to create high-fidelity videos with consistent human identity. It is an important task in video generation but remains an open problem for generative models. This paper pushes the technical frontier of IPT2V in two directions that have not been resolved in the literature: (1) A tuning-free pipeline without tedious case-by-case finetuning, and (2) A frequency-aware heuristic identity-preserving Diffusion Transformer (DiT)-based control scheme. To achieve these goals, we propose **ConsisID**, a tuning-free DiT-based controllable IPT2V model to keep human-**id**entity **consis**tent in the generated video. Inspired by prior findings in frequency analysis of vision/diffusion transformers, it employs identity-control signals in the frequency domain, where facial features can be decomposed into low-frequency global features (e.g., profile, proportions) and high-frequency intrinsic features (e.g., identity markers that remain unaffected by pose changes). First, from a low-frequency perspective, we introduce a global facial extractor, which encodes the reference image and facial key points into a latent space, generating features enriched with low-frequency information. These features are then integrated into the shallow layers of the network to alleviate training challenges associated with DiT. Second, from a high-frequency perspective, we design a local facial extractor to capture high-frequency details and inject them into the transformer blocks, enhancing the model's ability to preserve fine-grained features. To leverage the frequency information for identity preservation, we propose a hierarchical training strategy, transforming a vanilla pre-trained video generation model into an IPT2V model. Extensive experiments demonstrate that our frequency-aware heuristic scheme provides an optimal control solution for DiT-based models. Thanks to this scheme, our **ConsisID** achieves excellent results in generating high-quality, identity-preserving videos, making strides towards more effective IPT2V. The model weight of ConsID is publicly available at https://github.com/PKU-YuanGroup/ConsisID.*
23+
24+
<Tip>
25+
26+
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers.md) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading.md#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
27+
28+
</Tip>
29+
30+
This pipeline was contributed by [SHYuanBest](https://github.com/SHYuanBest). The original codebase can be found [here](https://github.com/PKU-YuanGroup/ConsisID). The original weights can be found under [hf.co/BestWishYsh](https://huggingface.co/BestWishYsh).
31+
32+
There are two official ConsisID checkpoints for identity-preserving text-to-video.
33+
34+
| checkpoints | recommended inference dtype |
35+
|:---:|:---:|
36+
| [`BestWishYsh/ConsisID-preview`](https://huggingface.co/BestWishYsh/ConsisID-preview) | torch.bfloat16 |
37+
| [`BestWishYsh/ConsisID-1.5`](https://huggingface.co/BestWishYsh/ConsisID-preview) | torch.bfloat16 |
38+
39+
### Memory optimization
40+
41+
ConsisID requires about 44 GB of GPU memory to decode 49 frames (6 seconds of video at 8 FPS) with output resolution 720x480 (W x H), which makes it not possible to run on consumer GPUs or free-tier T4 Colab. The following memory optimizations could be used to reduce the memory footprint. For replication, you can refer to [this](https://gist.github.com/SHYuanBest/bc4207c36f454f9e969adbb50eaf8258) script.
42+
43+
| Feature (overlay the previous) | Max Memory Allocated | Max Memory Reserved |
44+
| :----------------------------- | :------------------- | :------------------ |
45+
| - | 37 GB | 44 GB |
46+
| enable_model_cpu_offload | 22 GB | 25 GB |
47+
| enable_sequential_cpu_offload | 16 GB | 22 GB |
48+
| vae.enable_slicing | 16 GB | 22 GB |
49+
| vae.enable_tiling | 5 GB | 7 GB |
50+
51+
## ConsisIDPipeline
52+
53+
[[autodoc]] ConsisIDPipeline
54+
55+
- all
56+
- __call__
57+
58+
## ConsisIDPipelineOutput
59+
60+
[[autodoc]] pipelines.consisid.pipeline_output.ConsisIDPipelineOutput

docs/source/en/api/pipelines/flux.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -309,6 +309,53 @@ image.save("output.png")
309309

310310
When unloading the Control LoRA weights, call `pipe.unload_lora_weights(reset_to_overwritten_params=True)` to reset the `pipe.transformer` completely back to its original form. The resultant pipeline can then be used with methods like [`DiffusionPipeline.from_pipe`]. More details about this argument are available in [this PR](https://github.com/huggingface/diffusers/pull/10397).
311311

312+
## IP-Adapter
313+
314+
<Tip>
315+
316+
Check out [IP-Adapter](../../../using-diffusers/ip_adapter) to learn more about how IP-Adapters work.
317+
318+
</Tip>
319+
320+
An IP-Adapter lets you prompt Flux with images, in addition to the text prompt. This is especially useful when describing complex concepts that are difficult to articulate through text alone and you have reference images.
321+
322+
```python
323+
import torch
324+
from diffusers import FluxPipeline
325+
from diffusers.utils import load_image
326+
327+
pipe = FluxPipeline.from_pretrained(
328+
"black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16
329+
).to("cuda")
330+
331+
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/flux_ip_adapter_input.jpg").resize((1024, 1024))
332+
333+
pipe.load_ip_adapter(
334+
"XLabs-AI/flux-ip-adapter",
335+
weight_name="ip_adapter.safetensors",
336+
image_encoder_pretrained_model_name_or_path="openai/clip-vit-large-patch14"
337+
)
338+
pipe.set_ip_adapter_scale(1.0)
339+
340+
image = pipe(
341+
width=1024,
342+
height=1024,
343+
prompt="wearing sunglasses",
344+
negative_prompt="",
345+
true_cfg=4.0,
346+
generator=torch.Generator().manual_seed(4444),
347+
ip_adapter_image=image,
348+
).images[0]
349+
350+
image.save('flux_ip_adapter_output.jpg')
351+
```
352+
353+
<div class="justify-center">
354+
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/flux_ip_adapter_output.jpg"/>
355+
<figcaption class="mt-2 text-sm text-center text-gray-500">IP-Adapter examples with prompt "wearing sunglasses"</figcaption>
356+
</div>
357+
358+
312359
## Running FP16 inference
313360

314361
Flux can generate high-quality images with FP16 (i.e. to accelerate inference on Turing/Volta GPUs) but produces different outputs compared to FP32/BF16. The issue is that some activations in the text encoders have to be clipped when running in FP16, which affects the overall image. Forcing text encoders to run with FP32 inference thus removes this output difference. See [here](https://github.com/huggingface/diffusers/pull/9097#issuecomment-2272292516) for details.

docs/source/en/api/pipelines/mochi.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -115,7 +115,7 @@ export_to_video(frames, "mochi.mp4", fps=30)
115115

116116
## Reproducing the results from the Genmo Mochi repo
117117

118-
The [Genmo Mochi implementation](https://github.com/genmoai/mochi/tree/main) uses different precision values for each stage in the inference process. The text encoder and VAE use `torch.float32`, while the DiT uses `torch.bfloat16` with the [attention kernel](https://pytorch.org/docs/stable/generated/torch.nn.attention.sdpa_kernel.html#torch.nn.attention.sdpa_kernel) set to `EFFICIENT_ATTENTION`. Diffusers pipelines currently do not support setting different `dtypes` for different stages of the pipeline. In order to run inference in the same way as the the original implementation, please refer to the following example.
118+
The [Genmo Mochi implementation](https://github.com/genmoai/mochi/tree/main) uses different precision values for each stage in the inference process. The text encoder and VAE use `torch.float32`, while the DiT uses `torch.bfloat16` with the [attention kernel](https://pytorch.org/docs/stable/generated/torch.nn.attention.sdpa_kernel.html#torch.nn.attention.sdpa_kernel) set to `EFFICIENT_ATTENTION`. Diffusers pipelines currently do not support setting different `dtypes` for different stages of the pipeline. In order to run inference in the same way as the original implementation, please refer to the following example.
119119

120120
<Tip>
121121
The original Mochi implementation zeros out empty prompts. However, enabling this option and placing the entire pipeline under autocast can lead to numerical overflows with the T5 text encoder.

docs/source/en/api/utilities.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,7 @@ Utility and helper functions for working with 🤗 Diffusers.
4141
## randn_tensor
4242

4343
[[autodoc]] utils.torch_utils.randn_tensor
44+
45+
## apply_layerwise_casting
46+
47+
[[autodoc]] hooks.layerwise_casting.apply_layerwise_casting

0 commit comments

Comments
 (0)