FP8 compatibility / quantized model weights for 24 GB GPUs?

Hi, thanks for great work! I've been experimenting with it and ran into a VRAM constraint.                         
                                                                                                                        
The transformer is ~38 GB in bf16, which exceeds the 24 GB VRAM of common GPUs like the RTX A5000/3090/4090. A community fp8 quantization exists (https://huggingface.co/1038lab/Qwen-Image-Edit-2511-FP8) which brings it down to  ~20 GB, potentially fitting within 24 GB.                                                                             

A few questions:                                                                                                      
1. Have you tested PixelSmile's LoRA weights with the fp8-quantized base model? Any noticeable quality degradation?
2. Are you planning to release an officially validated quantized version (fp8/int8) of the base model or LoRA?        
3. Is there a recommended workaround for sub-24 GB inference in the meantime?                                 

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP8 compatibility / quantized model weights for 24 GB GPUs? #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

FP8 compatibility / quantized model weights for 24 GB GPUs? #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions