Skip to content

KAIST-Visual-AI-Group/Diffusion-2025-Demo

Repository files navigation

Instructor: Minhyuk Sung (mhsung [at] kaist.ac.kr)
TA: Seungwoo Yoo (dreamy1534 [at] kaist.ac.kr)


Description

In this demo session, you will gain hands-on experience with Stable Diffusion 3—one of the most powerful text-to-image models—while experimenting with three popular techniques: Classifier-Free Guidance, ControlNet, and LoRA.

(1) Classifier-Free Guidance improves the effect of user-provided conditional inputs, such as text prompts, on the generated images. In this demo, you will explore how changing the guidance scale influences the outputs.

(2) ControlNet further extends the capabilities of text-to-image diffusion models, such as Stable Diffusion, by allowing them to incorporate additional conditions beyond text prompts, such as sketches or depth maps. In this demo, we will present a minimal example of using ControlNet to generate images that follow depth maps.

(3) LoRA (Low-Rank Adaptation) is an efficient fine-tuning technique for neural networks that enables the customization of diffusion models with relatively small datasets, ranging from a few images to a few thousand. This repository provides a minimal example, adapted from a LoRA implementation for Stable Diffusion 3.5, to support future projects in visual content creation.

This material is heavily based on the diffusers library. You are strongly encouraged to consult materials beyond the scope of this demo, as they will be valuable for your projects.

Setup

Install the required package within the requirements.txt.

NOTE: Install PyTorch according to the CUDA version of your environment (See PyTorch Previous Versions)

conda create -n cs492c python=3.10
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install -r requirements.txt

Before proceeding with the demo, we will have a look at Hugging Face, an open-source platform that serves as a hub for machine learning applications, and Diffusers, a go-to library for pretrained diffusion models hosted by Hugging Face. As we'll be downloading the pretrained Stable Diffusion model from Hugging Face, you'll need to ensure you have access tokens.

Before running the demo, please do the following:

  • Sign into Hugging Face.
  • Obtain your Access Token at https://huggingface.co/settings/tokens.
  • In your terminal, log into Hugging Face by $ huggingface-cli login and enter your Access Token.

You can check whether you have access to Hugging Face using the below code, which downloads Stable Diffusion from Hugging Face and generates an image with it.

import torch
from diffusers import StableDiffusionPipeline

model_id = "CompVis/stable-diffusion-v1-4"
device = "cuda"


pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)

prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0]  
    
image.save("astronaut_rides_horse.png")

All set! Please open up the Jupyter Notebook named demo_cfg_controlnet.ipynb to get started!

❗Ethical Usage

While you are encouraged to explore creative possibilities using the above methods, it is crucial that you do not use these personalization techniques for harmful purposes, such as generating content that includes nudity, violence, or targets specific identities. It is your responsibility to ensure that this method is applied ethically.

Credits

This repository is built primarily on the Diffusers library. We also thank the authors of the following resources:

Further Readings

If you are interested in this topic, we encourage you to check out the materials below.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages