Skip to content

littlewhitesea/training-free-methods

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 

Repository files navigation

training-free-methods

This is a repository to collect recent training-free algorithms relevant to image generation and manipulation that can run on a single GPU with no more than 24GB of memory.

Model Acceleration

PFDiff: Training-free Acceleration of Diffusion Models through the Gradient Guidance of Past and Future.
Guangyi Wang, Yuren Cai, Lijiang Li, Wei Peng, Songzhi Su.
arxiv 2024. [PDF]

Multimodal Large Language Models

ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models.
Mingrui Wu, Xinyue Cai, Jiayi Ji, Jiale Li, Oucheng Huang, Gen Luo, Hao Fei, Xiaoshuai Sun, Rongrong Ji.
arxiv 2024. [PDF] [Code]

Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs.
Shi Liu, Kecheng Zheng, Wei Chen.
ECCV 2024. [PDF] [Project] [Code]

Material Transfer

ZeST: Zero-Shot Material Transfer from a Single Image.
Ta-Ying Cheng, Prafull Sharma, Andrew Markham, Niki Trigoni, Varun Jampani.
ECCV 2024. [PDF] [Project] [Code]

Style Transfer

Artist: Aesthetically Controllable Text-Driven Stylization without Training.
Ruixiang Jiang, Changwen Chen.
arXiv 2024. [PDF] [Project] [Code]

Visual Style Prompting with Swapping Self-Attention.
Jaeseok Jeong, Junho Kim, Yunjey Choi, Gayoung Lee, Youngjung Uh.
arXiv 2024. [PDF] [Project] [Code]

FreeStyle: Free Lunch for Text-guided Style Transfer using Diffusion Models.
Feihong He, Gang Li, Mengyuan Zhang, Leilei Yan, Lingyu Si, Fanzhang Li.
arXiv 2024. [PDF] [Project] [Code]

Eye-for-an-eye: Appearance Transfer with Semantic Correspondence in Diffusion Models.
Sooyeon Go, Kyungmook Choi, Minjung Shin, Youngjung Uh.
arXiv 2024. [PDF] [Project] [Code]

Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance.
Kuan Heng Lin, Sicheng Mo, Ben Klingher, Fangzhou Mu, Bolei Zhou.
NeurIPS 2024. [PDF] [Project] [Code]

RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control.
Litu Rout, Yujia Chen, Nataniel Ruiz, Abhishek Kumar, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu.
arXiv 2024. [PDF] [Project] [Code]

Tuning-Free Adaptive Style Incorporation for Structure-Consistent Text-Driven Style Transfer.
Yanqi Ge, Jiaqi Liu, Qingnan Fan, Xi Jiang, Ye Huang, Shuai Qin, Hong Gu, Wen Li, Lixin Duan.
arXiv 2024. [PDF]

Style Injection in Diffusion: A Training-free Approach for Adapting Large-scale Diffusion Models for Style Transfer.
Jiwoo Chung, Sangeek Hyun, Jae-Pil Heo.
CVPR 2024. [PDF] [Project] [Code]

Image Generation

AP-LDM: Attentive and Progressive Latent Diffusion Model for Training-Free High-Resolution Image Generation.
Boyuan Cao, Jiaxin Ye, Yujie Wei, Hongming Shan.
arxiv 2024. [PDF] [Code]

Training-free Color-Style Disentanglement for Constrained Text-to-Image Synthesis.
Aishwarya Agarwal, Srikrishna Karanam, Balaji Vasan Srinivasan.
arxiv 2024. [PDF]

Smoothed Energy Guidance: Guiding Diffusion Models with Reduced Energy Curvature of Attention.
Susung Hong.
NeurIPS 2024. [PDF] [Code]

MegaFusion: Extend Diffusion Models towards Higher-resolution Image Generation without Further Tuning.
Haoning Wu, Shaocheng Shen, Qiang Hu, Xiaoyun Zhang, Ya Zhang, Yanfeng Wang.
arxiv 2024. [PDF] [Project] [Code]

DiffuseHigh: Training-free Progressive High-Resolution Image Synthesis through Structure Guidance.
Younghyun Kim, Geunmin Hwang, Junyu Zhang, Eunbyung Park.
arxiv 2024. [PDF] [Project] [Code]

TraDiffusion: Trajectory-Based Training-Free Image Generation.
Mingrui Wu, Oucheng Huang, Jiayi Ji, Jiale Li, Xinyue Cai, Huafeng Kuang, Jianzhuang Liu, Xiaoshuai Sun, Rongrong Ji.
arxiv 2024. [PDF] [Code]

MagicFace: Training-free Universal-Style Human Image Customized Synthesis.
Yibin Wang, Weizhong Zhang, Cheng Jin.
arxiv 2024. [PDF] [Project] [Code]

AccDiffusion: An Accurate Method for Higher-Resolution Image Generation.
Zhihang Lin, Mingbao Lin, Meng Zhao, Rongrong Ji.
ECCV 2024. [PDF] [Project] [Code]

ResMaster: Mastering High-Resolution Image Generation via Structural and Fine-Grained Guidance.
Shuwei Shi, Wenbo Li, Yuechen Zhang, Jingwen He, Biao Gong, Yinqiang Zheng.
arXiv 2024. [PDF] [Project] [Code]

Coherent Zero-Shot Visual Instruction Generation.
Quynh Phung, Songwei Ge, Jia-Bin Huang.
arXiv 2024. [PDF] [Project]

FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition.
Ganggui Ding, Canyu Zhao, Wen Wang, Zhen Yang, Zide Liu, Hao Chen, Chunhua Shen.
CVPR 2024. [PDF] [Code]

Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation.
Shengyuan Liu, Bo Wang, Ye Ma, Te Yang, Xipeng Cao, Quan Chen, Han Li, Di Dong, Peng Jiang.
arXiv 2024. [PDF]

DemoFusion: Democratising High-Resolution Image Generation With No $$$.
Ruoyi Du, Dongliang Chang, Timothy Hospedales, Yi-Zhe Song, Zhanyu Ma.
CVPR 2024. [PDF] [Project] [Code]

HiDiffusion: Unlocking High-Resolution Creativity and Efficiency in Low-Resolution Trained Diffusion Models.
Shen Zhang, Zhaowei Chen, Zhenyu Zhao, Zhenyuan Chen, Yao Tang, Yuhao Chen, Wengang Cao, Jiajun Liang.
ECCV 2024. [PDF] [Project] [Code]

Training-Free Consistent Text-to-Image Generation.
Yoad Tewel, Omri Kaduri, Rinon Gal, Yoni Kasten, Lior Wolf, Gal Chechik, Yuval Atzmon.
arXiv 2024. [PDF] [Project]

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis.
Linjiang Huang, Rongyao Fang, Aiping Zhang, Guanglu Song, Si Liu, Yu Liu, Hongsheng Li.
ECCV 2024. [PDF] [Code]

Image Manipulation

360PanT: Training-Free Text-Driven 360-Degree Panorama-to-Panorama Translation.
Hai Wang, Jing-Hao Xue.
WACV 2025. [PDF] [Project] [Code]

OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model.
Runyi Li, Xuhan Sheng, Weiqi Li, Jian Zhang.
ECCV 2024. [PDF] [Project] [Code]

Guide-and-Rescale: Self-Guidance Mechanism for Effective Tuning-Free Real Image Editing.
Vadim Titov, Madina Khalmatova, Alexandra Ivanova, Dmitry Vetrov, Aibek Alanov.
arxiv 2024. [PDF] [Code]

TALE: Training-free Cross-domain Image Composition via Adaptive Latent Manipulation and Energy-guided Optimization.
Kien T. Pham, Jingye Chen, Qifeng Chen.
ACM MM 2024. [PDF] [Project] [Code]

Faster Diffusion via Temporal Attention Decomposition.
Haozhe Liu, Wentian Zhang, Jinheng Xie, Francesco Faccio, Mengmeng Xu, Tao Xiang, Mike Zheng Shou, Juan-Manuel Perez-Rua, Jürgen Schmidhuber.
arXiv 2024. [PDF] [Code]

DiffUHaul: A Training-Free Method for Object Dragging in Images.
Ganggui Ding, Canyu Zhao, Wen Wang, Zhen Yang, Zide Liu, Hao Chen, Chunhua Shen.
arXiv 2024. [PDF] [Project]

Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model.
Zheng Gu, Shiyuan Yang, Jing Liao, Jing Huo, Yang Gao.
Siggraph 2024. [PDF] [Project] [Code]

ObjectAdd: Adding Objects into Image via a Training-Free Diffusion Modification Fashion.
Ziyue Zhang, Mingbao Lin, Rongrong Ji.
arXiv 2024. [PDF]

CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method.
Mingbao Lin, Zhihang Lin, Wengyi Zhan, Liujuan Cao, Rongrong Ji.
arXiv 2024. [PDF] [Project]

FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models.
Wei Wu, Qingnan Fan, Shuai Qin, Hong Gu, Ruoyu Zhao, Antoni B. Chan.
arXiv 2024. [PDF]

FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition.
Sicheng Mo, Fangzhou Mu, Kuan Heng Lin, Yanli Liu, Bochen Guan, Yin Li, Bolei Zhou.
CVPR 2024. [PDF] [Project] [Code]

Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation.
Narek Tumanyan, Michal Geyer, Shai Bagon, Tali Dekel.
CVPR 2023. [PDF] [Project] [Code]

Video Generation

MotionMaster: Training-free Camera Motion Transfer For Video Generation.
Teng Hu, Jiangning Zhang, Ran Yi, Yating Wang, Hongrui Huang, Jieyu Weng, Yabiao Wang, Lizhuang Ma.
ACM MM 2024. [PDF] [Code]

Video Editing

Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices.
Nathaniel Cohen, Vladimir Kulikov, Matan Kleiner, Inbar Huberman-Spiegelglas, Tomer Michaeli.
ICML 2024. [PDF] [Project] [Code]

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing.
Yuren Cong, Mengmeng Xu, Christian Simon, Shoufa Chen, Jiawei Ren, Yanping Xie, Juan-Manuel Perez-Rua, Bodo Rosenhahn, Tao Xiang, Sen He.
ICLR 2024. [PDF] [Project] [Code]

TokenFlow: Consistent Diffusion Features for Consistent Video Editing.
Michal Geyer, Omer Bar-Tal, Shai Bagon, Tali Dekel.
ICLR 2024. [PDF] [Project] [Code]

About

This is a repository to collect training-free algorithms

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published