Skip to content
/ jpeg-lm Public

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Notifications You must be signed in to change notification settings

xhan77/jpeg-lm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

JPEG-LM

JPEG-LM autoregressively generates image file bytes like language. Data preprocessing and inference of JPEG-LM is very simple and can be done with run.py.

Example command: python run.py --query_vllm_server "local" --prefix_ratio 0.375 --temp 1.0 --topp 0.9 --topk 50 --test_image_path 'example_image_input/*.png' --repeat_generation 10 --seed 42 --output_dir "out".

Note that we use pillow==10.2.0 (lower versions won't work). torch (2.1.2), transformers (4.38.2), and vllm (0.3.3) should be installed as well. Different sampling hyperparameters can also be tried further (e.g., removing top-p for landscape images).

About

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations

Resources

Stars

Watchers

Forks

Languages