Skip to content

Commit a809596

Browse files
committed
first commit
0 parents  commit a809596

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

45 files changed

+41317
-0
lines changed

LICENSE

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
MIT License
2+
3+
Copyright (c) 2021 MIT HAN Lab
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

README.md

+235
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,235 @@
1+
# Anycost GANs for Interactive Image Synthesis and Editing
2+
3+
### [video](https://youtu.be/_yEziPl9AkM) | [paper](https://arxiv.org/abs/2103.03243) | [website](https://hanlab.mit.edu/projects/anycost-gan/) [![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mit-han-lab/anycost-gan/blob/master/notebooks/intro_colab.ipynb)
4+
5+
**Anycost GANs for Interactive Image Synthesis and Editing**
6+
7+
[Ji Lin](http://linji.me/), [Richard Zhang](https://richzhang.github.io/), Frieder Ganz, [Song Han](https://songhan.mit.edu/), [Jun-Yan Zhu](https://www.cs.cmu.edu/~junyanz/)
8+
9+
In CVPR 2021
10+
11+
**Anycost GAN (flexible)** generates consistent outputs under various, fine-grained computation budgets.
12+
13+
![flexible](https://hanlab.mit.edu/projects/anycost-gan/images/flexible.gif)
14+
15+
16+
17+
**Anycost GAN (uniform)** supports 4 resolutions and 4 channel ratios.
18+
19+
![uniform](https://hanlab.mit.edu/projects/anycost-gan/images/uniform.gif)
20+
21+
22+
23+
We can use the anycost generator for **interactive image editing**. A full generator takes **~3s** to render an image, which is too slow for editing. While with anycost generator, we can provide a visually similar preview at **5x faster speed**. After adjustment, we hit the "Finalize" button to give the high-qaulity, edited output. Check [here](https://youtu.be/_yEziPl9AkM?t=90) for the full demo.
24+
25+
<a href="https://youtu.be/_yEziPl9AkM?t=90"><img src='assets/figures/demo.gif' width=600></a>
26+
27+
28+
29+
## Method
30+
31+
Anycost generators can be run at *diverse computation costs* by using different *channel* and *resolution* configurations. Sub-generators achieve high output consistency compared to the full generator, providng a fast preview.
32+
33+
![overview](https://hanlab.mit.edu/projects/anycost-gan/images/overall.jpg)
34+
35+
36+
37+
With (1) Sampling-based multi-resolution training; (2) adaptive-channel training; (3) generator-conditioned discriminator, we can achieve high image quality and consistency at different resolutions and channels.
38+
39+
![method](https://hanlab.mit.edu/projects/anycost-gan/images/method_pad.gif)
40+
41+
## Usage
42+
43+
### Getting Started
44+
45+
- Clone this repo:
46+
47+
```bash
48+
git clone https://github.com/mit-han-lab/anycost-gan.git
49+
cd anycost-gan
50+
```
51+
52+
- Install PyTorch 1.7 and other dependancies.
53+
54+
We recommend setting up the environment using anaconda: `conda env create -f environment.yml`
55+
56+
57+
58+
### Introduction Notebook
59+
60+
We provide a jupyter notebook example to show how to use the anycost generator for image synthesis at diverse costs: `notebooks/intro.ipynb`.
61+
62+
We also provide a colab version of the notebook: [![](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/mit-han-lab/anycost-gan/blob/master/notebooks/intro_colab.ipynb). Be sure to select the GPU as the accelerator in runtime options.
63+
64+
65+
66+
### Interactive Demo
67+
68+
We provide an interactive demo showing how we can use anycost GAN to enable interactive image editing. To run the demo:
69+
70+
```bash
71+
python demo.py
72+
```
73+
74+
You can find a video recording of the demo [here](https://youtu.be/_yEziPl9AkM?t=90).
75+
76+
77+
78+
### Using Pre-trained Models
79+
80+
To get the pre-trained generator, encoder, and editing directions, run:
81+
82+
```python
83+
import model
84+
85+
pretrained_type = 'generator' # choosing from ['generator', 'encoder', 'boundary']
86+
config_name = 'anycost-ffhq-config-f' # replace the config name for other models
87+
model.get_pretrained(pretrained_type, config=config_name)
88+
```
89+
90+
We also provide the face attribute classifier (which is general for different generators) for computing the editing directions. You can get it by running:
91+
92+
```python
93+
model.get_pretrained('attribute-predictor')
94+
```
95+
96+
The attribute classifier takes in the face images in FFHQ format.
97+
98+
99+
100+
After loading the anycost generator, we can run it at diverse cost. For example:
101+
102+
```python
103+
from model.dynamic_channel import set_uniform_channel_ratio, reset_generator
104+
105+
g = model.get_pretrained('generator', config='anycost-ffhq-config-f') # anycost uniform
106+
set_uniform_channel_ratio(g, 0.5) # set channel
107+
g.target_res = 512 # set resolution
108+
out, _ = g(...) # generate image
109+
reset_generator(g) # restore the generator
110+
```
111+
112+
For a detailed usage and *flexible-channel* anycost generator, please refer to `notebooks/intro.ipynb`.
113+
114+
115+
116+
### Model Zoo
117+
118+
Currently, we provide the following pre-trained generators, encoders, and editing directions. We will add more in the future.
119+
120+
For anycost generators, by default, we refer to the uniform setting.
121+
122+
| config name | generator | encoder | edit direction |
123+
| ------------------------------ | ------------------ | ------------------ | ------------------ |
124+
| anycost-ffhq-config-f | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
125+
| anycost-ffhq-config-f-flexible | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
126+
| anycost-car-config-f | :heavy_check_mark: | | |
127+
| stylegan2-ffhq-config-f | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
128+
129+
`stylegan2-ffhq-config-f` refers to the official StyleGAN2 generator converted from the [repo](https://github.com/NVlabs/stylegan2).
130+
131+
132+
133+
### Datasets
134+
135+
We prepare the [FFHQ](https://github.com/NVlabs/ffhq-dataset), [CelebA-HQ](https://github.com/switchablenorms/CelebAMask-HQ), and [LSUN Car](https://github.com/fyu/lsun) datasets into a directory of images, so that it can be easily used with `ImageFolder` from `torchvision`. The dataset layout looks like:
136+
137+
```
138+
├── PATH_TO_DATASET
139+
│   ├── images
140+
│   │   ├── 00000.png
141+
│   │   ├── 00001.png
142+
│   │   ├── ...
143+
```
144+
145+
Due to the copyright issue, you need to download the dataset from official site and process them accordingly.
146+
147+
148+
149+
### Evaluation
150+
151+
We provide the code to evaluate some metrics presented in the paper. Some of the code is written with [`horovod`](https://github.com/horovod/horovod) to support distributed evalution and reduce the cost of inter-GPU communication, which greatly improves the speed. Check its website for a proper installation.
152+
153+
#### Fre ́chet Inception Distance (FID)
154+
155+
Before evaluating the FIDs, you need to compute the inception features of the real images using scripts like:
156+
157+
```bash
158+
python tools/calc_inception.py \
159+
--resolution 1024 --batch_size 64 -j 16 --n_sample 50000 \
160+
--save_name assets/inceptions/inception_ffhq_res1024_50k.pkl \
161+
PATH_TO_FFHQ
162+
```
163+
164+
or you can download the pre-computed inceptions from [here](https://www.dropbox.com/sh/bc8a7ewlvcxa2cf/AAD8NFzDWKmBDpbLef-gGhRZa?dl=0) and put it under `assets/inceptions`.
165+
166+
Then, you can evaluate the FIDs by running:
167+
168+
```bash
169+
horovodrun -np N_GPU \
170+
python metrics/fid.py \
171+
--config anycost-ffhq-config-f \
172+
--batch_size 16 --n_sample 50000 \
173+
--inception assets/inceptions/inception_ffhq_res1024_50k.pkl
174+
# --channel_ratio 0.5 --target_res 512 # optionally using a smaller resolution/channel
175+
```
176+
177+
#### Perceptual Path Lenght (PPL)
178+
179+
Similary, evaluting the PPL with:
180+
181+
```bash
182+
horovodrun -np N_GPU \
183+
python metrics/ppl.py \
184+
--config anycost-ffhq-config-f
185+
```
186+
187+
#### Attribute Consistency
188+
189+
Evaluating the attribute consistency by running:
190+
191+
```bash
192+
horovodrun -np N_GPU \
193+
python metrics/attribute_consistency.py \
194+
--config anycost-ffhq-config-f \
195+
--channel_ratio 0.5 --target_res 512 # config for the sub-generator; necessary
196+
```
197+
198+
#### Encoder Evaluation
199+
200+
To evaluate the performance of the encoder, run:
201+
202+
```bash
203+
python metrics/eval_encoder.py \
204+
--config anycost-ffhq-config-f \
205+
--data_path PATH_TO_CELEBA_HQ
206+
```
207+
208+
209+
210+
### Training
211+
212+
The training code will be updated shortly.
213+
214+
215+
216+
## Citation
217+
218+
If you use this code for your research, please cite our paper.
219+
220+
```
221+
@inproceedings{lin2021anycost,
222+
author = {Lin, Ji and Zhang, Richard and Ganz, Frieder and Han, Song and Zhu, Jun-Yan},
223+
title = {Anycost GANs for Interactive Image Synthesis and Editing},
224+
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
225+
year = {2021},
226+
}
227+
```
228+
229+
230+
231+
## Acknowledgement
232+
233+
We thank Taesung Park, Zhixin Shu, Muyang Li, and Han Cai for the helpful discussion. Part of the work is supported by NSF CAREER Award #1943349, Adobe, Naver Corporation, and MIT-IBM Watson AI Lab.
234+
235+
The codebase is build upon a PyTorch implementation of StyleGAN2: [rosinality/stylegan2-pytorch](https://github.com/rosinality/stylegan2-pytorch). For editing direction extraction, we refer to [InterFaceGAN](https://github.com/genforce/interfacegan).

assets/demo/input_images/00_ryan.jpg

71.5 KB
Loading

assets/demo/input_images/01_anne.jpg

97.4 KB
Loading

assets/demo/input_images/02_will.jpg

75.2 KB
Loading
125 KB
Loading

assets/demo/loading.gif

62.7 KB
Loading
36.1 KB
Binary file not shown.
36.1 KB
Binary file not shown.
36.1 KB
Binary file not shown.
36.1 KB
Binary file not shown.

0 commit comments

Comments
 (0)