Xingzhi

A pretrained chinese text-to-image generation model for Xingzhi competition.

Converting the text to tokens by cogview-data, then packing the ID and tokens to dataset/{train, val, test}.npy using make_dataset.py.

We use 6 Nvidia RTX 3090. Training process is about 20 days and inference is about 1 day.

Important!! You must put the train and val images into the dataset floder, and rename them into {train, val} before train. Then

bash scripts/train.sh
bash scripts/predict.sh

FID:40.748

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
modules		modules
scripts		scripts
README-ch.md		README-ch.md
README.md		README.md
fake_img.png		fake_img.png
make_dataset.py		make_dataset.py
model.png		model.png
model1.png		model1.png
predict.py		predict.py
requirements.txt		requirements.txt
result.png		result.png
result2.png		result2.png
train.py		train.py

Provide feedback