This is the implementation of ILVR: Conditioning Method for Denoising Diffusion Probabilistic Models (ICCV 2021 Oral).
This repository is heavily based on improved diffusion and guided diffusion. We use PyTorch-Resizer for resizing function.
ILVR is a learning-free method for controlling the generation of unconditional DDPMs. ILVR refines each generation step with low-frequency component of purturbed reference image. Our method enables various tasks (image translation, paint-to-image, editing with scribbles) with only a single model trained on a target dataset.
Create a folder
models/ and download model checkpoints into it.
Here are the unconditional models trained on various datasets: drive
Models are trained on FFHQ, CelebA-HQ, CUB, AFHQ-Dogs, Flowers, and MetFaces with P2-weighting.
You may also try with models from guided diffusion.
First, set PYTHONPATH variable to point to the root of the repository.
Then, place your input image into a folder
ilvr_sample.py script. Specify the folder where you want to save the output in
Here, we provide flags for sampling from above models.
Feel free to change
--range_t to adapt downsampling factor and conditioning range from the paper.
Refer to improved diffusion for
python scripts/ilvr_sample.py --attention_resolutions 16 --class_cond False --diffusion_steps 1000 --dropout 0.0 --image_size 256 --learn_sigma True --noise_schedule linear --num_channels 128 --num_head_channels 64 --num_res_blocks 1 --resblock_updown True --use_fp16 False --use_scale_shift_norm True --timestep_respacing 100 --model_path models/ffhq_10m.pt --base_samples ref_imgs/face --down_N 32 --range_t 20 --save_dir output
ILVR sampling is implemented in
These are samples generated with N=8 and 16:
These are cat-to-dog samples generated with N=32:
This repo is re-implemention of our method on guided diffusion. Our initial implementation of the paper is based on denoising-diffusion-pytorch.