Abandon Bayes and Markov
In Review DDPM and DDPM, to get the distribution , we applied Bayes' Theorem:
where we set as a series of Gaussian distribution, so that we can derive and from it and therefore obtain .
The Gaussian we chose for is
This transition satisfies Markov Property.
In DDIM, we choose a another way to obtain by assuming it as a Gaussian
This is a general form for , which should satisfy
Now we could get the form of by setting up and . To reuse the model we trained in DDPM, we remain the same distribution of them
Then we can solve
So is a set of distributions depend on . If we chose as the same in DDPM, the will also be the same.
Now, by Bayes' Theorem we could obtain the forward process , which is no longer Markovian as in DDPM
Tip
The general idea of DDIM is, specific and by and to get , rather than by defining to get all other distributions in DDPM
Training and Inference
As same in DDPM, DDIM trains a denoise network to get from . Since the loss function, as the same in DDPM, does not include , so we can use a pre-trained DDPM model for DDIM inference by changing the value of .
In particular, if we set , the inference process would be determinant, and this is what the Implicit in DDIM mean.
Accelerate Generation
In DDPM, the generative process is considered as the approximation to the reverse process; since of the forward process has steps, the generative process is also forced to sample steps.
However, the denoising network we trained in DDPM does not rely on any specific forward procedure. That is, as long as is fixed by with parameter , the denoising network would work regardless is sampled from or . Therefore, a DDPM trained for includes all the parameters we need for any subsequence , and we could build a DDIM model on this new sequence without training a new model.
By this way, we can directly accelerate the generation process to steps by recalculating from .