Abandon Bayes and Markov

In Review DDPM and DDPM, to get the distribution , we applied Bayes' Theorem:

where we set as a series of Gaussian distribution, so that we can derive and from it and therefore obtain .

The Gaussian we chose for is

This transition satisfies Markov Property.

In DDIM, we choose a another way to obtain by assuming it as a Gaussian

This is a general form for , which should satisfy

Now we could get the form of by setting up and . To reuse the model we trained in DDPM, we remain the same distribution of them

Then we can solve

So is a set of distributions depend on . If we chose as the same in DDPM, the will also be the same.

Now, by Bayes' Theorem we could obtain the forward process , which is no longer Markovian as in DDPM

Tip

The general idea of DDIM is, specific and by and to get , rather than by defining to get all other distributions in DDPM

Training and Inference

As same in DDPM, DDIM trains a denoise network to get from . Since the loss function, as the same in DDPM, does not include , so we can use a pre-trained DDPM model for DDIM inference by changing the value of .

In particular, if we set , the inference process would be determinant, and this is what the Implicit in DDIM mean.

Accelerate Generation

In DDPM, the generative process is considered as the approximation to the reverse process; since of the forward process has steps, the generative process is also forced to sample steps.

However, the denoising network we trained in DDPM does not rely on any specific forward procedure. That is, as long as is fixed by with parameter , the denoising network would work regardless is sampled from or . Therefore, a DDPM trained for includes all the parameters we need for any subsequence , and we could build a DDIM model on this new sequence without training a new model.

By this way, we can directly accelerate the generation process to steps by recalculating from .