Direct Guidance
In DDPM, we focus on modeling just the data distribution . However, we are often also interested in learning conditional distribution , which would enable us to explicitly control the data we generate through conditional information .
A natural way to add conditioning information is simply alongside the timestep information, at each iteration. Recall that from the joint distribution of can be derived from the product of transition distributions
We can simply add arbitrary conditioning information at each transition step as
where could be a text encoding in image-text generation, or a low-resolution image to perform super-resolution on. Now we can learn the core neural network of a DDPM as before.
However, a caveat of this vanilla formulation is that a conditional diffusion model trained in this way may potentially learn to ignore or downplay any given conditioning information. Guidance is therefore proposed as a way to more explicitly control the amount of weight the model gives to the conditioning information, at the cost of sample diversity
Classifier Guidance
Strat with the score-based formulation of a diffusion model, where our goal is to learn . By Bayes' rule, we can derive
Therefore, in Classifier Guidance, the score of an unconditional diffusion model is learned as previously derived, alongside a classifier that takes in arbitrary noisy and attempts to predict conditional information . Then, during the sampling procedure, the overall conditional score function used for annealed Langevin Dynamics is computed as the sum of the unconditional score function and the adversarial gradient of the noisy classifier
To introduce fine-grained control to either encourage or discourage the model to consider the conditioning information, we can scales the adversarial gradient of the noisy classifier by a hyper-parameter
The higher is, the model learns to produce samples that heavier adhere to the conditioning information, which comes at the cost of sample diversity.
Here is a pseudo-code of classifier guidance:
Classifier Guidance can only control the categories generated by the classification model. If the classification model distinguishes 10 classes, then Classifier Guidance can only guide the diffusion model to generate those fixed 10 classes.
To solve this problem, see Classifier-Free Diffusion Guidance