Paper ID: 2403.12063

Consistency Model is an Effective Posterior Sample Approximation for Diffusion Inverse Solvers

Tongda Xu, Ziran Zhu, Jian Li, Dailan He, Yuanyuan Wang, Ming Sun, Ling Li, Hongwei Qin, Yan Wang, Jingjing Liu, Ya-Qin Zhang

Diffusion Inverse Solvers (DIS) are designed to sample from the conditional distribution $p_{\theta}(X_0|y)$, with a predefined diffusion model $p_{\theta}(X_0)$, an operator $f(\cdot)$, and a measurement $y=f(x'_0)$ derived from an unknown image $x'_0$. Existing DIS estimate the conditional score function by evaluating $f(\cdot)$ with an approximated posterior sample drawn from $p_{\theta}(X_0|X_t)$. However, most prior approximations rely on the posterior means, which may not lie in the support of the image distribution, thereby potentially diverge from the appearance of genuine images. Such out-of-support samples may significantly degrade the performance of the operator $f(\cdot)$, particularly when it is a neural network. In this paper, we introduces a novel approach for posterior approximation that guarantees to generate valid samples within the support of the image distribution, and also enhances the compatibility with neural network-based operators $f(\cdot)$. We first demonstrate that the solution of the Probability Flow Ordinary Differential Equation (PF-ODE) with an initial value $x_t$ yields an effective posterior sample $p_{\theta}(X_0|X_t=x_t)$. Based on this observation, we adopt the Consistency Model (CM), which is distilled from PF-ODE, for posterior sampling. Furthermore, we design a novel family of DIS using only CM. Through extensive experiments, we show that our proposed method for posterior sample approximation substantially enhance the effectiveness of DIS for neural network operators $f(\cdot)$ (e.g., in semantic segmentation). Additionally, our experiments demonstrate the effectiveness of the new CM-based inversion techniques. The source code is provided in the supplementary material.

Submitted: Feb 9, 2024