论文学习6——Auto-Encoded Supervision for Perceptual Image Super-Resolution

Introduction

Recent advances in Super-Resolution have branched into two distinct mainstreams: fidelity-oriented SR and perceptual quality oriented SR.

The author proposed the Auto-Encoded Supervision for Perceptual SR(AESOP), which penalizes the fidelity bias factor while preserving visually important perceptual variance factors of SR images.

Related work

Fidelity-oriented SR. They employ \(L_{pix}\) as their sole objective and minimize the expected per-pixel distance which leads to high PSNR score.

Perceptual SR. These methods commonly adopt SRGAN-based framework and the author also use a GAN-based SR methods.

Revisiting per-pixel loss in perceptual SR

Perceptual SR penalized \(L_{pix}\) and often results in blurred texture. The prior works shown that the systematic-effect term and the variance-effect term are often induced by the bias and variance of the prediction. So the authors give a symmetric loss function and a variance loss function.

SE and VE in terms of perceptual SR

For the perceptual SR task, SE minimization is desired but VE should be preserved. So the authors elaborate on designing a novel loss that minimizes SE while preserving VE.

Revisiting prior methods

Previous approaches aim to avoid blurring by 1) introducing LPF before loss calculation or 2) simply applying a small coefficient for \(L_{pix}\)

First, the LPF-based approaches can avoid texture blurring induced by VE, they also fail to guide regresable high-frequency components.

Second, the coefficient reduce the efficacy on the SE which guide the information flow between the input image and the network output.

Method

The authors developed an tailored Auto-Encoder to create a feature space exclusively for fidelity biases. Second, they calculate \(L_{p}\) in the AE space.

Their work aims to make improvements in the GAN-based perceptual SR task. And they follow a recent GAn-based SR model LDL.

Auto-Encoder pretraining

Designing the fidelity bias feature space

The authors employ a basic Auto-Encoder to construct a differentiable approximation of this operator. The pretrained AE will act as a differentiable approximation whick can decompose the fidelity bias of images can directly plugged into the training framework.

Meanwhile, the AE is specifically designed to remove particular low-level features from the pixel space.

AE pretraining

The authors pretrain an AE to consecutively estimate the LR and HR image by minimizing \(L_P\)

AE architecture

The authors employ the RRDBNet on the decoder and a lightweight CNN on the encoder.

Auto-Encoder supervision

Defining the AESOP loss

Contrary to  \(L_{pix}\) which minimizes both SE and VE, the \(L_{AESOP}\) only minimizes SE.

Final objective function

\(L_{total}=\lambda_{AESOP}L_{AESOP}+\lambda_2 L_{percep}+\lambda_3 L_{artif} + \lambda_4 L_{adv}\)

Conclusion

Most importantly, The authors proposed a novel reconstruction loss \(L_{AESOP}\) that disentangle fidelity bias factors from perceptual variance factors, enabling the model to prioritize fidelity bias.

posted on 2025-03-22 08:27  bnbncch  阅读(107)  评论(0)    收藏  举报