论文学习6——Auto-Encoded Supervision for Perceptual Image Super-Resolution
Introduction
Recent advances in Super-Resolution have branched into two distinct mainstreams: fidelity-oriented SR and perceptual quality oriented SR.
The author proposed the Auto-Encoded Supervision for Perceptual SR(AESOP), which penalizes the fidelity bias factor while preserving visually important perceptual variance factors of SR images.
Related work
Fidelity-oriented SR. They employ \(L_{pix}\) as their sole objective and minimize the expected per-pixel distance which leads to high PSNR score.
Perceptual SR. These methods commonly adopt SRGAN-based framework and the author also use a GAN-based SR methods.
Revisiting per-pixel loss in perceptual SR
Perceptual SR penalized \(L_{pix}\) and often results in blurred texture. The prior works shown that the systematic-effect term and the variance-effect term are often induced by the bias and variance of the prediction. So the authors give a symmetric loss function and a variance loss function.
SE and VE in terms of perceptual SR
For the perceptual SR task, SE minimization is desired but VE should be preserved. So the authors elaborate on designing a novel loss that minimizes SE while preserving VE.
Revisiting prior methods
Previous approaches aim to avoid blurring by 1) introducing LPF before loss calculation or 2) simply applying a small coefficient for \(L_{pix}\)
First, the LPF-based approaches can avoid texture blurring induced by VE, they also fail to guide regresable high-frequency components.
Second, the coefficient reduce the efficacy on the SE which guide the information flow between the input image and the network output.
Method
The authors developed an tailored Auto-Encoder to create a feature space exclusively for fidelity biases. Second, they calculate \(L_{p}\) in the AE space.
Their work aims to make improvements in the GAN-based perceptual SR task. And they follow a recent GAn-based SR model LDL.
Auto-Encoder pretraining
Designing the fidelity bias feature space
The authors employ a basic Auto-Encoder to construct a differentiable approximation of this operator. The pretrained AE will act as a differentiable approximation whick can decompose the fidelity bias of images can directly plugged into the training framework.
Meanwhile, the AE is specifically designed to remove particular low-level features from the pixel space.
AE pretraining
The authors pretrain an AE to consecutively estimate the LR and HR image by minimizing \(L_P\)
AE architecture
The authors employ the RRDBNet on the decoder and a lightweight CNN on the encoder.
Auto-Encoder supervision
Defining the AESOP loss
Contrary to \(L_{pix}\) which minimizes both SE and VE, the \(L_{AESOP}\) only minimizes SE.
Final objective function
\(L_{total}=\lambda_{AESOP}L_{AESOP}+\lambda_2 L_{percep}+\lambda_3 L_{artif} + \lambda_4 L_{adv}\)
Conclusion
Most importantly, The authors proposed a novel reconstruction loss \(L_{AESOP}\) that disentangle fidelity bias factors from perceptual variance factors, enabling the model to prioritize fidelity bias.
浙公网安备 33010602011771号