Learning Deep Representations for Photo Retouching
Photo enhancement is a long-standing and challenging problem in image processing community. Despite having witnessed significant achievements in recent years, many of them are built upon supervised learning theories and thus required expertise in constructing a huge collection of paired data, which is well-known to be a problem as the acquisition of such data in real life can be impractical. We address this issue by proposing a multi-scale GAN framework that can be trained in an unsupervised fashion. Notably, we unify the design principle of the generator and discriminator in our framework so as to maximize the ability to learn deep latent representations. Specifically, rather than maintaining the content consistency through complicated two-way loss, we present a one-way loss that measures the content distance between multi-scale latent representations of inputs and outputs to speed up the training by 1.7× . Furthermore, we redesign the discriminator into a multi-scale-multi-stage manner to strengthen the adversarial learning, where the multiple latent features with varying scales are produced by the main discriminator and these features are then sent to auxiliary discriminators for final recognition. Extensive experiments have been conducted in the well-known MIT-Adobe-fivek and HDR+ datasets, and the results demonstrated that the proposed multi-scale representation learning framework shows outstanding performance in photo enhancement task.
History
Journal/Conference/Book title
IEEE Transactions on MultimediaPublication date
2023-08-23Version
- Post-print