Catalogue en ligne IGN

Détail de l'auteur

Auteur Jean-Philippe Thiran

Documents disponibles écrits par cet auteur (1)

Ajouter le résultat dans votre panier Affiner la recherche Interroger des sources externes

Context-aware image super-resolution using deep neural networks / Mohammad Saeed Rad (2021)

Public

Titre : Context-aware image super-resolution using deep neural networks
Type de document : Thèse/HDR
Auteurs : Mohammad Saeed Rad, Auteur ; Jean-Philippe Thiran, Directeur de thèse
Editeur : Lausanne : Ecole Polytechnique Fédérale de Lausanne EPFL
Année de publication : 2021
Importance : 148 p.
Format : 21 x 30 cm
Note générale : bibliographie
Thèse présentée pour l'obtention du grade de Docteur ès Sciences
Langues : Français (fre)
Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] apprentissage profond
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] image à basse résolution
[Termes IGN] image à haute résolution
[Termes IGN] pouvoir de résolution spectrale
[Termes IGN] reconstruction d'image
[Termes IGN] réseau antagoniste génératif
[Termes IGN] segmentation sémantique
[Termes IGN] vision par ordinateur

Index. décimale : THESE Thèses et HDR
Résumé : (auteur) Image super-resolution is a classic ill-posed computer vision and image processing problem, addressing the question of how to reconstruct a high-resolution image from its low-resolution counterpart. Current state-of-the-art methods have improved the performance of the single image super-resolution task significantly by benefiting from machine learning and AI-powered algorithms, and more specifically, with the advent of Deep Learning-based approaches. Although these advances allow a machine to learn and have better exploitation of an image and its content, recent methods are still unable to constrain the plausible solution space based on the available contextual information within an image. This limitation mostly results in poor reconstructions, even for well-known types of objects and textures easily recognizable for humans. In this thesis, we aim at proving that the categorical prior, which characterizes the semantic class of a region in an image (e.g., sky, building, plant), is crucial in super-resolution task for reaching a higher reconstruction quality. In particular, we propose several approaches to improve the perceived image quality and generalization capability of deep learning-based methods by exploiting the context and semantic meaning of images. To prove the effectiveness of this categorical information, we first propose a convolutional neural network-based framework that is able to extract and use semantic information to super-resolve a given image by using multitask learning, simultaneously for learning image super-resolution and semantic segmentation. The proposed decoder is forced to explore categorical information during training, as this setting employs only one shared deep network for both semantic segmentation and super-resolution tasks. We further investigate the possibility of using semantic information by a novel objective function to introduce additional spatial control over the training process. We propose penalizing images at different semantic levels using appropriate loss terms by benefiting from our new OBB (Object, Background, and Boundary) labels generated from segmentation labels. Then, we introduce a new test time adaptation-based technique to leverage high-resolution images with perceptually similar context to a given test image to improve the reconstruction quality. We further validate this approach's effectiveness by using a novel numerical experiment analyzing the correlation between filters learned by our network and what we define as `ideal' filters. Finally, we present a generic solution to enable adapting all our previous contributions in this thesis, as well as other recent super-resolution works trained on synthetic datasets, to real-world super-resolution problem. Real-world super-resolution refers to super-resolving images with real degradations caused by physical imaging systems, instead of low-resolution images from simulated datasets assuming a simple and uniform degradation model (i.e., bicubic downsampling). We study and develop an image-to-image translator to map the distribution of real low-resolution images to the well-understood distribution of bicubically downsampled images. This translator is used as a plug-in to integrate real inputs into any super-resolution framework trained on simulated datasets. We carry out extensive qualitative and quantitative experiments for each mentioned contribution, including user studies, to compare our proposed approaches to state-of-the-art method.
Note de contenu : 1- Introduction
2- Brief image super-resolution review
3- Extracting image context by multi-task learning
4- Spatial control over image genertion process
5- Test-time adaptation based on perceptual similarity
6- Integrating into real-world SR
7- Conclusion
Numéro de notice : 28652
Affiliation des auteurs : non IGN
Thématique : IMAGERIE
Nature : Thèse étrangère
Note de thèse : Thèse de Doctorat : Sciences : EPFL, Lausanne : 2021
DOI : sans
En ligne : https://infoscience.epfl.ch/record/286804?ln=fr
Format de la ressource électronique : URL
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=99790