Catalogue en ligne IGN

Détail de l'auteur

Auteur François Fleuret

Documents disponibles écrits par cet auteur (1)

Ajouter le résultat dans votre panier Affiner la recherche Interroger des sources externes

Learning stereo reconstruction with deep neural networks / Stepan Tulyakov (2020)

Public

Titre : Learning stereo reconstruction with deep neural networks
Type de document : Thèse/HDR
Auteurs : Stepan Tulyakov, Auteur ; François Fleuret, Directeur de thèse ; Anton Ivanov, Directeur de thèse
Editeur : Lausanne : Ecole Polytechnique Fédérale de Lausanne EPFL
Année de publication : 2020
Importance : 139 p.
Format : 21 x 30 cm
Note générale : bibliographie
Thèse présentée à l'Ecole Polytechnique Fédérale de Lausanne pour l’obtention du grade de Docteur ès Sciences
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] apprentissage profond
[Termes IGN] classification semi-dirigée
[Termes IGN] contrainte géométrique
[Termes IGN] couple stéréoscopique
[Termes IGN] entropie
[Termes IGN] estimateur
[Termes IGN] étalonnage géométrique
[Termes IGN] modèle stéréoscopique
[Termes IGN] profondeur
[Termes IGN] réalité de terrain
[Termes IGN] reconstruction 3D
[Termes IGN] reconstruction d'image
[Termes IGN] vision par ordinateur
[Termes IGN] vision stéréoscopique

Résumé : (auteur) Stereo reconstruction is a problem of recovering a 3d structure of a scene from a pair of images of the scene, acquired from different viewpoints. It has been investigated for decades and many successful methods were developed. The main drawback of these methods, is that they typically utilize a single depth cue, such as parallax, defocus blur or shading, and thus are not as robust as a human visual system that simultaneously relies on a range of monocular and binocular cues. This is mainly because it is hard to manually design a model, accounting for multiple depth cues. In this work, we address this problem by focusing on deep learning-based stereo methods that can discover a model for multiple depth cues directly from training data with ground truth depth. The complexity of deep learning-based methods, however, requires very large training sets with ground truth depth, which is often hard or costly to collect. Furthermore, even when training data is available it is often contaminated with noise, which reduces the effectiveness of supervised learning. In this work, in Chapter 3 we show that it is possible to alleviate this problem by using weakly supervised learning, that utilizes geometric constraints of the problem instead of ground truth depth. Besides the large training set requirement, deep stereo methods are not as application-friendlyas traditional methods. They have a large memory footprint and their disparity range is fixed at training time. For some applications, such as satellite stereo i magery, these are serious problems since satellite images are very large, often reaching tens of megapixels, and have a variable baseline, depending on a time difference between stereo images acquisition. In this work, in Chapter 4 we address these problems by introducing a novel network architecture with a bottleneck, capable of processing large images and utilizing more context, and an estimator that makes the network less sensitive to stereo matching ambiguities and applicable to any disparity range without re-training. Because deep learning-based methods discover depth cues directly from training data, they can be adapted to new data modalities without large modifications. In this work, in Chapter 5 we show that our method, developed for a conventional frame-based camera, can be used with a novel event-based camera, that has a higher dynamic range, smaller latency, and low power consumption. Instead of sampling intensity of all pixels with a fixed frequency, this camera asynchronously reports events of significant pixel intensity changes. To adopt our method to this new data modality, we propose a novel event sequence embedding module, that firstly aggregates information locally, across time, using a novel fully-connected layer for an irregularly sampled continuous domain, and then across discrete spatial domain. One interesting application of stereo is a reconstruction of a planet’s surface topography from satellite stereo images. In this work, in Chapter 6 we describe a geometric calibration method, as well as mosaicing and stereo reconstruction tools that we developed in the framework of the doctoral project for Color and Stereo Surface Imaging System onboard of ESA’s Trace Gas Orbiter, orbiting Mars. For the calibration, we propose a novel method, relying on starfield images because large focal lengths and complex optical distortion of the instrument forbid using standard methods. Scientific and practical results of this work are widely used by a scientific community.
Note de contenu : 1- Introduction
2- Background
3- Weakly supervised learning of deep patch-matching cost
4- Applications-friendly deep stereo
5- Dense deep event-based stereo
6- Calibration of a satellite stereo system
7- Conclusions
Numéro de notice : 25795
Affiliation des auteurs : non IGN
Thématique : IMAGERIE
Nature : Thèse étrangère
Note de thèse : Thèse de Doctorat : Sciences : Lausanne : 2020
En ligne : https://infoscience.epfl.ch/record/275342?ln=fr
Format de la ressource électronique : URL
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=95025