Titre : |
Learning to represent and reconstruct 3D deformable objects |
Type de document : |
Thèse/HDR |
Auteurs : |
Jan Bednarik, Auteur ; Pascal Fua, Directeur de thèse ; M. Salzmann, Directeur de thèse |
Editeur : |
Lausanne : Ecole Polytechnique Fédérale de Lausanne EPFL |
Année de publication : |
2022 |
Importance : |
138 p. |
Format : |
21 x 30 cm |
Note générale : |
bibliographie
Thèse présentée pour l'obtention du grade de Docteur ès Sciences, Ecole Polytechnique Fédérale de Lausanne |
Langues : |
Anglais (eng) |
Descripteur : |
[Vedettes matières IGN] Traitement d'image optique [Termes IGN] appariement de formes [Termes IGN] apprentissage profond [Termes IGN] cohérence temporelle [Termes IGN] déformation de surface [Termes IGN] distorsion d'image [Termes IGN] géométrie de Riemann [Termes IGN] image 3D [Termes IGN] reconstruction d'objet [Termes IGN] semis de points [Termes IGN] vision par ordinateur
|
Index. décimale : |
THESE Thèses et HDR |
Résumé : |
(auteur) Representing and reconstructing 3D deformable shapes are two tightly linked problems that have long been studied within the computer vision field. Deformable shapes are truly ubiquitous in the real world, whether be it specific object classes such as humans, garments and animals or more abstract ones such as generic materials deforming under stress caused by an external force. Truly practical computer vision algorithms must be able to understand the shapes of objects in the observed scenes to unlock the wide spectrum of much sought after applications ranging from virtual try-on to automated surgeries. Automatic shape reconstruction, however, is known to be an ill-posed problem, especially in the common scenario of a single image input. Therefore, the modern approaches rely on deep learning paradigm which has proven to be extremely effective even for the severely under-constrained computer vision problems. We, too, exploit the success of data-driven approaches, however, we also show that generic deep learning models can greatly benefit from being combined with explicit knowledge originating in traditional computational geometry. We analyze the use of various 3D shape representations for deformable object reconstruction and we distinctly focus on one of them, the atlas-based representation, which turns out to be especially suitable for modeling deformable shapes and which we further improve and extend to yield higher quality reconstructions. The atlas-based representation models the surfaces as an ensemble of continuous functions and thus allows for arbitrary resolution and analytical surface analysis. We identify major shortcomings of the base formulation, namely the infamous phenomena of patch collapse, patch overlap and arbitrarily strong mapping distortions, and we propose novel regularizers based on analytically computed properties of the reconstructed surfaces. Our approach counteracts the aforementioned drawbacks while yielding higher reconstruction accuracy in terms of surface normals on the tasks of single view-reconstruction, shape completion and point cloud auto-encoding. We dive into the problematics of atlas-based shape representation even deeper and focus on another pressing design flaw, the global inconsistency among the individual mappings. While the inconsistency is not reflected in the traditional reconstruction accuracy quantitative metrics, it is detrimental to the visual quality of the reconstructed surfaces. Specifically, we design loss functions encouraging intercommunication among the individual mappings which pushes the resulting surface towards a C1 smooth function. Our experiments on the tasks of single-view reconstruction and point cloud auto-encoding reveal that our method significantly improves the visual quality when compared to the baselines. Furthermore, we adapt the atlas-based representation and the related training procedure so that it could model a full sequence of a deforming object in a temporally-consistent way. In other words, the goal is to produce such reconstruction where each surface point always represents the same semantic point on the target ground-truth surface. To achieve such behavior, we note that if each surface point deforms close-to-isometrically, its semantic location likely remains unchanged. Practically, we make use of the Riemannian metric which is computed analytically on the surfaces, and force it to remain point-wise constant throughout the sequence. Our experimental results reveal that our method yields state-of-the-art results on the task of unsupervised dense shape correspondence estimation, while also improving the visual reconstruction quality. Finally, we look into a particular problem of monocular texture-less deformable shape reconstruction, an instance of the Shape-from-Shading problem. We propose a multi-task learning approach which takes an RGB image of an unknown object as the input and jointly produces a normal map, a depth map and a mesh corresponding to the observed part of the surface. We show that forcing the model to produce multiple different 3D representations of the same objects results in higher reconstruction quality. To train the network, we acquire a large real-world annotated dataset of texture-less deforming objects and we release it for public use. Finally, we prove through experiments that our approach outperforms a previous optimization based method on the single-view-reconstruction task. |
Note de contenu : |
1- Introduction
2- Related work
3- Atlas-based representation for deformable shape reconstruction
4- Shape reconstruction by learning differentiable surface representations
5- Better patch stitching for parametric surface reconstruction
6- Temporally-consistent surface reconstruction using metrically-consistent atlases
7- Learning to reconstruct texture-less deformable surfaces from a single view
8- Conclusion |
Numéro de notice : |
15761 |
Affiliation des auteurs : |
non IGN |
Thématique : |
IMAGERIE |
Nature : |
Thèse étrangère |
Note de thèse : |
Thèse de Doctorat : Sciences : Lausanne, EPFL : 2022 |
DOI : |
10.5075/epfl-thesis-7974 |
En ligne : |
https://doi.org/10.5075/epfl-thesis-7974 |
Format de la ressource électronique : |
URL |
Permalink : |
https://documentation.ensg.eu/index.php?lvl=notice_display&id=100958 |
| |