Catalogue en ligne IGN

Détail de l'auteur

Auteur Pascal Fua

Documents disponibles écrits par cet auteur (3)

Ajouter le résultat dans votre panier Affiner la recherche Interroger des sources externes

Learning to represent and reconstruct 3D deformable objects / Jan Bednarik (2022)

Public

Titre : Learning to represent and reconstruct 3D deformable objects
Type de document : Thèse/HDR
Auteurs : Jan Bednarik, Auteur ; Pascal Fua, Directeur de thèse ; M. Salzmann, Directeur de thèse
Editeur : Lausanne : Ecole Polytechnique Fédérale de Lausanne EPFL
Année de publication : 2022
Importance : 138 p.
Format : 21 x 30 cm
Note générale : bibliographie
Thèse présentée pour l'obtention du grade de Docteur ès Sciences, Ecole Polytechnique Fédérale de Lausanne
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] appariement de formes
[Termes IGN] apprentissage profond
[Termes IGN] cohérence temporelle
[Termes IGN] déformation de surface
[Termes IGN] distorsion d'image
[Termes IGN] géométrie de Riemann
[Termes IGN] image 3D
[Termes IGN] reconstruction d'objet
[Termes IGN] semis de points
[Termes IGN] vision par ordinateur

Index. décimale : THESE Thèses et HDR
Résumé : (auteur) Representing and reconstructing 3D deformable shapes are two tightly linked problems that have long been studied within the computer vision field. Deformable shapes are truly ubiquitous in the real world, whether be it specific object classes such as humans, garments and animals or more abstract ones such as generic materials deforming under stress caused by an external force. Truly practical computer vision algorithms must be able to understand the shapes of objects in the observed scenes to unlock the wide spectrum of much sought after applications ranging from virtual try-on to automated surgeries. Automatic shape reconstruction, however, is known to be an ill-posed problem, especially in the common scenario of a single image input. Therefore, the modern approaches rely on deep learning paradigm which has proven to be extremely effective even for the severely under-constrained computer vision problems. We, too, exploit the success of data-driven approaches, however, we also show that generic deep learning models can greatly benefit from being combined with explicit knowledge originating in traditional computational geometry. We analyze the use of various 3D shape representations for deformable object reconstruction and we distinctly focus on one of them, the atlas-based representation, which turns out to be especially suitable for modeling deformable shapes and which we further improve and extend to yield higher quality reconstructions. The atlas-based representation models the surfaces as an ensemble of continuous functions and thus allows for arbitrary resolution and analytical surface analysis. We identify major shortcomings of the base formulation, namely the infamous phenomena of patch collapse, patch overlap and arbitrarily strong mapping distortions, and we propose novel regularizers based on analytically computed properties of the reconstructed surfaces. Our approach counteracts the aforementioned drawbacks while yielding higher reconstruction accuracy in terms of surface normals on the tasks of single view-reconstruction, shape completion and point cloud auto-encoding. We dive into the problematics of atlas-based shape representation even deeper and focus on another pressing design flaw, the global inconsistency among the individual mappings. While the inconsistency is not reflected in the traditional reconstruction accuracy quantitative metrics, it is detrimental to the visual quality of the reconstructed surfaces. Specifically, we design loss functions encouraging intercommunication among the individual mappings which pushes the resulting surface towards a C1 smooth function. Our experiments on the tasks of single-view reconstruction and point cloud auto-encoding reveal that our method significantly improves the visual quality when compared to the baselines. Furthermore, we adapt the atlas-based representation and the related training procedure so that it could model a full sequence of a deforming object in a temporally-consistent way. In other words, the goal is to produce such reconstruction where each surface point always represents the same semantic point on the target ground-truth surface. To achieve such behavior, we note that if each surface point deforms close-to-isometrically, its semantic location likely remains unchanged. Practically, we make use of the Riemannian metric which is computed analytically on the surfaces, and force it to remain point-wise constant throughout the sequence. Our experimental results reveal that our method yields state-of-the-art results on the task of unsupervised dense shape correspondence estimation, while also improving the visual reconstruction quality. Finally, we look into a particular problem of monocular texture-less deformable shape reconstruction, an instance of the Shape-from-Shading problem. We propose a multi-task learning approach which takes an RGB image of an unknown object as the input and jointly produces a normal map, a depth map and a mesh corresponding to the observed part of the surface. We show that forcing the model to produce multiple different 3D representations of the same objects results in higher reconstruction quality. To train the network, we acquire a large real-world annotated dataset of texture-less deforming objects and we release it for public use. Finally, we prove through experiments that our approach outperforms a previous optimization based method on the single-view-reconstruction task.
Note de contenu : 1- Introduction
2- Related work
3- Atlas-based representation for deformable shape reconstruction
4- Shape reconstruction by learning differentiable surface representations
5- Better patch stitching for parametric surface reconstruction
6- Temporally-consistent surface reconstruction using metrically-consistent atlases
7- Learning to reconstruct texture-less deformable surfaces from a single view
8- Conclusion
Numéro de notice : 15761
Affiliation des auteurs : non IGN
Thématique : IMAGERIE
Nature : Thèse étrangère
Note de thèse : Thèse de Doctorat : Sciences : Lausanne, EPFL : 2022
DOI : 10.5075/epfl-thesis-7974
En ligne : https://doi.org/10.5075/epfl-thesis-7974
Format de la ressource électronique : URL
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=100958

Vision-based detection of aircrafts and UAVs / Artem Rozantsev (2017)

Public

Titre : Vision-based detection of aircrafts and UAVs
Type de document : Thèse/HDR
Auteurs : Artem Rozantsev, Auteur ; Pascal Fua, Directeur de thèse ; Vincent Lepetit, Directeur de thèse
Editeur : Lausanne : Ecole Polytechnique Fédérale de Lausanne EPFL
Année de publication : 2017
Importance : 117 p.
Format : 21 x 30 cm
Note générale : bibliographie
Thèse présentée à l'Ecole Polytechnique Fédérale de Lausanne pour l'obtention du grade de Docteur ès Sciences
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Télédétection
[Termes IGN] apprentissage automatique
[Termes IGN] apprentissage profond
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] cube espace-temps
[Termes IGN] détection d'objet
[Termes IGN] drone
[Termes IGN] image aérienne
[Termes IGN] objet mobile
[Termes IGN] régression
[Termes IGN] vision par ordinateur

Résumé : (auteur) Unmanned Aerial Vehicles are becoming increasingly popular for a broad variety of tasks ranging from aerial imagery to objects delivery. With the expansion of the areas, where drones can be efficiently used, the collision risk with other flying objects increases. Avoiding such collisions would be a relatively easy task, if all the aircrafts in the neighboring airspace could communicate with each other and share their location information. However, it is often the case that either location information is unavailable (e.g. flying in GPS-denied environments) or communication is not possible (e.g. different communication channels or non-cooperative flight scenario). To ensure
flight safety in this kind of situations drones need a way to autonomously detect other objects that are intruding the neighboring airspace. Visual-based collision avoidance is of particular interest as cameras generally consume less power and are more lightweight than active sensor alternatives such as radars and lasers. We have therefore developed a set of increasingly sophisticated algorithms to provide drones with a visual collision avoidance capability. First, we present a novel method for detecting flying objects such as drones and planes that occupy a small part of the camera field of view, possibly move in front of complex backgrounds, and are filmed by a moving camera. In order to be solved this problem requires combining motion and appearance information, as neither of the two alone is capable of providing reliable
enough detections. We therefore propose a machine learning technique that operates on spatiotemporal cubes of image intensities where individual patches are aligned using an object-centric regression-based motion stabilization algorithm. Second, in order to reduce the need to collect a large training dataset and to manual annotate it, we introduce a way to generate realistic synthetic images. Given only a small set of real examples and a coarse 3D model of the object, synthetic data can be generated in arbitrary quantities and further used to supplement real examples for training a detector. The key ingredient of our method is that the synthetically generated images need to be as close as possible to the real ones not in terms of image quality, but according to the features, used by a machine learning algorithm. Third, though the aforementioned approach yields a substantial increase in performance when using Adaboost and DPM detectors, it does not generalize well to Convolutional Neural Networks, which have become the state-of-the-art. This happens because, as we add more and more synthetic data, the CNNs begin to overfit to the synthetic images at the expense of the real ones. We therefore propose a novel deep domain adaptation technique that allows efficiently combining real and synthetic images without overfitting to either of the two. While most of the adaptation techniques aim at learning features that are invariant to the possible difference of the images, coming from different sources (real and synthetic). Unlike those methods, we suggest modeling this difference with a special two-stream architecture. We evaluate our approach on three different
datasets and show its effectiveness for various classification and regression tasks.
Note de contenu : Introduction
1- Flying Objects Detection
2- Synthetic Data Generation
3- Domain Adaption for Deep Networks
4- Concluding Remarks
Numéro de notice : 25870
Affiliation des auteurs : non IGN
Thématique : IMAGERIE
Nature : Thèse étrangère
Note de thèse : Thèse de Doctorat : Sciences : Lausanne : Suisse : 2017
En ligne : https://infoscience.epfl.ch/record/227934?ln=fr
Format de la ressource électronique : URL
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=95538

Une approche variationnelle pour la reconnaissance d'objets / Pascal Fua (1989)

Public

Titre : Une approche variationnelle pour la reconnaissance d'objets
Type de document : Thèse/HDR
Auteurs : Pascal Fua, Auteur ; Olivier Faugeras, Directeur de thèse
Editeur : Paris-Orsay : Université de Paris 11 Paris-Sud Centre d'Orsay
Année de publication : 1989
Importance : 145 p.
Format : 21 x 30 cm
Note générale : bibliographie
Thèse de Doctorat, Domaine Informatique, Université de Paris 11 Paris-Sud Centre d'Orsay
Langues : Français (fre)
Descripteur : [Vedettes matières IGN] Traitement d'image
[Termes IGN] bâtiment
[Termes IGN] compréhension de l'image
[Termes IGN] image aérienne
[Termes IGN] implémentation (informatique)
[Termes IGN] langage à objets
[Termes IGN] objet 3D
[Termes IGN] optimisation (mathématiques)
[Termes IGN] reconnaissance d'objets
[Termes IGN] reconnaissance de formes
[Termes IGN] route
[Termes IGN] vision par ordinateur

Index. décimale : THESE Thèses et HDR
Résumé : (auteur) Dans cette thèse, nous proposons une formulation variationnelle du problème de la reconnaissance d'objets qui nous permet, d'une part, d'unifier les différents éléments de notre approche dans un même cadre théorique et, d'autre part, de développer des méthodes de calcul réalistes pour le traitement d'images complexes. Nous décrivons les objets en termes d'un langage qui inclut les contraintes tant photométriques que géométriques ou sémantiques auxquelles ces objets et leur apparence dans l'image sont soumis. Nous définissons un critère de nature statistique qui mesure la qualité d'une telle description ; reconnaître les objets équivaut alors à trouver la description optimale de l'image en termes de notre langage. Nous avons validé notre approche dans le cadre de la reconnaissance de routes et bâtiments dans des images aériennes et avons implémenté un système qui identifie avec succès la majorité des objets cible dans des images difficiles. Dans le premier chapitre nous introduisons et motivons notre approche. Nous présentons ensuite des articles qui documentent son évolution. Dans le dernier chapitre, nous décrivons en détail notre fonction (objectif) et les procédures d'optimisation que nous avons implémentées.
Note de contenu : Introduction
1- Resegmentation
2- Génération de contours
3- Fonctions objectif pour la reconnaissance d'objets
Conclusion
Numéro de notice : 21757
Affiliation des auteurs : non IGN
Thématique : IMAGERIE/INFORMATIQUE
Nature : Thèse française
Note de thèse : Thèse de Doctorat : Informatique : Paris 11 : 1989
nature-HAL : Thèse
DOI : sans
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=91127