Catalogue en ligne IGN

Détail de l'auteur

Auteur Mathieu Aubry

Documents disponibles écrits par cet auteur (7)

Ajouter le résultat dans votre panier Visionner les documents numériques Affiner la recherche Interroger des sources externes

Learnable Earth Parser: Discovering 3D Prototypes in Aerial Scans / Romain Loiseau (2023)

Public

Titre : Learnable Earth Parser: Discovering 3D Prototypes in Aerial Scans
Type de document : Article/Communication
Auteurs : Romain Loiseau , Auteur ; Elliot Vincent, Auteur ; Mathieu Aubry, Auteur ; Loïc Landrieu , Auteur
Editeur : Ithaca [New York - Etats-Unis] : ArXiv - Université Cornell
Année de publication : 2023
Importance : 18 p.
Note générale : bibliographie
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Lasergrammétrie
[Termes IGN] données lidar
[Termes IGN] données localisées 3D
[Termes IGN] information complexe
[Termes IGN] scène 3D
[Termes IGN] semis de points
[Termes IGN] zone urbaine

Résumé : (auteur) We propose an unsupervised method for parsing large 3D scans of real-world scenes into interpretable parts. Our goal is to provide a practical tool for analyzing 3D scenes with unique characteristics in the context of aerial surveying and mapping, without relying on application-specific user annotations. Our approach is based on a probabilistic reconstruction model that decomposes an input 3D point cloud into a small set of learned prototypical shapes. Our model provides an interpretable reconstruction of complex scenes and leads to relevant instance and semantic segmentations. To demonstrate the usefulness of our results, we introduce a novel dataset of seven diverse aerial LiDAR scans. We show that our method outperforms state-of-the-art unsupervised methods in terms of decomposition accuracy while remaining visually interpretable. Our method offers significant advantage over existing approaches, as it does not require any manual annotations, making it a practical and efficient tool for 3D scene analysis. Our code and dataset are available at https://imagine.enpc.fr/~loiseaur/learnable-earth-parser
Numéro de notice : P2023-005
Affiliation des auteurs : UGE-LASTIG+Ext (2020- )
Thématique : IMAGERIE/INFORMATIQUE
Nature : Preprint
nature-HAL : Préprint
DOI : sans
En ligne : https://hal.science/hal-04135416
Format de la ressource électronique : URL article
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=103347

Deep learning based 3D reconstruction: supervision and representation / François Darmon (2022)

Public

Titre : Deep learning based 3D reconstruction: supervision and representation
Type de document : Thèse/HDR
Auteurs : François Darmon, Auteur ; Pascal Monasse, Directeur de thèse ; Mathieu Aubry, Directeur de thèse
Editeur : Champs-sur-Marne : Ecole des Ponts ParisTech
Année de publication : 2022
Importance : 115 p.
Format : 21 x 30 cm
Note générale : Bibliographie
Thèse de doctorat de l'Ecole des Ponts ParisTech, spécialité informatique
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] appariement d'images
[Termes IGN] carte de profondeur
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] extraction
[Termes IGN] géométrie épipolaire
[Termes IGN] maillage
[Termes IGN] modèle stéréoscopique
[Termes IGN] point d'intérêt
[Termes IGN] Ransac (algorithme)
[Termes IGN] reconstruction 3D
[Termes IGN] reconstruction d'objet
[Termes IGN] semis de points
[Termes IGN] SIFT (algorithme)
[Termes IGN] structure-from-motion
[Termes IGN] voxel

Index. décimale : THESE Thèses et HDR
Résumé : (auteur) 3D reconstruction is a long standing problem in computer vision. Yet, state-of-the-art methods still struggle when the images used have large illumination changes, many occlusions or limited textures. Deep Learning holds promises of improving 3D reconstruction in such setups, but classical methods still produce the best results. In this thesis we analyse the specificity of deep learning applied to multiview 3D reconstruction and introduce new deep learning based methods.The first contribution of this thesis is an analysis of the possible supervision for training Deep Learning models for sparse image matching. We introduce a two-step algorithm that first computes low resolution matches using deep learning and then matches classical local features inside the matches regions. We analyze several levels of supervision and show that our new epipolar supervision leads to the best results.The second contribution is also a study of supervision for Deep Learning but applied to another scenario: calibrated 3D reconstruction in the wild. We show that existing unsupervised methods do not work on such data and we introduce a new training technique that solves this issue. We then exhaustively compare unsupervised approach and supervised approaches with different network architectures and training data.Finally, our third contribution is about data representation. Neural implicit representation were recently used for image rendering. We adapt this representation to the multiview reconstruction problem and we introduce a new method that, similar to classical 3D reconstruction techniques, optimizes photo-consistency between projections of multiple images. Our approach outperforms state-of-the-art by a large margin.
Note de contenu : 1- Introduction
2- Background
3- Deep learning for guiding keypoint matching
4- Deep Learning based Multi-View Stereo in the wild
5- Multi-view reconstruction with implicit surfaces and patch warping
6- Conclusion
Numéro de notice : 24085
Affiliation des auteurs : non IGN
Thématique : IMAGERIE/INFORMATIQUE
Nature : Thèse française
Note de thèse : Thèse de Doctorat : Informatique : Ponts ParisTech : 2022
Organisme de stage : Laboratoire d'Informatique Gaspard-Monge LIGM
DOI : sans
En ligne : https://www.theses.fr/2022ENPC0024
Format de la ressource électronique : URL
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102473

A model you can hear: Audio identification with playable prototypes / Romain Loiseau (2022)

Public

Titre : A model you can hear: Audio identification with playable prototypes
Type de document : Article/Communication
Auteurs : Romain Loiseau , Auteur ; Baptiste Bouvier, Auteur ; Yann Teytaut, Auteur ; Elliot Vincent, Auteur ; Mathieu Aubry, Auteur ; Loïc Landrieu , Auteur
Editeur : Ithaca [New York - Etats-Unis] : ArXiv - Université Cornell
Année de publication : 2022
Projets : READY3D / Landrieu, Loïc
Conférence : ISMIR 2022, International Society for Music Information Retrieval Conference 04/12/2022 08/12/2022 Bengaluru Inde
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Traitement du signal
[Termes IGN] apprentissage automatique
[Termes IGN] bruit (audition)
[Termes IGN] onde acoustique
[Termes IGN] prototype

Résumé : (auteur) Machine learning techniques have proved useful for classifying and analyzing audio content. However, recent methods typically rely on abstract and high-dimensional representations that are difficult to interpret. Inspired by transformation-invariant approaches developed for image and 3D data, we propose an audio identification model based on learnable spectral prototypes. Equipped with dedicated transformation networks, these prototypes can be used to cluster and classify input audio samples from large collections of sounds. Our model can be trained with or without supervision and reaches state-of-the-art results for speaker and instrument identification, while remaining easily interpretable. The code is available at https://github.com/romainloiseau/a-model-you-can-hear
Numéro de notice : P2022-006
Affiliation des auteurs : UGE-LASTIG+Ext (2020- )
Thématique : INFORMATIQUE
Nature : Preprint
nature-HAL : Préprint
DOI : 10.48550/arXiv.2208.03311
En ligne : https://doi.org/10.48550/arXiv.2208.03311
Format de la ressource électronique : URL article
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=101330

Online segmentation of LiDAR sequences: dataset and algorithm / Romain Loiseau (2022)

Public

Titre : Online segmentation of LiDAR sequences: dataset and algorithm
Type de document : Article/Communication
Auteurs : Romain Loiseau , Auteur ; Mathieu Aubry, Auteur ; Loïc Landrieu , Auteur
Editeur : Berlin, Heidelberg, Vienne, New York, ... : Springer
Année de publication : 2022
Projets : READY3D / Landrieu, Loïc
Conférence : ECCV 2022, 17th European Conference on Computer Vision 23/10/2022 27/10/2022 Tel Aviv Israel Proceedings Springer
Importance : pp 301 - 317
Note générale : bibliographie
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Lasergrammétrie
[Termes IGN] données lidar
[Termes IGN] données localisées 3D
[Termes IGN] jeu de données
[Termes IGN] segmentation

Résumé : (auteur) Roof-mounted spinning LiDAR sensors are widely used by autonomous vehicles. However, most semantic datasets and algorithms used for LiDAR sequence segmentation operate on 360∘ frames, causing an acquisition latency incompatible with real-time applications. To address this issue, we first introduce HelixNet, a 10 billion point dataset with fine-grained labels, timestamps, and sensor rotation information necessary to accurately assess the real-time readiness of segmentation algorithms. Second, we propose Helix4D, a compact and efficient spatio-temporal transformer architecture specifically designed for rotating LiDAR sequences. Helix4D operates on acquisition slices corresponding to a fraction of a full sensor rotation, significantly reducing the total latency. Helix4D reaches accuracy on par with the best segmentation algorithms on HelixNet and SemanticKITTI with a reduction of over 5× in terms of latency and 50× in model size. The code and data are available at: https://romainloiseau.fr/helixnet.
Numéro de notice : C2022-043
Affiliation des auteurs : UGE-LASTIG+Ext (2020- )
Autre URL associée : vers ArXiv
Thématique : IMAGERIE
Nature : Communication
nature-HAL : ComAvecCL&ActesPubliésIntl
DOI : 10.1007/978-3-031-19839-7_18
Date de publication en ligne : 23/10/2022
En ligne : https://doi.org/10.1007/978-3-031-19839-7_18
Format de la ressource électronique : URL article
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=101905

Optimization of deep neural networks: A functional perspective with applications in image classification / Simon Roburin (2022)

Public

Titre : Optimization of deep neural networks: A functional perspective with applications in image classification
Type de document : Thèse/HDR
Auteurs : Simon Roburin, Auteur ; Mathieu Aubry, Directeur de thèse
Editeur : Champs-sur-Marne : Ecole des Ponts ParisTech
Année de publication : 2022
Importance : 141 p.
Format : 21 x 30 cm
Note générale : Bibliographie
Thèse de Doctorat de l'Ecole des Ponts ParisTech, spécialité Mathématiques Appliquées
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] analyse de groupement
[Termes IGN] apprentissage profond
[Termes IGN] classification par nuées dynamiques
[Termes IGN] mathématiques appliquées
[Termes IGN] optimisation (mathématiques)
[Termes IGN] vision par ordinateur

Index. décimale : THESE Thèses et HDR
Résumé : (auteur) Despite numerous successes in a wide range of industrial and scientific applications, the learning process of deep neural networks is poorly understood. Loosely speaking, learning aims at finding the network parameters that not only minimize the network errors on a set of training examples but also yield correct predictions on unseen data. Under the prism of optimization, it boils down to minimizing a high dimensional non-convex function. Generalization can generally be expected when one has access to very large datasets and assumes that both training examples and unseen data are sampled from identically independently distributed random variables. The goal of this thesis is to develop analytical tools to better understand neural network optimization and to improve the design of training algorithms in the context of image classification.
Note de contenu : 1- Introduction
2- Literature review
3- Impact of Normalization Layers on Optimization
4- Avoid learning spurious correlations
5- Conclusion
Numéro de notice : 24098
Affiliation des auteurs : non IGN
Thématique : IMAGERIE/INFORMATIQUE
Nature : Thèse française
Note de thèse : Thèse de Doctorat : Mathématiques Appliquées : Ponts ParisTech : 2022
Organisme de stage : LIGM-IMAGINE
En ligne : https://hal.science/tel-03968114v1
Format de la ressource électronique : URL
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102573

Representing shape collections with alignment-aware linear models / Romain Loiseau (2021)

Permalink
Learning 3D generation and matching / Thibault Groueix (2020)

Permalink