Détail de l'auteur
Auteur Lorenzo Bertoni |
Documents disponibles écrits par cet auteur (1)
Ajouter le résultat dans votre panier Affiner la recherche Interroger des sources externes
Titre : Reshaping perception for autonomous driving with semantic keypoints Type de document : Thèse/HDR Auteurs : Lorenzo Bertoni, Auteur Editeur : Lausanne : Ecole Polytechnique Fédérale de Lausanne EPFL Année de publication : 2022 Importance : 177 p. Format : 21 x 30 cm Note générale : bibliographie
Thèse présentée pour l'obtention du grade de Docteur ès Sciences, Ecole Polytechnique Fédérale de LausanneLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Intelligence artificielle
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] détection automatique
[Termes IGN] détection d'objet
[Termes IGN] détection de piéton
[Termes IGN] estimation de pose
[Termes IGN] navigation autonome
[Termes IGN] système multi-agents
[Termes IGN] vision par ordinateurRésumé : (auteur) The field of artificial intelligence is set to fuel the future of mobility by driving forward the transition from advanced driver-assist systems to fully autonomous vehicles (AV). Yet the current technology, backed by cutting-edge deep learning techniques, still leads to fatal accidents and does not convey trust. Current frameworks for 3D perception tasks, such as 3D object detection, are not adequate as they (i) do not generalize well to new scenarios, (ii) do not take into account measures of confidence in their predictions, and (iii) are not suitable for large-scale deployment as mainly based on costly LiDAR sensors. This doctoral thesis aims to study vision-based deep learning frameworks that can accurately perceive the world in 3D and generalize to new scenarios. We propose to escape the pixel domain using semantic keypoints, a sparse representation for every object in the scene containing meaningful information for 2D and 3D reasoning. The low-dimensionality enables downstream neural networks to focus on essential elements in the scene and improve their generalization capabilities. Furthermore, driven by the limitation of deep learning architectures outputting point estimates, we study how to estimate a confidence interval for each prediction. In particular, we emphasize vulnerable road users, such as pedestrians and cyclists, and explicitly address the long tail of 3D pedestrian detection to contribute to the safety of our roads. We further show the efficacy of our framework on multiple real-world domains by (a) integrating it in an existing AV pipeline, (b) detecting human-robot eye contact in real-world scenarios, and (c) helping verify the compliance of safety measures in the case of the COVID-19 outbreak. Finally, we publicly release the source code of all our projects and develop a unified library to contribute to an open science mission. Note de contenu : 1- Introduction
2- Semantic keypoints detection
3- Monocular 3D pedestrian localization and uncertainty estimation
4- Tackling the long tail of 3D pedestrian localization with stereo cameras
5- Autonomous driving applications of pedestrian 3D detection
6- Detecting pedestrians attention: Human-robot eye contact in the wild
7- Beyond autonomous driving: Social interactions and social distancing
8- ConclusionNuméro de notice : 24077 Affiliation des auteurs : non IGN Thématique : INFORMATIQUE Nature : Thèse étrangère Note de thèse : PhD Thesis : Sciences : EPFL : 2022 DOI : 10.5075/epfl-thesis-10072 En ligne : https://doi.org/10.5075/epfl-thesis-10072 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102212