Descripteur
Documents disponibles dans cette catégorie (1844)
Ajouter le résultat dans votre panier
Visionner les documents numériques
Affiner la recherche Interroger des sources externes
Etendre la recherche sur niveau(x) vers le bas
Contribution to object extraction in cartography : A novel deep learning-based solution to recognise, segment and post-process the road transport network as a continuous geospatial element in high-resolution aerial orthoimagery / Calimanut-Ionut Cira (2022)
Titre : Contribution to object extraction in cartography : A novel deep learning-based solution to recognise, segment and post-process the road transport network as a continuous geospatial element in high-resolution aerial orthoimagery Type de document : Thèse/HDR Auteurs : Calimanut-Ionut Cira, Auteur Editeur : Madrid [Espagne] : Universidad politécnica de Madrid Année de publication : 2022 Importance : 227 p. Format : 21 x 30 cm Note générale : bibliographie
Thèse de Doctorat en Topographie, Géodésie et cartographie, Universidad politécnica de MadridLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] analyse d'image orientée objet
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] extraction du réseau routier
[Termes IGN] image aérienne
[Termes IGN] orthoimage
[Termes IGN] réseau antagoniste génératif
[Termes IGN] réseau neuronal artificiel
[Termes IGN] route
[Termes IGN] segmentation sémantiqueIndex. décimale : THESE Thèses et HDR Résumé : (auteur) Remote sensing imagery combined with deep learning strategies is often regarded as an ideal solution for interpreting scenes and monitoring infrastructures with remarkable performance levels. Remote sensing experts have been actively using deep neural networks to solve object extraction tasks in high-resolution aerial imagery by means of supervised operations. However, the extraction operation is imperfect, due to the nature of remotely sensed images (noise, obstructions, etc.), the limitations of sensing resolution, or the occlusions often present in the scenes. The road network plays an important part in transportation and, nowadays, one of the main related challenges is keeping the existent cartographic support up to date. This task can be considered very challenging due to the complex nature of the geospatial object (continuous, with irregular geometry, and significant differences in width). We also need to take into account that secondary roads represent the largest part of the road transport network, but due to the absence of clearly defined edges, and the different spectral signatures of the materials used for pavement, monitoring, and mapping them represents a great effort for public administration, and their extraction is often omitted altogether. We believe that recent advancements in machine vision can enable a successful extraction of the road structures from high-resolution, remotely sensed imagery and a greater automation of the road mapping operation. In this PhD thesis, we leverage recent computer vision advances and propose a deep learning-based end-to-end solution, capable of efficiently extracting the surface area of roads at a large scale. The novel approach is based on a disjoint execution of three different image processing operations (recognition, semantic segmentation, and post-processing with conditional generative learning) within a common framework. We focused on improving the state-of-the-art results for each of the mentioned components, before incorporating the resulting models into the proposed solution architecture. For the recognition operation, we proposed two framework candidates based on convolutional neural networks to classify roads in openly available aerial orthoimages divided in tiles of 256×256 pixels, with a spatial resolution of 0.5 m. The frameworks are based on ensemble learning and transfer learning and combine weak classifiers to leverage the strengths of different state-of-the-art models that we heavily modified for computational efficiency. We evaluated their performance on unseen test data and compared the results with those obtained by the state-of-the-art convolutional neural networks trained for the same task, observing improvements in performance metrics of 2-3%. Secondly, we implemented hybrid semantic segmentation models (where the default backbones are replaced by neural network specialised in image segmentation) and trained them with high-resolution remote sensing imagery and their correspondent ground-truth masks. Our models achieved mean increases in performance metrics of 2.7-3.5%, when compared to the original state-of-the-art semantic segmentation architectures trained from scratch for the same task. The best-performing model was integrated on a web platform that handles the evaluation of large areas, the association of the semantic predictions with geographical coordinates, the conversion of the tiles’ format, and the generation of GeoTIFF results (compatible with geospatial databases). Thirdly, the road surface area extraction task is generally carried out via semantic segmentation over remotely sensed imagery—however, this supervised learning task can be considered very costly because it requires remote sensing images labelled at pixel level and the results are not always satisfactory (presence of discontinuities, overlooked connection points, or isolated road segments). We consider that unsupervised learning (not requiring labelled data) can be employed for post-processing the geometries of geospatial objects extracted via semantic segmentation. For this reason, we also approached the post-processing of the road surface areas obtained with the best performing segmentation model to improve the initial segmentation predictions. In this line, we proposed two post-processing operations based on conditional generative learning for deep inpainting and image-to-image translation operations and trained the networks to learn the distribution of the road network present in official cartography, using a novel dataset covering representative areas of Spain. The first proposed conditional Generative Adversarial Network (cGAN) model was trained for deep inpainting operation and obtained improvements in performance metrics of maximum 1.3%. The second cGAN model was trained for image-to-image translation, is based on a popular model heavily modified for computational efficiency (a 92.4% decrease in the number of parameters in the generator network and a 61.3% decrease in the discriminator network), and achieved a maximum increase of 11.6% in performance metrics. We also conducted a qualitative comparison to visually assess the effectiveness of the generative operations and observed great improvements with respect to the initial semantic segmentation predictions. Lastly, we proposed an end-to-end processing strategy that combines image classification, semantic segmentation, and post-processing operations to extract containing road surface area extraction from high-resolution aerial orthophotography. The training of the model components was carried out on a large-scale dataset containing more than 537,500 tiles, covering approximately 20,800 km2 of the Spanish territory, manually tagged at pixel level. The consecutive execution of the resulting deep learning models delivered higher quality results when compared to state-of-the-art implementations trained for the same task. The versatility and flexibility of the solution given by the disjointed execution of the three separate sub-operations proved its effectiveness and economic efficiency and enables the integration of a web application that alleviates the manipulation of geospatial data, while allowing for an easy integration of future models and algorithms. Resuming, applying the proposed models resulted from this PhD thesis translates to operations aimed to check if the latest existing aerial orthoimages contains the studied continuous geospatial element, to obtain an approximation of its surface area using supervised learning and to improve the initial segmentation results with post-processing methods based on conditional generative learning. The results obtained with the proposed end-to-end-solution presented in this PhD thesis improve the state-of-the-art in the field of road extraction with deep learning techniques and prove the appropriateness of applying the proposed extraction workflow for a more robust and more efficient extraction operation of the road transport network. We strongly believe that the processing strategy can be applied to enhance other similar extraction tasks of continuous geospatial elements (such as the mapping of riverbeds, or railroads), or serve as a base for developing additional extraction workflows of geospatial objects from remote sensing images. Note de contenu : 1- Introduction
2- Methodology
3- Theoretical framework
4- Litterature review
5- Road recognition: A framework based on nestion of convolutional neuronal networks and transfer learning to regognise road elements
6- Road segmentation: An approach based on hybrid semantic segmentation models to extract the surface area of rod elements from aerial orthoimagery
7- Post-processing of semantic segmentation predictions I: A conditional generative adversial network to improve the extraction of road surface areas via deep inpainting operations
8- Post-processing of semantic segmentation predictions II: A lightweight conditional generative adversial network to improve the extraction of road surface areas via image-to-image translation
9- An end-to-end road extraction solution based on regonition, segmentation, and post-processing operations for a large-scale mapping of the road transport network from aerial orthophotography
10- ConclusionsNuméro de notice : 24069 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Thèse étrangère Note de thèse : Thèse de Doctorat : Topographie, Géodésie et cartographie : Universidad politécnica de Madrid : 2022 DOI : 10.20868/UPM.thesis.70152 En ligne : https://doi.org/10.20868/UPM.thesis.70152 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102113
Titre : Cross-dataset learning for generalizable land use scene classification Type de document : Article/Communication Auteurs : Dimitri Gominski , Auteur ; Valérie Gouet-Brunet , Auteur ; Liming Chen, Auteur Editeur : New York : Institute of Electrical and Electronics Engineers IEEE Année de publication : 2022 Projets : Alegoria / Gouet-Brunet, Valérie Conférence : EarthVision 2022, Large Scale Computer Vision for Remote Sensing Imagery, workshop joint to CVPR 2022 19/06/2022 24/06/2022 New Orleans Louisiane - Etats-Unis OA Proceedings Importance : pp 1382 - 1391 Note générale : bibliographie
in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2022, pp. 1382-1391Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] cadre conceptuel
[Termes IGN] descripteur
[Termes IGN] données d'entrainement (apprentissage automatique)
[Termes IGN] intelligence artificielle
[Termes IGN] scène urbaine
[Termes IGN] segmentation sémantique
[Termes IGN] utilisation du solRésumé : (auteur) Few-shot and cross-domain land use scene classification methods propose solutions to classify unseen classes or uneen visual distributions, but are hardly applicable to real-world situations due to restrictive assumptions. Few-shot methods involve episodic training on restrictive training subsets with small feature extractors, while cross-domain methods are only applied to common classes. The underlying challenge remains open: can we accurately classify new scenes on new datasets? In this paper, we propose a new framework for few-shot, cross-domain classification. Our retrieval-inspired approach exploits the interrelations in both the training and testing data to output class labels using compact descriptors. Results show that our method can accurately produce land-use predictions on unseen datasets and unseen classes, going beyond the traditional few-shot or cross-domain formulation, and allowing cross-dataset training. Numéro de notice : C2022-031 Affiliation des auteurs : UGE-LASTIG+Ext (2020- ) Autre URL associée : vers IEEE Thématique : IMAGERIE/INFORMATIQUE Nature : Communication nature-HAL : ComAvecCL&ActesPubliésIntl DOI : 10.1109/CVPRW56347.2022.00144 En ligne : https://openaccess.thecvf.com/content/CVPR2022W/EarthVision/papers/Gominski_Cros [...] Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=101087 DART: An efficient 3D Monte Carlo vector radiative transfer model for remote sensing applications / Yingjie Wang (2022)
Titre : DART: An efficient 3D Monte Carlo vector radiative transfer model for remote sensing applications Titre original : Modélisation 3D du transfert radiatif avec polarisation pour l'étude des surfaces terrestres par télédétection Type de document : Thèse/HDR Auteurs : Yingjie Wang, Auteur ; Jean-Philippe Gastellu-Etchegorry, Directeur de thèse ; A. Deschamps, Directeur de thèse Editeur : Toulouse : Université de Toulouse Année de publication : 2022 Importance : 248 p. Format : 21 x 30 cm Note générale : Bibliographie
Thèse en vue de l'obtention du Doctorat de l'Université de Toulouse, spécialité Surfaces et interfaces continentales, hydrologieLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] distribution du coefficient de réflexion bidirectionnelle BRDF
[Termes IGN] méthode de Monte-Carlo
[Termes IGN] modèle de transfert radiatif
[Termes IGN] modélisation 3D
[Termes IGN] polarisation
[Termes IGN] radianceIndex. décimale : THESE Thèses et HDR Résumé : (auteur) Accurate understanding of the land surface functioning, such as the energy budget, carbon and water cycles, and ecosystem dynamics, is essential to better interpret, predict and mitigate the impact of the expected global changes. It thus requires observing our planet at different spatial and temporal scales that only the remote sensing (RS) can achieve because of its ability to provides systematic and synoptic radiometric observations. These observations can be transformed to surface parameters (e.g., temperature, vegetation biomass, etc.) used as input in process models (e.g., evapotranspiration) or be assimilated in the latter. Understanding the radiation interactions in the land surface and atmosphere is essential in two aspects: interpret RS signals as information about the observed land surfaces, and model the processes of functioning of land surfaces where the radiation participates. This explains the development of radiative transfer models (RTMs) that simulate the radiative budget and RS observations. The initial 3D RTMs in the 1980s simulated basic radiation mechanisms in very schematic representations of land surfaces (e.g., turbid medium, geometric primitive). Since then, their accuracy and performance have been greatly improved to address the increasing need of accurate information about land surfaces as well as the advances of RS instruments. So far, two types of improvements are still needed: 1. More accurate and efficient radiative transfer (RT) modelling (e.g., polarization, specular reflection, atmospheric scattering and emission, etc.) 2. Representation of land surfaces at different realism degrees and spatial scales. DART is one of the most accurate and comprehensive 3D RTMs (dart.omp.eu). It simulates the radiative budget and RS observations of urban and natural landscapes, with topography and atmosphere, from the ultraviolet to the thermal infrared domains. Its initial version, DART-FT, in 1992, used the discrete ordinates method to iteratively track the radiation along finite number of discrete directions in voxelized representations of the landscapes. It has been validated with other RTMs, and also RS and field measurements. However, it cannot simulate RS observations with the presently needed precision because of its voxelized representation of landscapes, and absence of some physical mechanisms (e.g., polarization). During this thesis, in collaboration with the DART team, I developed in DART a new Monte Carlo vector RT mode called DART-Lux that takes full advantage of the latest advances in RT modelling, especially in computer graphics. The central idea is to transfer the radiation transfer problem as a multi-dimensional integral problem and solve it with the Monte Carlo method that is considerably efficient and accurate in computing multi-dimensional integral such as the complex mechanisms (e.g., polarization) in realistic representations of 3D landscapes. For that, I implemented the bidirectional path tracing algorithm that generates a group of "source-sensor" paths by connecting two sub-paths, one is generated starting from the light source and another one is generated starting from the sensor. Then, the contribution of these paths to the integral is estimated by the multiple importance sampling. This method allows to accurately and efficiently simulate polarimetric RS observations of kilometre-scale realistic landscapes coupled with plane-parallel atmosphere, with consideration of the anisotropic scattering, the thermal emission, and the solar induced fluorescence. Compared to DART-FT, DART-Lux improves the computer efficiency (i.e., computer time and memory) usually by a factor of more than 100 for large-scale and complex landscapes. It provides new perspectives for studying the land surface functioning and also for preparing Earth observation satellite missions such as the missions TRISHNA (CNES and ISRO), LSTM and next generation Sentinel-2 (ESA), and CHANGE (NASA). Note de contenu : General introduction
1- Radiometry and radiative transfer
2- Numerical models for radiative transfer
3- DART-Lux: theory and implementation
4- Modelling of atmospheric effects
5- Modelling of polarization
Conclusion and perspectivesNuméro de notice : 24106 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Thèse française Organisme de stage : CESBIO DOI : sans En ligne : https://www.theses.fr/2022TOU30173 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=103060 Deep learning based 2D and 3D object detection and tracking on monocular video in the context of autonomous vehicles / Zhujun Xu (2022)
Titre : Deep learning based 2D and 3D object detection and tracking on monocular video in the context of autonomous vehicles Type de document : Thèse/HDR Auteurs : Zhujun Xu, Auteur ; Eric Chaumette, Directeur de thèse ; Damien Vivet, Directeur de thèse Editeur : Toulouse : Université de Toulouse Année de publication : 2022 Importance : 136 p. Format : 21 x 30 cm Note générale : bibliographie
Thèse en vue de l'obtention du Doctorat de l'Université de Toulouse, spécialité Informatique et TélécommunicationsLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] apprentissage profond
[Termes IGN] apprentissage semi-dirigé
[Termes IGN] architecture de réseau
[Termes IGN] détection d'objet
[Termes IGN] échantillonnage de données
[Termes IGN] objet 3D
[Termes IGN] segmentation d'image
[Termes IGN] véhicule automobile
[Termes IGN] vidéo
[Termes IGN] vision par ordinateurIndex. décimale : THESE Thèses et HDR Résumé : (auteur) The objective of this thesis is to develop deep learning based 2D and 3D object detection and tracking methods on monocular video and apply them to the context of autonomous vehicles. Actually, when directly using still image detectors to process a video stream, the accuracy suffers from sampled image quality problems. Moreover, generating 3D annotations is time-consuming and expensive due to the data fusion and large numbers of frames. We therefore take advantage of the temporal information in videos such as the object consistency, to improve the performance. The methods should not introduce too much extra computational burden, since the autonomous vehicle demands a real-time performance.Multiple methods can be involved in different steps, for example, data preparation, network architecture and post-processing. First, we propose a post-processing method called heatmap propagation based on a one-stage detector CenterNet for video object detection. Our method propagates the previous reliable long-term detection in the form of heatmap to the upcoming frame. Then, to distinguish different objects of the same class, we propose a frame-to-frame network architecture for video instance segmentation by using the instance sequence queries. The tracking of instances is achieved without extra post-processing for data association. Finally, we propose a semi-supervised learning method to generate 3D annotations for 2D video object tracking dataset. This helps to enrich the training process for 3D object detection. Each of the three methods can be individually applied to leverage image detectors to video applications. We also propose two complete network structures to solve 2D and 3D object detection and tracking on monocular video. Note de contenu : 1- Introduction
2- Video object detection avec la heatmap propagation (propagation de carte de chaleur)
3- Video instance segmentation with instance sequence queries
4- Semi-supervised learning of monocular 3D object detection with 2D video tracking annotations
5- Conclusions and perspectivesNuméro de notice : 24072 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Thèse française Note de thèse : Thèse de Doctorat : Informatique et Télécommunications : Toulouse : 2022 DOI : sans En ligne : https://www.theses.fr/2022ESAE0019 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102136
Titre : Deep learning based 3D reconstruction: supervision and representation Type de document : Thèse/HDR Auteurs : François Darmon, Auteur ; Pascal Monasse, Directeur de thèse ; Mathieu Aubry, Directeur de thèse Editeur : Champs-sur-Marne : Ecole des Ponts ParisTech Année de publication : 2022 Importance : 115 p. Format : 21 x 30 cm Note générale : Bibliographie
Thèse de doctorat de l'Ecole des Ponts ParisTech, spécialité informatiqueLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] appariement d'images
[Termes IGN] carte de profondeur
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] extraction
[Termes IGN] géométrie épipolaire
[Termes IGN] maillage
[Termes IGN] modèle stéréoscopique
[Termes IGN] point d'intérêt
[Termes IGN] Ransac (algorithme)
[Termes IGN] reconstruction 3D
[Termes IGN] reconstruction d'objet
[Termes IGN] semis de points
[Termes IGN] SIFT (algorithme)
[Termes IGN] structure-from-motion
[Termes IGN] voxelIndex. décimale : THESE Thèses et HDR Résumé : (auteur) 3D reconstruction is a long standing problem in computer vision. Yet, state-of-the-art methods still struggle when the images used have large illumination changes, many occlusions or limited textures. Deep Learning holds promises of improving 3D reconstruction in such setups, but classical methods still produce the best results. In this thesis we analyse the specificity of deep learning applied to multiview 3D reconstruction and introduce new deep learning based methods.The first contribution of this thesis is an analysis of the possible supervision for training Deep Learning models for sparse image matching. We introduce a two-step algorithm that first computes low resolution matches using deep learning and then matches classical local features inside the matches regions. We analyze several levels of supervision and show that our new epipolar supervision leads to the best results.The second contribution is also a study of supervision for Deep Learning but applied to another scenario: calibrated 3D reconstruction in the wild. We show that existing unsupervised methods do not work on such data and we introduce a new training technique that solves this issue. We then exhaustively compare unsupervised approach and supervised approaches with different network architectures and training data.Finally, our third contribution is about data representation. Neural implicit representation were recently used for image rendering. We adapt this representation to the multiview reconstruction problem and we introduce a new method that, similar to classical 3D reconstruction techniques, optimizes photo-consistency between projections of multiple images. Our approach outperforms state-of-the-art by a large margin. Note de contenu : 1- Introduction
2- Background
3- Deep learning for guiding keypoint matching
4- Deep Learning based Multi-View Stereo in the wild
5- Multi-view reconstruction with implicit surfaces and patch warping
6- ConclusionNuméro de notice : 24085 Affiliation des auteurs : non IGN Thématique : IMAGERIE/INFORMATIQUE Nature : Thèse française Note de thèse : Thèse de Doctorat : Informatique : Ponts ParisTech : 2022 Organisme de stage : Laboratoire d'Informatique Gaspard-Monge LIGM DOI : sans En ligne : https://www.theses.fr/2022ENPC0024 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102473 PermalinkPermalinkPermalinkDetection of windthrown tree stems on UAV-orthomosaics using U-Net convolutional networks / Stefan Reder in Remote sensing, vol 14 n° 1 (January-1 2022)PermalinkDevelopment of object detectors for satellite images by deep learning / Alissa Kouraeva (2022)PermalinkPermalinkEffective triplet mining improves training of multi-scale pooled CNN for image retrieval / Federico Vaccaro in Machine Vision and Applications, vol 33 n° 1 (January 2022)PermalinkÉléments pour l'analyse et le traitement d'images : application à l'estimation de la qualité du bois / Rémy Decelle (2022)PermalinkPermalinkExploring data fusion for multi-object detection for intelligent transportation systems using deep learning / Amira Mimouna (2022)Permalink