Descripteur
Termes IGN > mathématiques > statistique mathématique > analyse de données > segmentation > segmentation sémantique
segmentation sémantiqueSynonyme(s)étiquetage sémantique étiquetage de données |
Documents disponibles dans cette catégorie (204)
Ajouter le résultat dans votre panier
Visionner les documents numériques
Affiner la recherche Interroger des sources externes
Etendre la recherche sur niveau(x) vers le bas
Exploration of reinforcement learning algorithms for autonomous vehicle visual perception and control / Florence Carton (2021)
Titre : Exploration of reinforcement learning algorithms for autonomous vehicle visual perception and control Titre original : Exploration des algorithmes d'apprentissage par renforcement pour la perception et le controle d'un véhicule autonome par vision Type de document : Thèse/HDR Auteurs : Florence Carton, Auteur ; David Filliat, Directeur de thèse Editeur : Paris : Ecole Nationale Supérieure des Techniques Avancées ENSTA Année de publication : 2021 Importance : 173 p. Format : 21 x 30 cm Note générale : bibliographie
Thèse de Doctorat de l’Institut Polytechnique de Paris, Spécialité : Informatique, Données, IALangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Intelligence artificielle
[Termes IGN] apprentissage par renforcement
[Termes IGN] classification dirigée
[Termes IGN] instrument embarqué
[Termes IGN] navigation autonome
[Termes IGN] reconnaissance de formes
[Termes IGN] réseau neuronal profond
[Termes IGN] robot mobile
[Termes IGN] segmentation sémantique
[Termes IGN] vision par ordinateurIndex. décimale : THESE Thèses et HDR Résumé : (auteur) Reinforcement learning is an approach to solve a sequential decision making problem. In this formalism, an autonomous agent interacts with an environment and receives rewards based on the decisions it makes. The goal of the agent is to maximize the total amount of rewards it receives. In the reinforcement learning paradigm, the agent learns by trial and error the policy (sequence of actions) that yields the best rewards.In this thesis, we focus on its application to the perception and control of an autonomous vehicle. To stay close to human driving, only the onboard camera is used as input sensor. We focus in particular on end-to-end training, i.e. a direct mapping between information from the environment and the action chosen by the agent. However, training end-to-end reinforcement learning for autonomous driving poses some challenges: the large dimensions of the state and action spaces as well as the instability and weakness of the reinforcement learning signal to train deep neural networks.The approaches we implemented are based on the use of semantic information (image segmentation). In particular, this work explores the joint training of semantic information and navigation.We show that these methods are promising and allow to overcome some limitations. On the one hand, combining segmentation supervised learning with navigation reinforcement learning improves the performance of the agent and its ability to generalize to an unknown environment. On the other hand, it enables to train an agent that will be more robust to unexpected events and able to make decisions limiting the risks.Experiments are conducted in simulation, and numerous comparisons with state of the art methods are made. Note de contenu : 1- Introduction
2- Supervised learning and reinforcement learning background
3- State of the art
4- End-to-end autonomous driving on circuit with reinforcement learning
5- From lane following to robust conditional driving
6- Exploration of methods to reduce overfit
7- ConclusionNuméro de notice : 28325 Affiliation des auteurs : non IGN Thématique : INFORMATIQUE Nature : Thèse étrangère Note de thèse : Thèse de Doctorat : Informatique, Données, IA : ENSTA : 2021 DOI : sans En ligne : https://tel.hal.science/tel-03273748/ Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=98363 FuNet: A novel road extraction network with fusion of location data and remote sensing imagery / Kai Zhou in ISPRS International journal of geo-information, vol 10 n° 1 (January 2021)
[article]
Titre : FuNet: A novel road extraction network with fusion of location data and remote sensing imagery Type de document : Article/Communication Auteurs : Kai Zhou, Auteur ; Yan Xie, Auteur ; Zhan Gao, Auteur ; et al., Auteur Année de publication : 2021 Article en page(s) : n° 10 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] amélioration du contraste
[Termes IGN] apprentissage profond
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] connexité (topologie)
[Termes IGN] extraction du réseau routier
[Termes IGN] fusion d'images
[Termes IGN] itération
[Termes IGN] Pékin (Chine)
[Termes IGN] segmentation sémantiqueRésumé : (auteur) Road semantic segmentation is unique and difficult. Road extraction from remote sensing imagery often produce fragmented road segments leading to road network disconnection due to the occlusion of trees, buildings, shadows, cloud, etc. In this paper, we propose a novel fusion network (FuNet) with fusion of remote sensing imagery and location data, which plays an important role of location data in road connectivity reasoning. A universal iteration reinforcement (IteR) module is embedded into FuNet to enhance the ability of network learning. We designed the IteR formula to repeatedly integrate original information and prediction information and designed the reinforcement loss function to control the accuracy of road prediction output. Another contribution of this paper is the use of histogram equalization data pre-processing to enhance image contrast and improve the accuracy by nearly 1%. We take the excellent D-LinkNet as the backbone network, designing experiments based on the open dataset. The experiment result shows that our method improves over the compared advanced road extraction methods, which not only increases the accuracy of road extraction, but also improves the road topological connectivity. Numéro de notice : A2021-147 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.3390/ijgi10010039 Date de publication en ligne : 19/01/2021 En ligne : https://doi.org/10.3390/ijgi10010039 Format de la ressource électronique : url article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=97055
in ISPRS International journal of geo-information > vol 10 n° 1 (January 2021) . - n° 10[article]Geometric and semantic joint approach for the reconstruction of digital models of buildings / Pierre-Alain Langlois (2021)
Titre : Geometric and semantic joint approach for the reconstruction of digital models of buildings Type de document : Thèse/HDR Auteurs : Pierre-Alain Langlois, Auteur ; Renaud Marlet, Directeur de thèse ; Alexandre Boulch, Directeur de thèse Editeur : Champs-sur-Marne : Ecole des Ponts ParisTech Année de publication : 2021 Importance : 131 p. Format : 21 x 30 cm Note générale : Bibliographie
Thèse de doctorat de l’Ecole des Ponts ParisTech, Spécialité InformatiqueLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Applications photogrammétriques
[Termes IGN] détection du bâti
[Termes IGN] jeu de données localisées
[Termes IGN] modélisation 3D du bâti BIM
[Termes IGN] reconnaissance de surface
[Termes IGN] reconstruction 3D du bâti
[Termes IGN] reconstruction d'objet
[Termes IGN] segmentation sémantique
[Termes IGN] semis de points
[Termes IGN] texture d'imageIndex. décimale : THESE Thèses et HDR Résumé : (Auteur) The advent of Building Information Models (BIM) in the field of construction and city management revolutionizes the way we design, build, operate and maintain our buildings. BIM models not only include the geometric aspect of the buildings but also semantic information which identifies its logical components (walls, slabs, windows, doors, etc..). While this information is fairly reasonable to create during the building design, only 1% of the building stock is renewed each year. There is therefore an increasing need for automated methods to generate BIM models on existing buildings from sensors such as simple RGB cameras or more advanced Lidar sensors which directly provide a point cloud.In this context, the goal of this thesis is to develop approaches for BIM reconstruction, including both geometric reconstruction and semantic analysis.These tasks have been explored, but an important research effort is conducted to make them more robust to the variety of use cases found in practice.3D reconstruction is usually operated based on direct 3D acquisitions such as Lidars or using photogrammetry, i.e., using pictures to triangulate key point locations and reconstruct the surface afterward. In the context of buildings, the later case is usually limited by the presence of textureless areas which make it hard for the algorithms to find key points and to triangulate them. Moreover, some parts of the buildings might be missing from the input data because of occlusions or omission from the acquisition operator.Regarding semantics in point clouds, important ambiguities exist between semantic classes: the discontinuity between a wall and a door can be hard to distinguish; a slab, a roof and a ceiling sometimes need additional context to be disentangled.In this thesis, we present three technical contributions to address these issues.First, for 3D reconstruction of building scenes, we propose the first method to reconstruct piecewise-planar scenes from images using line segments as primitives. While wide textureless areas exist in indoor scenes (e.g., walls), making it particularly difficult to detect key points, lines are often more visible and easier to detect (e.g., change of illumination at the intersection of two walls) and therefore should be used to ensure robustness of image-based reconstruction approaches. We leverage the presence of these line segments and the possibility to detect and triangulate them. This makes the method robust to textureless surfaces, and we show that we can reconstruct scenes on which point-based methods fail.The second contribution is more theoretical and addresses the problem of mesh reconstruction from multiple calibrated images of low resolution. In this context, traditional methods completely fail and directly learning priors on a large scale dataset of 3D shapes allows us to still perform reconstruction. More specifically, our method uses the learned priors to provide an initial rough shape which is further refined by incorporating geometric constraints. Our method directly outputs a mesh and competes with state of the art methods which can only output a noisy point cloud.Finally, the third technical contribution is VASAD, a dataset for volumetric and semantic reconstruction, which we created from raw BIM models available online. It is the first large scale dataset (62000m²) to offer both geometric and semantic annotation at point and mesh level. With this dataset, we propose two methods to jointly reconstruct both geometry and semantics from a point cloud and we show that the proposed dataset is challenging enough to stimulate research. Note de contenu : 1. Introduction
1.1 Motivation
1.2 Approach
1.3 Contributions
1.4 Organization of the dissertation
SURFACE RECONSTRUCTION FROM 3D LINE SEGMENTS
2. Introduction
2.1 Reconstructing textureless surfaces
2.2 Related Work
3. Method
3.1 Line extraction
3.2 Plane detection from 3D line segments
3.3 Surface reconstruction
4. Results
4.1 Datasets
4.2 Observations on the input data
4.3 Qualitative evaluation of reconstructions
4.4 Quantitative evaluation of reconstructions
4.5 Ablation study
4.6 Limitations and perspectives
4.7 Conclusion
3D RECONSTRUCTION BY PARAMETERIZED SURFACE MAPPING
5. Introduction
5.1 Learning 3D reconstruction
5.2 Related work
6. Method
6.1 Learning a Multi-View Parameterized Surface Mapping
6.2 Design choices
7. Results
7.1 Dataset
7.2 Evaluation criteria
7.3 Experimental results
7.4 Ablation study
7.5 Discussion and limitations
7.6 Conclusion
VASAD: A VOLUME AND SEMANTIC DATASET FOR BUILDING RECONSTRUCTION FROM POINT CLOUDS
8. Introduction
8.1 3D Reconstruction for buildings
8.2 Related work
9. DATASET
9.1 Building information models
9.2 Presentation of the dataset
9.3 3D representation
9.4 Point cloud simulation
9.5 Train/test split
10. Method
10.1 Reconstruction approaches
10.2 PVSRNet
10.3 Semantic Convolutional Occupancy Network
10.4 Data preparation
11. RESULTS
11.1 Metrics
11.2 Surface reconstruction
11.3 Semantic segmentation
11.4 Discussion
11.5 Conclusion
EPILOGUE
12. Conclusion
12.1 Looking back
12.2 Looking aheadNuméro de notice : 26822 Affiliation des auteurs : non IGN Thématique : IMAGERIE/URBANISME Nature : Thèse française Note de thèse : Thèse de Doctorat : informatique : Champs-Sur-Marne : 2021 Organisme de stage : Laboratoire d'Informatique Gaspard Monge LIGM nature-HAL : Thèse DOI : sans Date de publication en ligne : 11/04/2022 En ligne : https://tel.hal.science/tel-03637158/ Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=100564 Geometric computer vision: omnidirectional visual and remotely sensed data analysis / Pouria Babahajiani (2021)
Titre : Geometric computer vision: omnidirectional visual and remotely sensed data analysis Type de document : Thèse/HDR Auteurs : Pouria Babahajiani, Auteur ; Moncef Gabbouj, Directeur de thèse Editeur : Tampere [Finlande] : Tampere University Année de publication : 2021 Importance : 147 p. Format : 21 x 30 cm ISBN/ISSN/EAN : 978-952-03-1979-3 Note générale : bibliographie
Accademic Dissertation, Tampere University, Faculty of Information Technology and Communication Sciences FinlandLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Applications photogrammétriques
[Termes IGN] apprentissage automatique
[Termes IGN] chaîne de traitement
[Termes IGN] données lidar
[Termes IGN] données localisées 3D
[Termes IGN] effet de profondeur cinétique
[Termes IGN] espace public
[Termes IGN] extraction de traits caractéristiques
[Termes IGN] image panoramique
[Termes IGN] image Streetview
[Termes IGN] image terrestre
[Termes IGN] modèle 3D de l'espace urbain
[Termes IGN] modèle sémantique de données
[Termes IGN] réalité virtuelle
[Termes IGN] scène urbaine
[Termes IGN] segmentation sémantique
[Termes IGN] semis de points
[Termes IGN] vision par ordinateur
[Termes IGN] zone urbaineIndex. décimale : THESE Thèses et HDR Résumé : (auteur) Information about the surrounding environment perceived by the human eye is one of the most important cues enabled by sight. The scientific community has put a great effort throughout time to develop methods for scene acquisition and scene understanding using computer vision techniques. The goal of this thesis is to study geometry in computer vision and its applications. In computer vision, geometry describes the topological structure of the environment. Specifically, it concerns measures such as shape, volume, depth, pose, disparity, motion, and optical flow, all of which are essential cues in scene acquisition and understanding.
This thesis focuses on two primary objectives. The first is to assess the feasibility of creating semantic models of urban areas and public spaces using geometrical features coming from LiDAR sensors. The second objective is to develop a practical Virtual Reality (VR) video representation that supports 6-Degrees-of-Freedom (DoF) head motion parallax using geometric computer vision and machine learning. The thesis’s first contribution is the proposal of semantic segmentation of the 3D LiDAR point cloud and its applications. The ever-growing demand for reliable mapping data, especially in urban environments, has motivated mobile mapping systems’ development. These systems acquire high precision data and, in particular 3D LiDAR point clouds and optical images. A large amount of data and their diversity make data processing a complex task. A complete urban map data processing pipeline has been developed, which annotates 3D LiDAR points with semantic labels. The proposed method is made efficient by combining fast rule-based processing for building and street surface segmentation and super-voxel-based feature extraction and classification for the remaining map elements (cars, pedestrians, trees, and traffic signs). Based on the experiments, the rule-based processing stage provides substantial improvement not only in computational time but also in classification accuracy. Furthermore, two back ends are developed for semantically labeled data that exemplify two important applications: (1) 3D high definition urban map that reconstructs a realistic 3D model using input labeled point cloud, and (2) semantic segmentation of 2D street view images. The second contribution of the thesis is the development of a practical, fast, and robust method to create high-resolution Depth-Augmented Stereo Panoramas (DASP) from a 360-degree VR camera. A novel and complete optical flow-based pipeline is developed, which provides stereo 360-views of a real-world scene with DASP. The system consists of a texture and depth panorama for each eye. A bi-directional flow estimation network is explicitly designed for stitching and stereo depth estimation, which yields state-of-the-art results with a limited run-time budget. The proposed architecture explicitly leverages geometry by getting both optical flow ground-truths. Building architectures that use this knowledge simplifies the learning problem. Moreover, a 6-DoF testbed for immersive content quality assessment is proposed. Modern machine learning techniques have been used to design the proposed architectures addressing many core computer vision problems by exploiting the enriched information coming from 3D scene structures. The architectures proposed in this thesis are practical systems that impact today’s technologies, including autonomous vehicles, virtual reality, augmented reality, robots, and smart-city infrastructures.Note de contenu : 1- Introduction
2- Geometry in Computer Vision
3- Contributions
4- ConclusionNuméro de notice : 28323 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Thèse étrangère Note de thèse : PhD Thesis : Computing and Electrical Engineering : Tempere, Finland : 2021 DOI : sans En ligne : https://trepo.tuni.fi/handle/10024/131379 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=98342 LANet: Local attention embedding to improve the semantic segmentation of remote sensing images / Lei Ding in IEEE Transactions on geoscience and remote sensing, vol 59 n° 1 (January 2021)
[article]
Titre : LANet: Local attention embedding to improve the semantic segmentation of remote sensing images Type de document : Article/Communication Auteurs : Lei Ding, Auteur ; Hao Tang, Auteur ; Lorenzo Bruzzone, Auteur Année de publication : 2021 Article en page(s) : pp 426 - 435 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] analyse de données
[Termes IGN] apprentissage profond
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] décodage
[Termes IGN] distribution spatiale
[Termes IGN] extraction de traits caractéristiques
[Termes IGN] segmentation sémantiqueRésumé : (auteur) The trade-off between feature representation power and spatial localization accuracy is crucial for the dense classification/semantic segmentation of remote sensing images (RSIs). High-level features extracted from the late layers of a neural network are rich in semantic information, yet have blurred spatial details; low-level features extracted from the early layers of a network contain more pixel-level information but are isolated and noisy. It is therefore difficult to bridge the gap between high- and low-level features due to their difference in terms of physical information content and spatial distribution. In this article, we contribute to solve this problem by enhancing the feature representation in two ways. On the one hand, a patch attention module (PAM) is proposed to enhance the embedding of context information based on a patchwise calculation of local attention. On the other hand, an attention embedding module (AEM) is proposed to enrich the semantic information of low-level features by embedding local focus from high-level features. Both proposed modules are lightweight and can be applied to process the extracted features of convolutional neural networks (CNNs). Experiments show that, by integrating the proposed modules into a baseline fully convolutional network (FCN), the resulting local attention network (LANet) greatly improves the performance over the baseline and outperforms other attention-based methods on two RSI data sets. Numéro de notice : A2021-035 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.1109/TGRS.2020.2994150 Date de publication en ligne : 27/05/2020 En ligne : https://doi.org/10.1109/TGRS.2020.2994150 Format de la ressource électronique : url article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=96737
in IEEE Transactions on geoscience and remote sensing > vol 59 n° 1 (January 2021) . - pp 426 - 435[article]Leveraging class hierarchies with metric-guided prototype learning / Vivien Sainte Fare Garnot (2021)PermalinkPermalinkPermalinkPermalinkPanoptic segmentation of satellite image time series with convolutional temporal attention networks / Vivien Sainte Fare Garnot (2021)PermalinkReal-time multimodal semantic scene understanding for autonomous UGV navigation / Yifei Zhang (2021)PermalinkSemantic segmentation of sea ice type on Sentinel-1 SAR data using convolutional neural networks / Alissa Kouraeva (2021)PermalinkSherloc: a knowledge-driven algorithm for geolocating microblog messages at sub-city level / Laura Di Rocco in International journal of geographical information science IJGIS, vol 35 n° 1 (January 2021)PermalinkSupplementary material for: Panoptic segmentation of satellite image time series with convolutional temporal attention networks / Vivien Sainte Fare Garnot (2021)PermalinkAutomated labeling of schematic maps by optimization with knowledge acquired from existing maps / Tian Lan in Transactions in GIS, Vol 24 n° 6 (December 2020)PermalinkMapping forest tree species in high resolution UAV-based RGB-imagery by means of convolutional neural networks / Felix Schiefer in ISPRS Journal of photogrammetry and remote sensing, vol 170 (December 2020)PermalinkMS-RRFSegNetMultiscale regional relation feature segmentation network for semantic segmentation of urban scene point clouds / Haifeng Luo in IEEE Transactions on geoscience and remote sensing, Vol 58 n° 12 (December 2020)PermalinkParsing very high resolution urban scene images by learning deep ConvNets with edge-aware loss / Xianwei Zheng in ISPRS Journal of photogrammetry and remote sensing, vol 170 (December 2020)PermalinkSemantic trajectory segmentation based on change-point detection and ontology / Yuan Gao in International journal of geographical information science IJGIS, vol 34 n° 12 (December 2020)PermalinkSemi-supervised PolSAR image classification based on improved tri-training with a minimum spanning tree / Shuang Wang in IEEE Transactions on geoscience and remote sensing, Vol 58 n° 12 (December 2020)PermalinkTowards a new generation of digital cartography: The development of neocartography and the geoweb / Marina Tavra in Cartographica, vol 55 n° 4 (Winter 2020)PermalinkUnsupervised deep joint segmentation of multitemporal high-resolution images / Sudipan Saha in IEEE Transactions on geoscience and remote sensing, Vol 58 n° 12 (December 2020)PermalinkActive and incremental learning for semantic ALS point cloud segmentation / Yaping Lin in ISPRS Journal of photogrammetry and remote sensing, vol 169 (November 2020)PermalinkEvaluating geo-tagged Twitter data to analyze tourist flows in Styria, Austria / Johannes Scholz in ISPRS International journal of geo-information, vol 9 n° 11 (November 2020)PermalinkHigh-resolution remote sensing image scene classification via key filter bank based on convolutional neural network / Fengpeng Li in IEEE Transactions on geoscience and remote sensing, vol 58 n° 11 (November 2020)Permalink