Descripteur
Documents disponibles dans cette catégorie (12)
Ajouter le résultat dans votre panier Affiner la recherche Interroger des sources externes
Etendre la recherche sur niveau(x) vers le bas
3D target detection using dual domain attention and SIFT operator in indoor scenes / Hanshuo Zhao in The Visual Computer, vol 38 n° 11 (November 2022)
[article]
Titre : 3D target detection using dual domain attention and SIFT operator in indoor scenes Type de document : Article/Communication Auteurs : Hanshuo Zhao, Auteur ; Dedong Yang, Auteur ; Jiankang Yu, Auteur Année de publication : 2022 Article en page(s) : pp3765 - 3774 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] attention (apprentissage automatique)
[Termes IGN] détection d'objet
[Termes IGN] détection de cible
[Termes IGN] jeu de données
[Termes IGN] objet 3D
[Termes IGN] scène intérieure
[Termes IGN] SIFT (algorithme)Résumé : (auteur) In a large number of real-life scenes and practical applications, 3D object detection is playing an increasingly important role. We need to estimate the position and direction of the 3D object in the real scene to complete the 3D object detection task. In this paper, we propose a new network architecture based on VoteNet to detect 3D point cloud targets. On the one hand, we use channel and spatial dual-domain attention module to enhance the features of the object to be detected while suppressing other useless features. On the other hand, the SIFT operator has scale invariance and the ability to resist occlusion and background interference. The PointSIFT module we use can capture information in different directions of point cloud in space, and is robust to shapes of different proportions, so as to better detect objects that are partially occluded. Our method is evaluated on the SUN-RGBD and ScanNet datasets of indoor scenes. The experimental results show that our method has better performance than VoteNet. Numéro de notice : A2022-840 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article DOI : 10.1007/s00371-021-02217-z Date de publication en ligne : 28/06/2021 En ligne : https://doi.org/10.1007/s00371-021-02217-z Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102042
in The Visual Computer > vol 38 n° 11 (November 2022) . - pp3765 - 3774[article]Unsupervised multi-view CNN for salient view selection and 3D interest point detection / Ran Song in International journal of computer vision, vol 130 n° 5 (May 2022)
[article]
Titre : Unsupervised multi-view CNN for salient view selection and 3D interest point detection Type de document : Article/Communication Auteurs : Ran Song, Auteur ; Wei Zhang, Auteur ; Yitian Zhao, Auteur ; et al., Auteur Année de publication : 2022 Article en page(s) : pp 1210 - 1227 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] apprentissage profond
[Termes IGN] classification non dirigée
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] détection d'objet
[Termes IGN] objet 3D
[Termes IGN] point d'intérêt
[Termes IGN] saillanceRésumé : (auteur) We present an unsupervised 3D deep learning framework based on a ubiquitously true proposition named by us view-object consistency as it states that a 3D object and its projected 2D views always belong to the same object class. To validate its effectiveness, we design a multi-view CNN instantiating it for salient view selection and interest point detection of 3D objects, which quintessentially cannot be handled by supervised learning due to the difficulty of collecting sufficient and consistent training data. Our unsupervised multi-view CNN, namely UMVCNN, branches off two channels which encode the knowledge within each 2D view and the 3D object respectively and also exploits both intra-view and inter-view knowledge of the object. It ends with a new loss layer which formulates the view-object consistency by impelling the two channels to generate consistent classification outcomes. The UMVCNN is then integrated with a global distinction adjustment scheme to incorporate global cues into salient view selection. We evaluate our method for salient view section both qualitatively and quantitatively, demonstrating its superiority over several state-of-the-art methods. In addition, we showcase that our method can be used to select salient views of 3D scenes containing multiple objects. We also develop a method based on the UMVCNN for 3D interest point detection and conduct comparative evaluations on a publicly available benchmark, which shows that the UMVCNN is amenable to different 3D shape understanding tasks. Numéro de notice : A2022-415 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article DOI : 10.1007/s11263-022-01592-x Date de publication en ligne : 16/03/2022 En ligne : https://doi.org/10.1007/s11263-022-01592-x Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=100771
in International journal of computer vision > vol 130 n° 5 (May 2022) . - pp 1210 - 1227[article]GeoRec: Geometry-enhanced semantic 3D reconstruction of RGB-D indoor scenes / Linxi Huan in ISPRS Journal of photogrammetry and remote sensing, vol 186 (April 2022)
[article]
Titre : GeoRec: Geometry-enhanced semantic 3D reconstruction of RGB-D indoor scenes Type de document : Article/Communication Auteurs : Linxi Huan, Auteur ; Xianwei Zheng, Auteur ; Jianya Gong, Auteur Année de publication : 2022 Article en page(s) : pp 301 - 314 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] apprentissage profond
[Termes IGN] données localisées 3D
[Termes IGN] géométrie
[Termes IGN] image RVB
[Termes IGN] maillage
[Termes IGN] modélisation sémantique
[Termes IGN] objet 3D
[Termes IGN] reconstruction 3D
[Termes IGN] reconstruction d'objet
[Termes IGN] scène intérieureRésumé : (auteur) Semantic indoor 3D modeling with multi-task deep neural networks is an efficient and low-cost way for reconstructing an indoor scene with geometrically complete room structure and semantic 3D individuals. Challenged by the complexity and clutter of indoor scenarios, the semantic reconstruction quality of current methods is still limited by the insufficient exploration and learning of 3D geometry information. To this end, this paper proposes an end-to-end multi-task neural network for geometry-enhanced semantic 3D reconstruction of RGB-D indoor scenes (termed as GeoRec). In the proposed GeoRec, we build a geometry extractor that can effectively learn geometry-enhanced feature representation from depth data, to improve the estimation accuracy of layout, camera pose and 3D object bounding boxes. We also introduce a novel object mesh generator that strengthens the reconstruction robustness of GeoRec to indoor occlusion with geometry-enhanced implicit shape embedding. With the parsed scene semantics and geometries, the proposed GeoRec reconstructs an indoor scene by placing reconstructed object mesh models with 3D object detection results in the estimated layout cuboid. Extensive experiments conducted on two benchmark datasets show that the proposed GeoRec yields outstanding performance with mean chamfer distance error for object reconstruction on the challenging Pix3D dataset, 70.45% mAP for 3D object detection and 77.1% 3D mIoU for layout estimation on the commonly-used SUN RGB-D dataset. Especially, the mesh reconstruction sub-network of GeoRec trained on Pix3D can be directly transferred to SUN RGB-D without any fine-tuning, manifesting a high generalization ability. Numéro de notice : A2022-235 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article DOI : 10.1016/j.isprsjprs.2022.02.014 Date de publication en ligne : 03/03/2022 En ligne : https://doi.org/10.1016/j.isprsjprs.2022.02.014 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=100139
in ISPRS Journal of photogrammetry and remote sensing > vol 186 (April 2022) . - pp 301 - 314[article]Exemplaires(3)
Code-barres Cote Support Localisation Section Disponibilité 081-2022041 SL Revue Centre de documentation Revues en salle Disponible 081-2022043 DEP-RECP Revue LASTIG Dépôt en unité Exclu du prêt 081-2022042 DEP-RECF Revue Nancy Dépôt en unité Exclu du prêt Deep learning based 2D and 3D object detection and tracking on monocular video in the context of autonomous vehicles / Zhujun Xu (2022)
Titre : Deep learning based 2D and 3D object detection and tracking on monocular video in the context of autonomous vehicles Type de document : Thèse/HDR Auteurs : Zhujun Xu, Auteur ; Eric Chaumette, Directeur de thèse ; Damien Vivet, Directeur de thèse Editeur : Toulouse : Université de Toulouse Année de publication : 2022 Importance : 136 p. Format : 21 x 30 cm Note générale : bibliographie
Thèse en vue de l'obtention du Doctorat de l'Université de Toulouse, spécialité Informatique et TélécommunicationsLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] apprentissage profond
[Termes IGN] apprentissage semi-dirigé
[Termes IGN] architecture de réseau
[Termes IGN] détection d'objet
[Termes IGN] échantillonnage de données
[Termes IGN] objet 3D
[Termes IGN] segmentation d'image
[Termes IGN] véhicule automobile
[Termes IGN] vidéo
[Termes IGN] vision par ordinateurIndex. décimale : THESE Thèses et HDR Résumé : (auteur) The objective of this thesis is to develop deep learning based 2D and 3D object detection and tracking methods on monocular video and apply them to the context of autonomous vehicles. Actually, when directly using still image detectors to process a video stream, the accuracy suffers from sampled image quality problems. Moreover, generating 3D annotations is time-consuming and expensive due to the data fusion and large numbers of frames. We therefore take advantage of the temporal information in videos such as the object consistency, to improve the performance. The methods should not introduce too much extra computational burden, since the autonomous vehicle demands a real-time performance.Multiple methods can be involved in different steps, for example, data preparation, network architecture and post-processing. First, we propose a post-processing method called heatmap propagation based on a one-stage detector CenterNet for video object detection. Our method propagates the previous reliable long-term detection in the form of heatmap to the upcoming frame. Then, to distinguish different objects of the same class, we propose a frame-to-frame network architecture for video instance segmentation by using the instance sequence queries. The tracking of instances is achieved without extra post-processing for data association. Finally, we propose a semi-supervised learning method to generate 3D annotations for 2D video object tracking dataset. This helps to enrich the training process for 3D object detection. Each of the three methods can be individually applied to leverage image detectors to video applications. We also propose two complete network structures to solve 2D and 3D object detection and tracking on monocular video. Note de contenu : 1- Introduction
2- Video object detection avec la heatmap propagation (propagation de carte de chaleur)
3- Video instance segmentation with instance sequence queries
4- Semi-supervised learning of monocular 3D object detection with 2D video tracking annotations
5- Conclusions and perspectivesNuméro de notice : 24072 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Thèse française Note de thèse : Thèse de Doctorat : Informatique et Télécommunications : Toulouse : 2022 DOI : sans En ligne : https://www.theses.fr/2022ESAE0019 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102136 An anchor-based graph method for detecting and classifying indoor objects from cluttered 3D point clouds / Fei Su in ISPRS Journal of photogrammetry and remote sensing, vol 172 (February 2021)
[article]
Titre : An anchor-based graph method for detecting and classifying indoor objects from cluttered 3D point clouds Type de document : Article/Communication Auteurs : Fei Su, Auteur ; Haihong Zhu, Auteur ; Taoyi Chen, Auteur Année de publication : 2021 Article en page(s) : pp 114 - 131 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Lasergrammétrie
[Termes IGN] adjacence
[Termes IGN] appariement de graphes
[Termes IGN] arc
[Termes IGN] bloc d'ancrage
[Termes IGN] classification orientée objet
[Termes IGN] données lidar
[Termes IGN] données localisées 3D
[Termes IGN] jeu de données localisées
[Termes IGN] méthode du maximum de vraisemblance (estimation)
[Termes IGN] noeud
[Termes IGN] objet 3D
[Termes IGN] orientation
[Termes IGN] positionnement en intérieur
[Termes IGN] semis de pointsRésumé : (auteur) Most of the existing 3D indoor object classification methods have shown impressive achievements on the assumption that all objects are oriented in the upward direction with respect to the ground. To release this assumption, great effort has been made to handle arbitrarily oriented objects in terrestrial laser scanning (TLS) point clouds. As one of the most promising solutions, anchor-based graphs can be used to classify freely oriented objects. However, this approach suffers from missing anchor detection since valid detection relies heavily on the completeness of an anchor’s point clouds and is sensitive to missing data. This paper presents an anchor-based graph method to detect and classify arbitrarily oriented indoor objects. The anchors of each object are extracted by the structurally adjacent relationship among parts instead of the parts’ geometric metrics. In the case of adjacency, an anchor can be correctly extracted even with missing parts since the adjacency between an anchor and other parts is retained irrespective of the area extent of the considered parts. The best graph matching is achieved by finding the optimal corresponding node-pairs in a super-graph with fully connecting nodes based on maximum likelihood. The performances of the proposed method are evaluated with three indicators (object precision, object recall and object F1-score) in seven datasets. The experimental tests demonstrate the effectiveness of dealing with TLS point clouds, RGBD point clouds and Panorama RGBD point clouds, resulting in performance scores of approximately 0.8 for object precision and recall and over 0.9 for chair precision and table recall. Numéro de notice : A2021-087 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.1016/j.isprsjprs.2020.12.007 Date de publication en ligne : 29/12/2020 En ligne : https://doi.org/10.1016/j.isprsjprs.2020.12.007 Format de la ressource électronique : url article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=96852
in ISPRS Journal of photogrammetry and remote sensing > vol 172 (February 2021) . - pp 114 - 131[article]Exemplaires(2)
Code-barres Cote Support Localisation Section Disponibilité 081-2021021 SL Revue Centre de documentation Revues en salle Disponible 081-2021022 DEP-RECF Revue Nancy Bibliothèque Nancy IFN Exclu du prêt PermalinkPermalinkClassification and segmentation of mining area objects in large-scale spares Lidar point cloud using a novel rotated density network / Yueguan Yan in ISPRS International journal of geo-information, vol 9 n° 3 (March 2020)PermalinkLearning high-level features by fusing multi-view representation of MLS point clouds for 3D object recognition in road environments / Zhipeng Luo in ISPRS Journal of photogrammetry and remote sensing, vol 150 (April 2019)PermalinkDes nouveaux moyens et des opportunités / Laurent Polidori in Géomètre, n° 2137 (juin 2016)PermalinkManifold harmonic transform and spatial relationships for partial 3D object retrieval / Nguyen-Vu Hoang (April 2014)PermalinkUne approche variationnelle pour la reconnaissance d'objets / Pascal Fua (1989)Permalink