Descripteur
Termes IGN > sciences naturelles > physique > traitement d'image > reconnaissance de formes > reconnaissance d'objets
reconnaissance d'objets |
Documents disponibles dans cette catégorie (57)



Etendre la recherche sur niveau(x) vers le bas
Exploring semantic elements for urban scene recognition: Deep integration of high-resolution imagery and OpenStreetMap (OSM) / Wenzhi Zhao in ISPRS Journal of photogrammetry and remote sensing, vol 151 (May 2019)
![]()
[article]
Titre : Exploring semantic elements for urban scene recognition: Deep integration of high-resolution imagery and OpenStreetMap (OSM) Type de document : Article/Communication Auteurs : Wenzhi Zhao, Auteur ; Yanchen Bo, Auteur ; Jiage Chen, Auteur ; et al., Auteur Année de publication : 2019 Article en page(s) : pp 237 - 250 Note générale : Bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] apprentissage profond
[Termes IGN] classe sémantique
[Termes IGN] compréhension de l'image
[Termes IGN] fusion de données
[Termes IGN] image à haute résolution
[Termes IGN] reconnaissance d'objets
[Termes IGN] scène urbaineRésumé : (Auteur) Urban scenes refer to city blocks which are basic units of megacities, they play an important role in citizens’ welfare and city management. Remote sensing imagery with largescale coverage and accurate target descriptions, has been regarded as an ideal solution for monitoring the urban environment. However, due to the heterogeneity of remote sensing images, it is difficult to access their geographical content at the object level, let alone understanding urban scenes at the block level. Recently, deep learning-based strategies have been applied to interpret urban scenes with remarkable accuracies. However, the deep neural networks require a substantial number of training samples which are hard to satisfy, especially for high-resolution images. Meanwhile, the crowed-sourced Open Street Map (OSM) data provides rich annotation information about the urban targets but may encounter the problem of insufficient sampling (limited by the places where people can go). As a result, the combination of OSM and remote sensing images for efficient urban scene recognition is urgently needed. In this paper, we present a novel strategy to transfer existing OSM data to high-resolution images for semantic element determination and urban scene understanding. To be specific, the object-based convolutional neural network (OCNN) can be utilized for geographical object detection by feeding it rich semantic elements derived from OSM data. Then, geographical objects are further delineated into their functional labels by integrating points of interest (POIs), which contain rich semantic terms, such as commercial or educational labels. Lastly, the categories of urban scenes are easily acquired from the semantic objects inside. Experimental results indicate that the proposed method has an ability to classify complex urban scenes. The classification accuracies of the Beijing dataset are as high as 91% at the object-level and 88% at the scene level. Additionally, we are probably the first to investigate the object level semantic mapping by incorporating high-resolution images and OSM data of urban areas. Consequently, the method presented is effective in delineating urban scenes that could further boost urban environment monitoring and planning with high-resolution images. Numéro de notice : A2019-209 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.1016/j.isprsjprs.2019.03.019 Date de publication en ligne : 29/03/2019 En ligne : https://doi.org/10.1016/j.isprsjprs.2019.03.019 Format de la ressource électronique : URL Article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=92675
in ISPRS Journal of photogrammetry and remote sensing > vol 151 (May 2019) . - pp 237 - 250[article]Réservation
Réserver ce documentExemplaires (3)
Code-barres Cote Support Localisation Section Disponibilité 081-2019051 RAB Revue Centre de documentation En réserve 3L Disponible 081-2019053 DEP-RECP Revue LaSTIG Dépôt en unité Exclu du prêt 081-2019052 DEP-RECF Revue Nancy Dépôt en unité Exclu du prêt Learning high-level features by fusing multi-view representation of MLS point clouds for 3D object recognition in road environments / Zhipeng Luo in ISPRS Journal of photogrammetry and remote sensing, vol 150 (April 2019)
![]()
[article]
Titre : Learning high-level features by fusing multi-view representation of MLS point clouds for 3D object recognition in road environments Type de document : Article/Communication Auteurs : Zhipeng Luo, Auteur ; Jonathan Li, Auteur ; Zhenlong Xiao, Auteur ; et al., Auteur Année de publication : 2019 Article en page(s) : pp 44 - 58 Note générale : Bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Lasergrammétrie
[Termes IGN] apprentissage profond
[Termes IGN] données lidar
[Termes IGN] données localisées 3D
[Termes IGN] extraction de traits caractéristiques
[Termes IGN] fusion de données
[Termes IGN] jointure spatiale
[Termes IGN] objet 3D
[Termes IGN] reconnaissance d'objets
[Termes IGN] représentation multiple
[Termes IGN] réseau neuronal convolutif
[Termes IGN] semis de pointsRésumé : (Auteur) Most existing 3D object recognition methods still suffer from low descriptiveness and weak robustness although remarkable progress has made in 3D computer vision. The major challenge lies in effectively mining high-level 3D shape features. This paper presents a high-level feature learning framework for 3D object recognition through fusing multiple 2D representations of point clouds. The framework has two key components: (1) three discriminative low-level 3D shape descriptors for obtaining multi-view 2D representation of 3D point clouds. These descriptors preserve both local and global spatial relationships of points from different perspectives and build a bridge between 3D point clouds and 2D Convolutional Neural Networks (CNN). (2) A two-stage fusion network, which consists of a deep feature learning module and two fusion modules, for extracting and fusing high-level features. The proposed method was tested on three datasets, one of which is Sydney Urban Objects dataset and the other two were acquired by a mobile laser scanning (MLS) system along urban roads. The results obtained from comprehensive experiments demonstrated that our method is superior to the state-of-the-art methods in descriptiveness, robustness and efficiency. Our method achieves high recognition rates of 94.6%, 93.1% and 74.9% on the above three datasets, respectively. Numéro de notice : A2019-137 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.1016/j.isprsjprs.2019.01.024 Date de publication en ligne : 16/02/2019 En ligne : https://doi.org/10.1016/j.isprsjprs.2019.01.024 Format de la ressource électronique : URL Article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=92468
in ISPRS Journal of photogrammetry and remote sensing > vol 150 (April 2019) . - pp 44 - 58[article]Réservation
Réserver ce documentExemplaires (3)
Code-barres Cote Support Localisation Section Disponibilité 081-2019041 RAB Revue Centre de documentation En réserve 3L Disponible 081-2019043 DEP-RECP Revue LaSTIG Dépôt en unité Exclu du prêt 081-2019042 DEP-RECF Revue Nancy Dépôt en unité Exclu du prêt Learning to segment moving objects / Pavel Tokmakov in International journal of computer vision, vol 127 n° 3 (March 2019)
![]()
[article]
Titre : Learning to segment moving objects Type de document : Article/Communication Auteurs : Pavel Tokmakov, Auteur ; Cordelia Schmid, Auteur ; Karteek Alahari, Auteur Année de publication : 2019 Article en page(s) : pp 282 - 301 Note générale : Bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image
[Termes IGN] apprentissage profond
[Termes IGN] cohérence temporelle
[Termes IGN] image vidéo
[Termes IGN] objet mobile
[Termes IGN] reconnaissance d'objets
[Termes IGN] réseau neuronal convolutif
[Termes IGN] séquence d'imagesRésumé : (Auteur) We study the problem of segmenting moving objects in unconstrained videos. Given a video, the task is to segment all the objects that exhibit independent motion in at least one frame. We formulate this as a learning problem and design our framework with three cues: (1) independent object motion between a pair of frames, which complements object recognition, (2) object appearance, which helps to correct errors in motion estimation, and (3) temporal consistency, which imposes additional constraints on the segmentation. The framework is a two-stream neural network with an explicit memory module. The two streams encode appearance and motion cues in a video sequence respectively, while the memory module captures the evolution of objects over time, exploiting the temporal consistency. The motion stream is a convolutional neural network trained on synthetic videos to segment independently moving objects in the optical flow field. The module to build a “visual memory” in video, i.e., a joint representation of all the video frames, is realized with a convolutional recurrent unit learned from a small number of training video sequences. For every pixel in a frame of a test video, our approach assigns an object or background label based on the learned spatio-temporal features as well as the “visual memory” specific to the video. We evaluate our method extensively on three benchmarks, DAVIS, Freiburg-Berkeley motion segmentation dataset and SegTrack. In addition, we provide an extensive ablation study to investigate both the choice of the training data and the influence of each component in the proposed framework. Numéro de notice : A2018-601 Affiliation des auteurs : non IGN Thématique : IMAGERIE/INFORMATIQUE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.1007/s11263-018-1122-2 Date de publication en ligne : 22/09/2018 En ligne : https://doi.org/10.1007/s11263-018-1122-2 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=92528
in International journal of computer vision > vol 127 n° 3 (March 2019) . - pp 282 - 301[article]Do semantic parts emerge in convolutional neural networks? / Abel Gonzalez-Garcia in International journal of computer vision, vol 126 n° 5 (May 2018)
![]()
[article]
Titre : Do semantic parts emerge in convolutional neural networks? Type de document : Article/Communication Auteurs : Abel Gonzalez-Garcia, Auteur ; Davide Modolo, Auteur ; Vittorio Ferrari, Auteur Année de publication : 2018 Article en page(s) : pp 476 - 494 Note générale : Bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Intelligence artificielle
[Termes IGN] reconnaissance d'objets
[Termes IGN] rectangle englobant minimum
[Termes IGN] réseau neuronal convolutif
[Termes IGN] segmentation sémantiqueRésumé : (Auteur) Semantic object parts can be useful for several visual recognition tasks. Lately, these tasks have been addressed using Convolutional Neural Networks (CNN), achieving outstanding results. In this work we study whether CNNs learn semantic parts in their internal representation. We investigate the responses of convolutional filters and try to associate their stimuli with semantic parts. We perform two extensive quantitative analyses. First, we use ground-truth part bounding-boxes from the PASCAL-Part dataset to determine how many of those semantic parts emerge in the CNN. We explore this emergence for different layers, network depths, and supervision levels. Second, we collect human judgements in order to study what fraction of all filters systematically fire on any semantic part, even if not annotated in PASCAL-Part. Moreover, we explore several connections between discriminative power and semantics. We find out which are the most discriminative filters for object recognition, and analyze whether they respond to semantic parts or to other image patches. We also investigate the other direction: we determine which semantic parts are the most discriminative and whether they correspond to those parts emerging in the network. This enables to gain an even deeper understanding of the role of semantic parts in the network. Numéro de notice : A2018-408 Affiliation des auteurs : non IGN Thématique : INFORMATIQUE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.1007/s11263-017-1048-0 Date de publication en ligne : 17/10/2017 En ligne : https://doi.org/10.1007/s11263-017-1048-0 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=90882
in International journal of computer vision > vol 126 n° 5 (May 2018) . - pp 476 - 494[article]
contenu dans HAL Hyper articles en ligne / Centre pour la Communication Scientifique Directe CCSD (2000)
Titre : Effective and annotation efficient deep learning for image understanding Type de document : Thèse/HDR Auteurs : Spyridon Gidaris, Auteur ; Nikos Komodakis, Directeur de thèse Editeur : Champs/Marne : Université Paris-Est Année de publication : 2018 Importance : 236 p. Format : 21 x 30 cm Note générale : bibliographie
Thèse de Doctorat de l’Université Paris-Est, Domaine : Traitement du Signal et des ImagesLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] analyse d'image numérique
[Termes IGN] apprentissage profond
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] compréhension de l'image
[Termes IGN] détection d'objet
[Termes IGN] prédiction
[Termes IGN] reconnaissance d'objets
[Termes IGN] segmentation sémantiqueRésumé : (auteur) Recent development in deep learning have achieved impressive results on image understanding tasks. However, designing deep learning architectures that will effectively solve the image understanding tasks of interest is far from trivial. Even more, the success of deep learning approaches heavily relies on the availability of large-size manually labeled (by humans) data. In this context, the objective of this dissertation is to explore deep learning based approaches for core image understanding tasks that would allow to increase the effectiveness with which they are performed as well as to make their learning process more annotation efficient, i.e., less dependent on the availability of large amounts of manually labeled training data. We first focus on improving the state-of-the-art on object detection. More specifically, we attempt to boost the ability of object detection systems to recognize (even difficult) object instances by proposing a multi-region and semantic segmentation-aware ConvNet-based representation that is able to capture a diverse set of discriminative appearance factors. Also, we aim to improve the localization accuracy of object detection systems by proposing iterative detection schemes and a novel localization model for estimating the bounding box of the objects. We demonstrate that the proposed technical novelties lead to significant improvements in the object detection performance of PASCAL and MS COCO benchmarks. Regarding the pixel-wise image labeling problem, we explored a family of deep neural network architectures that perform structured prediction by learning to (iteratively) improve some initial estimates of the output labels. The goal is to identify which is the optimal architecture for implementing such deep structured prediction models. In this context, we propose to decompose the label improvement task into three steps: 1) detecting the initial label estimates that are incorrect, 2) replacing the incorrect labels with new ones, and finally 3) refining the renewed labels by predicting residual corrections w.r.t. them. We evaluate the explored architectures on the disparity estimation task and we demonstrate that the proposed architecture achieves state-of-the-art results on the KITTI 2015 benchmark.In order to accomplish our goal for annotation efficient learning, we proposed a self-supervised learning approach that learns ConvNet-based image representations by training the ConvNet to recognize the 2d rotation that is applied to the image that it gets as input. We empirically demonstrate that this apparently simple task actually provides a very powerful supervisory signal for semantic feature learning. Specifically, the image features learned from this task exhibit very good results when transferred on the visual tasks of object detection and semantic segmentation, surpassing prior unsupervised learning approaches and thus narrowing the gap with the supervised case.Finally, also in the direction of annotation efficient learning, we proposed a novel few-shot object recognition system that after training is capable to dynamically learn novel categories from only a few data (e.g., only one or five training examples) while it does not forget the categories on which it was trained on. In order to implement the proposed recognition system we introduced two technical novelties, an attention based few-shot classification weight generator, and implementing the classifier of the ConvNet based recognition model as a cosine similarity function between feature representations and classification vectors. We demonstrate that the proposed approach achieved state-of-the-art results on relevant few-shot benchmarks. Note de contenu : Introduction
1- Effective deep learning for image understanding
2- Annotation deep learning for image understandingNuméro de notice : 25835 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Thèse française Note de thèse : Thèse de Doctorat : Domaine : Traitement du Signal et des Images : Paris-Est : 2018 nature-HAL : Thèse DOI : sans En ligne : http://www.theses.fr/2018PESC1143 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=95174 Localisation d'objets urbains à partir de sources multiples dont des images aériennes / Lionel Pibre (2018)
PermalinkPermalinkMachine learning and pose estimation for autonomous robot grasping with collaborative robots / Victor Talbot (2018)
PermalinkSDE: A novel selective, discriminative and equalizing feature representation for visual recognition / Guo-Sen Xie in International journal of computer vision, vol 124 n° 2 (1 September 2017)
PermalinkPermalinkUrban objects classification by spectral library: Feasibility and applications / Walid Ouerghemmi (2017)
PermalinkSparse output coding for scalable visual recognition / Bin Zhao in International journal of computer vision, vol 119 n° 1 (August 2016)
PermalinkObject classification and recognition from mobile laser scanning point clouds in a road environment / Matti Lehtomäki in IEEE Transactions on geoscience and remote sensing, vol 54 n° 2 (February 2016)
PermalinkA joint Gaussian process model for active visual recognition with expertise estimation in crowdsourcing / Chengjiang Long in International journal of computer vision, vol 116 n° 2 (15th January 2016)
PermalinkForest species recognition based on dynamic classifier selection and dissimilarity feature vector representation / J.G. Martins in Machine Vision and Applications, vol 26 n° 2-3 (April 2015)
Permalink