Descripteur
Documents disponibles dans cette catégorie (24)
Ajouter le résultat dans votre panier Affiner la recherche Interroger des sources externes
Etendre la recherche sur niveau(x) vers le bas
ChineseTR: A weakly supervised toponym recognition architecture based on automatic training data generator and deep neural network / Qinjun Qiu in Transactions in GIS, vol 26 n° 3 (May 2022)
[article]
Titre : ChineseTR: A weakly supervised toponym recognition architecture based on automatic training data generator and deep neural network Type de document : Article/Communication Auteurs : Qinjun Qiu, Auteur ; Zhong Xie, Auteur ; Shu Wang, Auteur ; et al., Auteur Année de publication : 2022 Article en page(s) : pp 1256 - 1279 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Géomatique web
[Termes IGN] apprentissage profond
[Termes IGN] Chine
[Termes IGN] classification par réseau neuronal récurrent
[Termes IGN] données d'entrainement (apprentissage automatique)
[Termes IGN] données issues des réseaux sociaux
[Termes IGN] échantillonnage de données
[Termes IGN] OpenStreetMap
[Termes IGN] reconnaissance automatique
[Termes IGN] répertoire toponymique
[Termes IGN] site wiki
[Termes IGN] toponymeRésumé : (auteur) Toponym recognition is used to extract toponyms from natural language texts, which is a fundamental task of ubiquitous geographic information applications. Existing toponym recognition methods with state-of-the-art performance mainly leverage supervised learning (i.e., deep-learning-based approaches) with parameters learned from massive, labeled datasets that must be annotated manually. This is a great inconvenience when model training needs to fit different domain texts, especially those of social media messaging. To address this issue, this article proposes a weakly supervised Chinese toponym recognition (ChineseTR) architecture that leverages a training dataset creator that generates training datasets automatically based on word collections and associated word frequencies from various texts and an extension recognizer that employs a basic bidirectional recurrent neural network based on particular features designed for toponym recognition. The results show that the proposed ChineseTR achieves a 0.76 F1 score in a corpus with a 0.718 out-of-vocabulary rate and a 0.903 in-vocabulary rate. All comparative experiments demonstrate that ChineseTR is an effective and scalable architecture that recognizes toponyms. Numéro de notice : A2022-462 Affiliation des auteurs : non IGN Thématique : GEOMATIQUE Nature : Article DOI : 10.1111/tgis.12902 Date de publication en ligne : 02/02/2022 En ligne : https://doi.org/10.1111/tgis.12902 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=100796
in Transactions in GIS > vol 26 n° 3 (May 2022) . - pp 1256 - 1279[article]Research on machine intelligent perception of urban geographic location based on high resolution remote sensing images / Jun Chen in Photogrammetric Engineering & Remote Sensing, PERS, vol 88 n° 4 (April 2022)
[article]
Titre : Research on machine intelligent perception of urban geographic location based on high resolution remote sensing images Type de document : Article/Communication Auteurs : Jun Chen, Auteur ; Cunjian Yang, Auteur ; Zengyang Yu, Auteur Année de publication : 2022 Article en page(s) : pp 223 - 231 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] base de données
[Termes IGN] Chine
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] cognition
[Termes IGN] détection d'objet
[Termes IGN] extraction automatique
[Termes IGN] géolocalisation
[Termes IGN] image à haute résolution
[Termes IGN] intelligence artificielle
[Termes IGN] reconnaissance automatique
[Termes IGN] zone urbaineRésumé : (auteur) Machine intelligent perception (MIP) provides a novel way for human beings to recognize geographical locations automatically. MIP of geographical locations enables computers to describe locations automatically and quantitatively by extracting Earth's surface features and building relationships. The earth surface fingerprint is established here by mining the relationship between spatial objects with stable characteristics extracted from urban high-resolution remote sensing images, which realizes intelligent perception of geographical location innovatively. Mask Region-based Convolutional Neural Network is used to automatically extract the spatial objects such as playgrounds, crossroads, and bridges from the images. Then, the extracted spatial objects are encoded according to the landuse type, distance, and angle of 24 nearest objects to construct urban surface fingerprint database. The urban surface fingerprint database is used to match the geographical location of spatial objects in local images so that the matching algorithm can be used for machine recognition of the geographical location of specific objects in the target image. Taking the main cities in China as the experimental area, the success rate of location perception is 92%. We have made a useful exploration in the field of MIP of geographical location, hoping to promote the development of human cognition of geographical location. Numéro de notice : A2022-285 Affiliation des auteurs : non IGN Thématique : IMAGERIE/INFORMATIQUE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.14358/PERS.21-00017R3 Date de publication en ligne : 04/04/2022 En ligne : https://doi.org/10.14358/PERS.21-00017R3 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=100319
in Photogrammetric Engineering & Remote Sensing, PERS > vol 88 n° 4 (April 2022) . - pp 223 - 231[article]Exemplaires(1)
Code-barres Cote Support Localisation Section Disponibilité 105-2022041 SL Revue Centre de documentation Revues en salle Disponible
Titre : Scene understanding and gesture recognition for human-machine interaction Type de document : Thèse/HDR Auteurs : Naina Dhingra, Auteur Editeur : Zurich : Eidgenossische Technische Hochschule ETH - Ecole Polytechnique Fédérale de Zurich EPFZ Année de publication : 2022 Note générale : Bibliographie
A dissertation submitted to attain the degree of Doctor of Sciences of ETH ZurichLangues : Français (fre) Descripteur : [Vedettes matières IGN] Intelligence artificielle
[Termes IGN] apprentissage profond
[Termes IGN] attention (apprentissage automatique)
[Termes IGN] classification orientée objet
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] classification par séparateurs à vaste marge
[Termes IGN] compréhension de l'image
[Termes IGN] image RVB
[Termes IGN] interaction homme-machine
[Termes IGN] oculométrie
[Termes IGN] reconnaissance automatique
[Termes IGN] reconnaissance de formes
[Termes IGN] reconnaissance de gestes
[Termes IGN] réseau neuronal récurrent
[Termes IGN] scène
[Termes IGN] vision par ordinateurRésumé : (auteur) Scene understanding and gesture recognition are useful for a myriad of applications such as human-robotic interaction, assisting blind and visually impaired people, advanced driver assistance systems, and autonomous driving. To work autonomously in real-world environments, automatic systems need to deliver non-verbal information to enhance the verbal communication in particular for blind people. We are exploring the holistic approach for providing the scene as well as gesture related information. We propose that incorporating attention mechanisms in neural networks which behave similarly to attention in the human brain, and conducting an integrated study using neural networks in real-time can yield significant improvements in the scene and gesture understanding, thereby enhancing the user experience. In this thesis, we investigate the understanding of visual scenes and gestures. We explore these two areas, in particular, by proposing novel architectures, training methods, user studies, and thorough evaluations. We show that, for deep learning approaches, attention or self attention mechanisms improve and push the boundaries of network performance for different tasks in consideration. We suggest that the various kinds of gestures can complement and supplement each other’s information to better understand non-verbal conversation; hence integrated gestures comprehension is useful. First, we focus on visual scene understanding using scene graph generation. We propose, BGT-Net, a new network that uses an object detection model with 1) bidirectional gated recurrent units for object-object communication and 2) transformer encoders including self attention to classify the objects and their relationships. We address the problem of bias caused by the long tailed distribution in the dataset. This enables the network to perform even for the unseen objects or relationships in the dataset. Second, we propose to learn hand gesture recognition from RGB and RGB-D videos using attention learning. We present a novel architecture based on residual connections and an attention mechanism. Our approach successfully detects hand gestures when evaluated on three open-source datasets. Third, we explore pointing gesture recognition and localization using open-source software, i.e. OpenPtrack which uses a deep learning based iii network to track multi-persons in the scene. We use a Kinect sensor as an input device and conduct a user study with 26 users to evaluate the system using two setup types. Fourth, we propose a technique to perform eye gaze tracking using OpenFace which is based on a deep learning model and RGB webcam. We use support vector machine regression to estimate the position of eye gaze on the screen. In a study, we evaluate the system with 28 users and show that this system can perform similarly to commercially expensive eye trackers. Finally, we focus on 3D head pose estimation using two models: 1)headPosr includes residual connections for the base network followed by a transformer encoder. It outperforms existing models but has a drawback of being computationally expensive; 2) lwPosr uses depthwise separable convolutions and transformer encoders. It is a two stream network in fine-grained fashion to estimate the three angles of the head pose. We demonstrate that this method is able to predict head poses better than state-of-the-art lightweight networks. Note de contenu : 1- Introduction
2- Background
3- State of the art
4- Scene graph generation
5- 3D hand gesture recognition
6- Pointing gesture recognition
7- Eye-gaze tracking
8- Head pose estimation
9- Lightweight head pose estimation
10- SummaryNuméro de notice : 24039 Affiliation des auteurs : non IGN Thématique : IMAGERIE/INFORMATIQUE Nature : Thèse étrangère Note de thèse : PhD Thesis : Sciences : ETH Zurich :2022 DOI : sans En ligne : https://www.research-collection.ethz.ch/handle/20.500.11850/559347 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=101876 A novel method based on deep learning, GIS and geomatics software for building a 3D city model from VHR satellite stereo imagery / Massimiliano Pepe in ISPRS International journal of geo-information, vol 10 n° 10 (October 2021)
[article]
Titre : A novel method based on deep learning, GIS and geomatics software for building a 3D city model from VHR satellite stereo imagery Type de document : Article/Communication Auteurs : Massimiliano Pepe, Auteur ; Domenica Costantino, Auteur ; Vincenzo Saverio Alfio, Auteur ; et al., Auteur Année de publication : 2021 Article en page(s) : n° 697 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] algorithme de Gram-Schmidt
[Termes IGN] apprentissage profond
[Termes IGN] ArcGIS
[Termes IGN] détection du bâti
[Termes IGN] empreinte
[Termes IGN] hauteur du bâti
[Termes IGN] image à très haute résolution
[Termes IGN] image Worldview
[Termes IGN] modèle 3D de l'espace urbain
[Termes IGN] modèle numérique de surface
[Termes IGN] Oman
[Termes IGN] pansharpening (fusion d'images)
[Termes IGN] reconnaissance automatique
[Termes IGN] système d'information géographiqueRésumé : (auteur) The aim of the paper is to identify a suitable method for the construction of a 3D city model from stereo satellite imagery. In order to reach this goal, it is necessary to build a workflow consisting of three main steps: (1) Increasing the geometric resolution of the color images through the use of pan-sharpening techniques, (2) identification of the buildings’ footprint through deep-learning techniques and, finally, (3) building an algorithm in GIS (Geographic Information System) for the extraction of the elevation of buildings. The developed method was applied to stereo imagery acquired by WorldView-2 (WV-2), a commercial Earth-observation satellite. The comparison of the different pan-sharpening techniques showed that the Gram–Schmidt method provided better-quality color images than the other techniques examined; this result was deduced from both the visual analysis of the orthophotos and the analysis of quality indices (RMSE, RASE and ERGAS). Subsequently, a deep-learning technique was applied for pan sharpening an image in order to extract the footprint of buildings. Performance indices (precision, recall, overall accuracy and the F1measure) showed an elevated accuracy in automatic recognition of the buildings. Finally, starting from the Digital Surface Model (DSM) generated by satellite imagery, an algorithm built in the GIS environment allowed the extraction of the building height from the elevation model. In this way, it was possible to build a 3D city model where the buildings are represented as prismatic solids with flat roofs, in a fast and precise way. Numéro de notice : A2021-801 Affiliation des auteurs : non IGN Thématique : GEOMATIQUE/IMAGERIE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.3390/ijgi10100697 Date de publication en ligne : 14/10/2021 En ligne : https://doi.org/10.3390/ijgi10100697 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=98853
in ISPRS International journal of geo-information > vol 10 n° 10 (October 2021) . - n° 697[article]Activity recognition in residential spaces with Internet of things devices and thermal imaging / Kshirasagar Naik in Sensors, vol 21 n° 3 (February 2021)
[article]
Titre : Activity recognition in residential spaces with Internet of things devices and thermal imaging Type de document : Article/Communication Auteurs : Kshirasagar Naik, Auteur ; Tejas Pandit, Auteur ; Nitin Naik, Auteur ; et al., Auteur Année de publication : 2021 Article en page(s) : n° 988 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] compréhension de l'image
[Termes IGN] contrôle par télédétection
[Termes IGN] détection d'événement
[Termes IGN] espace intérieur
[Termes IGN] image RVB
[Termes IGN] image thermique
[Termes IGN] intelligence artificielle
[Termes IGN] internet des objets
[Termes IGN] itération
[Termes IGN] modèle stéréoscopique
[Termes IGN] objet mobile
[Termes IGN] reconnaissance automatique
[Termes IGN] reconnaissance d'objets
[Termes IGN] scène 3DRésumé : (auteur) In this paper, we design algorithms for indoor activity recognition and 3D thermal model generation using thermal images, RGB images, captured from external sensors, and the internet of things setup. Indoor activity recognition deals with two sub-problems: Human activity and household activity recognition. Household activity recognition includes the recognition of electrical appliances and their heat radiation with the help of thermal images. A FLIR ONE PRO camera is used to capture RGB-thermal image pairs for a scene. Duration and pattern of activities are also determined using an iterative algorithm, to explore kitchen safety situations. For more accurate monitoring of hazardous events such as stove gas leakage, a 3D reconstruction approach is proposed to determine the temperature of all points in the 3D space of a scene. The 3D thermal model is obtained using the stereo RGB and thermal images for a particular scene. Accurate results are observed for activity detection, and a significant improvement in the temperature estimation is recorded in the 3D thermal model compared to the 2D thermal image. Results from this research can find applications in home automation, heat automation in smart homes, and energy management in residential spaces. Numéro de notice : A2021-159 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.3390/s21030988 Date de publication en ligne : 02/02/2021 En ligne : https://doi.org/10.3390/s21030988 Format de la ressource électronique : url article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=97075
in Sensors > vol 21 n° 3 (February 2021) . - n° 988[article]Multiview automatic target recognition for infrared imagery using collaborative sparse priors / Xuelu Li in IEEE Transactions on geoscience and remote sensing, vol 58 n° 10 (October 2020)PermalinkNavigation des personnes aux moyens des technologies des smartphones et des données d’environnements cartographiés / Fadoua Taia Alaoui (2018)PermalinkTrajectory-based place-recognition for efficient large scale localization / Simon Lynen in International journal of computer vision, vol 124 n° 1 (August 2017)PermalinkUnsupervised feature learning for land-use scene recognition / Jiayuan Fan in IEEE Transactions on geoscience and remote sensing, vol 55 n° 4 (April 2017)PermalinkSingle Image Super-Resolution based on Neural Networks for text and face recognition / Clément Peyrard (2017)PermalinkImage based geo-localization in the Alps / Olivier Saurer in International journal of computer vision, vol 116 n° 3 (February 2016)PermalinkFast robust large-scale mapping from video and internet photo collections / J. Frahm in ISPRS Journal of photogrammetry and remote sensing, vol 65 n° 6 (November - December 2010)PermalinkIndexation rapide de documents audio par traitement morphologique de la parole / F. Salama in Ingénierie des systèmes d'information, ISI : Revue des sciences et technologies de l'information, RSTI, vol 15 n° 2 (mars - avril 2010)PermalinkA structure recognition technique in contextual generalisation of buildings and built-up areas / Melih Basaraner in Cartographic journal (the), vol 45 n° 4 (November 2008)Permalink8es rencontres nationales des jeunes chercheurs en intelligence artificielle, RJCIA 2007, 4 - 6 juillet 2007, Grenoble, France / Bruno Zanuttini (2007)Permalink