Catalogue en ligne IGN

Nouvelle recherche

Détail de l'éditeur

Université Gustave Eiffel

localisé à :

Champs-sur-Marne

Documents disponibles chez cet éditeur (21)

Ajouter le résultat dans votre panier Affiner la recherche Interroger des sources externes

A benchmark of nested named entity recognition approaches in historical structured documents / Solenn Tual (2023)

Public

Titre : A benchmark of nested named entity recognition approaches in historical structured documents
Type de document : Article/Communication
Auteurs : Solenn Tual , Auteur ; Nathalie Abadie , Auteur ; Joseph Chazalon, Auteur ; Bertrand Duménieu , Auteur ; Edwin Carlinet, Auteur
Editeur : Champs-sur-Marne [France] : Université Gustave Eiffel
Année de publication : 2023
Projets : SODUCO / Perret, Julien
Importance : 18 p.
Format : 21 x 30 cm
Note générale : Bibliographie
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Géomatique
[Termes IGN] langage naturel (informatique)
[Termes IGN] reconnaissance de noms
[Termes IGN] traitement du langage naturel

Résumé : (Auteur) Named Entity Recognition (NER) is a key step in the creation of structured data from digitised historical documents. Traditional NER approaches deal with flat named entities, whereas entities often are nested. For example, a postal address might contain a street name and a number. This work compares three nested NER approaches, including two state-of-the-art approaches using Transformer-based architectures. We introduce a new Transformer-based approach based on joint labelling and semantic weighting of errors, evaluated on a collection of 19 th-century Paris trade directories. We evaluate approaches regarding the impact of supervised fine-tuning, unsupervised pre-training with noisy texts, and variation of IOB tagging formats. Our results show that while nested NER approaches enable extracting structured data directly, they do not benefit from the extra knowledge provided during training and reach a performance similar to the base approach on flat entities. Even though all 3 approaches perform well in terms of F1 scores, joint labelling is most suitable for hierarchically structured data. Finally, our experiments reveal the superiority of the IO tagging format on such data.
Numéro de notice : P2023-001
Affiliation des auteurs : UGE-LASTIG+Ext (2020- )
Thématique : GEOMATIQUE/TOPONYMIE
Nature : Preprint
nature-HAL : Préprint
DOI : sans
Date de publication en ligne : 20/02/2023
En ligne : https://hal.science/hal-03994759v1/document
Format de la ressource électronique : URL Article
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102602

Entry separation using a mixed visual and textual language model: Application to 19th century French trade directories / Bertrand Duménieu (2023)

Public

Titre : Entry separation using a mixed visual and textual language model: Application to 19th century French trade directories
Type de document : Article/Communication
Auteurs : Bertrand Duménieu , Auteur ; Edwin Carlinet, Auteur ; Nathalie Abadie , Auteur ; Joseph Chazalon, Auteur
Editeur : Champs-sur-Marne [France] : Université Gustave Eiffel
Année de publication : 2023
Projets : SODUCO / Perret, Julien
Importance : 20 p.
Format : 21 x 30 cm
Note générale : Bibliographie
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Géomatique
[Termes IGN] annuaire
[Termes IGN] dix-neuvième siècle
[Termes IGN] modèle de langue
[Termes IGN] reconnaissance de noms

Résumé : (Auteur) When extracting structured data from repetitively organized documents, such as dictionaries, directories, or even newspapers, a key challenge is to correctly segment what constitutes the basic text regions for the target database. Traditionally, such a problem was tackled as part of the layout analysis and was mostly based on visual clues for dividing (top-down) approaches. Some agglomerating (bottom-up) approaches started to consider textual information to link similar contents, but they required a proper over-segmentation of ne-grained units. In this work, we propose a new pragmatic approach whose eciency is demonstrated on 19 th century French Trade Directories. We propose to consider two sub-problems: coarse layout detection (text columns and reading order), which is assumed to be eective and not detailed here, and a ne-grained entry separation stage for which we propose to adapt a state-of-the-art Named Entity Recognition (NER) approach. By injecting special visual tokens, coding, for instance, indentation or breaks, into the token stream of the language model used for NER purpose, we can leverage both textual and visual knowledge simultaneously. Code, data, results and models are available at https://github.com/soduco/ paper-entryseg-icdar23-code, https://huggingface.co/HueyNemud/ (icdar23-entrydetector* variants).
Numéro de notice : P2023-002
Affiliation des auteurs : UGE-LASTIG+Ext (2020- )
Thématique : GEOMATIQUE/INFORMATIQUE/TOPONYMIE
Nature : Preprint
nature-HAL : Préprint
DOI : sans
Date de publication en ligne : 17/02/2023
En ligne : https://hal.science/hal-03994702v1/
Format de la ressource électronique : URL Article
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102609

Exploring the potential of deep learning for map generalization / Azelle Courtial (2023)

Public

Titre : Exploring the potential of deep learning for map generalization
Type de document : Thèse/HDR
Auteurs : Azelle Courtial , Auteur ; Guillaume Touya , Directeur de thèse ; Xiang Zhang, Directeur de thèse
Editeur : Champs-sur-Marne [France] : Université Gustave Eiffel
Année de publication : 2023
Importance : 216 p.
Note générale : bibliographie
Doctoral thesis from Université Gustave Eiffel, Doctoral school MSTIC, Specialty "Geographic information sciences"
Langues : Anglais (eng)
Descripteur : [Termes IGN] généralisation automatique de données
[Termes IGN] généralisation cartographique automatisée
[Termes IGN] relation spatiale
[Termes IGN] réseau antagoniste génératif
[Termes IGN] réseau neuronal profond
[Vedettes matières IGN] Généralisation

Index. décimale : THESE Thèses et HDR
Résumé : (auteur) Map generalization is a process that aims to adapt the level of detail of geographic information for cartography at a small scale. Automating the process is complex but essential in map production. We think this research field could benefit from the recent advances in deep learning that make it possible to solve more and more complex tasks, using numerous training examples. This thesis proposes exploring the potential of deep learning for map generalization. This exploration is built upon three map generalization use cases: recognition of spatial relations, graphic generalization of mountain roads, and generalization of topographic maps at medium scales. These three use cases enable us to address research questions relative to the concrete implementation of deep learning models for map generalization (including dataset creation and architecture), the evaluation of such models and their integration in existing generalization processes. In addition to the models and training set adapted for each of our case studies already mentioned, we propose evaluation methods adapted to the challenges of cartographic generalization by deep learning. Finally, we propose a partitioning of the cartographic generalization into sub-problems facilitating the resolution by learning and allowing the generation of generalized map images.
Note de contenu : Introduction

Part 1 A new paradigm for map generalization
Chapter A. Literature review
Chapter B. Formulating map generalization as a deep learning task
Chapter C. Designing a framework for deep learning based map generalization

Part 2 Exploration of deep learning for map generalization
Chapter D. Can graph neural networks model spatial relations?
Chapter E. CNN for the generalization of roads
Chapter F. The generation of topographic map with several themes

Part III The future of map generalization with deep learning
Chapter G. Usages of deep learning models for map generalization
Chapter H. Evaluation of deep learning predictions

Conclusion
Numéro de notice : 17752
Affiliation des auteurs : UGE-LASTIG (2020- )
Thématique : GEOMATIQUE
Nature : Thèse française
Organisme de stage : LASTIG (IGN)
nature-HAL : Thèse
DOI : sans
Date de publication en ligne : 05/05/2023
En ligne : https://theses.hal.science/tel-04089883v1
Format de la ressource électronique : URL
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=103186

Modern vectorization and alignment of historical maps: An application to Paris Atlas (1789-1950) / Yizi Chen (2023)

Public

Titre : Modern vectorization and alignment of historical maps: An application to Paris Atlas (1789-1950)
Titre original : Vectorisation et alignement modernes des cartes historiques : Une application à l'Atlas de Paris (1789-1950)
Type de document : Thèse/HDR
Auteurs : Yizi Chen , Auteur ; Julien Perret , Directeur de thèse ; Joseph Chazalon, Directeur de thèse ; Clément Mallet , Directeur de thèse
Editeur : Champs-sur-Marne [France] : Université Gustave Eiffel
Année de publication : 2023
Importance : 124 p.
Format : 21 x 30 cm
Note générale : bibliographie
Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] alignement des données
[Termes IGN] apprentissage profond
[Termes IGN] carte ancienne
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] contraste local
[Termes IGN] extraction automatique
[Termes IGN] jeu de données localisées
[Termes IGN] morphologie mathématique
[Termes IGN] Paris (75)
[Termes IGN] plan de ville
[Termes IGN] reconnaissance de formes
[Termes IGN] vectorisation
[Termes IGN] vision par ordinateur

Index. décimale : THESE Thèses et HDR
Résumé : (auteur) Les cartes sont une source unique de connaissances depuis des siècles. Ces documents historiques fournissent des informations inestimables pour analyser des transformations spatiales complexes sur des périodes importantes. Cela est particulièrement vrai pour les zones urbaines qui englobent de multiples domaines de recherche imbriqués : humanités, sciences sociales, etc. La complexité des cartes (texte, bruit, artefacts de numérisation, etc.) a entravé la capacité à proposer des approches de vectorisation polyvalentes et efficaces pendant des décennies. Dans cette thèse, nous proposons une solution apprenable, reproductible et réutilisable pour la transformation automatique de cartes raster en objets vectoriels (îlots, rues, rivières), en nous focalisant sur le problème d'extraction de formes closes. Notre approche s'appuie sur la complémentarité des réseaux de neurones convolutifs qui excellent dans et de la morphologie mathématique, qui présente de solides garanties au regard de l'extraction de formes closes tout en étant très sensible au bruit. Afin d'améliorer la robustesse au bruit des filtres convolutifs, nous comparons plusieurs fonctions de coût visant spécifiquement à préserver les propriétés topologiques des résultats, et en proposons de nouvelles. À cette fin, nous introduisons également un nouveau type de couche convolutive (CConv) exploitant le contraste des images, pour explorer les possibilités de telles améliorations à l'aide de transformations architecturales des réseaux. Finalement, nous comparons les différentes approches et architectures qui peuvent être utilisées pour implémenter chaque étape de notre chaîne de traitements, et comment combiner ces dernières de la meilleure façon possible. Grâce à une chaîne de traitement fonctionnelle, nous proposons une nouvelle procédure d'alignement d'images de plans historiques, et commençons à tirer profit de la redondance des données extraites dans des images similaires pour propager des annotations, améliorer la qualité de la vectorisation, et éventuellement détecter des cas d'évolution en vue d'analyse thématique, ou encore l'estimation automatique de la qualité de la vectorisation. Afin d'évaluer la performance des méthodes mentionnées précédemment, nous avons publié un nouveau jeu de données composé d'images de plans historiques annotées. C'est le premier jeu de données en libre accès dédié à la vectorisation de plans historiques. Nous espérons qu'au travers de nos publications, et de la diffusion ouverte et publique de nos résultats, sources et jeux de données, cette recherche pourra être utile à un large éventail d'applications liées aux cartes historiques.
Note de contenu : 1- Introduction
2- Pipeline design for historical map vectorization
3- Learning edges through deep neural architectures
4- Topology-aware loss functions
5- Improving model robustness of deep edge detectors
6- Leveraging redundancies of historical maps
7- Conclusion and perspectives
Numéro de notice : 10713
Affiliation des auteurs : UGE-LASTIG (2020- )
Thématique : IMAGERIE/INFORMATIQUE
Nature : Thèse française
Note de thèse : thèse de doctorat : Sciences géographiques : UGE : 2023
Organisme de stage : LASTIG (IGN)
nature-HAL : Thèse
DOI : sans
En ligne : https://theses.hal.science/tel-04106107
Format de la ressource électronique : URL
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=103264

Structured learning of geospatial data / Loïc Landrieu (2023)

Public

Titre : Structured learning of geospatial data
Type de document : Thèse/HDR
Auteurs : Loïc Landrieu , Auteur
Editeur : Champs-sur-Marne [France] : Université Gustave Eiffel
Année de publication : 2023
Importance : 179 p.
Format : 21 x 30 cm
Note générale : Bibliographie
Habilitation à Diriger des Recherches délivrée par l'Université Gustave Eiffel, Spécialité "Sciences et Technologies de l'Information Géographique"

Langues : Anglais (eng)
Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] algorithme Cut Pursuit
[Termes IGN] apprentissage automatique
[Termes IGN] carte agricole
[Termes IGN] graphe
[Termes IGN] lasergrammétrie
[Termes IGN] reconnaissance de formes
[Termes IGN] segmentation sémantique
[Termes IGN] série temporelle
[Termes IGN] vision par ordinateur

Résumé : (auteur) This manuscript presents an overview of my work in the field of geospatial machine learning, a rapidly growing interdisciplinary field that poses many methodological challenges and has a wide range of impactful applications. Throughout my research, I have focused on developing bespoke approaches that leverage the unique properties of geospatial data to create more efficient, precise, and parsimonious models. This manuscript is divided into four main chapters, each covering a different property of geospatial data structures that can be leveraged algorithmically. The first chapter presents a versatile mathematical framework formalizing the concept of spatial regularity with graphs. We propose an efficient algorithm that tackles a broad family of spatial problems and provides novel convergence guarantees and significant speed-ups compared to generic approaches. The second chapter introduces a deep learning method that extends the idea of exploiting graph regularity to the case of massive 3D point clouds. We simplify the task of large-scale semantic segmentation by formulating it as as a small graph labelling problem. Our compact models reach high precision at a fraction of the computational cost of other approaches. In the third chapter, we present a collection of methods designed to take advantage of the data structure inherited from 3D sensors. By considering the sensors’ structure, we develop powerful networks with state-of-the-art accuracy, latency, and robustness for various applications and data types. The last chapter dives into the real-life challenge of automated satellite time series analysis for crop mapping. Recognizing the difference between such data and standard formats used in computer vision, we propose novel and streamlined architectures that achieve unprecedented precision while remaining efficient and economical in memory and preprocessing. We also introduce the task of panoptic segmentation for satellite time series and an efficient architecture to solve this problem at scale. In summary, this manuscript argues that geospatial problems represent a challenging and impactful venue for evaluating the newest machine learning and vision methods and a fertile source of inspiration for designing novel approaches.
Note de contenu : 1- Introduction
2- Exploiting graph regularity
3- Exploiting the spatial regularity of 3D data
4- Exploiting the structure of 3D sensors
5- Exploiting the structure of satellite time series
6- Perspectives
7- Curriculum vitae
Numéro de notice : 24107
Affiliation des auteurs : UGE-LASTIG (2020- )
Thématique : IMAGERIE
Nature : HDR
Note de thèse : HDR: Sciences et Technologies de l’Information Geographique : UGE : 2023
Organisme de stage : LASTIG (IGN)
DOI : sans
En ligne : https://hal.science/tel-04095452v1
Format de la ressource électronique : URL
Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=103248

Évaluation de la qualité des données géographiques d'OpenStreetMap à l'aide des méthodes d'apprentissage automatique : cas de la République de Djibouti / Ibrahim Maidaneh Abdi (2022)

Permalink
Feature matching for multi-epoch historical aerial images / Lulin Zhang (2022)

Permalink
Learning spatio-temporal representations of satellite time series for large-scale crop mapping / Vivien Sainte Fare Garnot (2022)

Permalink
Learning surface reconstruction from point clouds in the wild / Raphaël Sulzer (2022)

Permalink
Monitoring grassland dynamics by exploiting multi-modal satellite image time series / Anatol Garioud (2022)

Permalink
Scaling up and evaluating surface reconstruction from point clouds of open scenes / Yanis Marchand (2022)

Permalink
Apport des données Sentinel-1 pour le suivi continu de la forêt tropicale : Cas de la Guyane / Marie Ballère (2021)

Permalink
Description et recherche d’image généralisables pour l’interconnexion et l’analyse multi-source / Dimitri Gominski (2021)

Permalink
Intelligent embedded camera for robust object tracking on mobile platform / Imane Salhi (2021)

Permalink
Knowledge graph management and streaming in the context of edge computing / Weiqin Xu (2021)

Permalink