Détail de l'auteur
Auteur Joseph Chazalon |
Documents disponibles écrits par cet auteur (5)



A benchmark of named entity recognition approaches in historical documents : application to 19th century French directories / Nathalie Abadie (2022)
![]()
Titre : A benchmark of named entity recognition approaches in historical documents : application to 19th century French directories Type de document : Article/Communication Auteurs : Nathalie Abadie , Auteur ; Edwin Carlinet, Auteur ; Joseph Chazalon, Auteur ; Bertrand Duménieu
, Auteur
Editeur : Berlin, Heidelberg, Vienne, New York, ... : Springer Année de publication : 2022 Collection : Lecture notes in Computer Science, ISSN 0302-9743 num. 13237 Projets : SODUCO / Perret, Julien Conférence : DAS 2022, 5th IAPR International Workshop on Document Analysis Systems 22/05/2022 25/05/2022 La Rochelle France Proceedings Springer Importance : pp 445 - 460 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Géomatique
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] dix-neuvième siècle
[Termes IGN] données d'entrainement (apprentissage automatique)
[Termes IGN] exploration de texte
[Termes IGN] objet géohistorique
[Termes IGN] reconnaissance de noms
[Termes IGN] traitement du langage naturelRésumé : (auteur) Named entity recognition (NER) is a necessary step in many pipelines targeting historical documents. Indeed, such natural language processing techniques identify which class each text token belongs to, e.g. “person name”, “location”, “number”. Introducing a new public dataset built from 19th century French directories, we first assess how noisy modern, off-the-shelf OCR are. Then, we compare modern CNN- and Transformer-based NER techniques which can be reasonably used in the context of historical document analysis. We measure their requirements in terms of training data, the effects of OCR noise on their performance, and show how Transformer-based NER can benefit from unsupervised pre-training and supervised fine-tuning on noisy data. Results can be reproduced using resources available at https://github.com/soduco/paper-ner-bench-das22 and https://zenodo.org/record/6394464. Numéro de notice : C2022-030 Affiliation des auteurs : UGE-LASTIG+Ext (2020- ) Autre URL associée : vers HAL Thématique : GEOMATIQUE/INFORMATIQUE Nature : Communication nature-HAL : ComAvecCL&ActesPubliésIntl DOI : 10.1007/978-3-031-06555-2_30 En ligne : http://dx.doi.org/10.1007/978-3-031-06555-2_30 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=101088 Combining deep learning and mathematical morphology for historical map segmentation / Yizi Chen (2021)
![]()
Titre : Combining deep learning and mathematical morphology for historical map segmentation Type de document : Chapitre/Contribution Auteurs : Yizi Chen, Auteur ; Edwin Carlinet, Auteur ; Joseph Chazalon, Auteur ; Clément Mallet , Auteur ; Bertrand Duménieu
, Auteur ; Julien Perret
, Auteur
Editeur : Berlin, Heidelberg, Vienne, New York, ... : Springer Année de publication : 2021 Collection : Lecture notes in Computer Science, ISSN 0302-9743 num. 12708 Projets : SODUCO / Perret, Julien Conférence : DGMM 2021, 1st International Joint Conference on Discrete Geometry and Mathematical Morphology 24/05/2021 27/05/2021 Uppsala Suède Proceedings Springer Importance : pp 79 - 92 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Géomatique
[Termes IGN] analyse diachronique
[Termes IGN] apprentissage profond
[Termes IGN] carte ancienne
[Termes IGN] chaîne de traitement
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] détection d'objet
[Termes IGN] données maillées
[Termes IGN] morphologie mathématique
[Termes IGN] vectorisationRésumé : (auteur) The digitization of historical maps enables the study of ancient, fragile, unique, and hardly accessible information sources. Main map features can be retrieved and tracked through the time for subsequent thematic analysis. The goal of this work is the vectorization step, i.e., the extraction of vector shapes of the objects of interest from raster images of maps. We are particularly interested in closed shape detection such as buildings, building blocks, gardens, rivers, etc. in order to monitor their temporal evolution. Historical map images present significant pattern recognition challenges. The extraction of closed shapes by using traditional Mathematical Morphology (MM) is highly challenging due to the overlapping of multiple map features and texts. Moreover, state-of-the-art Convolutional Neural Networks (CNN) are perfectly designed for content image filtering but provide no guarantee about closed shape detection. Also, the lack of textural and color information of historical maps makes it hard for CNN to detect shapes that are represented by only their boundaries. Our contribution is a pipeline that combines the strengths of CNN (efficient edge detection and filtering) and MM (guaranteed extraction of closed shapes) in order to achieve such a task. The evaluation of our approach on a public dataset shows its effectiveness for extracting the closed boundaries of objects in historical maps. Numéro de notice : H2021-001 Affiliation des auteurs : UGE-LASTIG+Ext (2020- ) Autre URL associée : vers HAL Thématique : GEOMATIQUE Nature : Chapître / contribution nature-HAL : ChOuvrScient DOI : 10.1007/978-3-030-76657-3_5 Date de publication en ligne : 16/05/2021 En ligne : http://dx.doi.org/10.1007/978-3-030-76657-3_5 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=96739
Titre : ICDAR 2021 competition on historical map segmentation Type de document : Article/Communication Auteurs : Joseph Chazalon, Auteur ; Edwin Carlinet, Auteur ; Yizi Chen, Auteur ; Julien Perret , Auteur ; Bertrand Duménieu
, Auteur ; Clément Mallet
, Auteur ; Thierry Géraud, Auteur ; Vincent Nguyen, Auteur ; Nam Nguyen, Auteur ; Josef Baloun, Auteur ; Ladislav Lenc, Auteur ; Pavel Král, Auteur
Editeur : Le Kremlin Bicêtre : Ecole pour l'Informatique et les Techniques Avancées EPITA Année de publication : 2021 Projets : 1-Pas de projet / Perret, Julien Conférence : ICDAR 2021, 16th International Conference on Document Analysis and Recognition 05/09/2021 10/09/2021 Lausanne Suisse Proceedings Springer Importance : 15 p. Note générale : bibliographie Langues : Anglais (eng) Résumé : (auteur) This paper presents the final results of the ICDAR 2021 Competition on Historical Map Segmentation (MapSeg), encouraging research on a series of historical atlases of Paris, France, drawn at 1/5000 scale between 1894 and 1937. The competition featured three tasks, awarded separately. Task 1 consists in detecting building blocks and was won by the L3IRIS team using a DenseNet-121 network trained in a weakly supervised fashion. This task is evaluated on 3 large images containing hundreds of shapes to detect. Task 2 consists in segmenting map content from the larger map sheet, and was won by the UWB team using a U-Net-like FCN combined with a binarization method to increase detection edge accuracy. Task 3 consists in locating intersection points of geo-referencing lines, and was also won by the UWB team who used a dedicated pipeline combining binarization, line detection with Hough transform, candidate filtering, and template matching for intersection refinement. Tasks 2 and 3 are evaluated on 95 map sheets with complex content. Dataset, evaluation tools and results are available under permissive licensing at https://icdar21-mapseg.github.io/. Numéro de notice : C2021-022 Affiliation des auteurs : UGE-LASTIG+Ext (2020- ) Nature : Communication nature-HAL : ComAvecCL&ActesPubliésIntl DOI : sans En ligne : https://hal.archives-ouvertes.fr/hal-03256193/document Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=98032
Titre : Introducing the boundary-aware loss for deep image segmentation Type de document : Article/Communication Auteurs : Minh On Vu Ngoc, Auteur ; Yizi Chen, Auteur ; Nicolas Boutry, Auteur ; Joseph Chazalon, Auteur ; Edwin Carlinet, Auteur ; Jonathan Fabrizio, Auteur ; Clément Mallet , Auteur ; Thierry Géraud, Auteur
Editeur : The British Machine Vision Association Press (BMVA) Année de publication : 2021 Projets : SODUCO / Perret, Julien Conférence : BMVC 2021, 32nd British Machine Vision Conference 22/11/2021 25/11/2021 online Royaume-Uni OA Proceedings Importance : 17 p. Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image
[Termes IGN] apprentissage profond
[Termes IGN] classification barycentrique
[Termes IGN] segmentation d'imageRésumé : (auteur) Most contemporary supervised image segmentation methods do not preserve the initial topology of the given input (like the closeness of the contours). One can generally remark that edge points have been inserted or removed when the binary prediction and the ground truth are compared. This can be critical when accurate localization of multiple interconnected objects is required. In this paper, we present a new loss function, called, Boundary-Aware loss (BALoss), based on the Minimum Barrier Distance (MBD) cut algorithm. It is able to locate what we call the leakage pixels and to encode the boundary information coming from the given ground truth. Thanks to this adapted loss, we are able to significantly refine the quality of the predicted boundaries during the learning procedure. Furthermore, our loss function is differentiable and can be applied to any kind of neural network used in image processing. We apply this loss function on the standard U-Net and DC U-Net on Electron Microscopy datasets. They are well-known to be challenging due to their high noise level and to the close or even connected objects covering the image space. Our segmentation performance, in terms of Variation of Information (VOI) and Adapted Rank Index (ARI), are very promising and lead to 15% better scores of VOI and 5% better scores of ARI than the state-of-the-art. The code of boundary-awareness loss is freely available at https://github.com/onvungocminh/MBD_BAL Numéro de notice : C2021-054 Affiliation des auteurs : UGE-LASTIG+Ext (2020- ) Thématique : IMAGERIE Nature : Communication nature-HAL : ComAvecCL&ActesPubliésIntl DOI : sans En ligne : https://www.bmvc2021-virtualconference.com/assets/papers/1546.pdf Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=99411 Vectorization of historical maps using deep edge filtering and closed shape extraction / Yizi Chen (2021)
![]()
Titre : Vectorization of historical maps using deep edge filtering and closed shape extraction Type de document : Article/Communication Auteurs : Yizi Chen, Auteur ; Edwin Carlinet, Auteur ; Joseph Chazalon, Auteur ; Clément Mallet , Auteur ; Bertrand Duménieu
, Auteur ; Julien Perret
, Auteur
Editeur : Saint-Mandé : Institut national de l'information géographique et forestière - IGN Année de publication : 2021 Projets : SODUCO / Perret, Julien Conférence : ICDAR 2021, 16th International Conference on Document Analysis and Recognition 05/09/2021 10/09/2021 Lausanne Suisse OA Proceedings Importance : 17 p. Format : 21 x 30 cm Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Géomatique
[Termes IGN] apprentissage profond
[Termes IGN] carte ancienne
[Termes IGN] chaîne de traitement
[Termes IGN] extraction de traits caractéristiques
[Termes IGN] filtrage numérique d'image
[Termes IGN] traitement d'image
[Termes IGN] vectorisationRésumé : (auteur) Maps have been a unique source of knowledge for centuries. Such historical documents provide invaluable information for analyzing the complex spatial transformation of landscapes over important time frames. This is particularly true for urban areas that encompass multiple interleaved research domains (social sciences, economy, etc.). The large amount and significant diversity of map sources call for automatic image processing techniques in order to extract the relevant objects under a vectorial shape. The complexity of maps (text, noise, digiti-zation artifacts, etc.) has hindered the capacity of proposing a versatile and efficient raster-to-vector approaches for decades. We propose alearnable, reproducible, and reusable solution for the automatic transformation of raster maps into vector objects (building blocks, streets,rivers). It is built upon the complementary strength of mathematical morphology and convolutional neural networks through efficient edge filtering. Even more, we modify ConnNet and combine with deep edgefiltering architecture to make use of pixel connectivity information and built an end-to-end system without requiring any post-processing techniques. In this paper, we focus on the comprehensive benchmark on various architectures on multiple datasets coupled with a novel vectorization step. Our experimental results on a new public dataset using COCO Panoptic metric exhibit very encouraging results confirmedby a qualitative analysis of the success and failure cases of our approach. Code, dataset, results and extra illustrations are freely available at https://github.com/soduco/ICDAR-2021-Vectorization Numéro de notice : C2021-011 Affiliation des auteurs : UGE-LASTIG+Ext (2020- ) Thématique : GEOMATIQUE/IMAGERIE/INFORMATIQUE Nature : Communication nature-HAL : ComAvecCL&ActesPubliésIntl DOI : sans En ligne : https://hal.archives-ouvertes.fr/hal-03256073/document Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=97988