Descripteur
Documents disponibles dans cette catégorie (57)
Ajouter le résultat dans votre panier
Visionner les documents numériques
Affiner la recherche Interroger des sources externes
Etendre la recherche sur niveau(x) vers le bas
Titre : Dynamic scene understanding using deep neural networks Type de document : Thèse/HDR Auteurs : Ye Lyu, Auteur ; M. George Vosselman, Directeur de thèse ; Michael Ying Yang, Directeur de thèse Editeur : Enschede [Pays-Bas] : International Institute for Geo-Information Science and Earth Observation ITC Année de publication : 2021 Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image
[Termes IGN] attention (apprentissage automatique)
[Termes IGN] chaîne de traitement
[Termes IGN] champ aléatoire conditionnel
[Termes IGN] compréhension de l'image
[Termes IGN] détection d'objet
[Termes IGN] image captée par drone
[Termes IGN] image vidéo
[Termes IGN] poursuite de cible
[Termes IGN] régression
[Termes IGN] segmentation sémantiqueRésumé : (auteur) Scene understanding is an important and fundamental research field in computer vision, which is quite useful for many applications in photogrammetry and remote sensing. It focuses on locating and classifying objects in images, understanding the relationships between them. The higher goal is to interpret what event happens in the scene, when it happens and why it happens, and what should we do based on the information. Dynamic scene understanding is to use information from different time to interpret scenes and answer the above related questions. For modern scene understanding technology, deep learning has shown great potential for such task. "Deep" in deep learning refers to the use of multiple layers in the neural networks. Deep neural networks are powerful as they are highly non-linear function that possess the ability to map from one domain to another quite different domain after proper training. It is the best solution for many fundamental research tasks regarding scene understanding. This ph.D. research also takes advantage of deep learning for dynamic scene understanding. Temporal information plays an important role for dynamic scene understanding. Compared with static scene understanding from images, information distilled from the time dimension provides values in many different ways. Images across consecutive frames have very high correlation, i.e., objects observed in one frame have very high chance to be observed and identified in nearby frames as well. Such redundancy in observation could potentially reduce the uncertainty for object recognition with deep learning based methods, resulting in more consistent inference. High correlation across frames could also improve the chance for recognizing objects correctly. If the camera or the object moves, the object could be observed in multiple different views with different poses and appearance. The information captured for object recognition would be more diverse and complementary, which could be aggregated to jointly inference the categories and the properties of objects. This ph.D. research involves several tasks related to the dynamic scene understanding in computer vision, including semantic segmentation for aerial platform images (chapter 2, 3), video object segmentation and video object detection for common objects in natural scenes (chapter 4, 5), and multi-object tracking and segmentation for cars and pedestrians in driving scenes (chapter 6). Chapter2 investigates how to establish the semantic segmentation benchmark for the UAV images, which includes data collection, data labeling, dataset construction, and performance evaluation with baseline deep neural networks and the proposed multi-scale dilation net. Conditional random field with feature space optimization is used to achieve consistent semantic segmentation prediction in videos. Chapter3 investigates how to better extract the scene context information for etter object recognition performance by proposing the novel bidirectional multiscale attention networks. It achieves better performance by inferring features and attention weights for feature fusing from both higher level and lower level branches. Chapter4 investigates how to simultaneously segment multiple objects across multiple frames by combining memory modules with instance segmentation networks. Our method learns to propagate the target object labels without auxiliary data, such as optical flow, which simplifies the model. Chapter5 investigates how to improve the performance of well-trained object detectors with a light weighted and efficient plug&play tracker for object detection in video. This chapter also investigates how the proposed model performs when lacking video training data. Chapter6 investigates how to improve the performance of detection, segmentation, and tracking by jointly considering top-down and bottom-up inference. The whole pipeline follows the multi-task design, i.e., a single feature extraction backbone with multiple heads for different sub-tasks. Overall, this manuscript has delved into several different computer vision tasks, which share fundamental research problems, including detection, segmentation, and tracking. Based on the research experiments and knowledge from literature review, several reflections regarding dynamic scene understanding have been discussed: The range of object context influence the quality for object recognition; The quality of video data affect the method choice for specific computer vision task; Detection and tracking are complementary for each other. For future work, unified dynamic scene understanding task could be a trend, and transformer plus self-supervised learning is one promising research direction. Real-time processing for dynamic scene understanding requires further researches in order to put the methods into usage for real-world applications. Numéro de notice : 12984 Affiliation des auteurs : non IGN Thématique : IMAGERIE/INFORMATIQUE Nature : Thèse étrangère Note de thèse : PhD thesis : Geo-Information Science and Earth Observation : Enschede, university of Twente : 2021 DOI : 10.3990/1.9789036552233 Date de publication en ligne : 08/09/2021 En ligne : https://library.itc.utwente.nl/papers_2021/phd/lyu.pdf Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=100962 Remotely-sensed rip current dynamics and morphological control in high-energy beach environments / Isaac Rodriguez Padilla (2021)
Titre : Remotely-sensed rip current dynamics and morphological control in high-energy beach environments Titre original : Télédétection par système vidéo de la dynamique des courants induits par les vagues et de l’évolution morpholoqique des plages soumises aux houles énergétiques Type de document : Thèse/HDR Auteurs : Isaac Rodriguez Padilla, Auteur ; Bruno Castelle, Directeur de thèse ; Philippe Bonneton, Directeur de thèse Editeur : Bordeaux : Université de Bordeaux 1 Année de publication : 2021 Importance : 156 p. Format : 21 x 30 cm Note générale : Bibliographie
Thèse présentée pour obtenir le grade de Docteur de l’Université de Bordeaux, Spécialité : Physique de l'environnementLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] base de données localisées
[Termes IGN] carte bathymétrique
[Termes IGN] courant marin
[Termes IGN] données bathymétriques
[Termes IGN] données topographiques
[Termes IGN] érosion côtière
[Termes IGN] extraction de données
[Termes IGN] géomorphologie locale
[Termes IGN] houle
[Termes IGN] image vidéo
[Termes IGN] océanographie dynamique
[Termes IGN] Pyrénées-atlantiques (64)
[Termes IGN] séquence d'images
[Termes IGN] série temporelle
[Termes IGN] surveillance du littoral
[Termes IGN] tempête
[Termes IGN] trait de côte
[Termes IGN] vagueIndex. décimale : THESE Thèses et HDR Résumé : (Auteur) Understanding the surf zone circulation and the morphological changes within the nearshore is essential for both scientific and societal interests. However, direct measurements with in-situ instruments are logistically challenging and expensive. The development of optical remote sensing techniques in combination with low-cost image platforms and open-source algorithms offers the possibility of collecting large amounts of information at a reasonable instrumental and computational cost. This work builds on existing and new video monitoring techniques to remotely sense the nearshore bathymetry as well as the surf zone circulation in a high-energy meso-macro tidal beach environment, including storm events. The methods are validated against a dense data set acquired during an intensive field campaign conducted at Anglet beach, SW France. For the first time the temporal and spatial variability of concurrent nearshore bathymetry and surface currents are addressed under high-energy wave forcing. Note de contenu : 1. Introduction
1.1 General context
1.2 Objectives and approach
1.3 Thesis outline
2. Field site and data
2.1 Study site: La Petite Chambre d’Amour (PCA), Anglet Beach
2.2 October 2018 field experiment
3. Image stabilization
3.1 Preamble
3.2 Introduction
3.3 Article: A Simple and Efficient Image Stabilization Method for Coastal Video Monitoring Video Systems
4. Nearshore bathymetric mapping from video imagery
4.1 Preamble
4.2 Introduction
4.3 Indirect bathymetric mapping
4.4 cBathy algorithm
4.5 cBathy results and previous validation
4.6 cBathy settings for PCA beach field experiment
4.7 Topo-bathymetry surveys comparison
4.8 cBathy results
4.9 cBathy error assessment
4.10 Discussion
4.11 Conclusions
5. Optically derived wave-filtered surface currents
5.1 Preamble
5.2 Introduction
5.3 Article: Wave-Filtered Surf Zone Circulation under High-Energy Waves Derived from Video-Based Optical Systems
5.4 Implications and potential of optically derived wave-filtered surface cur?rents
6. Conclusions and perspectives
6.1 General conclusions
6.2 Research perspectivesNuméro de notice : 26726 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Thèse française Note de thèse : Thèse de Doctorat : Physique de l'environnement : Bordeaux : 2021 Organisme de stage : Environnements et Paléoenvironnements Océaniques et Continentaux EPOC nature-HAL : Thèse DOI : sans Date de publication en ligne : 19/11/2021 En ligne : https://hal.science/tel-03436157 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=99523 Learning to segment moving objects / Pavel Tokmakov in International journal of computer vision, vol 127 n° 3 (March 2019)
[article]
Titre : Learning to segment moving objects Type de document : Article/Communication Auteurs : Pavel Tokmakov, Auteur ; Cordelia Schmid, Auteur ; Karteek Alahari, Auteur Année de publication : 2019 Article en page(s) : pp 282 - 301 Note générale : Bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image
[Termes IGN] apprentissage profond
[Termes IGN] cohérence temporelle
[Termes IGN] image vidéo
[Termes IGN] objet mobile
[Termes IGN] reconnaissance d'objets
[Termes IGN] réseau neuronal convolutif
[Termes IGN] séquence d'imagesRésumé : (Auteur) We study the problem of segmenting moving objects in unconstrained videos. Given a video, the task is to segment all the objects that exhibit independent motion in at least one frame. We formulate this as a learning problem and design our framework with three cues: (1) independent object motion between a pair of frames, which complements object recognition, (2) object appearance, which helps to correct errors in motion estimation, and (3) temporal consistency, which imposes additional constraints on the segmentation. The framework is a two-stream neural network with an explicit memory module. The two streams encode appearance and motion cues in a video sequence respectively, while the memory module captures the evolution of objects over time, exploiting the temporal consistency. The motion stream is a convolutional neural network trained on synthetic videos to segment independently moving objects in the optical flow field. The module to build a “visual memory” in video, i.e., a joint representation of all the video frames, is realized with a convolutional recurrent unit learned from a small number of training video sequences. For every pixel in a frame of a test video, our approach assigns an object or background label based on the learned spatio-temporal features as well as the “visual memory” specific to the video. We evaluate our method extensively on three benchmarks, DAVIS, Freiburg-Berkeley motion segmentation dataset and SegTrack. In addition, we provide an extensive ablation study to investigate both the choice of the training data and the influence of each component in the proposed framework. Numéro de notice : A2018-601 Affiliation des auteurs : non IGN Thématique : IMAGERIE/INFORMATIQUE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.1007/s11263-018-1122-2 Date de publication en ligne : 22/09/2018 En ligne : https://doi.org/10.1007/s11263-018-1122-2 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=92528
in International journal of computer vision > vol 127 n° 3 (March 2019) . - pp 282 - 301[article]Real-time relative mobile target positioning using GPS-assisted stereo videogrammetry / Bahadir Ergun in Survey review, vol 50 n° 361 (July 2018)
[article]
Titre : Real-time relative mobile target positioning using GPS-assisted stereo videogrammetry Type de document : Article/Communication Auteurs : Bahadir Ergun, Auteur ; Irfan Sayim, Auteur ; Cumhur Sahin, Auteur ; N. Tok, Auteur Année de publication : 2018 Article en page(s) : pp 326 - 335 Note générale : Bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image
[Termes IGN] détection de cible
[Termes IGN] image vidéo
[Termes IGN] photogrammétrie métrologique
[Termes IGN] VidéogrammétrieRésumé : (Auteur) Positioning of a GPS-equipped (Global Positioning System) moving target was determined by stereo-videogrammetry from two images of cameras where they were placed on another GPS-equipped moving platform. The computed position outputs of target were compared with the relative positions obtained from two GPS receivers. The target, a small square-like pattern, was tracked from a certain distance depending on the base distance between the cameras. The video files were created from acquired images data. These video files were used in real-time computation to get the target image position for every film frame. First, the location of target was computed within video film frames. Since the target cannot be searched on the whole picture, maximum pixel length, which the target can travel on the consecutive film frames was considered as offset. Therefore, the search was made over a small area rather than whole picture. That was improved the performance of positioning. Finally, videogrammetrically computed coordinates for all epochs were compared with GPS-based relative distances to justify performance of relative target positioning results. Numéro de notice : A2018-443 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.1080/00396265.2016.1267303 Date de publication en ligne : 27/12/2016 En ligne : https://doi.org/10.1080/00396265.2016.1267303 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=91015
in Survey review > vol 50 n° 361 (July 2018) . - pp 326 - 335[article]Video event recognition and anomaly detection by combining gaussian process and hierarchical dirichlet process models / Michael Ying Yang in Photogrammetric Engineering & Remote Sensing, PERS, vol 84 n° 4 (April 2018)
[article]
Titre : Video event recognition and anomaly detection by combining gaussian process and hierarchical dirichlet process models Type de document : Article/Communication Auteurs : Michael Ying Yang, Auteur ; Wentong Liao, Auteur ; Yanpeng Cao, Auteur ; Bodo Rosenhahn, Auteur Année de publication : 2018 Article en page(s) : pp 203 - 214 Note générale : Bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image
[Termes IGN] agent (intelligence artificielle)
[Termes IGN] apprentissage non-dirigé
[Termes IGN] approche hiérarchique
[Termes IGN] image vidéo
[Termes IGN] modèle de Markov
[Termes IGN] modèle orienté agent
[Termes IGN] séquence d'imagesRésumé : (Auteur) In this paper, we present an unsupervised learning framework for analyzing activities and interactions in surveillance videos. In our framework, three levels of video events are connected by Hierarchical Dirichlet Process (HDP) model: low-level visual features, simple atomic activities, and multi-agent interactions. Atomic activities are represented as distribution of low-level features, while complicated interactions are represented as distribution of atomic activities. This learning process is unsupervised. Given a training video sequence, low-level visual features are extracted based on optic flow and then clustered into different atomic activities and video clips are clustered into different interactions. The HDP model automatically decides the number of clusters, i.e., the categories of atomic activities and interactions. Based on the learned atomic activities and interactions, a training dataset is generated to train the Gaussian Process (GP) classifier. Then, the trained GP models work in newly captured video to classify interactions and detect abnormal events in real time. Furthermore, the temporal dependencies between video events learned by HDP-Hidden Markov Models (HMM) are effectively integrated into GP classifier to enhance the accuracy of the classification in newly captured videos. Our framework couples the benefits of the generative model (HDP) with the discriminant model (GP). We provide detailed experiments showing that our framework enjoys favorable performance in video event classification in real-time in a crowded traffic scene. Numéro de notice : A2018-139 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.14358/PERS.84.4.203 Date de publication en ligne : 01/04/2018 En ligne : https://doi.org/10.14358/PERS.84.4.203 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=89689
in Photogrammetric Engineering & Remote Sensing, PERS > vol 84 n° 4 (April 2018) . - pp 203 - 214[article]Exemplaires(1)
Code-barres Cote Support Localisation Section Disponibilité 105-2018041 RAB Revue Centre de documentation En réserve L003 Disponible Self-calibration of omnidirectional multi-cameras including synchronization and rolling shutter / Thanh-Tin Nguyen in Computer Vision and image understanding, vol 162 (September 2017)PermalinkMulti-view performance capture of surface details / Nadia Robertini in International journal of computer vision, vol 124 n° 1 (August 2017)PermalinkMotion priors based on goals hierarchies in pedestrian tracking applications / Francisco Madrigal in Machine Vision and Applications, vol 28 n° 3-4 (May 2017)PermalinkAnalysis on the dynamic deformations of the images from digital film sequences / Tomasz Markowski in Geodesy and cartography, vol 64 n° 1 (June 2015)PermalinkDetection of abrupt changes in spatial relationships in video sequences / Abdalbassir Abou-Elailah (2015)PermalinkThe Bulger case : A spatial story / Les Roberts in Cartographic journal (the), vol 51 n° 2 (May 2014)PermalinkUsing video acquired from an unmanned aerial vehicle (UAV) to measure fracture orientation in an open-pit mine / Tara McLeod in Geomatica, vol 67 n° 3 (September 2013)PermalinkClose range stereophotogrammetry and video imagery analyses in soil ecohydrology modelling / Maria J. Rossi in Photogrammetric record, vol 27 n° 137 (March - May 2012)PermalinkElaboration d'une carte dynamique des dégâts causés par les lahars au Merapi (Java centre, Indonésie) dans le cadre du programme MIA VITA / A.K. Robin (2011)PermalinkFast robust large-scale mapping from video and internet photo collections / J. Frahm in ISPRS Journal of photogrammetry and remote sensing, vol 65 n° 6 (November - December 2010)Permalink