Descripteur
Documents disponibles dans cette catégorie (410)
Ajouter le résultat dans votre panier
Visionner les documents numériques
Affiner la recherche Interroger des sources externes
Etendre la recherche sur niveau(x) vers le bas
Titre : Scene understanding and gesture recognition for human-machine interaction Type de document : Thèse/HDR Auteurs : Naina Dhingra, Auteur Editeur : Zurich : Eidgenossische Technische Hochschule ETH - Ecole Polytechnique Fédérale de Zurich EPFZ Année de publication : 2022 Note générale : Bibliographie
A dissertation submitted to attain the degree of Doctor of Sciences of ETH ZurichLangues : Français (fre) Descripteur : [Vedettes matières IGN] Intelligence artificielle
[Termes IGN] apprentissage profond
[Termes IGN] attention (apprentissage automatique)
[Termes IGN] classification orientée objet
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] classification par séparateurs à vaste marge
[Termes IGN] compréhension de l'image
[Termes IGN] image RVB
[Termes IGN] interaction homme-machine
[Termes IGN] oculométrie
[Termes IGN] reconnaissance automatique
[Termes IGN] reconnaissance de formes
[Termes IGN] reconnaissance de gestes
[Termes IGN] réseau neuronal récurrent
[Termes IGN] scène
[Termes IGN] vision par ordinateurRésumé : (auteur) Scene understanding and gesture recognition are useful for a myriad of applications such as human-robotic interaction, assisting blind and visually impaired people, advanced driver assistance systems, and autonomous driving. To work autonomously in real-world environments, automatic systems need to deliver non-verbal information to enhance the verbal communication in particular for blind people. We are exploring the holistic approach for providing the scene as well as gesture related information. We propose that incorporating attention mechanisms in neural networks which behave similarly to attention in the human brain, and conducting an integrated study using neural networks in real-time can yield significant improvements in the scene and gesture understanding, thereby enhancing the user experience. In this thesis, we investigate the understanding of visual scenes and gestures. We explore these two areas, in particular, by proposing novel architectures, training methods, user studies, and thorough evaluations. We show that, for deep learning approaches, attention or self attention mechanisms improve and push the boundaries of network performance for different tasks in consideration. We suggest that the various kinds of gestures can complement and supplement each other’s information to better understand non-verbal conversation; hence integrated gestures comprehension is useful. First, we focus on visual scene understanding using scene graph generation. We propose, BGT-Net, a new network that uses an object detection model with 1) bidirectional gated recurrent units for object-object communication and 2) transformer encoders including self attention to classify the objects and their relationships. We address the problem of bias caused by the long tailed distribution in the dataset. This enables the network to perform even for the unseen objects or relationships in the dataset. Second, we propose to learn hand gesture recognition from RGB and RGB-D videos using attention learning. We present a novel architecture based on residual connections and an attention mechanism. Our approach successfully detects hand gestures when evaluated on three open-source datasets. Third, we explore pointing gesture recognition and localization using open-source software, i.e. OpenPtrack which uses a deep learning based iii network to track multi-persons in the scene. We use a Kinect sensor as an input device and conduct a user study with 26 users to evaluate the system using two setup types. Fourth, we propose a technique to perform eye gaze tracking using OpenFace which is based on a deep learning model and RGB webcam. We use support vector machine regression to estimate the position of eye gaze on the screen. In a study, we evaluate the system with 28 users and show that this system can perform similarly to commercially expensive eye trackers. Finally, we focus on 3D head pose estimation using two models: 1)headPosr includes residual connections for the base network followed by a transformer encoder. It outperforms existing models but has a drawback of being computationally expensive; 2) lwPosr uses depthwise separable convolutions and transformer encoders. It is a two stream network in fine-grained fashion to estimate the three angles of the head pose. We demonstrate that this method is able to predict head poses better than state-of-the-art lightweight networks. Note de contenu : 1- Introduction
2- Background
3- State of the art
4- Scene graph generation
5- 3D hand gesture recognition
6- Pointing gesture recognition
7- Eye-gaze tracking
8- Head pose estimation
9- Lightweight head pose estimation
10- SummaryNuméro de notice : 24039 Affiliation des auteurs : non IGN Thématique : IMAGERIE/INFORMATIQUE Nature : Thèse étrangère Note de thèse : PhD Thesis : Sciences : ETH Zurich :2022 DOI : sans En ligne : https://www.research-collection.ethz.ch/handle/20.500.11850/559347 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=101876 Towards expressive graph neural networks : Theory, algorithms, and applications / Georgios Dasoulas (2022)
Titre : Towards expressive graph neural networks : Theory, algorithms, and applications Type de document : Thèse/HDR Auteurs : Georgios Dasoulas, Auteur ; Michalis Vazirgiannis, Directeur de thèse ; Aladin Virmaux, Directeur de thèse Editeur : Paris : Institut Polytechnique de Paris Année de publication : 2022 Note générale : bibliographie
These de doctorat de l’Institut Polytechnique de Paris préparée à l’Ecole Polytechnique, spécialité InformatiqueLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Intelligence artificielle
[Termes IGN] apprentissage automatique
[Termes IGN] attention (apprentissage automatique)
[Termes IGN] entropie
[Termes IGN] graphe
[Termes IGN] isomorphisme
[Termes IGN] noeud
[Termes IGN] réseau neuronal de graphes
[Termes IGN] théorie des graphesIndex. décimale : THESE Thèses et HDR Résumé : (auteur) As the technological evolution of machine learning is accelerating nowadays, data plays a vital role in building intelligent models, being able to simulate phenomena, predict values and make decisions. In an increasing number of applications, data take the form of networks. The inherent graph structure of network data motivated the evolution of the graph representation learning field. Its scope includes generating meaningful representations for graphs and their components, i.e., the nodes and the edges. The research on graph representation learning was accelerated with the success of message passing frameworks applied on graphs, namely the Graph Neural Networks. Learning informative and expressive representations on graphs plays a critical role in a wide range of real-world applications, from telecommunication and social networks, urban design, chemistry, and biology. In this thesis, we study various aspects from which Graph Neural Networks can be more expressive, and we propose novel approaches to improve their performance in standard graph learning tasks. The main branches of the present thesis include: the universality of graph representations, the increase of the receptive field of graph neural networks, the design of stable deeper graph learning models, and alternatives to the standard message-passing framework. Performing both theoretical and experimental studies, we show how the proposed approaches can become valuable and efficient tools for designing more powerful graph learning models.In the first part of the thesis, we study the quality of graph representations as a function of their discrimination power, i.e., how easily we can differentiate graphs that are not isomorphic. Firstly, we show that standard message-passing schemes are not universal due to the inability of simple aggregators to separate nodes with ambiguities (similar attribute vectors and neighborhood structures). Based on the found limitations, we propose a simple coloring scheme that can provide universal representations with theoretical guarantees and experimental validations of the performance superiority. Secondly, moving beyond the standard message-passing paradigm, we propose an approach for treating a corpus of graphs as a whole instead of examining graph pairs. To do so, we learn a soft permutation matrix for each graph, and we project all graphs in a common vector space, achieving a solid performance on graph classification tasks.In the second part of the thesis, our primary focus is concentrated around the receptive field of the graph neural networks, i.e., how much information a node has in order to update its representation. To begin with, we study the spectral properties of operators that encode adjacency information. We propose a novel parametric family of operators that can adapt throughout training and provide a flexible framework for data-dependent neighborhood representations. We show that the incorporation of this approach has a substantial impact on both node classification and graph classification tasks. Next, we study how considering the k-hop neighborhood information for a node representation can output more powerful graph neural network models. The resulted models are proven capable of identifying structural properties, such as connectivity and triangle-freeness.In the third part of the thesis, we address the problem of long-range interactions, where nodes that lie in distant parts of the graph can affect each other. In this problem, we either need the design of deeper models or the reformulation of how proximity is defined in the graph. Firstly, we study the design of deeper attention models, focusing on graph attention. We calibrate the gradient flow of the model by introducing a novel normalization that enforces Lipschitz continuity. Next, we propose a data augmentation method for enriching the node attributes with information that encloses structural information based on local entropy measures. Note de contenu : 1. Introduction
2. Preliminaries
I- Discrimination power
3. Universal approximation on graphs
4. Learning soft permutations for graph representations
II- Receptive field
5. Learning graph shift operators
6. Increasing the receptive field with multiple hops
III- Beyond local interactions
7. Lipschitz continuity of graph attention
8. Structural symmetries in graphs
9. Conclusions and outlookNuméro de notice : 24076 Affiliation des auteurs : non IGN Thématique : INFORMATIQUE/MATHEMATIQUE Nature : Thèse française Note de thèse : Thèse de doctorat : Informatique : Palaiseau : 2022 DOI : sans En ligne : https://theses.hal.science/tel-03666690 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102200 Towards synthetic sensing for smart cities : a machine/deep learning-based approach / Faraz Malik Awan (2022)
Titre : Towards synthetic sensing for smart cities : a machine/deep learning-based approach Type de document : Thèse/HDR Auteurs : Faraz Malik Awan, Auteur ; Noël Crespi, Directeur de thèse ; Roberto Minerva, Directeur de thèse Editeur : Courcouronnes : Télécom SudParis Année de publication : 2022 Importance : 106 p. Format : 21 x 30 cm Note générale : bibliographie
Thèse de doctorat de l’Institut Polytechnique de Paris préparée à Telecom SudParis, Spécialité InformatiqueLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Intelligence artificielle
[Termes IGN] analyse comparative
[Termes IGN] apprentissage automatique
[Termes IGN] apprentissage profond
[Termes IGN] classification par arbre de décision
[Termes IGN] classification par forêts d'arbres décisionnels
[Termes IGN] classification par Perceptron multicouche
[Termes IGN] Espagne
[Termes IGN] parking
[Termes IGN] plus proche voisin, algorithme du
[Termes IGN] pollution acoustique
[Termes IGN] pollution atmosphérique
[Termes IGN] réseau neuronal récurrent
[Termes IGN] système de transport intelligent
[Termes IGN] trafic routier
[Termes IGN] ville intelligenteIndex. décimale : THESE Thèses et HDR Résumé : (auteur) We worked on one of the most significant research directions in Smart City, i.e., Intelligent Transportation System (ITS). ITS encapsulates several domains, such as electronic vehicles notification systems, traffic information, smart parking, and environment. However, in this thesis, we target two of its important domains; i) Smart Parking, and ii) Road Traffic. We started our research with Smart Parking use case. Performing literature review, we realized that different Machine Learning (ML) and Deep Learning (DL) approaches have been used for smart parking solutions. In most of these proposed approaches, enclosed parking areas were targeted with different feature sets to predict the "occupancy rate" in parking areas. It inspired us to conduct a comparative analysis to answer following questions; Given the parking prediction use case, how do the traditional ML models perform as compared to complex DL models? Provided big data, can less complex, traditional ML models outperform complex DL models? How well these models can perform to predict the availability of the individual on-street parking spots rather than predicting the overall occupancy rate of an enclosed parking area. To answer these questions, we choose five well-known classical ML algorithms (K-Nearest Neighbours, Random Forest, Decision Tree) and DL algorithm (Multilayer Perceptron). To take our investigation into depth, we train Ensemble Learning Model, in which we combine all the above-mentioned ML and DL models. A huge parking dataset of city of Santander, Spain, has been used which consists of around 25 million records. We also propose to recommend available parking spots based on the current location of the driver. Moving forward with our research goals, we performed literature review on road traffic and found road traffic associated with air pollution and noise pollution often. However, to the best of our knowledge, air pollution & noise pollution have never been use d in traffic prediction problem. In this part of our research, firstly we used air pollution (CO, NO, NO2, NOx, and O3) along with the atmospheric variables, such as wind speed, wind direction, temperature, and pressure to improve the traffic forecasting in the city of Madrid. This successful experiment motivated us to extend our investigation to another factor, which is also strongly correlated with road traffic i.e., noise pollution. Hence, as an extension of our previous work, in this part of our research, we use noise pollution to improve the traffic prediction in the city of Madrid. Note de contenu : 1- Introduction
2- Parking space prediction using classical ML and deep learning models
3- Road traffic prediction improvement using air pollution and atmospheric data
4- Using noise pollution to improve traffic prediction
5- Conclusion and future workNuméro de notice : 20025 Affiliation des auteurs : non IGN Thématique : INFORMATIQUE/URBANISME Nature : Thèse française Note de thèse : Thèse de Doctorat : Informatique : Telecom SudParis : 2022 Organisme de stage : SAMOVAR DOI : sans En ligne : https://tel.hal.science/tel-03722891/ Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=101825 Unsupervised generative models for data analysis and explainable artificial intelligence / Mohanad Abukmeil (2022)
Titre : Unsupervised generative models for data analysis and explainable artificial intelligence Type de document : Thèse/HDR Auteurs : Mohanad Abukmeil, Auteur ; Vincenzo Piuri, Directeur de thèse Editeur : Milan [Italie] : Università di Milano Année de publication : 2022 Importance : 194 p. Format : 21 x 30 cm Note générale : bibliographie
Thèse de Doctorat spécialité Informatique, Université de MilanLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Intelligence artificielle
[Termes IGN] allocation de Dirichlet latente
[Termes IGN] analyse en composantes indépendantes
[Termes IGN] analyse en composantes principales
[Termes IGN] apprentissage automatique
[Termes IGN] apprentissage non-dirigé
[Termes IGN] modèle stochastique
[Termes IGN] navigation autonome
[Termes IGN] reconstruction d'image
[Termes IGN] réseau antagoniste génératif
[Termes IGN] séparation aveugle de sourceRésumé : (auteur) For more than a century, the methods of learning representation and the exploration of the intrinsic structures of data have developed remarkably and currently include supervised, semi-supervised, and unsupervised methods. However, recent years have witnessed the flourishing of big data, where typical dataset dimensions are high, and the data can come in messy, missing, incomplete, unlabeled, or corrupted forms. Consequently, discovering and learning the hidden structure buried inside such data becomes highly challenging. From this perspective, latent data analysis and dimensionality reduction play a substantial role in decomposing the exploratory factors and learning the hidden structures of data, which encompasses the significant features that characterize the categories and trends among data samples in an ordered manner. That is by extracting patterns, differentiating trends, and testing hypotheses to identify anomalies, learning compact knowledge, and performing many different machine learning (ML) tasks such as classification, detection, and prediction. Unsupervised generative learning (UGL) methods are a class of ML characterized by their possibility of analyzing and decomposing latent data, reducing dimensionality, visualizing the manifold of data, and learning representations with limited levels of predefined labels and prior assumptions. Furthermore, explainable artificial intelligence (XAI) is an emerging field of ML that deals with explaining the decisions and behaviors of learned models. XAI is also associated with UGL models to explain the hidden structure of data, and to explain the learned representations of ML models. However, the current UGL models lack large-scale generalizability and explainability in the testing stage, which leads to restricting their potential in ML and XAI applications. To overcome the aforementioned limitations, this thesis proposes innovative methods that integrate UGL and XAI to enable data factorization and dimensionality reduction to improve the generalizability of the learned ML models. Moreover, the proposed methods enable visual explainability in modern applications as anomaly detection and autonomous driving systems. The main research contributions are listed as follows:
* A novel overview of UGL models including blind source separation (BSS), manifold learning (MfL), and neural networks (NNs). Also, the overview considers open issues and challenges among each UGL method.
* An innovative method to identify the dimensions of the compact feature space via a generalized rank in the application of image dimensionality reduction.
* An innovative method to hierarchically reduce and visualize the manifold of data to improve the generalizability in limited data learning scenarios, and computational complexity reduction applications.
* An original method to visually explain autoencoders by reconstructing an attention map in the application of anomaly detection and explainable autonomous driving systems.
The novel methods introduced in this thesis are benchmarked on publicly available datasets, and they outperformed the state-of-the-art methods considering different evaluation metrics. Furthermore, superior results were obtained with respect to the state-of-the-art to confirm the feasibility of the proposed methodologies concerning the computational complexity, availability of learning data, model explainability, and high data reconstruction accuracy.Note de contenu : 1- Introduction
2- State of the art of unsupervised generative learning (UGL) models
3- Research challenges and open issues of UGL models
4- UGL models for dimensionality reduction and XAI
5- Conclusion and future worksNuméro de notice : 15307 Affiliation des auteurs : non IGN Thématique : INFORMATIQUE Nature : Thèse étrangère Note de thèse : Thèse de doctorat : Informatique : Milan : 2022 DOI : 10.13130/abukmeil-mohanad_phd2022-01-24 En ligne : http://dx.doi.org/10.13130/abukmeil-mohanad_phd2022-01-24 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=99965
Titre : A world model enabling information integrity for autonomous vehicles Type de document : Thèse/HDR Auteurs : Corentin Sanchez, Auteur ; Philippe Bonnifait, Directeur de thèse ; Philippe Xu, Directeur de thèse Editeur : Compiègne : Université de Technologie de Compiègne UTC Année de publication : 2022 Importance : 198 p. Format : 21 x 30 cm Note générale : Bibliographie
Thèse de Doctorat de l'Université de Technologie de Compiègne, Spécialité Automatique et RobotiqueLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Intelligence artificielle
[Termes IGN] attention (apprentissage automatique)
[Termes IGN] carte routière
[Termes IGN] données multisources
[Termes IGN] information sémantique
[Termes IGN] intégrité des données
[Termes IGN] milieu urbain
[Termes IGN] navigation autonome
[Termes IGN] raisonnement
[Termes IGN] réseau routier
[Termes IGN] robot mobile
[Termes IGN] sécurité routière
[Termes IGN] véhicule sans pilote
[Termes IGN] vision par ordinateurIndex. décimale : THESE Thèses et HDR Résumé : (auteur) To drive in complex urban environments, autonomous vehicles need to understand their driving context. This task, also known as the situation awareness, relies on an internal virtual representation of the world made by the vehicle, called world model. This representation is generally built from information provided by multiple sources. High definition navigation maps supply prior information such as road network topology, geometric description of the carriageway, and semantic information including traffic laws. The perception system provides a description of the space and of road users evolving in the vehicle surroundings. Conjointly, they provide representations of the environment (static and dynamic) and allow to model interactions. In complex situations, a reliable and non-misleading world model is mandatory to avoid inappropriate decision-making and to ensure safety. The goal of this PhD thesis is to propose a novel formalism on the concept of world model that fulfills the situation awareness requirements for an autonomous vehicle. This world model integrates prior knowledge on the road network topology, a lane-level grid representation, its prediction over time and more importantly a mechanism to control and monitor the integrity of information. The concept of world model is present in many autonomous vehicle architectures but may take many various forms and sometimes only implicitly. In some work, it is part of the perception process when in some other it is part of a decisionmaking process. The first contribution of this thesis is a survey on the concept of world model for autonomous driving covering different levels of abstraction for information representation and reasoning. Then, a novel representation is proposed for the world model at the tactical level combining dynamic objects and spatial occupancy information. First, a graph based top-down approach using a high-definition map is proposed to extract the areas of interests with respect to the situation from the vehicle's perspective. It is then used to build a Lane Grid Map (LGM), which is an intermediate space state representation from the ego-vehicle point of view. A top-down approach is chosen to assess and characterize the relevant information of the situation. Additionally to classical free-occupied states, the unknown state is further characterized by the notions of neutralized and safe areas that provide a deeper level of understanding of the situation. Another contribution to the world model is an integrity management mechanism that is built upon the LGM representation. It consists in managing the spatial sampling of the grid cells in order to take into account localization and perception errors and to avoid misleading information. Regardless of the confidence on localization and perception information, the LGM is capable of providing reliable information to decision making in order not to take hazardous decisions.The last part of the situation awareness strategy is the prediction of the world model based on the LGM representation. The main contribution is to show how a classical object-level prediction fits this representation and that the integrity can also be extended at the prediction stage. It is also depicted how a neutralized area can be used in the prediction stage to provide a better situation prediction. The work relies on experimental data in order to demonstrate a real application of a complex situation awareness representation. The approach is evaluated with real data obtained thanks to several experimental vehicles equipped with LiDAR sensors and IMU with RTK corrections in the city of Compi_egne. A high-definition map has also been used in the framework of the SIVALab joint laboratory between Renault and Heudiasyc CNRS-UTC. The world model module has been implemented (with ROS software) in order to fulfll real-time application and is functional on the experimental vehicles for live demonstrations. Note de contenu : General introduction
1- World model for autonomous vehicules
2- An architecture for WM
3- A lane level world model
4- Set-based LGM prediction
General conclusionNuméro de notice : 24089 Affiliation des auteurs : non IGN Thématique : INFORMATIQUE Nature : Thèse française Note de thèse : Thèse de Doctorat : Automatique et Robotique : UTC Compiègne : 2022 Organisme de stage : Laboratoire Heudiasyc DOI : sans En ligne : https://www.theses.fr/2022COMP2683 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=102509 Machine learning and geodesy: A survey / Jemil Butt in Journal of applied geodesy, vol 15 n° 2 (April 2021)PermalinkPermalinkPermalinkPermalinkPermalinkPermalinkPermalinkExploration of reinforcement learning algorithms for autonomous vehicle visual perception and control / Florence Carton (2021)PermalinkIntelligent sensors for positioning, tracking, monitoring, navigation and smart sensing in smart cities / Li Tiancheng (2021)PermalinkPermalink