Détail de l'auteur
Auteur Nikos Komodakis |
Documents disponibles écrits par cet auteur (1)
Ajouter le résultat dans votre panier Affiner la recherche Interroger des sources externes
Titre : Effective and annotation efficient deep learning for image understanding Type de document : Thèse/HDR Auteurs : Spyridon Gidaris, Auteur ; Nikos Komodakis, Directeur de thèse Editeur : Champs/Marne : Université Paris-Est Année de publication : 2018 Importance : 236 p. Format : 21 x 30 cm Note générale : bibliographie
Thèse de Doctorat de l’Université Paris-Est, Domaine : Traitement du Signal et des ImagesLangues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] analyse d'image numérique
[Termes IGN] apprentissage profond
[Termes IGN] classification par réseau neuronal convolutif
[Termes IGN] compréhension de l'image
[Termes IGN] détection d'objet
[Termes IGN] prédiction
[Termes IGN] reconnaissance d'objets
[Termes IGN] segmentation sémantiqueIndex. décimale : THESE Thèses et HDR Résumé : (auteur) Recent development in deep learning have achieved impressive results on image understanding tasks. However, designing deep learning architectures that will effectively solve the image understanding tasks of interest is far from trivial. Even more, the success of deep learning approaches heavily relies on the availability of large-size manually labeled (by humans) data. In this context, the objective of this dissertation is to explore deep learning based approaches for core image understanding tasks that would allow to increase the effectiveness with which they are performed as well as to make their learning process more annotation efficient, i.e., less dependent on the availability of large amounts of manually labeled training data. We first focus on improving the state-of-the-art on object detection. More specifically, we attempt to boost the ability of object detection systems to recognize (even difficult) object instances by proposing a multi-region and semantic segmentation-aware ConvNet-based representation that is able to capture a diverse set of discriminative appearance factors. Also, we aim to improve the localization accuracy of object detection systems by proposing iterative detection schemes and a novel localization model for estimating the bounding box of the objects. We demonstrate that the proposed technical novelties lead to significant improvements in the object detection performance of PASCAL and MS COCO benchmarks. Regarding the pixel-wise image labeling problem, we explored a family of deep neural network architectures that perform structured prediction by learning to (iteratively) improve some initial estimates of the output labels. The goal is to identify which is the optimal architecture for implementing such deep structured prediction models. In this context, we propose to decompose the label improvement task into three steps: 1) detecting the initial label estimates that are incorrect, 2) replacing the incorrect labels with new ones, and finally 3) refining the renewed labels by predicting residual corrections w.r.t. them. We evaluate the explored architectures on the disparity estimation task and we demonstrate that the proposed architecture achieves state-of-the-art results on the KITTI 2015 benchmark.In order to accomplish our goal for annotation efficient learning, we proposed a self-supervised learning approach that learns ConvNet-based image representations by training the ConvNet to recognize the 2d rotation that is applied to the image that it gets as input. We empirically demonstrate that this apparently simple task actually provides a very powerful supervisory signal for semantic feature learning. Specifically, the image features learned from this task exhibit very good results when transferred on the visual tasks of object detection and semantic segmentation, surpassing prior unsupervised learning approaches and thus narrowing the gap with the supervised case.Finally, also in the direction of annotation efficient learning, we proposed a novel few-shot object recognition system that after training is capable to dynamically learn novel categories from only a few data (e.g., only one or five training examples) while it does not forget the categories on which it was trained on. In order to implement the proposed recognition system we introduced two technical novelties, an attention based few-shot classification weight generator, and implementing the classifier of the ConvNet based recognition model as a cosine similarity function between feature representations and classification vectors. We demonstrate that the proposed approach achieved state-of-the-art results on relevant few-shot benchmarks. Note de contenu : Introduction
1- Effective deep learning for image understanding
2- Annotation deep learning for image understandingNuméro de notice : 25835 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Thèse française Note de thèse : Thèse de Doctorat : Domaine : Traitement du Signal et des Images : Paris-Est : 2018 nature-HAL : Thèse DOI : sans En ligne : http://www.theses.fr/2018PESC1143 Format de la ressource électronique : URL Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=95174