Détail de l'auteur
Auteur Pavel Tokmakov |
Documents disponibles écrits par cet auteur (1)
Ajouter le résultat dans votre panier Affiner la recherche Interroger des sources externes
Learning to segment moving objects / Pavel Tokmakov in International journal of computer vision, vol 127 n° 3 (March 2019)
[article]
Titre : Learning to segment moving objects Type de document : Article/Communication Auteurs : Pavel Tokmakov, Auteur ; Cordelia Schmid, Auteur ; Karteek Alahari, Auteur Année de publication : 2019 Article en page(s) : pp 282 - 301 Note générale : Bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image
[Termes IGN] apprentissage profond
[Termes IGN] cohérence temporelle
[Termes IGN] image vidéo
[Termes IGN] objet mobile
[Termes IGN] reconnaissance d'objets
[Termes IGN] réseau neuronal convolutif
[Termes IGN] séquence d'imagesRésumé : (Auteur) We study the problem of segmenting moving objects in unconstrained videos. Given a video, the task is to segment all the objects that exhibit independent motion in at least one frame. We formulate this as a learning problem and design our framework with three cues: (1) independent object motion between a pair of frames, which complements object recognition, (2) object appearance, which helps to correct errors in motion estimation, and (3) temporal consistency, which imposes additional constraints on the segmentation. The framework is a two-stream neural network with an explicit memory module. The two streams encode appearance and motion cues in a video sequence respectively, while the memory module captures the evolution of objects over time, exploiting the temporal consistency. The motion stream is a convolutional neural network trained on synthetic videos to segment independently moving objects in the optical flow field. The module to build a “visual memory” in video, i.e., a joint representation of all the video frames, is realized with a convolutional recurrent unit learned from a small number of training video sequences. For every pixel in a frame of a test video, our approach assigns an object or background label based on the learned spatio-temporal features as well as the “visual memory” specific to the video. We evaluate our method extensively on three benchmarks, DAVIS, Freiburg-Berkeley motion segmentation dataset and SegTrack. In addition, we provide an extensive ablation study to investigate both the choice of the training data and the influence of each component in the proposed framework. Numéro de notice : A2018-601 Affiliation des auteurs : non IGN Thématique : IMAGERIE/INFORMATIQUE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.1007/s11263-018-1122-2 Date de publication en ligne : 22/09/2018 En ligne : https://doi.org/10.1007/s11263-018-1122-2 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=92528
in International journal of computer vision > vol 127 n° 3 (March 2019) . - pp 282 - 301[article]