Détail de l'auteur
Auteur Bolei Zhou |
Documents disponibles écrits par cet auteur (2)
Ajouter le résultat dans votre panier Affiner la recherche Interroger des sources externes
Semantic hierarchy emerges in deep generative representations for scene synthesis / Ceyuan Yang in International journal of computer vision, vol 129 n° 5 (May 2021)
[article]
Titre : Semantic hierarchy emerges in deep generative representations for scene synthesis Type de document : Article/Communication Auteurs : Ceyuan Yang, Auteur ; Yujun Shen, Auteur ; Bolei Zhou, Auteur Année de publication : 2021 Article en page(s) : pp 1451 - 1466 Note générale : bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image optique
[Termes IGN] analyse visuelle
[Termes IGN] apprentissage profond
[Termes IGN] compréhension de l'image
[Termes IGN] représentation des connaissances
[Termes IGN] réseau antagoniste génératif
[Termes IGN] segmentation hiérarchique
[Termes IGN] segmentation sémantique
[Termes IGN] synthèse d'imageRésumé : (auteur) Despite the great success of Generative Adversarial Networks (GANs) in synthesizing images, there lacks enough understanding of how photo-realistic images are generated from the layer-wise stochastic latent codes introduced in recent GANs. In this work, we show that highly-structured semantic hierarchy emerges in the deep generative representations from the state-of-the-art GANs like StyleGAN and BigGAN, trained for scene synthesis. By probing the per-layer representation with a broad set of semantics at different abstraction levels, we manage to quantify the causality between the layer-wise activations and the semantics occurring in the output image. Such a quantification identifies the human-understandable variation factors that can be further used to steer the generation process, such as changing the lighting condition and varying the viewpoint of the scene. Extensive qualitative and quantitative results suggest that the generative representations learned by the GANs with layer-wise latent codes are specialized to synthesize various concepts in a hierarchical manner: the early layers tend to determine the spatial layout, the middle layers control the categorical objects, and the later layers render the scene attributes as well as the color scheme. Identifying such a set of steerable variation factors facilitates high-fidelity scene editing based on well-learned GAN models without any retraining (code and demo video are available at https://genforce.github.io/higan). Numéro de notice : A2021-408 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article DOI : 10.1007/s11263-020-01429-5 Date de publication en ligne : 10/02/2021 En ligne : https://doi.org/10.1007/s11263-020-01429-5 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=97725
in International journal of computer vision > vol 129 n° 5 (May 2021) . - pp 1451 - 1466[article]Semantic understanding of scenes through the ADE20K dataset / Bolei Zhou in International journal of computer vision, vol 127 n° 3 (March 2019)
[article]
Titre : Semantic understanding of scenes through the ADE20K dataset Type de document : Article/Communication Auteurs : Bolei Zhou, Auteur ; Hang Zhao, Auteur ; Xavier Puig, Auteur ; Tete Xiao, Auteur ; Sanja Fidler, Auteur ; et al., Auteur Année de publication : 2019 Article en page(s) : pp 302 - 321 Note générale : Bibliographie Langues : Anglais (eng) Descripteur : [Vedettes matières IGN] Traitement d'image
[Termes IGN] apprentissage profond
[Termes IGN] compréhension de l'image
[Termes IGN] détection d'objet
[Termes IGN] jeu de données localisées
[Termes IGN] réseau neuronal artificiel
[Termes IGN] scène
[Termes IGN] segmentation sémantiqueRésumé : (Auteur) Semantic understanding of visual scenes is one of the holy grails of computer vision. Despite efforts of the community in data collection, there are still few image datasets covering a wide range of scenes and object categories with pixel-wise annotations for scene understanding. In this work, we present a densely annotated dataset ADE20K, which spans diverse annotations of scenes, objects, parts of objects, and in some cases even parts of parts. Totally there are 25k images of the complex everyday scenes containing a variety of objects in their natural spatial context. On average there are 19.5 instances and 10.5 object classes per image. Based on ADE20K, we construct benchmarks for scene parsing and instance segmentation. We provide baseline performances on both of the benchmarks and re-implement state-of-the-art models for open source. We further evaluate the effect of synchronized batch normalization and find that a reasonably large batch size is crucial for the semantic segmentation performance. We show that the networks trained on ADE20K are able to segment a wide variety of scenes and objects. Numéro de notice : A2018-602 Affiliation des auteurs : non IGN Thématique : IMAGERIE Nature : Article nature-HAL : ArtAvecCL-RevueIntern DOI : 10.1007/s11263-018-1140-0 Date de publication en ligne : 07/12/2018 En ligne : https://doi.org/10.1007/s11263-018-1140-0 Format de la ressource électronique : URL article Permalink : https://documentation.ensg.eu/index.php?lvl=notice_display&id=92529
in International journal of computer vision > vol 127 n° 3 (March 2019) . - pp 302 - 321[article]