Multi-Modal Face Stylization with a Generative Prior

dc.contributor.authorLi, Mengtianen_US
dc.contributor.authorDong, Yien_US
dc.contributor.authorLin, Minxuanen_US
dc.contributor.authorHuang, Haibinen_US
dc.contributor.authorWan, Pengfeien_US
dc.contributor.authorMa, Chongyangen_US
dc.contributor.editorChaine, Raphaëlleen_US
dc.contributor.editorDeng, Zhigangen_US
dc.contributor.editorKim, Min H.en_US
dc.date.accessioned2023-10-09T07:34:52Z
dc.date.available2023-10-09T07:34:52Z
dc.date.issued2023
dc.description.abstractIn this work, we introduce a new approach for face stylization. Despite existing methods achieving impressive results in this task, there is still room for improvement in generating high-quality artistic faces with diverse styles and accurate facial reconstruction. Our proposed framework, MMFS, supports multi-modal face stylization by leveraging the strengths of StyleGAN and integrates it into an encoder-decoder architecture. Specifically, we use the mid-resolution and high-resolution layers of StyleGAN as the decoder to generate high-quality faces, while aligning its low-resolution layer with the encoder to extract and preserve input facial details. We also introduce a two-stage training strategy, where we train the encoder in the first stage to align the feature maps with StyleGAN and enable a faithful reconstruction of input faces. In the second stage, the entire network is fine-tuned with artistic data for stylized face generation. To enable the fine-tuned model to be applied in zero-shot and one-shot stylization tasks, we train an additional mapping network from the large-scale Contrastive-Language-Image-Pre-training (CLIP) space to a latent w+ space of fine-tuned StyleGAN. Qualitative and quantitative experiments show that our framework achieves superior performance in both one-shot and zero-shot face stylization tasks, outperforming state-of-the-art methods by a large margin.en_US
dc.description.number7
dc.description.sectionheadersVirtual Humans
dc.description.seriesinformationComputer Graphics Forum
dc.description.volume42
dc.identifier.doi10.1111/cgf.14952
dc.identifier.issn1467-8659
dc.identifier.pages10 pages
dc.identifier.urihttps://doi.org/10.1111/cgf.14952
dc.identifier.urihttps://diglib.eg.org:443/handle/10.1111/cgf14952
dc.publisherThe Eurographics Association and John Wiley & Sons Ltd.en_US
dc.subjectCCS Concepts: Computing methodologies -> Image processing
dc.subjectComputing methodologies
dc.subjectImage processing
dc.titleMulti-Modal Face Stylization with a Generative Prioren_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
v42i7_24_14952.pdf
Size:
15.67 MB
Format:
Adobe Portable Document Format
Collections