Multi-Modal Face Stylization with a Generative Prior

Li, Mengtian; Dong, Yi; Lin, Minxuan; Huang, Haibin; Wan, Pengfei; Ma, Chongyang

Multi-Modal Face Stylization with a Generative Prior

dc.contributor.author	Li, Mengtian	en_US
dc.contributor.author	Dong, Yi	en_US
dc.contributor.author	Lin, Minxuan	en_US
dc.contributor.author	Huang, Haibin	en_US
dc.contributor.author	Wan, Pengfei	en_US
dc.contributor.author	Ma, Chongyang	en_US
dc.contributor.editor	Chaine, Raphaëlle	en_US
dc.contributor.editor	Deng, Zhigang	en_US
dc.contributor.editor	Kim, Min H.	en_US
dc.date.accessioned	2023-10-09T07:34:52Z
dc.date.available	2023-10-09T07:34:52Z
dc.date.issued	2023
dc.description.abstract	In this work, we introduce a new approach for face stylization. Despite existing methods achieving impressive results in this task, there is still room for improvement in generating high-quality artistic faces with diverse styles and accurate facial reconstruction. Our proposed framework, MMFS, supports multi-modal face stylization by leveraging the strengths of StyleGAN and integrates it into an encoder-decoder architecture. Specifically, we use the mid-resolution and high-resolution layers of StyleGAN as the decoder to generate high-quality faces, while aligning its low-resolution layer with the encoder to extract and preserve input facial details. We also introduce a two-stage training strategy, where we train the encoder in the first stage to align the feature maps with StyleGAN and enable a faithful reconstruction of input faces. In the second stage, the entire network is fine-tuned with artistic data for stylized face generation. To enable the fine-tuned model to be applied in zero-shot and one-shot stylization tasks, we train an additional mapping network from the large-scale Contrastive-Language-Image-Pre-training (CLIP) space to a latent w+ space of fine-tuned StyleGAN. Qualitative and quantitative experiments show that our framework achieves superior performance in both one-shot and zero-shot face stylization tasks, outperforming state-of-the-art methods by a large margin.	en_US
dc.description.number	7
dc.description.sectionheaders	Virtual Humans
dc.description.seriesinformation	Computer Graphics Forum
dc.description.volume	42
dc.identifier.doi	10.1111/cgf.14952
dc.identifier.issn	1467-8659
dc.identifier.pages	10 pages
dc.identifier.uri	https://doi.org/10.1111/cgf.14952
dc.identifier.uri	https://diglib.eg.org:443/handle/10.1111/cgf14952
dc.publisher	The Eurographics Association and John Wiley & Sons Ltd.	en_US
dc.subject	CCS Concepts: Computing methodologies -> Image processing
dc.subject	Computing methodologies
dc.subject	Image processing
dc.title	Multi-Modal Face Stylization with a Generative Prior	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: v42i7_24_14952.pdf
Size:: 15.67 MB
Format:: Adobe Portable Document Format

Download

Collections

42-Issue 7