通过结构性潜在空间进行细粒度的表达操纵

论文标题

通过结构性潜在空间进行细粒度的表达操纵

Fine-Grained Expression Manipulation via Structured Latent Space

论文作者

Tang, Junshu, Shao, Zhiwen, Ma, Lizhuang

论文摘要

细粒度的面部表达操纵是一个具有挑战性的问题，因为很难捕获细粒的表达细节。大多数现有的表达操纵方法求助于离散的表达标签，该标签主要是编辑全局表达式并忽略了对细节的操纵。为了解决此限制，我们提出了一个端到端表达引导的生成对抗网络（EGGAN），该网络利用结构化的潜在代码和连续的表达标签作为输入，以生成具有预期表达式的图像。具体来说，我们采用对抗性自动编码器将源图像映射到结构化的潜在空间中。然后，给定源潜在代码和目标表达式标签，我们采用条件gan来生成具有目标表达式的新图像。此外，我们引入了感知损失和多尺度的结构相似性损失，以保留发电期间的身份和全球形状。广泛的实验表明，我们的方法可以操纵细粒的表达式，并在源表达式和目标表达之间产生连续的中间表达式。

Fine-grained facial expression manipulation is a challenging problem, as fine-grained expression details are difficult to be captured. Most existing expression manipulation methods resort to discrete expression labels, which mainly edit global expressions and ignore the manipulation of fine details. To tackle this limitation, we propose an end-to-end expression-guided generative adversarial network (EGGAN), which utilizes structured latent codes and continuous expression labels as input to generate images with expected expressions. Specifically, we adopt an adversarial autoencoder to map a source image into a structured latent space. Then, given the source latent code and the target expression label, we employ a conditional GAN to generate a new image with the target expression. Moreover, we introduce a perceptual loss and a multi-scale structural similarity loss to preserve identity and global shape during generation. Extensive experiments show that our method can manipulate fine-grained expressions, and generate continuous intermediate expressions between source and target expressions.

下载PDF全文

下载文献需遵守相关版权规定

论文标题