通过蒸馏数据识别Omni监督面部表达识别

论文标题

通过蒸馏数据识别Omni监督面部表达识别

Omni-supervised Facial Expression Recognition via Distilled Data

论文作者

Liu, Ping, Wei, Yunchao, Meng, Zibo, Deng, Weihong, Zhou, Joey Tianyi, Yang, Yi

论文摘要

面部表情在理解人类情绪中起着重要的作用。最近，基于深度学习的方法显示出对面部表达识别的有希望。但是，当前最新面部表达识别（FER）方法的性能与标记的训练数据直接相关。为了解决此问题，先前的工作采用了预处理和最重要的策略，即利用大量未标记的数据来预处网络，然后通过标记的数据进行对。由于标记的数据的数量很少，因此最终的网络性能仍受到限制。从不同的角度来看，我们建议执行全体监督的学习，以直接利用大量未标记数据的可靠样本进行网络培训。特别是，首先使用在少数标记的样本上训练的原始模型首先构建一个新数据集，以根据功能相似性从面部数据集（即MS-CELEB-1M）中选择具有较高置信度得分的样本。我们在实验上验证了以这种全国监督的方式创建的新数据集可以显着提高学到的FER模型的概括能力。但是，随着训练样本的数量的增加，计算成本和培训时间急剧增加。为了解决这个问题，我们建议采用数据集蒸馏策略，以将创建的数据集压缩为几个信息丰富的课堂图像，从而大大提高培训效率。我们已经对广泛使用的基准进行了广泛的实验，在各种设置下，使用拟议的框架可以在各种设置下实现一致的性能增长。更重要的是，蒸馏数据集表明其具有可忽略的额外计算成本来提高FER的功能。

Facial expression plays an important role in understanding human emotions. Most recently, deep learning based methods have shown promising for facial expression recognition. However, the performance of the current state-of-the-art facial expression recognition (FER) approaches is directly related to the labeled data for training. To solve this issue, prior works employ the pretrain-and-finetune strategy, i.e., utilize a large amount of unlabeled data to pretrain the network and then finetune it by the labeled data. As the labeled data is in a small amount, the final network performance is still restricted. From a different perspective, we propose to perform omni-supervised learning to directly exploit reliable samples in a large amount of unlabeled data for network training. Particularly, a new dataset is firstly constructed using a primitive model trained on a small number of labeled samples to select samples with high confidence scores from a face dataset, i.e., MS-Celeb-1M, based on feature-wise similarity. We experimentally verify that the new dataset created in such an omni-supervised manner can significantly improve the generalization ability of the learned FER model. However, as the number of training samples grows, computational cost and training time increase dramatically. To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images, significantly improving the training efficiency. We have conducted extensive experiments on widely used benchmarks, where consistent performance gains can be achieved under various settings using the proposed framework. More importantly, the distilled dataset has shown its capabilities of boosting the performance of FER with negligible additional computational costs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题