论文标题

MSV挑战2022:低资源印度语言的NPU-HC扬声器验证系统

MSV Challenge 2022: NPU-HC Speaker Verification System for Low-resource Indian Languages

论文作者

Li, Yue, Zhang, Li, Wang, Namin, Liu, Jie, Xie, Lei

论文摘要

本报告描述了提交给O-Cocosda多语言扬声器验证(MSV)挑战2022的NPU-HC扬声器验证系统,该系统的重点是开发用于低资源亚洲语言的扬声器验证系统。我们参加了I-MSV轨道,该曲目旨在为各种印度语言开发扬声器验证系统。在这一挑战中,我们首先探讨了用于低资源扬声器验证的不同神经网络框架。然后,我们利用香草微调和重量转移微调将外域预训练的模型转移到印度内域数据集中。具体而言,重量转移微调旨在限制预训练模型和微调模型之间权重的距离,该模型利用了先前获得的判别能力,从大规模外域数据集中获得了歧视能力,并避免了同时灾难性的遗忘和过度拟合。最后,采用得分融合以进一步提高性能。与上述贡献一起,我们在公众评估集中获得了0.223%的EER,排名排名第二。在私人评估集中,对于I-MSV轨道的约束和不受约束的子任务,我们提交系统的EER分别为2.123%和0.630%,从而在排名中分别达到了排名和第三名。

This report describes the NPU-HC speaker verification system submitted to the O-COCOSDA Multi-lingual Speaker Verification (MSV) Challenge 2022, which focuses on developing speaker verification systems for low-resource Asian languages. We participate in the I-MSV track, which aims to develop speaker verification systems for various Indian languages. In this challenge, we first explore different neural network frameworks for low-resource speaker verification. Then we leverage vanilla fine-tuning and weight transfer fine-tuning to transfer the out-domain pre-trained models to the in-domain Indian dataset. Specifically, the weight transfer fine-tuning aims to constrain the distance of the weights between the pre-trained model and the fine-tuned model, which takes advantage of the previously acquired discriminative ability from the large-scale out-domain datasets and avoids catastrophic forgetting and overfitting at the same time. Finally, score fusion is adopted to further improve performance. Together with the above contributions, we obtain 0.223% EER on the public evaluation set, ranking 2nd place on the leaderboard. On the private evaluation set, the EER of our submitted system is 2.123% and 0.630% for the constrained and unconstrained sub-tasks of the I-MSV track, leading to the 1st and 3rd place in the ranking, respectively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源