部分欺骗音频的波形边界检测

论文标题

部分欺骗音频的波形边界检测

Waveform Boundary Detection for Partially Spoofed Audio

论文作者

Cai, Zexin, Wang, Weiqing, Li, Ming

论文摘要

本文提出了一个波形边界检测系统，用于音频欺骗攻击，其中包含部分操纵段。据报道，部分欺骗/假音频被替换为综合或自然音频剪辑的一部分话语，最近据报道是音频深击的一种情况。由于深击可能对社会保障构成威胁，因此对这种欺骗音频的检测至关重要。因此，我们建议通过基于深度学习的框架级检测系统解决该问题，该系统可以检测到部分欺骗的音频并找到操纵的零件。我们提出的方法对ADD2022挑战提供的数据进行了培训和评估。我们评估了有关各种声学特征和网络配置的检测模型。结果，我们的检测系统在ADD2022挑战测试集上达到了同等的错误率（EER），这是可以定位操纵夹的部分欺骗音频检测系统中的最佳性能。

The present paper proposes a waveform boundary detection system for audio spoofing attacks containing partially manipulated segments. Partially spoofed/fake audio, where part of the utterance is replaced, either with synthetic or natural audio clips, has recently been reported as one scenario of audio deepfakes. As deepfakes can be a threat to social security, the detection of such spoofing audio is essential. Accordingly, we propose to address the problem with a deep learning-based frame-level detection system that can detect partially spoofed audio and locate the manipulated pieces. Our proposed method is trained and evaluated on data provided by the ADD2022 Challenge. We evaluate our detection model concerning various acoustic features and network configurations. As a result, our detection system achieves an equal error rate (EER) of 6.58% on the ADD2022 challenge test set, which is the best performance in partially spoofed audio detection systems that can locate manipulated clips.

下载PDF全文

下载文献需遵守相关版权规定

论文标题