多任务单信道语音使用语音存在概率作为次级任务培训目标

论文标题

多任务单信道语音使用语音存在概率作为次级任务培训目标

Multi-task single channel speech enhancement using speech presence probability as a secondary task training target

论文作者

Wang, L., Zhu, J., Kodrasi, I.

论文摘要

为了应对单个通道声场景中的混响和噪音，典型的监督深神经网络〜（DNN）的技术从回响和嘈杂的输入功能到用户定义的目标学习映射。常用的目标是所需的信号幅度，诸如维也纳增益之类的时频面膜或可用于计算时间频面掩码的干扰功率频谱密度和信号与干扰比。在本文中，我们建议通过使用语音存在概率（SPP）估计作为辅助任务，将多任务学习纳入此类基于DNN的增强技术，从而有助于主要任务中的目标估计。多任务学习的优点在于在两个任务（即目标和SPP估计）之间共享特定于域的信息，而学习更具概括性和健壮的表示。为了同时学习这两个任务，我们建议使用从任务的同质不确定性得出的自适应加权方法。仿真结果表明，经过训练的单任务DNN的缩放和降噪性能高于训练的单任务DNN的性能，以估算所需的信号幅度，干扰功率频谱密度或信号侵入比率。结合了提出的多任务学习方案，以共同估计维纳（Wiener）增益，而SPP则进一步增加了脊椎和降噪。

To cope with reverberation and noise in single channel acoustic scenarios, typical supervised deep neural network~(DNN)-based techniques learn a mapping from reverberant and noisy input features to a user-defined target. Commonly used targets are the desired signal magnitude, a time-frequency mask such as the Wiener gain, or the interference power spectral density and signal-to-interference ratio that can be used to compute a time-frequency mask. In this paper, we propose to incorporate multi-task learning in such DNN-based enhancement techniques by using speech presence probability (SPP) estimation as a secondary task assisting the target estimation in the main task. The advantage of multi-task learning lies in sharing domain-specific information between the two tasks (i.e., target and SPP estimation) and learning more generalizable and robust representations. To simultaneously learn both tasks, we propose to use the adaptive weighting method of losses derived from the homoscedastic uncertainty of tasks. Simulation results show that the dereverberation and noise reduction performance of a single-task DNN trained to directly estimate the Wiener gain is higher than the performance of single-task DNNs trained to estimate the desired signal magnitude, the interference power spectral density, or the signal-to-interference ratio. Incorporating the proposed multi-task learning scheme to jointly estimate the Wiener gain and the SPP increases the dereverberation and noise reduction further.

下载PDF全文

下载文献需遵守相关版权规定

论文标题