论文标题
随着反馈学习的增强时间域单语音演讲
A Time-domain Monaural Speech Enhancement with Feedback Learning
论文作者
论文摘要
在本文中,我们提出了一种具有反馈学习的神经网络,称为ftnet,用于增强单声道语音,其中提议的网络由三个主要组成部分组成。第一部分称为阶段复发性神经网络,该网络被引入以有效地通过记忆机制汇总了不同阶段的深度特征依赖性,并且还按阶段删除了干扰。第二部分是卷积自动编码器。第三部分由一系列串联线性单元组成,它们能够促进信息流并逐渐增加接受场。采用反馈学习来提高参数效率,因此,可训练参数的数量有效地减少而不牺牲其性能。对Timit语料库进行了许多实验,实验结果表明,与在不同条件下的两个基于时间域的基线相比,在PESQ和Stoi得分方面,提出的网络可以始终如一地取得更好的性能。
In this paper, we propose a type of neural network with feedback learning in the time domain called FTNet for monaural speech enhancement, where the proposed network consists of three principal components. The first part is called stage recurrent neural network, which is introduced to effectively aggregate the deep feature dependencies across different stages with a memory mechanism and also remove the interference stage by stage. The second part is the convolutional auto-encoder. The third part consists of a series of concatenated gated linear units, which are capable of facilitating the information flow and gradually increasing the receptive fields. Feedback learning is adopted to improve the parameter efficiency and therefore, the number of trainable parameters is effectively reduced without sacrificing its performance. Numerous experiments are conducted on TIMIT corpus and experimental results demonstrate that the proposed network can achieve consistently better performance in terms of both PESQ and STOI scores than two state-of-the-art time domain-based baselines in different conditions.