论文标题
阶段感知的深度语音增强:一切都与框架长度有关
Phase-Aware Deep Speech Enhancement: It's All About The Frame Length
论文作者
论文摘要
语音处理中的算法延迟以用于傅立叶分析的框架长度为主导,这反过来限制了以幅度为中心的方法的可实现的性能。正如先前的研究表明,随着框架长度的减小,相位的重要性,这项工作介绍了一项关于现代深神网络(DNN)基于不同框架长度上基于现代深层神经网络(DNN)的语音增强的贡献的系统研究。结果表明,与使用更长的帧相比,使用短帧时,DNN可以成功估计阶段,总体性能相似或更好。因此,有趣的是,现代的阶段感知DNN可以以高质量的高延迟语音增强。
Algorithmic latency in speech processing is dominated by the frame length used for Fourier analysis, which in turn limits the achievable performance of magnitude-centric approaches. As previous studies suggest the importance of phase grows with decreasing frame length, this work presents a systematical study on the contribution of phase and magnitude in modern Deep Neural Network (DNN)-based speech enhancement at different frame lengths. Results indicate that DNNs can successfully estimate phase when using short frames, with similar or better overall performance compared to using longer frames. Thus, interestingly, modern phase-aware DNNs allow for low-latency speech enhancement at high quality.