探索基于时间频参考的端到端边界器的最佳DNN体系结构

论文标题

探索基于时间频参考的端到端边界器的最佳DNN体系结构

Exploring Optimal DNN Architecture for End-to-End Beamformers Based on Time-frequency References

论文作者

Koyama, Yuichiro, Raj, Bhiksha

论文摘要

声学上的波束形式已被广泛用于增强音频信号。当前，最好的方法是广义特征值和最小值无失真响应束缚器的深神经网络（DNN）的功率变体，以及用于直接计算光束式滤波器过滤器的基于DNN的滤波器估计方法。两种方法都是有效的。但是，他们的普遍性有盲点。因此，我们提出了一种新颖的方法，将这两种方法组合到一个试图利用两者最佳特征的单个框架中。所得模型称为W-net Beamformer，包括两个组件。第一个计算第二次用于估计波束形成过滤器的时频参考。数据的结果包括各种各样的房间和噪声条件，包括静态和移动噪声源，表明所提出的波束形式在所有经过测试的评估指标上都优于其他方法，这表明所提出的体系结构允许对光束形成过滤器进行有效计算。

Acoustic beamformers have been widely used to enhance audio signals. Currently, the best methods are the deep neural network (DNN)-powered variants of the generalized eigenvalue and minimum-variance distortionless response beamformers and the DNN-based filter-estimation methods that are used to directly compute beamforming filters. Both approaches are effective; however, they have blind spots in their generalizability. Therefore, we propose a novel approach for combining these two methods into a single framework that attempts to exploit the best features of both. The resulting model, called the W-Net beamformer, includes two components; the first computes time-frequency references that the second uses to estimate beamforming filters. The results on data that include a wide variety of room and noise conditions, including static and mobile noise sources, show that the proposed beamformer outperforms other methods on all tested evaluation metrics, which signifies that the proposed architecture allows for effective computation of the beamforming filters.

下载PDF全文

下载文献需遵守相关版权规定

论文标题