论文标题

FastWave:在FPGA上加速自回归卷积神经网络

FastWave: Accelerating Autoregressive Convolutional Neural Networks on FPGA

论文作者

Hussain, Shehzeen, Javaheripi, Mojan, Neekhara, Paarth, Kastner, Ryan, Koushanfar, Farinaz

论文摘要

自回归卷积神经网络(CNN)已被广泛利用用于序列生成任务,例如音频综合,语言建模和神经机器翻译。 WaveNet是一种深度自回旋的CNN,由用于序列生成的几个堆叠式卷积层组成。尽管Wavenet产生了最先进的音频产生结果,但幼稚的推断实施却很慢。在高端GPU上仅产生音频的一秒钟,需要几分钟。在这项工作中,我们开发了第一个用于自动卷积神经网络的加速器平台〜\ textit {fastwave},并应对相关的设计挑战。我们在Vivado HLS中设计了快速波动推理模型,并执行广泛的优化,包括定点实现,阵列分区和管道。我们的模型对快速矩阵向量乘法使用完全参数化的并行体系结构,该乘法可实现每层自定义延迟微调以进行进一步的吞吐量改进。我们的实验相对评估了各种优化的吞吐量和资源利用之间的权衡。我们在Xilinx XCVU13P FPGA上的最佳Wavenet设计仅使用芯片内存,与CPU实现相比,与GPU实现相比,与CPU实现相比,生成速度更快66。

Autoregressive convolutional neural networks (CNNs) have been widely exploited for sequence generation tasks such as audio synthesis, language modeling and neural machine translation. WaveNet is a deep autoregressive CNN composed of several stacked layers of dilated convolution that is used for sequence generation. While WaveNet produces state-of-the art audio generation results, the naive inference implementation is quite slow; it takes a few minutes to generate just one second of audio on a high-end GPU. In this work, we develop the first accelerator platform~\textit{FastWave} for autoregressive convolutional neural networks, and address the associated design challenges. We design the Fast-Wavenet inference model in Vivado HLS and perform a wide range of optimizations including fixed-point implementation, array partitioning and pipelining. Our model uses a fully parameterized parallel architecture for fast matrix-vector multiplication that enables per-layer customized latency fine-tuning for further throughput improvement. Our experiments comparatively assess the trade-off between throughput and resource utilization for various optimizations. Our best WaveNet design on the Xilinx XCVU13P FPGA that uses only on-chip memory, achieves 66 faster generation speed compared to CPU implementation and 11 faster generation speed than GPU implementation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源