论文标题
在自动生成应用特定的FPGA覆盖层中利用Rapidwright
Exploiting RapidWright in the Automatic Generation of Application-Specific FPGA Overlays
论文作者
论文摘要
已经提出了在FPGA设备上实施的叠加体系结构,以增加通用计算中FPGA采用的一种手段。它们提供了诸如灵活性和可编程性之类的软件的好处,从而使构建专用编译器变得更加容易。但是,现有的覆盖层是通用,资源和饥饿的努力,其性能通常比裸金属实现的数量级低。结果,FPGA叠加层仅限于研究和一些利基应用。在本文中,我们介绍了特定于应用程序的FPGA覆盖层(覆盖层),可以为FPGA叠加层提供裸露的金属性能,从而为更广泛的采用打开门。我们的方法是基于从数据流程应用程序中自动提取硬件内核的。然后将提取的内核用于特定于应用的硬件加速器。覆盖层的重新配置是通过RapidWright完成的,它允许绕过HDL设计流。通过原型制作,我们证明了方法的生存能力和相关性。实验表明,与最先进的FPGA覆盖状态相比,高达20倍的生产率提高,而FMAX比直接FPGA实施高1.33倍以上,与裸金属相比,资源和功耗较低的可能性。
Overlay architectures implemented on FPGA devices have been proposed as a means to increase FPGA adoption in general-purpose computing. They provide the benefits of software such as flexibility and programmability, thus making it easier to build dedicated compilers. However, existing overlays are generic, resource and power hungry with performance usually an order of magnitude lower than bare metal implementations. As a result, FPGA overlays have been confined to research and some niche applications. In this paper, we introduce Application-Specific FPGA Overlays (AS-Overlays), which can provide bare-metal performance to FPGA overlays, thus opening doors for broader adoption. Our approach is based on the automatic extraction of hardware kernels from data flow applications. Extracted kernels are then leveraged for application-specific generation of hardware accelerators. Reconfiguration of the overlay is done with RapidWright which allows to bypass the HDL design flow. Through prototyping, we demonstrated the viability and relevance of our approach. Experiments show a productivity improvement up to 20x compared to the state of the art FPGA overlays, while achieving over 1.33x higher Fmax than direct FPGA implementation and the possibility of lower resource and power consumption compared to bare metal.