撕下记忆墙

论文标题

Tearing Down the Memory Wall

论文作者

Qureshi, Zaid, Mailthody, Vikram Sharma, Min, Seung Won, Chung, I-Hsin, Xiong, Jinjun, Hwu, Wen-mei

论文摘要

我们为博学的体系结构提出了一个愿景，该构筑重新定义了计算和内存抽象，使记忆带宽和容量与计算吞吐量一起成为一流的公民。在此体系结构中，我们设想将高密度，大量平行的存储器技术（例如Flash）与可编程近数据加速器（如Modern GPU中的流媒体多处理器）结合在一起。每个加速器都有一个本地存储级内存池，它可以通过启动大量的重叠请求来在高吞吐量下访问，这些请求有助于耐受长期访问延迟。加速器还可以通过高通量的低延迟互连进行彼此通信和远程内存。结果，基于博学体系结构量表的系统以相同的速率计算和内存带宽，从而拆除了几代人困扰计算机体系结构的臭名昭著的记忆墙。在本文中，我们介绍了博学的动机，理由，设计，利益和研究挑战。

We present a vision for the Erudite architecture that redefines the compute and memory abstractions such that memory bandwidth and capacity become first-class citizens along with compute throughput. In this architecture, we envision coupling a high-density, massively parallel memory technology like Flash with programmable near-data accelerators, like the streaming multiprocessors in modern GPUs. Each accelerator has a local pool of storage-class memory that it can access at high throughput by initiating very large numbers of overlapping requests that help to tolerate long access latency. The accelerators can also communicate with each other and remote memory through a high-throughput low-latency interconnect. As a result, systems based on the Erudite architecture scale compute and memory bandwidth at the same rate, tearing down the notorious memory wall that has plagued computer architecture for generations. In this paper, we present the motivation, rationale, design, benefit, and research challenges for Erudite.

下载PDF全文

下载文献需遵守相关版权规定

论文标题