论文标题
Ripple:无服务器计算的实用声明编程框架
Ripple: A Practical Declarative Programming Framework for Serverless Compute
论文作者
论文摘要
无服务器计算已成为基础架构(IAAS)和平台服务(PAAS)云平台的有希望的替代方案,用于具有足够的并行性和间歇性活动的应用程序。无服务器承诺更大的资源弹性,大量的成本和简化的应用程序部署。包括亚马逊,Google和Microsoft在内的所有主要云提供商都向其公共云产品介绍了无服务器。为了使无服务器发挥其潜力,迫切需要编程框架,这些框架将部署复杂性从用户中抽象出来。这包括简化为无服务器环境编写应用程序的过程,自动化任务和数据分配以及处理调度和错误公差。 我们提出了Ripple,这是一个编程框架,旨在专门采用用于单手机执行的应用程序,并允许他们利用无服务器的任务并行。 Ripple公开了一个简单的接口,用户可以利用该界面来表达广泛应用程序的高级数据流,包括机器学习(ML)分析,基因组学和蛋白质组学。 Ripple还可以通过急切地检测出Straggler任务来自动化资源供应,满足用户定义的QoS目标并处理容错。我们在AWS lambda上移植了波纹,并表明,在一组不同的应用程序中,它提供了一个表现力且可推广的编程框架,该框架简化了无服务器上运行的数据并行应用程序,并且与IAAS/PAAS云相比,可以提高性能高达80x。
Serverless computing has emerged as a promising alternative to infrastructure- (IaaS) and platform-as-a-service (PaaS)cloud platforms for applications with ample parallelism and intermittent activity. Serverless promises greater resource elasticity, significant cost savings, and simplified application deployment. All major cloud providers, including Amazon, Google, and Microsoft, have introduced serverless to their public cloud offerings. For serverless to reach its potential, there is a pressing need for programming frameworks that abstract the deployment complexity away from the user. This includes simplifying the process of writing applications for serverless environments, automating task and data partitioning, and handling scheduling and fault tolerance. We present Ripple, a programming framework designed to specifically take applications written for single-machine execution and allow them to take advantage of the task parallelism of serverless. Ripple exposes a simple interface that users can leverage to express the high-level dataflow of a wide spectrum of applications, including machine learning (ML) analytics, genomics, and proteomics. Ripple also automates resource provisioning, meeting user-defined QoS targets, and handles fault tolerance by eagerly detecting straggler tasks. We port Ripple over AWS Lambda and show that, across a set of diverse applications, it provides an expressive and generalizable programming framework that simplifies running data-parallel applications on serverless, and can improve performance by up to 80x compared to IaaS/PaaS clouds for similar costs.