论文标题
将LHCB工作流程在HPC资源上:状态和策略
Integrating LHCb workflows on HPC resources: status and strategies
论文作者
论文摘要
高性能计算(HPC)超级计算机有望在未来几年中在HEP计算中发挥越来越重要的作用。尽管HPC资源不一定是HEP工作流的最佳拟合度,但在一段时间以来,LHC实验已经可以在HPC中心进行计算时间,而LHC实验已经可以使用,并且有可能将部分认捐的计算资源作为CPU时间分配在将来的HPC中心提供。因此,实验工作流程以最有效地利用HPC资源是必不可少的。本文介绍了在意大利Cineca的特定HPC站点(Marconi-A2系统)中整合LHCB工作流程所必需的工作,LHCB在其中受益于与其他大型Hadron Collider(LHC)实验的联合PRACE(欧洲高级计算的合作伙伴关系)。这需要解决两种类型的挑战:在软件应用程序工作负载上,以优化其在多核硬件体系结构上的性能,该架构与WLCG(全球LHC计算网格)传统上使用的构建结构有很大不同,并通过使用多进程方法来减少内存足迹;在分布式计算区域中,用于使用每个工作的多个逻辑处理器提交这些工作负载,而LHCB从未完成过。
High Performance Computing (HPC) supercomputers are expected to play an increasingly important role in HEP computing in the coming years. While HPC resources are not necessarily the optimal fit for HEP workflows, computing time at HPC centers on an opportunistic basis has already been available to the LHC experiments for some time, and it is also possible that part of the pledged computing resources will be offered as CPU time allocations at HPC centers in the future. The integration of the experiment workflows to make the most efficient use of HPC resources is therefore essential. This paper describes the work that has been necessary to integrate LHCb workflows at a specific HPC site, the Marconi-A2 system at CINECA in Italy, where LHCb benefited from a joint PRACE (Partnership for Advanced Computing in Europe) allocation with the other Large Hadron Collider (LHC) experiments. This has required addressing two types of challenges: on the software application workloads, for optimising their performance on a many-core hardware architecture that differs significantly from those traditionally used in WLCG (Worldwide LHC Computing Grid), by reducing memory footprint using a multi-process approach; and in the distributed computing area, for submitting these workloads using more than one logical processor per job, which had never been done yet in LHCb.