论文标题
关于分析原子探针层析成像数据的强缩放和开源工具
On Strong Scaling and Open Source Tools for Analyzing Atom Probe Tomography Data
论文作者
论文摘要
原子探针断层扫描(APT)已成熟到一种多功能纳米分析特征工具,其应用从材料科学到地质学,甚至更远。全世界已经存在超过100个APT显微镜。来自APT数据的信息需要对重建点云进行后处理,这是通过数据科学方法的基本实现来实现的,该方法主要使用专有软件执行。该软件的局限性促使APT社区开发补充的后处理工具来应对提高方法复杂性和更高质量的需求:示例是如何提高方法透明度,如何支持批处理处理能力以及如何更全面地记录方法和计算工作流程以更好地与公平的数据管理原理保持一致。 APT软件工具景观中的一个差距是支持科学计算硬件的开放工具的集合。在这里,我们介绍了Paraprobe,这是一种开源,有效的APT数据计算的工具。我们展示了如何处理几种计算几何,空间统计和集群任务,以执行高达20亿个离子的数据集。我们的并行化工作产生了数量级的绩效订单,并提供批处理处理能力。我们为这些工具提供了贡献,以打开APT数据挖掘并简化其制造工具,以进行严格的量化,灵敏度分析和可供从业者提供的跨方法基准测试。
Atom probe tomography (APT) has matured to a versatile nanoanalytical characterization tool with applications that range from materials science to geology and possibly beyond. Already, well over 100 APT microscopes exist worldwide. Information from the APT data requires a post-processing of the reconstructed point cloud which is realized via basic implementations of data science methods, mostly executed with proprietary software. Limitations of the software have motivated the APT community to develop supplementary post-processing tools to cope with increasing method complexity and higher quality demands: examples are how to improve method transparency, how to support batch processing capabilities, and how to document more completely the methods and computational workflows to better align with the FAIR data stewardship principles. One gap in the APT software tool landscape has been a collection of open tools which support scientific computing hardware. Here, we introduce PARAPROBE, an open source, efficient tool for the scientific computing of APT data. We show how to process several computational geometry, spatial statistics, and clustering tasks performantly for datasets as large as two billion ions. Our parallelization efforts yield orders of magnitude performance gains and deliver batch processing capabilities. We contribute these tools in an effort to open up APT data mining and simplify it to make tools for rigorous quantification, sensitivity analyses, and cross-method benchmarking available to practitioners.