振动h $ _2 $ co的ML模型：比较繁殖内核，FCHL和Physnet

论文标题

振动h $ _2 $ co的ML模型：比较繁殖内核，FCHL和Physnet

ML Models of Vibrating H$_2$CO: Comparing Reproducing Kernels, FCHL and PhysNet

论文作者

Käser, Silvan, Koner, Debasish, Christensen, Anders S., von Lilienfeld, O. Anatole, Meuwly, Markus

论文摘要

机器学习（ML）已成为提高原子模拟质量的有前途的工具。使用甲醛作为用于分子内相互作用的基准系统，对基于深神经网络（NN）最新变体的ML模型进行了比较评估，并提出了核心Hilbert Space（RKHS+F）和核脊回归（KRR）。对能量和原子力的学习曲线表明，对B3LYP，MP2和CCSD和CCSD（T）-F12参考结果（在数百个）训练集中的参考结果迅速融合。通常，学习曲线从NN（PhysNet）到RKHS+F到KRR（FCHL）时，学习曲线衰减。相反，向新几何形状推出能量的预测能力与RKHS+F和FCHL几乎均等的相同顺序增加。对于谐波振动频率，图片不太清楚，PhysNet和FCHL分别以$ \ sim $ 1和$ \ sim $ \ sim $ 0.2 cm $^{ - 1} $产生扁平学习，无论哪种参考方法，而RKHS+F模型b3lyp的模型级别均可用于MP2和CCSD（T）的持续改进。与实验相比，具有相同初始条件的有限传温分子动力学（MD）模拟具有良好的性能，除了涉及氢弹力运动的高频模式外，其性能良好，这是MD的振动光谱局度限制。对于足够大的训练集尺寸，所有三个模型均可检测到参考电子结构计算的不足收敛（``噪声''），因为学习曲线降低了。带有PhysNet的转移学习（TL）从B3LYP到CCSD（T）-F12表明可以实现数据效率的进一步提高。

Machine Learning (ML) has become a promising tool for improving the quality of atomistic simulations. Using formaldehyde as a benchmark system for intramolecular interactions, a comparative assessment of ML models based on state-of-the-art variants of deep neural networks (NN), reproducing kernel Hilbert space (RKHS+F), and kernel ridge regression (KRR) is presented. Learning curves for energies and atomic forces indicate rapid convergence towards excellent predictions for B3LYP, MP2, and CCSD(T)-F12 reference results for modestly sized (in the hundreds) training sets. Typically, learning curve off-sets decay as one goes from NN (PhysNet) to RKHS+F to KRR (FCHL). Conversely, the predictive power for extrapolation of energies towards new geometries increases in the same order with RKHS+F and FCHL performing almost equally. For harmonic vibrational frequencies, the picture is less clear, with PhysNet and FCHL yielding respectively flat learning at $\sim$ 1 and $\sim$ 0.2 cm$^{-1}$ no matter which reference method, while RKHS+F models level off for B3LYP, and exhibit continued improvements for MP2 and CCSD(T)-F12. Finite-temperature molecular dynamics (MD) simulations with the same initial conditions yield indistinguishable infrared spectra with good performance compared with experiment except for the high-frequency modes involving hydrogen stretch motion which is a known limitation of MD for vibrational spectroscopy. For sufficiently large training set sizes all three models can detect insufficient convergence (``noise'') of the reference electronic structure calculations in that the learning curves level off. Transfer learning (TL) from B3LYP to CCSD(T)-F12 with PhysNet indicates that additional improvements in data efficiency can be achieved.

下载PDF全文

下载文献需遵守相关版权规定

论文标题