论文标题
质子化咪唑二聚体的51维势能表面的高斯过程模型
Gaussian process model of 51-dimensional potential energy surface for protonated imidazole dimer
论文作者
论文摘要
本工作的目的是为具有少数$ {\ it ab} $ $ {\ it Initio} $计算的高维分子系统获得准确的势能表面(PE)。我们使用基于高斯过程(GPS)的概率建模。我们说明,可以基于$ 5000 $随机分布的$ {\ it ab} $ $ {\ it Initio} $计算,其全球精度为$ <0.2 $ kcal/mol。我们的方法将GP模型与复合内核一起设计,旨在增强贝叶斯信息含量,并将全局PE表示为全维GP的总和,以及几种用于较低维度的分子片段的GP模型。我们通过为质子化咪唑二聚体(一种$ 19 $原子的分子系统构建全球PES)来证明这些算法的效力。我们说明,因此构建的GP模型可以从低能($ <10,000 $ cm $^{ - 1} $)中推断出PES,在高能($> 20,000 $ CM $ $^{ - 1} $)下产生PES。这为GP的新应用打开了前景,例如通过外推或加速贝叶斯优化来绘制相转换,用于高维物理学和化学问题,并且输入数量限制,即对于获得训练数据的高维问题非常困难。
The goal of the present work is to obtain accurate potential energy surfaces (PES) for high-dimensional molecular systems with a small number of ${\it ab}$ ${\it initio}$ calculations in a system-agnostic way. We use probabilistic modeling based on Gaussian processes (GPs). We illustrate that it is possible to build an accurate GP model of a 51-dimensional PES based on $5000$ randomly distributed ${\it ab}$ ${\it initio}$ calculations with a global accuracy of $< 0.2$ kcal/mol. Our approach uses GP models with composite kernels designed to enhance the Bayesian information content and represents the global PES as a sum of a full-dimensional GP and several GP models for molecular fragments of lower dimensionality. We demonstrate the potency of these algorithms by constructing the global PES for the protonated imidazole dimer, a molecular system with $19$ atoms. We illustrate that GP models thus constructed can extrapolate the PES from low energies ($< 10,000$ cm$^{-1}$), yielding a PES at high energies ($> 20,000$ cm$^{-1}$). This opens the prospect for new applications of GPs, such as mapping out phase transitions by extrapolation or accelerating Bayesian optimization, for high-dimensional physics and chemistry problems with a restricted number of inputs, i.e. for high-dimensional problems where obtaining training data is very difficult.