论文标题
使用正则化稀疏自动编码器的良好反应坐标和MD轨迹的未来进化的预测:一种新颖的深度学习方法
Prediction of good reaction coordinates and future evolution of MD trajectories using Regularized Sparse Autoencoders: A novel deep learning approach
论文作者
论文摘要
鉴于RCS在确定化学反应的进展中,识别反应坐标(RCS)是研究的活跃领域。反应坐标的选择通常是基于启发式知识。但是,选择的基本标准是坐标应明确捕获反应物和产物状态。同样,坐标应该是最慢的,以便所有其他自由度都可以轻松地沿反应坐标平衡。同样,坐标应该是最慢的,以便所有其他自由度都可以轻松地沿反应坐标平衡。我们使用了一个基于能量的模型的正则稀疏自动编码器来发现一组至关重要的反应坐标。除了发现反应坐标外,我们的模型还预测了分子动力学(MD)轨迹的演变。我们表明,包括实施正则化的稀疏性有助于选择一组少量但重要的反应坐标集。我们使用了两个模型系统来证明我们的方法:丙氨酸二肽系统和prollavine and DNA系统,它们在水性环境中表现出proflavine插入到DNA小凹槽中。我们将MD轨迹建模为多元时间序列,我们的潜在变量模型执行了多步骤时间序列预测的任务。这个想法的灵感来自流行的稀疏编码方法 - 将每个输入样本表示为从一组代表性模式中获取的几个元素的线性组合。
Identifying reaction coordinates(RCs) is an active area of research, given the crucial role RCs play in determining the progress of a chemical reaction. The choice of the reaction coordinate is often based on heuristic knowledge. However, an essential criterion for the choice is that the coordinate should capture both the reactant and product states unequivocally. Also, the coordinate should be the slowest one so that all the other degrees of freedom can easily equilibrate along the reaction coordinate. Also, the coordinate should be the slowest one so that all the other degrees of freedom can easily equilibrate along the reaction coordinate. We used a regularised sparse autoencoder, an energy-based model, to discover a crucial set of reaction coordinates. Along with discovering reaction coordinates, our model also predicts the evolution of a molecular dynamics(MD) trajectory. We showcased that including sparsity enforcing regularisation helps in choosing a small but important set of reaction coordinates. We used two model systems to demonstrate our approach: alanine dipeptide system and proflavine and DNA system, which exhibited intercalation of proflavine into DNA minor groove in an aqueous environment. We model MD trajectory as a multivariate time series, and our latent variable model performs the task of multi-step time series prediction. This idea is inspired by the popular sparse coding approach - to represent each input sample as a linear combination of few elements taken from a set of representative patterns.