论文标题

Gryffin:一种用于贝叶斯的算法,以优化专家知识知识的分类变量

Gryffin: An algorithm for Bayesian optimization of categorical variables informed by expert knowledge

论文作者

Häse, Florian, Aldeghi, Matteo, Hickman, Riley J., Roch, Loïc M., Aspuru-Guzik, Alán

论文摘要

设计功能分子和高级材料需要复杂的设计选择:调整连续过程参数,例如温度或流速,同时选择催化剂或溶剂。迄今为止,尽管渴望为选择分类变量设计有效的策略,但仍集中在数据驱动的实验计划策略中,主要集中在持续的过程参数上。在这里,我们介绍了Gryffin,这是一个通用优化框架,用于自动选择由专家知识驱动的分类变量。 Gryffin基于内核密度估计,增强了贝叶斯优化,对分类分布的平滑近似值。以物理化学描述符的形式利用领域知识,格兰芬可以显着加速寻找有希望的分子和材料。格林芬可以进一步强调提供的描述符之间的相关相关性,以激发身体见解并促进科学直觉。 In addition to comprehensive benchmarks, we demonstrate the capabilities and performance of Gryffin on three examples in materials science and chemistry: (i) the discovery of non-fullerene acceptors for organic solar cells, (ii) the design of hybrid organic-inorganic perovskites for light harvesting, and (iii) the identification of ligands and process parameters for Suzuki-Miyaura reactions.我们的结果表明,Gryffin最简单的形式与最先进的分类优化算法具有竞争力。但是,当利用通过描述符提供的领域知识时,格兰芬(Gryffin)的表现优于其他方法,同时完善该领域知识以促进科学理解。

Designing functional molecules and advanced materials requires complex design choices: tuning continuous process parameters such as temperatures or flow rates, while simultaneously selecting catalysts or solvents. To date, the development of data-driven experiment planning strategies for autonomous experimentation has largely focused on continuous process parameters despite the urge to devise efficient strategies for the selection of categorical variables. Here, we introduce Gryffin, a general purpose optimization framework for the autonomous selection of categorical variables driven by expert knowledge. Gryffin augments Bayesian optimization based on kernel density estimation with smooth approximations to categorical distributions. Leveraging domain knowledge in the form of physicochemical descriptors, Gryffin can significantly accelerate the search for promising molecules and materials. Gryffin can further highlight relevant correlations between the provided descriptors to inspire physical insights and foster scientific intuition. In addition to comprehensive benchmarks, we demonstrate the capabilities and performance of Gryffin on three examples in materials science and chemistry: (i) the discovery of non-fullerene acceptors for organic solar cells, (ii) the design of hybrid organic-inorganic perovskites for light harvesting, and (iii) the identification of ligands and process parameters for Suzuki-Miyaura reactions. Our results suggest that Gryffin, in its simplest form, is competitive with state-of-the-art categorical optimization algorithms. However, when leveraging domain knowledge provided via descriptors, Gryffin outperforms other approaches while simultaneously refining this domain knowledge to promote scientific understanding.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源