论文标题
从科学文献中对全稳态电池的注释和提取合成过程
Annotating and Extracting Synthesis Process of All-Solid-State Batteries from Scientific Literature
论文作者
论文摘要
合成过程对于在无机材料化学领域中实现计算实验设计至关重要。在这项工作中,我们介绍了全稳态电池合成过程的新型语料库和一个自动化的机器阅读系统,用于提取埋在科学文献中的合成过程。我们使用流图定义了合成过程的表示,并从243篇论文的实验部分创建了一个语料库。自动化的机器阅读系统是由基于深度学习的序列标记器和简单的基于启发式规则的关系提取器开发的。我们的实验结果表明,具有最佳设置的序列标记器可以检测宏观平均F1得分为0.826的实体,而基于规则的关系提取器可以以0.887的宏观平均F1得分来实现高性能。
The synthesis process is essential for achieving computational experiment design in the field of inorganic materials chemistry. In this work, we present a novel corpus of the synthesis process for all-solid-state batteries and an automated machine reading system for extracting the synthesis processes buried in the scientific literature. We define the representation of the synthesis processes using flow graphs, and create a corpus from the experimental sections of 243 papers. The automated machine-reading system is developed by a deep learning-based sequence tagger and simple heuristic rule-based relation extractor. Our experimental results demonstrate that the sequence tagger with the optimal setting can detect the entities with a macro-averaged F1 score of 0.826, while the rule-based relation extractor can achieve high performance with a macro-averaged F1 score of 0.887.