论文标题

一种基于文本插入的方法,用于衡量专利的技术相似性 - 工作流程,代码和应用程​​序

A Text-Embedding-based Approach to Measure Patent-to-Patent Technological Similarity -- Workflow, Code, and Applications

论文作者

Hain, Daniel, Jurowetzki, Roman, Buchmann, Tobias, Wolf, Patrick

论文摘要

本文描述了一种有效可扩展的方法,可以通过将自然语言处理中的嵌入技术与最近邻居近似结合起来来衡量专利之间的技术相似性。使用这种方法,我们能够计算所有专利之间的现有相似性,这又使我们能够将整个专利宇宙代表为技术网络。我们以各种方式验证了技术签名和相似性,并在电动汽车技术的情况下证明了它们的有用性来衡量知识流,映射技术变化并创建专利质量指标。因此,本文有助于越来越多的有关文本指标的文献进行专利分析。我们提供有关该方法的详尽文档,包括https://github.com/daniel-hain/patent_embedding_research,包括所有代码,指标和中间输出。

This paper describes an efficiently scalable approach to measure technological similarity between patents by combining embedding techniques from natural language processing with nearest-neighbor approximation. Using this methodology we are able to compute existing similarities between all patents, which in turn enables us to represent the whole patent universe as a technological network. We validate both technological signature and similarity in various ways, and demonstrate at the case of electric vehicle technologies their usefulness to measure knowledge flows, map technological change, and create patent quality indicators. Thereby the paper contributes to the growing literature on text-based indicators for patent analysis. We provide thorough documentations of the method, including all code, indicators, and intermediate outputs at https://github.com/daniel-hain/patent_embedding_research.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源