文本排名的预训练的变压器有何不同？

论文标题

文本排名的预训练的变压器有何不同？

How Different are Pre-trained Transformers for Text Ranking?

论文作者

Rau, David, Kamps, Jaap

论文摘要

近年来，大型的预训练的变压器已导致对传统检索模型和反馈方法的绩效增长。但是，这些结果主要基于MS MARCO/TREC深度学习轨道设置，其特殊的设置以及我们对这些模型为什么和如何更好地工作的理解充其量是最好的。我们分析了有效的基于BERT的跨编码器与传统的BM25排名，以进行通过观察到最大收益的通道检索任务，并研究了两个主要问题。一方面，什么相似？神经排名者在多大程度上涵盖了传统排名的能力？相同文档的排名更好（确定精度）是否会增加绩效的增长？另一方面，有什么不同？它可以检索传统系统所遗漏的有效文档（优先召回）吗？我们发现在识别伯特的优势和劣势的相关性概念上存在实质性差异，这可能会激发研究以进行未来的改进。我们的结果有助于我们相对于（熟悉的）传统排名者对（黑盒）神经排名者的理解，有助于了解基于MS-Marco的测试集的特殊实验设置。

In recent years, large pre-trained transformers have led to substantial gains in performance over traditional retrieval models and feedback approaches. However, these results are primarily based on the MS Marco/TREC Deep Learning Track setup, with its very particular setup, and our understanding of why and how these models work better is fragmented at best. We analyze effective BERT-based cross-encoders versus traditional BM25 ranking for the passage retrieval task where the largest gains have been observed, and investigate two main questions. On the one hand, what is similar? To what extent does the neural ranker already encompass the capacity of traditional rankers? Is the gain in performance due to a better ranking of the same documents (prioritizing precision)? On the other hand, what is different? Can it retrieve effectively documents missed by traditional systems (prioritizing recall)? We discover substantial differences in the notion of relevance identifying strengths and weaknesses of BERT that may inspire research for future improvement. Our results contribute to our understanding of (black-box) neural rankers relative to (well-understood) traditional rankers, help understand the particular experimental setting of MS-Marco-based test collections.

下载PDF全文

下载文献需遵守相关版权规定

论文标题