论文标题
探索双重编码器架构以回答问题
Exploring Dual Encoder Architectures for Question Answering
论文作者
论文摘要
双重编码器已用于提问(QA)和信息检索(IR)任务,并具有良好的结果。先前的研究重点是两种主要类型的双重编码器,暹罗双重编码器(SDE),在两个编码器上共享参数,以及不对称的双重编码器(ADE),并带有两个明显的参数化编码器。在这项工作中,我们探讨了可以对双重编码器进行构建的不同方式,并评估这些差异如何在QA检索任务方面影响它们的功效。通过评估MS MARCO,开放域NQ和MultireQA基准测试,我们表明SDE的性能明显优于ADE。我们进一步通过在两个编码塔之间共享或冻结体系结构的各个部分,提出了三种不同的ADE改进版本。我们发现,在投影层中共享参数将使ADE能够在表现或胜过SDE的竞争力。我们进一步探索并解释了为什么在投影层中共享参数共享,从而通过直接探测具有T-SNE算法的两个编码器塔的嵌入空间来显着提高双重编码器的疗效。
Dual encoders have been used for question-answering (QA) and information retrieval (IR) tasks with good results. Previous research focuses on two major types of dual encoders, Siamese Dual Encoder (SDE), with parameters shared across two encoders, and Asymmetric Dual Encoder (ADE), with two distinctly parameterized encoders. In this work, we explore different ways in which the dual encoder can be structured, and evaluate how these differences can affect their efficacy in terms of QA retrieval tasks. By evaluating on MS MARCO, open domain NQ and the MultiReQA benchmarks, we show that SDE performs significantly better than ADE. We further propose three different improved versions of ADEs by sharing or freezing parts of the architectures between two encoder towers. We find that sharing parameters in projection layers would enable ADEs to perform competitively with or outperform SDEs. We further explore and explain why parameter sharing in projection layer significantly improves the efficacy of the dual encoders, by directly probing the embedding spaces of the two encoder towers with t-SNE algorithm.