论文标题
在面向任务的对话框中无监督的KB查询学习
Unsupervised Learning of KB Queries in Task-Oriented Dialogs
论文作者
论文摘要
面向任务的对话框(TOD)系统通常需要制定与用户意图相对应的知识库(KB)查询,并使用查询结果来生成系统响应。现有方法需要对话数据集以明确注释这些KB查询 - 这些注释可能很耗时且昂贵。作为回应,我们定义了预测KB查询和训练对话框的新问题,而无需明确的KB查询注释。为了进行查询预测,我们提出了加强学习(RL)基线,该学习奖励其KB结果的查询的产生涵盖了后续对话框中提到的实体。进一步的分析表明,KB中查询属性之间的相关性可能会严重混淆内存增强策略优化(MAPO),这是现有的RL代理的现有状态。为了解决这个问题,我们通过适合我们任务的简单但重要的修改来改善MAPO基线。为了为我们的设置训练完整的TOD系统,我们提出了一种管道的方法:它独立预测何时进行KB查询(查询位置预测指标),然后预测在预测位置(查询预测指标)的KB查询,并在后续对话框(下一个响应预测器)中使用预测查询的结果。总体而言,我们的工作提出了解决新问题的首先解决方案,我们的分析强调了训练TOD系统的研究挑战而无需查询注释。
Task-oriented dialog (TOD) systems often need to formulate knowledge base (KB) queries corresponding to the user intent and use the query results to generate system responses. Existing approaches require dialog datasets to explicitly annotate these KB queries -- these annotations can be time consuming, and expensive. In response, we define the novel problems of predicting the KB query and training the dialog agent, without explicit KB query annotation. For query prediction, we propose a reinforcement learning (RL) baseline, which rewards the generation of those queries whose KB results cover the entities mentioned in subsequent dialog. Further analysis reveals that correlation among query attributes in KB can significantly confuse memory augmented policy optimization (MAPO), an existing state of the art RL agent. To address this, we improve the MAPO baseline with simple but important modifications suited to our task. To train the full TOD system for our setting, we propose a pipelined approach: it independently predicts when to make a KB query (query position predictor), then predicts a KB query at the predicted position (query predictor), and uses the results of predicted query in subsequent dialog (next response predictor). Overall, our work proposes first solutions to our novel problem, and our analysis highlights the research challenges in training TOD systems without query annotation.