论文标题
阿贝语料库:动画生物情绪激动
The ABBE Corpus: Animate Beings Being Emotional
论文作者
论文摘要
情感检测是一项已建立的NLP,用于证明文本理解的实用性。但是,基本的情绪检测忽略了关键信息,即谁正在经历所讨论的情绪。例如,它可能是作者,叙述者或角色。否则情绪可能与观众应该感受到的东西相对应,甚至不适合特定的存在,例如在讨论情绪本身时。我们提供了Abbe语料库 - 动画生物是情感上的 - 一种新的双重宣布的文本语料库,该文本捕获了一类情感体验者的关键信息,即文本描述的世界上的生物。这样的语料库对于开发试图建模或理解这种特定类型的表达情绪的系统很有用。我们的语料库包含30章,其中包含134,513个单词,摘自英语小说的语料库,并包含2,010个独特的情感表达式,可归因于2,227个动画生物。根据Plutchik的8类情绪模型对情绪表达进行分类,注释的总体通道一致性为0.83 Cohen的Kappa,这表明非常同意。我们详细描述了我们的注释方案和程序,还释放了其他研究人员使用的语料库。
Emotion detection is an established NLP task of demonstrated utility for text understanding. However, basic emotion detection leaves out key information, namely, who is experiencing the emotion in question. For example, it may be the author, the narrator, or a character; or the emotion may correspond to something the audience is supposed to feel, or even be unattributable to a specific being, e.g., when emotions are being discussed per se. We provide the ABBE corpus -- Animate Beings Being Emotional -- a new double-annotated corpus of texts that captures this key information for one class of emotion experiencer, namely, animate beings in the world described by the text. Such a corpus is useful for developing systems that seek to model or understand this specific type of expressed emotion. Our corpus contains 30 chapters, comprising 134,513 words, drawn from the Corpus of English Novels, and contains 2,010 unique emotion expressions attributable to 2,227 animate beings. The emotion expressions are categorized according to Plutchik's 8-category emotion model, and the overall inter-annotator agreement for the annotations was 0.83 Cohen's Kappa, indicating excellent agreement. We describe in detail our annotation scheme and procedure, and also release the corpus for use by other researchers.