论文标题
我们可以从将近十年的食物推文中学到什么
What Can We Learn From Almost a Decade of Food Tweets
论文作者
论文摘要
我们介绍了拉脱维亚Twitter Eater语料库 - 与食物,饮料,饮食和饮酒有关的狭窄领域中的一系列推文。该语料库已在超过8年的时间内收集,其中包含超过200万条带有其他有用数据的推文。我们还将两个问答子群分开,回答推文和注释的推文。我们通过使用来自语料库的数据来分析语料库的内容,并通过训练域特异性问题和情感 - 分析模型来证明亚群的用例。
We present the Latvian Twitter Eater Corpus - a set of tweets in the narrow domain related to food, drinks, eating and drinking. The corpus has been collected over time-span of over 8 years and includes over 2 million tweets entailed with additional useful data. We also separate two sub-corpora of question and answer tweets and sentiment annotated tweets. We analyse contents of the corpus and demonstrate use-cases for the sub-corpora by training domain-specific question-answering and sentiment-analysis models using data from the corpus.