关于识别灾难Twitter数据中的主题标签

论文标题

关于识别灾难Twitter数据中的主题标签

On Identifying Hashtags in Disaster Twitter Data

论文作者

Chowdhury, Jishnu Ray, Caragea, Cornelia, Caragea, Doina

论文摘要

推文主题标签有可能在灾难事件中改善搜索信息的搜索。但是，有大量与灾难有关的推文没有任何用户提供的主题标签。此外，只有少数包含可操作的主题标签的推文对于灾难响应有用。为了促进Twitter数据的灾难主题标签自动识别（或提取）的进度，我们构建了带有灾难相关的推文的唯一数据集，该推文带有标签，可用于过滤可行的信息。使用此数据集，我们进一步研究了多任务学习框架内的长期基于内存的模型。最佳性能模型的F1得分高达92.22％。数据集，代码和其他资源可在GitHub上找到。

Tweet hashtags have the potential to improve the search for information during disaster events. However, there is a large number of disaster-related tweets that do not have any user-provided hashtags. Moreover, only a small number of tweets that contain actionable hashtags are useful for disaster response. To facilitate progress on automatic identification (or extraction) of disaster hashtags for Twitter data, we construct a unique dataset of disaster-related tweets annotated with hashtags useful for filtering actionable information. Using this dataset, we further investigate Long Short Term Memory-based models within a Multi-Task Learning framework. The best performing model achieves an F1-score as high as 92.22%. The dataset, code, and other resources are available on Github.

下载PDF全文

下载文献需遵守相关版权规定

论文标题