论文标题
使用可靠来源的交叉检查自动化的假新闻检测
Automated Fake News Detection using cross-checking with reliable sources
论文作者
论文摘要
在过去的十年中,假新闻和错误信息已成为一个主要问题,影响了我们生活的各个方面,包括政治和公共卫生。受自然行为的启发,我们提出了一种自动检测假新闻的方法。自然的人类行为是通过可靠来源交叉检查新信息。我们使用自然语言处理(NLP)并构建机器学习(ML)模型,该模型可以自动使用一组预定义的可靠来源交叉检查新信息的过程。我们将其用于Twitter,并构建一个标记伪造推文的模型。具体来说,对于给定推文,我们使用其文本来查找可靠新闻机构的相关新闻。然后,我们训练一个随机的森林模型,该模型检查了推文的文本内容是否与可信赖的新闻保持一致。如果不是这样,则该推文被归类为假货。这种方法通常可以应用于任何类型的信息,并且不仅限于特定的新闻故事或类别的信息。我们对这种方法的实施提供了$ 70 \%$的准确性,这表现优于其他通用的假新型分类模型。这些结果为采用虚假新闻检测的更明智和自然的方法铺平了道路。
Over the past decade, fake news and misinformation have turned into a major problem that has impacted different aspects of our lives, including politics and public health. Inspired by natural human behavior, we present an approach that automates the detection of fake news. Natural human behavior is to cross-check new information with reliable sources. We use Natural Language Processing (NLP) and build a machine learning (ML) model that automates the process of cross-checking new information with a set of predefined reliable sources. We implement this for Twitter and build a model that flags fake tweets. Specifically, for a given tweet, we use its text to find relevant news from reliable news agencies. We then train a Random Forest model that checks if the textual content of the tweet is aligned with the trusted news. If it is not, the tweet is classified as fake. This approach can be generally applied to any kind of information and is not limited to a specific news story or a category of information. Our implementation of this approach gives a $70\%$ accuracy which outperforms other generic fake-news classification models. These results pave the way towards a more sensible and natural approach to fake news detection.