论文标题

使用NLP在伊朗Covid-19期间波斯/波斯语推文的内容分析

Content analysis of Persian/Farsi Tweets during COVID-19 pandemic in Iran using NLP

论文作者

Hosseini, Pedram, Hosseini, Poorya, Broniatowski, David A.

论文摘要

伊朗以及中国,韩国和意大利是在19日蔓延的第一波浪潮中遭受重创的国家之一。 Twitter是国外和国外伊朗人广泛使用的在线平台之一,以分享他们对广泛问题的看法,想法和感受。在这项研究中,我们在Covid-19上使用波斯语/法尔西的530,000多条推文,我们分析了主要是伊朗人之间讨论的主题,以衡量和跟踪对大流行的反应以及它随着时间的流逝的演变。我们应用了随机的推文样本和主题建模工具的手动注释,以对每个类别的主题的内容和频率进行分类。我们确定了前25个主题,其中包括家庭隔离的生活经验是主要的谈话点。我们还对推文的更广泛的内容进行了分类,这些推文显示讽刺,随后是新闻,是伊朗用户中的主要推文类型。尽管该框架和方法可以用于跟踪公众对与Covid-19有关的正在进行的发展的反应,但该框架的概括可以成为一个有用的框架,以评估伊朗公众对正在进行的政策措施或本地和国际上的事件的反应。

Iran, along with China, South Korea, and Italy was among the countries that were hit hard in the first wave of the COVID-19 spread. Twitter is one of the widely-used online platforms by Iranians inside and abroad for sharing their opinion, thoughts, and feelings about a wide range of issues. In this study, using more than 530,000 original tweets in Persian/Farsi on COVID-19, we analyzed the topics discussed among users, who are mainly Iranians, to gauge and track the response to the pandemic and how it evolved over time. We applied a combination of manual annotation of a random sample of tweets and topic modeling tools to classify the contents and frequency of each category of topics. We identified the top 25 topics among which living experience under home quarantine emerged as a major talking point. We additionally categorized broader content of tweets that shows satire, followed by news, is the dominant tweet type among the Iranian users. While this framework and methodology can be used to track public response to ongoing developments related to COVID-19, a generalization of this framework can become a useful framework to gauge Iranian public reaction to ongoing policy measures or events locally and internationally.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源