论文标题
阿拉伯:阿拉伯语Twitter的性别分析和推论
ArabGend: Gender Analysis and Inference on Arabic Twitter
论文作者
论文摘要
Twitter的性别分析可以揭示男性和女性使用者之间的重要社会文化差异。过去,对于大多数使用的语言的内容,已经为我们的知识提供了很大的努力来分析和自动推断性别,但是,对于我们所知,阿拉伯语的工作非常有限。在本文中,我们对阿拉伯Twitter-Sphere上男性和女性用户之间的差异进行了广泛的分析。我们研究用户参与,感兴趣的主题和专业的性别差距的差异。除性别分析外,我们还提出了一种通过使用用户名,个人资料图片,推文和朋友网络来推断性别的方法。为了这样做,我们为与〜92K用户位置关联的〜166K Twitter帐户的性别和位置手动注释,我们计划在http://anonymonyous.com上公开提供。我们提出的性别推理方法的F1得分为82.1%,比多数基线高47.3%。此外,我们还开发了一个演示并公开可用。
Gender analysis of Twitter can reveal important socio-cultural differences between male and female users. There has been a significant effort to analyze and automatically infer gender in the past for most widely spoken languages' content, however, to our knowledge very limited work has been done for Arabic. In this paper, we perform an extensive analysis of differences between male and female users on the Arabic Twitter-sphere. We study differences in user engagement, topics of interest, and the gender gap in professions. Along with gender analysis, we also propose a method to infer gender by utilizing usernames, profile pictures, tweets, and networks of friends. In order to do so, we manually annotated gender and locations for ~166K Twitter accounts associated with ~92K user location, which we plan to make publicly available at http://anonymous.com. Our proposed gender inference method achieve an F1 score of 82.1%, which is 47.3% higher than majority baseline. In addition, we also developed a demo and made it publicly available.