论文标题

走向库尔德人的有限状态形态

Towards Finite-State Morphology of Kurdish

论文作者

Ahmadi, Sina, Hassani, Hossein

论文摘要

形态学分析是对单词形成和结构的研究。它在自然语言处理(NLP)和计算语言学(CL)中的各种任务中起着至关重要的作用,例如机器翻译以及文本和语音生成。库尔德语是一种资源较低的多核核心印欧语语言,具有高度变化的形态。在本文中,作为同类尝试的第一次尝试,从计算的角度描述了库尔德语言(索拉尼方言)的形态。我们提取形态学规则,这些规则被转化为有限状态传感器,以生成和分析单词。这项研究的结果有助于对库尔德语的语言产生进行研究,并提高语言的信息检索能力(IR),同时利用库尔德NLP和CL为更高级的计算水平。

Morphological analysis is the study of the formation and structure of words. It plays a crucial role in various tasks in Natural Language Processing (NLP) and Computational Linguistics (CL) such as machine translation and text and speech generation. Kurdish is a less-resourced multi-dialect Indo-European language with highly inflectional morphology. In this paper, as the first attempt of its kind, the morphology of the Kurdish language (Sorani dialect) is described from a computational point of view. We extract morphological rules which are transformed into finite-state transducers for generating and analyzing words. The result of this research assists in conducting studies on language generation for Kurdish and enhances the Information Retrieval (IR) capacity for the language while leveraging the Kurdish NLP and CL into a more advanced computational level.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源