论文标题
为葡萄牙的句法和语义类比生成感官嵌入
Generating Sense Embeddings for Syntactic and Semantic Analogy for Portuguese
论文作者
论文摘要
单词嵌入是数值向量,可以在低维连续空间中表示单词或概念。这些向量能够捕获有用的句法和语义信息。诸如Word2Vec,Glove和FastText之类的传统方法具有严格的缺点:它们每个单词都会产生单个向量表示,而忽略了歧义单词可以采用不同含义的事实。在本文中,我们使用技术来生成感官嵌入,并提出针对葡萄牙人进行的第一个实验。我们的实验表明,在句法和语义类比任务中,Sense Vectors优于传统单词向量,证明此处生成的语言资源可以改善葡萄牙语中NLP任务的性能。
Word embeddings are numerical vectors which can represent words or concepts in a low-dimensional continuous space. These vectors are able to capture useful syntactic and semantic information. The traditional approaches like Word2Vec, GloVe and FastText have a strict drawback: they produce a single vector representation per word ignoring the fact that ambiguous words can assume different meanings. In this paper we use techniques to generate sense embeddings and present the first experiments carried out for Portuguese. Our experiments show that sense vectors outperform traditional word vectors in syntactic and semantic analogy tasks, proving that the language resource generated here can improve the performance of NLP tasks in Portuguese.