论文标题
爱沙尼亚的神经语音综合
Neural Speech Synthesis for Estonian
论文作者
论文摘要
该技术报告描述了塔尔图大学NLP研究小组与爱沙尼亚语的神经言语综合的NLP研究小组之间的合作结果。该报告(用爱沙尼亚语编写)描述了项目结果,其摘要是:(1)来自6位讲话者的语音综合数据,总共收集了92.4小时并公开发布(CC-BY-4.0)。可在https://konekorpus.tartunlp.ai和https://www.eki.ee.ee/litsents/上找到数据。 (2)神经语音综合的软件和模型开源(麻省理工学院许可证)。可在https://koodivaramu.eesti.ee/tartunlp/text-to-spech上找到。 (3)我们对新模型进行了评估,并将它们与其他现有解决方案进行了比较(来自EKI,http://www.eki.ee/heli/和Google for Estonian的hts型号,通过https://translate.google.com访问了Estonian的语音综合。评估包括句子级别和更长摘录的语音可接受性MOS分数,详细的错误分析和预处理模块的评估。
This technical report describes the results of a collaboration between the NLP research group at the University of Tartu and the Institute of Estonian Language on improving neural speech synthesis for Estonian. The report (written in Estonian) describes the project results, the summary of which is: (1) Speech synthesis data from 6 speakers for a total of 92.4 hours is collected and openly released (CC-BY-4.0). Data available at https://konekorpus.tartunlp.ai and https://www.eki.ee/litsents/. (2) software and models for neural speech synthesis is released open-source (MIT license). Available at https://koodivaramu.eesti.ee/tartunlp/text-to-speech . (3) We ran evaluations of the new models and compared them to other existing solutions (HMM-based HTS models from EKI, http://www.eki.ee/heli/, and Google's speech synthesis for Estonian, accessed via https://translate.google.com). Evaluation includes voice acceptability MOS scores for sentence-level and longer excerpts, detailed error analysis and evaluation of the pre-processing module.