论文标题
itôwave:itô随机微分方程是您的全部浪潮产生所需的
ItôWave: Itô Stochastic Differential Equation Is All You Need For Wave Generation
论文作者
论文摘要
在本文中,我们提出了一个基于一对正向和反时线性随机微分方程(SDE)的声码器。该SDE对的解决方案是两个随机过程,其中一个将我们想要生成的波的分布变成了简单且可拖延的分布。另一个是将此可处理的简单信号变成目标波的生成过程。该模型称为itôwave。 ItWave使用Wiener过程作为驱动程序,逐渐从噪声信号中逐渐减去过量信号,分别在原始MEL频谱图的条件输入下分别生成相应的有意义的音频。实验的结果表明,ITôWave的平均意见分数(MOS)可以超过当前的最新方法(SOTA)方法,并达到4.35 $ \ pm $ 0.115。生成的音频样品可在线获得。
In this paper, we propose a vocoder based on a pair of forward and reverse-time linear stochastic differential equations (SDE). The solutions of this SDE pair are two stochastic processes, one of which turns the distribution of wave, that we want to generate, into a simple and tractable distribution. The other is the generation procedure that turns this tractable simple signal into the target wave. The model is called ItôWave. ItôWave use the Wiener process as a driver to gradually subtract the excess signal from the noise signal to generate realistic corresponding meaningful audio respectively, under the conditional inputs of original mel spectrogram. The results of the experiment show that the mean opinion scores (MOS) of ItôWave can exceed the current state-of-the-art (SOTA) methods, and reached 4.35$\pm$0.115. The generated audio samples are available online.