论文标题

人类的演讲遵循本福德定律吗?

Does human speech follow Benford's Law?

论文作者

Hsu, Leo, Berisha, Visar

论文摘要

研究人员观察到,许多人造数字和自然存在的数据集中的领先数字频率遵循对数曲线,数字从数字1开始,该数据集中的所有数字中的$ \ sim 30 \%\%\%$ \%的数字和数字的数字为9数字9,该数字9占数据集中数字的$ \ sim 5 \%$ $ \ sim 5 \%。这种现象被称为本福德定律,是可重复的,并且出现在电费,股票价格,纳税申报表,房价,死亡率,河流长度和自然发生的图像中的数字列表中。在本文中,我们证明了人类的言语光谱也平均遵循本福德定律。也就是说,当对许多说话者的平均值时,语音幅度光谱中领先数字的频率遵循此分布,尽管在单个样本级别上有一些可变性。我们使用此观察结果来激发一组新的特征,这些特征可以有效地从语音中提取,并证明这些功能可用于在人类语音和综合语音之间进行分类。

Researchers have observed that the frequencies of leading digits in many man-made and naturally occurring datasets follow a logarithmic curve, with digits that start with the number 1 accounting for $\sim 30\%$ of all numbers in the dataset and digits that start with the number 9 accounting for $\sim 5\%$ of all numbers in the dataset. This phenomenon, known as Benford's Law, is highly repeatable and appears in lists of numbers from electricity bills, stock prices, tax returns, house prices, death rates, lengths of rivers, and naturally occurring images. In this paper we demonstrate that human speech spectra also follow Benford's Law on average. That is, when averaged over many speakers, the frequencies of leading digits in speech magnitude spectra follow this distribution, although with some variability at the individual sample level. We use this observation to motivate a new set of features that can be efficiently extracted from speech and demonstrate that these features can be used to classify between human speech and synthetic speech.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源