论文标题

通过子句索引提高TSETLIN机器的推理和学习速度

Increasing the Inference and Learning Speed of Tsetlin Machines with Clause Indexing

论文作者

Gorji, Saeed Rahimi, Granmo, Ole-Christoffer, Glimsdal, Sondre, Edwards, Jonathan, Goodwin, Morten

论文摘要

Tsetlin Machine(TM)是一种基于经典Tsetlin Automaton(TA)和游戏理论的机器学习算法。它进一步利用了频繁的模式挖掘和资源分配原理来提取数据中的常见模式,而不是依靠最小化的输出误差,这很容易过度拟合。与神经网络中模式表示的相互交织的性质不同,TM将问题分解为独立模式,被表示为连词条款。从子句输出又通过求和和阈值组合成类似于逻辑回归函数的分类决策,但是,具有二进制重量和单位步长输出功能。在本文中,我们通过引入一种避免详尽评估条款的新型算法来利用这种层次结构。取而代之的是,我们使用一个简单的查找表,该表索引了伪造它们的功能上的子句。通过这种方式,我们可以通过伪造来快速评估大量子句,只需迭代特征并使用查找表即可消除那些伪造的条款。查找表是进一步的结构化,因此它有助于恒定的时间更新,从而在学习过程中支持使用。我们报告的分类速度最高15倍,并且对MNIST和时尚摄像机图像分类和IMDB情感分析的学习速度更快。

The Tsetlin Machine (TM) is a machine learning algorithm founded on the classical Tsetlin Automaton (TA) and game theory. It further leverages frequent pattern mining and resource allocation principles to extract common patterns in the data, rather than relying on minimizing output error, which is prone to overfitting. Unlike the intertwined nature of pattern representation in neural networks, a TM decomposes problems into self-contained patterns, represented as conjunctive clauses. The clause outputs, in turn, are combined into a classification decision through summation and thresholding, akin to a logistic regression function, however, with binary weights and a unit step output function. In this paper, we exploit this hierarchical structure by introducing a novel algorithm that avoids evaluating the clauses exhaustively. Instead we use a simple look-up table that indexes the clauses on the features that falsify them. In this manner, we can quickly evaluate a large number of clauses through falsification, simply by iterating through the features and using the look-up table to eliminate those clauses that are falsified. The look-up table is further structured so that it facilitates constant time updating, thus supporting use also during learning. We report up to 15 times faster classification and three times faster learning on MNIST and Fashion-MNIST image classification, and IMDb sentiment analysis.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源