论文标题

用标签和种子查询积极学习分类器

Active Learning of Classifiers with Label and Seed Queries

论文作者

Bressan, Marco, Cesa-Bianchi, Nicolò, Lattanzi, Silvio, Paudice, Andrea, Thiessen, Maximilian

论文摘要

我们研究了具有边缘的二进制和多类分类器的精确积极学习。给定一个$ n $ - 点集$ x \ subset \ mathbb {r}^m $,我们想在$ x $上学习任何未知分类器,其类具有有限的强凸额赫尔边距,这是一个扩大了SVM保证金的新概念。在仅允许标签查询的标准主动学习设置中,学习具有强凸壳保证金$γ$的分类器需要在最坏的情况下$ω\ big(1+ \ frac {1}γ\ big)^{(M-1)/2} $查询。另一方面,使用更强大的种子查询(一种等价查询的变体),可以通过littlestone的一半算法在$ o(m \ log n)$查询中学习目标分类器;但是,减半在计算上效率低下。在这项工作中,我们表明,通过仔细将两种类型的查询组合在一起,可以仅使用$ o(m^2 \ log n)$ colies和$ o \ big(m \ log \ log \ frac \ frac {m} frac {m} frac {m}γ\ big)$ seed queries queries;结果以$ k!k^2 $乘法开销的价格扩展到$ k $ class分类器。当输入点的位复杂性界定时,或者只有一个类具有强凸形船体边缘时,相似的结果就会成立。我们通过证明在最坏的情况下,任何算法都需要$ω\ big(k m \ log \ frac {1}γ\ big)$ seed和标签查询来学习$ k $ - 类别的分类器,具有强凸赫尔率$γ$。

We study exact active learning of binary and multiclass classifiers with margin. Given an $n$-point set $X \subset \mathbb{R}^m$, we want to learn any unknown classifier on $X$ whose classes have finite strong convex hull margin, a new notion extending the SVM margin. In the standard active learning setting, where only label queries are allowed, learning a classifier with strong convex hull margin $γ$ requires in the worst case $Ω\big(1+\frac{1}γ\big)^{(m-1)/2}$ queries. On the other hand, using the more powerful seed queries (a variant of equivalence queries), the target classifier could be learned in $O(m \log n)$ queries via Littlestone's Halving algorithm; however, Halving is computationally inefficient. In this work we show that, by carefully combining the two types of queries, a binary classifier can be learned in time $\operatorname{poly}(n+m)$ using only $O(m^2 \log n)$ label queries and $O\big(m \log \frac{m}γ\big)$ seed queries; the result extends to $k$-class classifiers at the price of a $k!k^2$ multiplicative overhead. Similar results hold when the input points have bounded bit complexity, or when only one class has strong convex hull margin against the rest. We complement the upper bounds by showing that in the worst case any algorithm needs $Ω\big(k m \log \frac{1}γ\big)$ seed and label queries to learn a $k$-class classifier with strong convex hull margin $γ$.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源