论文标题
Ask-N-Learn:通过可靠的梯度表示形式进行主动学习以进行图像分类
Ask-n-Learn: Active Learning via Reliable Gradient Representations for Image Classification
论文作者
论文摘要
深度预测模型以标记培训数据的形式依赖于人类监督。获得大量注释的培训数据可能会很昂贵且耗时,这在实践中构建此类模型时成为了关键的瓶颈。在这种情况下,主动学习(AL)策略用于在标签工作方面实现更快的融合。现有的主动学习采用基于不确定性和多样性的各种启发式方法来选择查询样本。尽管它们广泛使用,但实际上,它们的性能受到许多因素的限制,包括非校准的不确定性,数据探索和剥削之间的不足权衡取舍,确认偏见等以应对这些挑战,我们建议使用eask-n-learn,一种基于使用pesudo-labels in ever ing atgerseryserity ealgorith的积极学习方法,基于一种积极的学习方法。更重要的是,我们主张使用预测校准来获得可靠的梯度嵌入,并提出一种数据增强策略,以减轻伪标记期间确认偏差的影响。通过对基准图像分类任务(CIFAR-10,SVHN,时尚摄影师,MNIST)的实证研究,我们证明了对最先进的基线的显着改善,包括最近提出的徽章算法。
Deep predictive models rely on human supervision in the form of labeled training data. Obtaining large amounts of annotated training data can be expensive and time consuming, and this becomes a critical bottleneck while building such models in practice. In such scenarios, active learning (AL) strategies are used to achieve faster convergence in terms of labeling efforts. Existing active learning employ a variety of heuristics based on uncertainty and diversity to select query samples. Despite their wide-spread use, in practice, their performance is limited by a number of factors including non-calibrated uncertainties, insufficient trade-off between data exploration and exploitation, presence of confirmation bias etc. In order to address these challenges, we propose Ask-n-Learn, an active learning approach based on gradient embeddings obtained using the pesudo-labels estimated in each iteration of the algorithm. More importantly, we advocate the use of prediction calibration to obtain reliable gradient embeddings, and propose a data augmentation strategy to alleviate the effects of confirmation bias during pseudo-labeling. Through empirical studies on benchmark image classification tasks (CIFAR-10, SVHN, Fashion-MNIST, MNIST), we demonstrate significant improvements over state-of-the-art baselines, including the recently proposed BADGE algorithm.