论文标题
标签保存的对抗性自动仪器:代表学习原理指导方法
Adversarial Auto-Augment with Label Preservation: A Representation Learning Principle Guided Approach
论文作者
论文摘要
数据增强是深入学习成功的关键因素,但在很大程度上依赖于先前的领域知识,这并不总是可用。关于自动数据增强的最新著作学习了一系列策略,以形成一系列增强操作,这些操作仍然是预先定义的,并且仅限于有限的选项。在本文中,我们表明,先前的无自主数据增强目标可以源自代表学习原则,该原则旨在保留标签的最低信息。以一个例子为例,目标旨在创建一个遥远的“积极示例”作为增强,同时仍保留原始标签。然后,我们向目标提出了一个实用的代孕,该目标可以有效地优化并无缝地集成到现有的广泛机器学习任务的方法中,例如监督,半监督和嘈杂的标签学习。与以前的工作不同,我们的方法不需要培训额外的生成模型,而是利用端任务模型的中间层表示来生成数据增强。在实验中,我们表明我们的方法始终从效率和最终性能中,或与强有力的预定义的增强相结合,将上述三个学习任务带来了非平凡的改进,例如,当域知识无法获得的医学图像以及现有的增强技术的性能较差时,在医学图像上的表现不佳。代码可在:https://github.com/kai-wen-yang/lpa3} {https://github.com/kai-wen-yang/lpa3中获得。
Data augmentation is a critical contributing factor to the success of deep learning but heavily relies on prior domain knowledge which is not always available. Recent works on automatic data augmentation learn a policy to form a sequence of augmentation operations, which are still pre-defined and restricted to limited options. In this paper, we show that a prior-free autonomous data augmentation's objective can be derived from a representation learning principle that aims to preserve the minimum sufficient information of the labels. Given an example, the objective aims at creating a distant "hard positive example" as the augmentation, while still preserving the original label. We then propose a practical surrogate to the objective that can be optimized efficiently and integrated seamlessly into existing methods for a broad class of machine learning tasks, e.g., supervised, semi-supervised, and noisy-label learning. Unlike previous works, our method does not require training an extra generative model but instead leverages the intermediate layer representations of the end-task model for generating data augmentations. In experiments, we show that our method consistently brings non-trivial improvements to the three aforementioned learning tasks from both efficiency and final performance, either or not combined with strong pre-defined augmentations, e.g., on medical images when domain knowledge is unavailable and the existing augmentation techniques perform poorly. Code is available at: https://github.com/kai-wen-yang/LPA3}{https://github.com/kai-wen-yang/LPA3.