火车，您会想念它：交互式模型迭代，具有弱监督和预训练的嵌入

论文标题

火车，您会想念它：交互式模型迭代，具有弱监督和预训练的嵌入

Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings

论文作者

Chen, Mayee F., Fu, Daniel Y., Sala, Frederic, Wu, Sen, Mullapudi, Ravi Teja, Poms, Fait, Fatahalian, Kayvon, Ré, Christopher

论文摘要

我们的目标是使机器学习系统进行交互培训。这就需要在没有大量手工标记的数据的情况下快速训练并快速训练的模型。我们通过从弱监督（WS）借用借款来向前迈出一步，其中模型可以通过信号噪声来源而不是手工标记的数据训练。但是WS依靠培训下游深层网络来推断看不见的数据点，这可能需要数小时或数天。预训练的嵌入可以消除此要求。我们不将嵌入方式用作传输学习（TL）的功能，该功能需要进行高性能进行微调，而是使用它们来定义数据上的距离函数，并将WS源投票扩展到附近的点。从理论上讲，我们提供了一系列的结果，研究了性能如何随嵌入空间中标签分布的源覆盖率，源准确性和Lipschitzness的变化而变化，并将此速率与标准WS进行比较，而无需扩展和TL，而无需进行微调。在六个基准的NLP和视频任务上，我们的方法在没有延长的情况下优于WS，不超过4.1分，而无需微调12.8点，传统上可以接受的深度网络13.1点，并在不到一半的一半秒内训练的同时训练的较弱的深度网络中，在0.7点以内。

Our goal is to enable machine learning systems to be trained interactively. This requires models that perform well and train quickly, without large amounts of hand-labeled data. We take a step forward in this direction by borrowing from weak supervision (WS), wherein models can be trained with noisy sources of signal instead of hand-labeled data. But WS relies on training downstream deep networks to extrapolate to unseen data points, which can take hours or days. Pre-trained embeddings can remove this requirement. We do not use the embeddings as features as in transfer learning (TL), which requires fine-tuning for high performance, but instead use them to define a distance function on the data and extend WS source votes to nearby points. Theoretically, we provide a series of results studying how performance scales with changes in source coverage, source accuracy, and the Lipschitzness of label distributions in the embedding space, and compare this rate to standard WS without extension and TL without fine-tuning. On six benchmark NLP and video tasks, our method outperforms WS without extension by 4.1 points, TL without fine-tuning by 12.8 points, and traditionally-supervised deep networks by 13.1 points, and comes within 0.7 points of state-of-the-art weakly-supervised deep networks-all while training in less than half a second.

下载PDF全文

下载文献需遵守相关版权规定

论文标题