论文标题
Fastbert:自适应推理时间的自我献礼
FastBERT: a Self-distilling BERT with Adaptive Inference Time
论文作者
论文摘要
像伯特这样的预训练的语言模型已被证明是高性能的。但是,在许多实际情况下,它们通常在计算上昂贵,因为这样的重型模型几乎不容易通过有限的资源来实施。为了通过确保的模型性能提高其效率,我们提出了一种具有自适应推理时间的新型速度速度Fastbert。在不同的需求下,可以灵活地调整推断的速度,同时避免样品的冗余计算。此外,该模型在微调时采用了独特的自我验证机制,从而进一步实现了更大的计算功效,并且性能损失最小。我们的模型在十二个英语和中文数据集中取得了令人鼓舞的结果。如果给出不同的加速阈值,它的速度比伯特的速度是1到12倍,以实现速度绩效的权衡。
Pre-trained language models like BERT have proven to be highly performant. However, they are often computationally expensive in many practical scenarios, for such heavy models can hardly be readily implemented with limited resources. To improve their efficiency with an assured model performance, we propose a novel speed-tunable FastBERT with adaptive inference time. The speed at inference can be flexibly adjusted under varying demands, while redundant calculation of samples is avoided. Moreover, this model adopts a unique self-distillation mechanism at fine-tuning, further enabling a greater computational efficacy with minimal loss in performance. Our model achieves promising results in twelve English and Chinese datasets. It is able to speed up by a wide range from 1 to 12 times than BERT if given different speedup thresholds to make a speed-performance tradeoff.