论文标题
从应用现成的伯特(Bert)汲取的经验教训:没有银弹
Lessons Learned from Applying off-the-shelf BERT: There is no Silver Bullet
论文作者
论文摘要
NLP领域的挑战之一是训练大型分类模型,这是一项既困难又乏味的任务。当GPU硬件不可用时,这甚至更困难。预先训练和现成的单词嵌入,模型和模块的可用性增加旨在减轻训练大型模型并实现竞争性能的过程。我们探讨了现成的BERT模型的使用,并分享了我们的实验结果,并将其结果与LSTM网络和更简单的基线的结果进行比较。我们表明,BERT的复杂性和计算成本不能保证在眼前的分类任务中提高预测性能。
One of the challenges in the NLP field is training large classification models, a task that is both difficult and tedious. It is even harder when GPU hardware is unavailable. The increased availability of pre-trained and off-the-shelf word embeddings, models, and modules aim at easing the process of training large models and achieving a competitive performance. We explore the use of off-the-shelf BERT models and share the results of our experiments and compare their results to those of LSTM networks and more simple baselines. We show that the complexity and computational cost of BERT is not a guarantee for enhanced predictive performance in the classification tasks at hand.