论文标题
具有自动混合精度训练的编码器 - 码头网络的计算,时间和能量表征
Compute, Time and Energy Characterization of Encoder-Decoder Networks with Automatic Mixed Precision Training
论文作者
论文摘要
深度神经网络在许多不同领域都取得了巨大的成功。这些网络的培训可能需要大量时间,计算和能量。随着数据集变得更大并变得越来越复杂,模型体系结构的探索变得过于刺激。在本文中,我们研究了培训基于UNET的深层神经网络的计算,能量和时间成本,以预测短期天气预报(称为降水现象)的问题。通过利用数据分布和混合精液培训的组合,我们探索了此问题的设计空间。我们还表明,如果使用适当的优化,则具有更好性能的较大模型可能会以潜在的增量成本。我们表明,通过利用混合精确培训而无需牺牲模型性能,可以通过利用混合精确培训来取得重大改善。此外,我们发现,网络的可训练参数数量增加了1549%,其能源使用率相对较小63.22%,其中具有4个编码层的UNET。
Deep neural networks have shown great success in many diverse fields. The training of these networks can take significant amounts of time, compute and energy. As datasets get larger and models become more complex, the exploration of model architectures becomes prohibitive. In this paper we examine the compute, energy and time costs of training a UNet based deep neural network for the problem of predicting short term weather forecasts (called precipitation Nowcasting). By leveraging a combination of data distributed and mixed-precision training, we explore the design space for this problem. We also show that larger models with better performance come at a potentially incremental cost if appropriate optimizations are used. We show that it is possible to achieve a significant improvement in training time by leveraging mixed-precision training without sacrificing model performance. Additionally, we find that a 1549% increase in the number of trainable parameters for a network comes at a relatively smaller 63.22% increase in energy usage for a UNet with 4 encoding layers.