论文标题
用单个深度学习模型代替移动相机ISP
Replacing Mobile Camera ISP with a Single Deep Learning Model
论文作者
论文摘要
随着移动摄影的普及不断增长,现在正在为构建复杂的手工制作的相机解决方案而投入许多努力。在这项工作中,我们证明,即使是最复杂的ISP管道也可以用训练的单一端到端深度学习模型代替,而无需对特定设备中使用的传感器和光学设备进行任何先验知识。 For this, we present PyNET, a novel pyramidal CNN architecture designed for fine-grained image restoration that implicitly learns to perform all ISP steps such as image demosaicing, denoising, white balancing, color and contrast correction, demoireing, etc. The model is trained to convert RAW Bayer data obtained directly from mobile camera sensor into photos captured with a professional high-end DSLR camera, making the solution independent of any particular mobile ISP 执行。为了验证实际数据上提出的方法,我们收集了一个大规模数据集,该数据集由10,000个全分辨率RAW-RGB图像对与Huawei P20摄像头捕获在野外捕获(12.3 MP Sony Exmor IMX380传感器)和佳能5D Mark 5d Mark IV IV DSLR。实验表明,所提出的解决方案可以轻松地达到嵌入式P20的ISP管道的水平,与我们的方法不同,该管道将来自两个(RGB + B/W)相机传感器的数据组合在一起。本文使用的数据集,预培训的模型和代码可在项目网站上找到。
As the popularity of mobile photography is growing constantly, lots of efforts are being invested now into building complex hand-crafted camera ISP solutions. In this work, we demonstrate that even the most sophisticated ISP pipelines can be replaced with a single end-to-end deep learning model trained without any prior knowledge about the sensor and optics used in a particular device. For this, we present PyNET, a novel pyramidal CNN architecture designed for fine-grained image restoration that implicitly learns to perform all ISP steps such as image demosaicing, denoising, white balancing, color and contrast correction, demoireing, etc. The model is trained to convert RAW Bayer data obtained directly from mobile camera sensor into photos captured with a professional high-end DSLR camera, making the solution independent of any particular mobile ISP implementation. To validate the proposed approach on the real data, we collected a large-scale dataset consisting of 10 thousand full-resolution RAW-RGB image pairs captured in the wild with the Huawei P20 cameraphone (12.3 MP Sony Exmor IMX380 sensor) and Canon 5D Mark IV DSLR. The experiments demonstrate that the proposed solution can easily get to the level of the embedded P20's ISP pipeline that, unlike our approach, is combining the data from two (RGB + B/W) camera sensors. The dataset, pre-trained models and codes used in this paper are available on the project website.