用于语义图像传输的生成联合源通道编码

论文标题

用于语义图像传输的生成联合源通道编码

Generative Joint Source-Channel Coding for Semantic Image Transmission

论文作者

Erdemir, Ecenaz, Tung, Tze-Yang, Dragotti, Pier Luigi, Gunduz, Deniz

论文摘要

最近的工作表明，使用深神经网络（DNN）（DNNS）（DEEDJSCC）的联合源通道编码（JSCC）方案为无线图像传输提供了有希望的结果。但是，这些方法主要集中于相对于输入图像的重建信号的扭曲，而不是人类的感知。但是，仅关注传统的失真指标并不一定会导致高感知质量，尤其是在极端的物理条件下，例如非常低的带宽压缩比（BCR）和低信噪比（SNR）制度。在这项工作中，我们提出了两种新型的JSCC方案，这些方案利用了无线图像传输的深层生成模型（DGM）的感知质量，即InverseJSCC和GenerativeJSCC。虽然前者是对DEEPJSCC的反问题方法，但后者是端到端优化的JSCC方案。在这两个方面，我们优化了平均误差（MSE）的加权总和和学习的感知图像贴片相似性（LPIPS）损失，该损失比其他失真指标捕获更多的语义相似性。 InverseJSCC通过使用基于样式的生成对抗网络（stylegan）解决反向优化问题来对DEEPJSCC模型的扭曲重建进行剥落。我们的仿真结果表明，InverseJSCC在边缘病例中的感知质量方面显着改善了最先进的（SOTA）DEEPJSCC。在GenerativeJSCC中，我们对编码器和基于样式的解码器进行端到端培训，并表明GenerativeJSCC在失真和感知质量方面都大大优于DeepJSCC。

Recent works have shown that joint source-channel coding (JSCC) schemes using deep neural networks (DNNs), called DeepJSCC, provide promising results in wireless image transmission. However, these methods mostly focus on the distortion of the reconstructed signals with respect to the input image, rather than their perception by humans. However, focusing on traditional distortion metrics alone does not necessarily result in high perceptual quality, especially in extreme physical conditions, such as very low bandwidth compression ratio (BCR) and low signal-to-noise ratio (SNR) regimes. In this work, we propose two novel JSCC schemes that leverage the perceptual quality of deep generative models (DGMs) for wireless image transmission, namely InverseJSCC and GenerativeJSCC. While the former is an inverse problem approach to DeepJSCC, the latter is an end-to-end optimized JSCC scheme. In both, we optimize a weighted sum of mean squared error (MSE) and learned perceptual image patch similarity (LPIPS) losses, which capture more semantic similarities than other distortion metrics. InverseJSCC performs denoising on the distorted reconstructions of a DeepJSCC model by solving an inverse optimization problem using style-based generative adversarial network (StyleGAN). Our simulation results show that InverseJSCC significantly improves the state-of-the-art (SotA) DeepJSCC in terms of perceptual quality in edge cases. In GenerativeJSCC, we carry out end-to-end training of an encoder and a StyleGAN-based decoder, and show that GenerativeJSCC significantly outperforms DeepJSCC both in terms of distortion and perceptual quality.

下载PDF全文

下载文献需遵守相关版权规定

论文标题