论文标题
图像质量评估:统一的结构和纹理相似性
Image Quality Assessment: Unifying Structure and Texture Similarity
论文作者
论文摘要
图像质量的客观度量通常通过将“降解”图像的像素与原始图像进行比较。相对于人类观察者,这些措施对重新采样过采样过采样(例如,用另一种草替换一块草)。在这里,我们开发了第一个全参考图像质量模型,具有对纹理重新采样的明确耐受性。使用卷积神经网络,我们构建了一个注射和可区分的功能,该功能将图像转换为多尺度的过度代表。我们从经验上证明,此表示形式中特征图的空间平均值捕获纹理外观,因为它们提供了一组足够的统计约束来综合各种纹理模式。然后,我们描述了一种图像质量方法,该方法结合了这些空间平均值(“纹理相似性”)与特征图(“结构相似性”)的相关性的相关性。该措施的参数被共同优化,以匹配人类图像质量的评分,同时最大程度地减少了从同一纹理图像中裁剪的子图像之间的距离。实验表明,优化方法在常规图像质量数据库以及纹理数据库上解释了人类感知得分。该措施还可以在相关任务(例如纹理分类和检索)上提供竞争性能。最后,我们表明我们的方法对几何变换(例如翻译和扩张)相对不敏感,而无需使用任何专门的培训或数据增强。代码可从https://github.com/dingkeyan93/dists获得。
Objective measures of image quality generally operate by comparing pixels of a "degraded" image to those of the original. Relative to human observers, these measures are overly sensitive to resampling of texture regions (e.g., replacing one patch of grass with another). Here, we develop the first full-reference image quality model with explicit tolerance to texture resampling. Using a convolutional neural network, we construct an injective and differentiable function that transforms images to multi-scale overcomplete representations. We demonstrate empirically that the spatial averages of the feature maps in this representation capture texture appearance, in that they provide a set of sufficient statistical constraints to synthesize a wide variety of texture patterns. We then describe an image quality method that combines correlations of these spatial averages ("texture similarity") with correlations of the feature maps ("structure similarity"). The parameters of the proposed measure are jointly optimized to match human ratings of image quality, while minimizing the reported distances between subimages cropped from the same texture images. Experiments show that the optimized method explains human perceptual scores, both on conventional image quality databases, as well as on texture databases. The measure also offers competitive performance on related tasks such as texture classification and retrieval. Finally, we show that our method is relatively insensitive to geometric transformations (e.g., translation and dilation), without use of any specialized training or data augmentation. Code is available at https://github.com/dingkeyan93/DISTS.