形状引导的扩散，内外注意力

论文标题

形状引导的扩散，内外注意力

Shape-Guided Diffusion with Inside-Outside Attention

论文作者

Park, Dong Huk, Luo, Grace, Toste, Clayton, Azadi, Samaneh, Liu, Xihui, Karalashvili, Maka, Rohrbach, Anna, Darrell, Trevor

论文摘要

我们将精确的对象轮廓介绍为文本到图像扩散模型中用户控制的一种新形式，我们将其配置为形状引导的扩散。我们的无训练方法在反转和生成过程中使用内而外的注意机制，将形状约束应用于跨和自我发项图的图。我们的机制指定哪个空间区域是对象（内部）与背景（外部），然后将编辑与正确区域相关联。我们演示了我们方法对形状引导的编辑任务的功效，该任务必须根据文本提示和对象掩码替换对象。我们策划了一种从MS-Coco衍生而来的新的层状基准，并实现SOTA会导致忠诚度，而不会根据自动指标和注释量的评级而没有文本对齐或图像现实主义的退化。我们的数据和代码将在https://shape-guided-diffusion.github.io中提供。

We introduce precise object silhouette as a new form of user control in text-to-image diffusion models, which we dub Shape-Guided Diffusion. Our training-free method uses an Inside-Outside Attention mechanism during the inversion and generation process to apply a shape constraint to the cross- and self-attention maps. Our mechanism designates which spatial region is the object (inside) vs. background (outside) then associates edits to the correct region. We demonstrate the efficacy of our method on the shape-guided editing task, where the model must replace an object according to a text prompt and object mask. We curate a new ShapePrompts benchmark derived from MS-COCO and achieve SOTA results in shape faithfulness without a degradation in text alignment or image realism according to both automatic metrics and annotator ratings. Our data and code will be made available at https://shape-guided-diffusion.github.io.

下载PDF全文

下载文献需遵守相关版权规定

论文标题