AIGC-AI二维码Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for TextGuided QRCode

https://arxiv.org/pdf/2403.06452

这里是引用

ABSTRACT

In the digital era, QR codes serve as a linchpin connecting virtual and physical realms. Their pervasive integration across various applications highlights the demand for aesthetically pleasing codes without compromised scannability. However, prevailing methods grapple with the intrinsic challenge of balancing customization and scannability. Notably, stable-diffusion models have ushered in an epoch of high-quality, customizable content generation. This paper introduces Text2QR, a pioneering approach leveraging these advancements to address a fundamental challenge: concurrently achieving user-defined aesthetics and scanning robustness. To ensure stable generation of aesthetic QR codes, we introduce the QR Aesthetic Blueprint (QAB) module, generating a blueprint image exerting control over the entire generation process. Subsequently, the Scannability Enhancing Latent Refinement (SELR) process refines the output iteratively in the latent space, enhancing scanning robustness. This approach harnesses the potent generation capabilities of stable-diffusion models, navigating the trade-offbetween image aesthetics and QR code scannability. Our experiments demonstrate the seamless fusion ofvi sual appeal with the practical utility ofaesthetic QR codes, markedly outperforming prior methods.

CONTRIBUTION

  • An integrated pipeline, Text2QR, that harmonizes userdefined aesthetics and robust scannability in QR code generation.
  • The introduction of the QR Aesthetic Blueprint (QAB) for creating template images and the Scannability-Enhancing Latent Refinement (SELR) process for optimizing scan robustness while maintaining aesthetics.
  • Superior performance compared to existing techniques, establishing Text2QR as a state-of-the-art solution for QR code generation that excels in both visual quality and scanning robustness.

RELATED WORKS

  • Halftone QR Codes 【这种二维码将传统的黑白模块重新排列,形成一个与输入图像语义相匹配的轮廓】
    • Hung-Kuo Chu, Chia-Sheng Chang, Ruen-Rone Lee, and Niloy J Mitra. Halftone QR Codes. ACM Transactions on Graphics (TOG), 32(6):1–8, 2013. 2, 7
  • QR Image 【这种方法利用二维码编码规则中的冗余,将彩色图像嵌入到二维码中,使其既实用又更具视觉吸引力。】
    • Gonzalo J Garateguy, Gonzalo R Arce, Daniel L Lau, and Ofelia P Villarreal. QR Images: Optimized Image Embedding in QR Codes. IEEE Transactions on Image Processing, 23(7):2842–2853, 2014.
    • Mingliang Xu, Qingfeng Li, Jianwei Niu, Hao Su, Xiting Liu, Weiwei Xu, Pei Lv, Bing Zhou, and Yi Yang. ART-UP: A novel method for generating scanning-robust aesthetic QR codes. ACMTrans. Multim. Comput. Commun. Appl., 17(1): 25:1–25:23, 2021
  • Artistic QR Codes - Su et al. 【将二维码与风格迁移技术相结合,创造出具有艺术风格的二维码】
    • Hao Su, Jianwei Niu, Xuefeng Liu, Qingfeng Li, Ji Wan, and Mingliang Xu. Q-Art Code: Generating Scanning-robust Art-style QR Codes by Deformable Convolution. In Proceedings ofthe 29th ACM International Conference on Multimedia, pages 722–730, 2021
    • Hao Su, Jianwei Niu, Xuefeng Liu, Qingfeng Li, Ji Wan, Mingliang Xu, and Tao Ren. Artcoder: an end-to-end method for generating scanning-robust stylized qr codes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2277–2286, 2021
  • Locating Pattern Minimization【设计了编码规则,以满足人类视觉系统的敏感性,使二维码的定位图案对肉眼不那么显眼,同时仍然保持其可扫描性。】
    • Changsheng Chen, Wenjian Huang, Lin Zhang, and Wai Ho Mow. Robust and Unobtrusive Display-to-Camera Communications via Blue Channel Embedding. IEEE Transactions on Image Processing, 28(1):156–169, 2018[?
    • Changsheng Chen, Baojian Zhou, and Wai Ho Mow. RA Code: A Robust and Aesthetic Code for ResolutionConstrained Applications. IEEE Transactions on Circuits and Systems for Video Technology, 28(11):3300–3312, 2018. 1, 2
    • Zehua Ma, Xi Yang, Han Fang, Weiming Zhang, and Nenghai Yu. Oacode: Overall aesthetic 2d barcode on screen. IEEE Transactions on Multimedia, 2023
  • TPVM (Temporal Pixel Value Modulation) 【将二维码隐藏在视频中,利用屏幕与人眼之间的帧率差异,使二维码对肉眼不可见,但在被摄像机拍摄后仍然可以解码】
    • Zhongpai Gao, Guangtao Zhai, and Chunjia Hu. The Invisible QR Code. In Proceedings ofthe 23rd ACM International Conference on Multimedia, pages 1047–1050, 2015
  • Invisible Information Hiding【使信息对肉眼不可见但在拍摄后可以解码的技术。这是通过以一种不影响人类感知但可以被数字扫描器检测和读取的方式操纵数据的可见性来实现的】
    • Han Fang, Weiming Zhang, Hang Zhou, Hao Cui, and Nenghai Yu. Screen-Shooting Resilient Watermarking. IEEE Transactions on Information Forensics and Security, 14(6): 1403–1418, 2018. 2
    • Han Fang, Dongdong Chen, Feng Wang, Zehua Ma, Honggu Liu, Wenbo Zhou, Weiming Zhang, and Neng-Hai Yu. TERA: Screen-to-Camera Image Code with Transparency, Efficiency, Robustness and Adaptability. IEEE Transactions on Multimedia, pages 1–1, 2021. 2
    • Jun Jia, Zhongpai Gao, Kang Chen, Menghan Hu, Xiongkuo Min, Guangtao Zhai, and Xiaokang Yang. RIHOOP: Robust Invisible Hyperlinks in Offline and Online Photographs. IEEE Transactions on Cybernetics, pages 1–13, 2020. 2
    • Jun Jia, Zhongpai Gao, Dandan Zhu, Xiongkuo Min, Guangtao Zhai, and Xiaokang Yang. Learning invisible markers for hidden codes in offline-to-online photography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2273–2282, 2022. 2
    • Matthew Tancik, Ben Mildenhall, and Ren Ng. Stegastamp: Invisible Hyperlinks in Physical Photographs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2117–2126, 2020Eric Wengrowski and Kristin Dana. Light Field Messaging with Deep Photographic Steganography. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1515–1524, 2019

Preliminary

文章还讨论了即使在颜色和形状微妙变化的情况下,如何通过保持采样像素颜色的变化来确保扫描鲁棒性。

  1. 图像灰度化:首先,将彩色的QR码图像转换为灰度表示,通过提取亮度通道(Y通道,属于YCbCr颜色空间),记为 I ∈ R H × W I \in \mathbb{R}^{H \times W} IRH×W,包含 L L L个灰度级别(通常是256个)。

    • Y通道:亮度通道,表示图像的亮度信息,即黑白部分。它包含了图像的灰度信息,与人类视觉系统对亮度的感知最为敏感。
    • Cb通道:蓝色色度通道,表示图像中蓝色与黄色的色度分量。Cb是"Chroma Blue"的缩写,代表蓝色色差。
    • Cr通道:红色色度通道,表示图像中红色与绿色的色度分量。Cr是"Chroma Red"的缩写,代表红色色差。
  2. 定位标记:扫描器首先定位到Finder和Alignment标记,这些标记用于识别QR码区域并提取关键信息,如模块数量和模块大小。

  3. 构建网格:使用标记信息构建一个包含 n 2 n^2 n2 个模块的网格,每个模块记为 M k , k ∈ [ 1 , 2 , . . . , n 2 ] M_k, k \in [1, 2, ..., n^2] Mk,k[1,2,...,n2]

    • 每个模块像素大小为 a × a a\times a a×a ,( n ⋅ a ≤ min ⁡ ( H , W ) n\cdot a\leq\operatorname*{min}(H,W) namin(H,W))
    • 这个网格将图像 I I I 分割成 n 2 n^2 n2 个小块,每个小块表示为 I M k ∈ R a × a I_{M_k} \in \mathbb{R}^{a \times a} IMkRa×a
  4. 模块解码:每个模块( M k M_k Mk)被解码为1位信息 I ~ k \tilde{I}_k I~k,表示为0或1,其中 I ~ ∈ R n × n \tilde{I} \in \mathbb{R}^{n \times n} I~Rn×n 是最终的二进制图像。通常,扫描器会在每个模块的中心子区域内采样像素。

    • 模块的解码值 I ~ k \tilde{I}_k I~k:扫描器根据 θ θ θ区域内的像素值来计算模块 M k M_k Mk 的解码二进制值 I ~ k \tilde{I}_k I~k。计算过程如下:
      v k = 1 x 2 ∑ p ∈ θ I M k ( p ) ; I ~ k = { 0 , if  v k ≤ T b , 1 , if  v k ≥ T w , − 1 , otherwise. v_k=\frac{1}{x^2}\sum_{\mathbf{p}\in\theta}I_{M_k}(\mathbf{p});\quad\tilde{I}_k=\begin{cases}0,\text{if }v_k\leq\mathcal{T}_b,\\1,\text{if }v_k\geq\mathcal{T}_w,\\-1,\text{otherwise.}\end{cases} vk=x21pθIMk(p);I~k= 0,if vkTb,1,if vkTw,1,otherwise.

      • I M k ( p ) I_{M_k}(p) IMk(p) 是模块 M k M_k Mk 中像素 p p p 的强度值, v k v_k vk 是θ区域内所有像素强度值的平均值。
      • 像素坐标 p p p 表示图像 I I I中像素的坐标, p ∈ { 1 , 2 , … , H } × { 1 , 2 , … , W } \mathbf{p}\in\{1,2,\ldots,H\}\times\{1,2,\ldots,W\} p{ 1,2,,H}×{ 1,2,,W}
      • 定义θ区域 θ θ θ是一个以模块 M k M_k Mk为中心的正方形区域,其大小为 x × x x \times x x×x 像素。这个区域是从图像 I I I 中提取的一个局部区域,用于评估和解码该模块。
      • 二值化阈值:根据 v k v_k vk 的值, I ~ k \tilde{I}_k I~k 将被解码为0或1。这里使用了两个阈值 T b Tb Tb T w Tw Tw 来进行二值化:
        • 如果 v k v_k vk 小于或等于 T b Tb Tb,则 I ~ k \tilde{I}_k I~k 被解码为0。
        • 如果 v k v_k vk 大于或等于 T w Tw Tw,则 I ~ k \tilde{I}_k I~k 被解码为1。
        • 如果 v k v_k vk T b Tb Tb T w Tw
  • 16
    点赞
  • 21
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值