【文字超分辨率】Improving Text Image Resolution using a Deep Generative Adversarial Network for OCR 阅读笔记

会议:2019 International Conference on Document Analysis and Recognition (ICDAR)

📖Abstract

为了提高OCR的准确率,本文提出了一种基于GAN的方法。
使用了perceptual loss,包括an adversarial loss, a content loss and an L1 loss.


📖INTRODUCTION

随着深度学习的发展,许多用于提高识别精度的字符识别模型被提出。但是,除了用于字符识别的模型外,识别精度主要取决于图像分辨率,这在早期工作中很少被研究。
对于相同的识别模型,我们在高分辨率(HR)图像上可以获得比在低分辨率(LR)图像上更好的结果。 对于某些LR图像,仅通过重新设计识别模型很难获得预期的识别精度。 因此,有必要在识别之前使用超分辨率方法进行预处理。

本文提出了一种基于条件生成对抗网络(cGAN)的提高文本图像分辨率的方法,在网络训练中采用了更为复杂的损失函数。

在我们基于cGAN的方法中,有两个子网络,即生成器和判别器,它们是使用基于游戏的方式进行训练的。 训练后,生成器网络用于将LR文本图像映射到HR文本图像。
使用的感知损失函数包括对抗损失内容损失L1损失。对抗损失用于确保生成的SR图像接近于真实的

  • 0
    点赞
  • 7
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
Image super-resolution (SR) is the process of increasing the resolution of a low-resolution (LR) image to a higher resolution (HR) version. This is an important task in computer vision and has many practical applications, such as improving the quality of images captured by low-resolution cameras or enhancing the resolution of medical images. However, most existing SR methods suffer from a loss of texture details and produce overly smooth HR images, which can result in unrealistic and unappealing results. To address this issue, a new SR method called Deep Spatial Feature Transform (DSFT) has been proposed. DSFT is a deep learning-based approach that uses a spatial feature transform layer to recover realistic texture in the HR image. The spatial feature transform layer takes the LR image and a set of HR feature maps as input and transforms the features to a higher dimensional space. This allows the model to better capture the high-frequency details in the image and produce more realistic HR images. The DSFT method also employs a multi-scale approach, where the LR image is processed at multiple scales to capture both local and global features. Additionally, the model uses residual connections to improve the training process and reduce the risk of overfitting. Experimental results show that DSFT outperforms state-of-the-art SR methods in terms of both quantitative metrics and visual quality. The method is also shown to be robust to different noise levels and image degradation scenarios. In summary, DSFT is a promising approach for realistic texture recovery in image super-resolution. Its ability to capture high-frequency details and produce visually appealing HR images makes it a valuable tool for various applications in computer vision.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值