java ocr图片识别去燥,如何从图像中分割噪声和文本以进行OCR的预处理

I am applying OCR against subtitle in TV footage. (I am using Tesseact 3.x w/ C++) I am trying to split text and background part as a preprocessing of OCR.

Here's the original image:

sv9Pm.png

And, preprocessed image:

aYoKa.png

The OCR result is: Sicemn clone

As the above preprocessed image shown, there're some "fog" remained around the letter which prevents OCR module to do their job properly.

Is there any way to recognize those "fog" programatically to remove, or do some image processing to remove/reduce it from the preprocessed image?

Since preprocessed logic is heavily optimized to handle different images, I rather want to find a way to "clean" the preprocessed image, than modifying preprocessed logic (since optimizing to this pics can affecting to other pics)

Any suggestion is very welcome.

Update

Apparently, sixela's answer is great, and will work with most of the case.

The case it does not work is background also include similar color of text

Example of not working case:

mzjp8.png

Example of result:

TDyPf.png

Seemingly, Gaussian filter seems to cause a problem in this types of footage.

This implies, different footage may requires different approach.

解决方案

I managed to have a clearer (not perfect) image by using morphological operations and thresholding.

Here is how:

I started by converting the original image in greyscale

Applied a gaussian Blur (9x9 kernel) to denoise the greyscale image

Top Hat Morphological operation (3x3 kernel)to get the white text

Otsu thresholding method

dilation

Inverted binary threshold to get the white text in black

I finally obtained the following image

66C4b.png

Which gives, as OCR results, this text: "Since vou don'k"

PS: This result can of course be improved by tweaking the parameters (kernel size for example) but i hope it can guide you. I used OpenCv in Python to quickly try out those methods.

import cv2

image = cv2.imread('./inputImg.png', 0)

imgBlur = cv2.GaussianBlur(image, (9, 9), 0)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))

imgTH = cv2.morphologyEx(imgBlur, cv2.MORPH_TOPHAT, kernel)

_, imgBin = cv2.threshold(imgTH, 0, 250, cv2.THRESH_OTSU)

imgdil = cv2.dilate(imgBin, kernel)

_, imgBin_Inv = cv2.threshold(imgdil, 0, 250, cv2.THRESH_BINARY_INV)

cv2.imshow('original', image)

cv2.imshow('bin', imgBin)

cv2.imshow('dil', imgdil)

cv2.imshow('inv', imgBin_Inv)

cv2.imwrite('./output.png', imgBin_Inv)

cv2.waitKey(0)

After this i tried the output image on Tesseract with this command:

tesseract output.png stdout

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值