ocr

ocr,身份证纹理去除,搜索到一个方法,记录一下。

===========================

There are certain cases where you, as a human, have trouble discerning between background and foreground, so certainly there is no method to do correctly what you want. Since you mention OCR, I assume you actually want to eliminate everything that is not text. This doesn't make the question any easier actually, so what I'm actually assuming is that you want to keep objects that are highly contrasted against other objects (like foreground and background, or black text on a white background, for example). Again, there is no perfect method for that.

So, all this answer is going to do is present a simple method that might help you in your task. The method is a combination of ready morphological tools and the Otsu method for binarization since it is statistically optimal. The result are the regions that are potentially worth to look at. Note that you will certainly need to combine these results with many other different analysis, a good OCR system goes much beyond these direct approaches.

The method: 1) convert the image to grayscale (not interested in the colors, but a different method can certainly use them); 2) Use the h-dome transform to remove irrelevant maxima; 3) Calculate the morphological gradient; 4) Binarize by otsu; 5) Remove small objects by area opening. Removing irrelevant maxima is important for your task since you can have pretty horrible regions caused by a combination of bad camera's with bad camera's flash together with a inexperienced photographer. H-dome transform is based on morphological reconstruction, so if your library has the latter but not the former, it is straightforward to implement it (otherwise you could learn how to efficiently implement the latter). Morphological gradient for discrete images is a very simple method to apply which tends to work fine even with bad illumination, since it is a local method. Threshold on its result by Otsu keeps the strongest edges (which possibly includes noise and other minor features). You could precede all this by a gaussian smoothing, which might serve as an initial tool for noise suppression. The small features are readily removed by area opening. In Matlab, this can be done as in:

f = rgb2gray(imread(yourimage));
se = strel('square', 3);
g = imhmax(f, 50);                    % h-dome with h = 50
g = imdilate(g, se) - imerode(g, se); % morphological gradient
h = im2bw(g, graythresh(g));          % graythresh applies Otsu's method
w = bwareaopen(h, 50);

assuming that objects smaller than 50 pixels are irrelevant (which might not always be the case for small text).

Here are the w images for your examples:

enter image description hereenter image description hereenter image description hereenter image description hereenter image description here

These outputs give an indication of where you should look for text, i.e., the interior of the connected components.


=====================
void bwareaopen(Mat& image, double size)
{
        image.convertTo(image, CV_8UC1);
        Mat labels, stats, centroids;
        int nLabels = connectedComponentsWithStats(image.clone(), labels, stats, centroids);
        int N = nLabels - 1;
        
        Mat temparea = stats.col(4);
        Mat idxx;
        findNonZero(temparea > size, idxx);
        Mat zerores = Mat::zeros(image.rows, image.cols, CV_8UC1);
        for (int j = 1; j<idxx.rows; j++)
        {
                bitwise_or(zerores, labels == idxx.at<int>(j, 1), zerores);
        }

        image = zerores;
}

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值