MSER+NMS 文本检测(身份证+发票+火车票)

 

版权

    此篇文章不细说MSER和NMS原理,以实战为主。

       MSER最大稳定极值区域:是对一幅灰度图像(灰度值为0~255)取阈值进行二值化处理,阈值从0到255依次递增。阈值的递增类似于分水岭算法中的水面的上升,随着水面的上升,有一些较矮的丘陵会被淹没,如果从天空往下看,则大地分为陆地和水域两个部分,这类似于二值图像。在得到的所有二值图像中,图像中的某些连通区域变化很小,甚至没有变化,则该区域就被称为最大稳定极值区域。具体算法的原理参考:Opencv2.4.9源码分析——MSER

       NMS是经常伴随图像区域检测的算法,作用是去除重复的区域,在人脸识别、物体检测等领域都经常使用,全称是非极大值抑制(non maximum suppression),顾名思义就是抑制不是极大值的元素,所以用在这里就是抑制不是最大框的框,也就是去除大框中包含的小框。NMS的基本思想是遍历将所有的框得分排序,选中其中得分最高的框,然后遍历其余框找到和当前最高分的框的重叠面积(IOU)大于一定阈值的框,删除。然后继续这个过程,找另一个得分高的框,再删除IOU大于阈值的框,循环。在这个例子中,就是设定一个IOU阈值(比如0.5,也就是如果两个框的重叠面积大于其中一个框的50%,那么就删除那个框),然后遍历所有框,对剩下的每个框,遍历判断其余框中与他重叠面积大于阈值的,则删除。最后剩下的就是不包含重叠部分的文本框了。具体算法的原理参考:目标检测之非极大值抑制(NMS)各种变体

一,代码:

1.1 MSER.py

import cv2
import matplotlib.pyplot as plt
import numpy as np
import nms

img = cv2.imread('D:/6.jpg')
orig = img.copy()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

mser = cv2.MSER_create(_min_area=10, _max_area=600)

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
regions, boxes = mser.detectRegions(gray)
keep = []
for box in boxes:
    x, y, w, h = box
    keep.append([x, y, x + w, y + h])
    cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)

plt.imshow(img, 'brg')
plt.show()

keep2=np.array(keep)
pick = nms.nms(keep2, 0.3)
for (startX, startY, endX, endY) in pick:
    cv2.rectangle(orig, (startX, startY), (endX, endY), (255, 0, 0), 1)
cv2.imshow("After NMS", orig)
cv2.waitKey(0)

opencv中MSER参数:

注意上面代码中我们是用“cv2.MSER_create()”得到了一个默认的MSER算法对象,这个对象也是可以设置参数的:

  • _delta it compares (sizei−sizei−delta)/sizei−delta
  • _min_area prune the area which smaller than minArea
  • _max_area prune the area which bigger than maxArea
  • _max_variation prune the area have similar size to its children
  • _min_diversity for color image, trace back to cut off mser with diversity less than min_diversity
  • _max_evolution for color image, the evolution steps
  • _area_threshold for color image, the area threshold to cause re-initialize
  • _min_margin for color image, ignore too small margin
  • _edge_blur_size for color image, the aperture size for edge blur

1.2 nms.py 

# import the necessary packages
import numpy as np


# Malisiewicz et al.
def nms(boxes, overlapThresh):
    # if there are no boxes, return an empty list
    if len(boxes) == 0:
        return []

    # if the bounding boxes integers, convert them to floats --
    # this is important since we'll be doing a bunch of divisions
    if boxes.dtype.kind == "i":
        boxes = boxes.astype("float")

    # initialize the list of picked indexes
    pick = []

    # grab the coordinates of the bounding boxes
    x1 = boxes[:, 0]
    y1 = boxes[:, 1]
    x2 = boxes[:, 2]
    y2 = boxes[:, 3]

    # compute the area of the bounding boxes and sort the bounding
    # boxes by the bottom-right y-coordinate of the bounding box
    area = (x2 - x1 + 1) * (y2 - y1 + 1)
    idxs = np.argsort(y2)

    # keep looping while some indexes still remain in the indexes
    # list
    while len(idxs) > 0:
        # grab the last index in the indexes list and add the
        # index value to the list of picked indexes
        last = len(idxs) - 1
        i = idxs[last]
        pick.append(i)

        # find the largest (x, y) coordinates for the start of
        # the bounding box and the smallest (x, y) coordinates
        # for the end of the bounding box
        xx1 = np.maximum(x1[i], x1[idxs[:last]])
        yy1 = np.maximum(y1[i], y1[idxs[:last]])
        xx2 = np.minimum(x2[i], x2[idxs[:last]])
        yy2 = np.minimum(y2[i], y2[idxs[:last]])

        # compute the width and height of the bounding box
        w = np.maximum(0, xx2 - xx1 + 1)
        h = np.maximum(0, yy2 - yy1 + 1)

        # compute the ratio of overlap
        overlap = (w * h) / area[idxs[:last]]

        # delete all indexes from the index list that have
        idxs = np.delete(idxs, np.concatenate(([last],
                                               np.where(overlap > overlapThresh)[0])))

    # return only the bounding boxes that were picked using the
    # integer data type
    return boxes[pick].astype("int")

二,效果:

2.1,发票

 

2.2,火车票

2.3,身份证

 

MSER对图片检测效果不好,在检测身份证的文本信息时,头像区域会有错误,在实际检测时可以先将照片区域遮挡,再进行文本检测。 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值