halcon学习和实践（ocr识别）

嵌入式-老费

已于 2022-09-07 09:00:29 修改

阅读量1.9k

点赞数

分类专栏： # halcon学习和实践文章标签：学习

于 2022-09-07 08:58:21 首次发布

本文链接：https://blog.csdn.net/feixiaoxing/article/details/126737764

版权

halcon学习和实践专栏收录该内容

10 篇文章 4 订阅

订阅专栏

ocr是工业视觉里面重要的一个环节。大家可以试想一下，目前工业视觉里面，除了定位和测量、残次品识别之外，另外一个重要的应用领域就是ocr。ocr，全称是optical character recognition，也就是光学字符识别。比如，别人已经贴好的标签，打印好的文字、发票，身份证信息，这些都可以用ocr识别的算法来解决。

目前识别的方法主要有两种，一种是传统的机器学习算法，这部分有决策树、多层感知机、svm算法等等；另外一种就是基于cnn、rnn的深度学习，虽然算法本身类似于黑盒，但是识别效果很好，目前使用的也越来越多。当然，不管是哪种算法，都要包含这样几个步骤，

1）挑选好一定数量的样本，一般来说，样本越多越好；

2）将样本分成训练集和测试集，训练集是专门用于训练使用，测试集是训练后使用；

3）将训练好的模型部署到现场，进行实际生产的使用。

在halcon的样例代码中，也包含了很多的ocr用例，ocr_wafer_semi_font.hdev就是其中一个。今天可以分析一下这个算法，

* 
* This example describes one step from the semiconductor product chain.
* In the front-end-of-line step, the ICs are printed on the wafer. To
* tag a single wafer from the production life line, each wafer receives
* an ID number, printed with the SEMI font. This ID number is read here.
* 
dev_update_off ()
dev_close_window ()
read_image (Image, 'ocr/wafer_semi_font_01')
dev_open_window_fit_image (Image, 0, 0, -1, -1, WindowHandle)
dev_set_draw ('margin')
set_display_font (WindowHandle, 16, 'mono', 'true', 'false')
dev_set_line_width (2)
dev_set_colored (12)
* 
read_ocr_class_mlp ('SEMI_NoRej.omc', OCRHandle)
NumImages := 10
for Index := 1 to NumImages by 1
    * 
    * Segment characters
    read_image (Image, 'ocr/wafer_semi_font_' + Index$'02')
    * Characters must be black-on-white, i.e., dark characters on a light background
    invert_image (Image, ImageInvert)
    mean_image (Image, ImageMean, 31, 31)
    dyn_threshold (Image, ImageMean, RegionDynThresh, 7, 'light')
    * Characters are often dotted. Therefore, we first merge close dots
    * that belong to the same character just before calling the operator connection
    closing_circle (RegionDynThresh, RegionClosing, 2.0)
    connection (RegionClosing, ConnectedRegions)
    * Filter out characters based on two facts:
    * 1. Characters are printed in SEMI-12. Therefore we can make strong assumptions
    *    on the dimensions of the characters
    * 2. Characters are printed along a straight line
    select_shape (ConnectedRegions, SelectedRegions1, ['height','width'], 'and', [29,15], [60,40])
    area_center (SelectedRegions1, Area, RowCh, ColumnCh)
    MedianRow := median(RowCh)
    select_shape (SelectedRegions1, Chars, 'row', 'and', MedianRow - 30, MedianRow + 30)
    * 
    * Read out segmented characters
    enhance_contrast (Chars, ImageInvert, ImageRead)
    sort_region (Chars, CharsSorted, 'character', 'true', 'column')
    do_ocr_multi_class_mlp (CharsSorted, ImageRead, OCRHandle, Class, Confidence)
    * 
    dev_display (ImageInvert)
    dev_display (CharsSorted)
    area_center (CharsSorted, Area1, Row, Column)
    MeanRow := mean(Row)
    disp_message (WindowHandle, Class, 'image', MeanRow + 42, Column - 11, 'yellow', 'false')
    disp_message (WindowHandle, Class, 'image', MeanRow + 40, Column - 10, 'slate blue', 'false')
    if (Index != NumImages)
        disp_continue_message (WindowHandle, 'black', 'true')
        stop ()
    endif
endfor
clear_ocr_class_mlp (OCRHandle)

整个代码有55行，下面开始逐步进行分析。

代码从第7行开始，第7行到第14行的部分主要是验证一下图片是否可以读出。这里挑选了一张图片进行读取检验。因为后续循环的时候会不断读取图片，所以在此之前先用一张图片试试水，确保图片的读取路径没有问题。当然，中间还对字体、线宽、颜色做了设置。

第16行，读入OCRHandle，可以理解为加载好训练的模型。

第17-18行，开始准备循环读入图片，总共10张，

第21行，读取图片，

第23行，反转图片，主要是像素反转，

第24行，中值滤波，

第25行，二值化处理，

第28行，闭运算，

第29行，图像分割，

第34行，根据高度和宽度筛选合适的区域，

第35-37行，先计算一个平均高度，然后在平均高度的（-30，30）范围内筛选，