Basic OCR in OpenCV

最新推荐文章于 2018-09-26 20:01:10 发布

hilter

最新推荐文章于 2018-09-26 20:01:10 发布

阅读量1.1k

点赞数

文章标签： basic image float file function vector

分类： matlab 图像 OpenCV 2012-02-23 20:10 33人阅读评论(0) 收藏举报

http://blog.damiles.com/category/tutorials/opencv-tutorials/

In this tutorial we go to create a basic number OCR. It consist to classify a handwrite number into his class.

To do it, we go to use all we learn in before tutorials, we go to use a simplebasic painter andthe basic pattern recognition and classification with openCV tutorial.

In a typical pattern recognition classifier consist in three modules:

Preprocessing: in this module we go to process our input image, for example size normalize, convert color to BN…

Feature extraction: in this module we convert our image processed to a characteristic vector of features to classify, it can be the pixels matrix convert to vector or get contour chain codes data representation

Classification module get the feature vectors and train our system or classify an input feature vector with a classify method as knn.

In this basic OCR we go to use this graph:

Where we get a train set and test set of image to train and test our classifier method (knn)

We have a 1000 handwrite images, 100 images of each number. We get 50 images of each number (class) to train and other 50 to test our system.

Then the first work we do is pre-process all train image, to do it we create a preprocessing function. In this function we get a image and a new width and height we want as result of preprocessing, then the function return a normalized size with bounding box image. You can see more clear the process in this graph:

Pre-processing code:

 
    01void  findX(IplImage* imgSrc,int* min,int* max){ 
 
    02int  i; 
 
    03int  minFound=0; 
 
    04CvMat data; 
 
    05CvScalar maxVal=cvRealScalar(imgSrc->width * 255);
 
    06CvScalar val=cvRealScalar(0);  
 
    07//For each col sum, if sum < width*255 then we find the min
 
    08//then continue to end to search the max, if sum< width*255 then is new max
 
    09for  (i=0; i< imgSrc->width; i++){ 
 
    10cvGetCol(imgSrc, &data, i); 
 
    11val= cvSum(&data); 
 
    12if(val.val[0] < maxVal.val[0]){
 
    13*max= i; 
 
    14if(!minFound){
 
    15*min= i; 
 
    16minFound= 1; 
 
    17} 
 
    18} 
 
    19} 
 
    20} 
 
    21  
 
    22void  findY(IplImage* imgSrc,int* min,int* max){ 
 
    23int  i; 
 
    24int  minFound=0; 
 
    25CvMat data; 
 
    26CvScalar maxVal=cvRealScalar(imgSrc->width * 255);
 
    27CvScalar val=cvRealScalar(0);  
 
    28//For each col sum, if sum < width*255 then we find the min
 
    29//then continue to end to search the max, if sum< width*255 then is new max
 
    30for  (i=0; i< imgSrc->height; i++){ 
 
    31cvGetRow(imgSrc, &data, i); 
 
    32val= cvSum(&data); 
 
    33if(val.val[0] < maxVal.val[0]){
 
    34*max=i; 
 
    35if(!minFound){
 
    36*min= i; 
 
    37minFound= 1; 
 
    38} 
 
    39} 
 
    40} 
 
    41} 
 
    42CvRect findBB(IplImage* imgSrc){  
 
    43CvRect aux; 
 
    44int  xmin, xmax, ymin, ymax; 
 
    45xmin=xmax=ymin=ymax=0; 
 
    46  
 
    47findX(imgSrc, &xmin, &xmax); 
 
    48findY(imgSrc, &ymin, &ymax); 
 
    49  
 
    50aux=cvRect(xmin, ymin, xmax-xmin, ymax-ymin);
 
    51  
 
    52//printf("BB: %d,%d - %d,%d\n", aux.x, aux.y, aux.width, aux.height);
 
    53  
 
    54return  aux; 
 
    55  
 
    56} 
 
    57  
 
    58IplImage preprocessing(IplImage* imgSrc,intnew_width, intnew_height){ 
 
    59IplImage* result; 
 
    60IplImage* scaledResult; 
 
    61  
 
    62CvMat data; 
 
    63CvMat dataA; 
 
    64CvRect bb;//bounding box
 
    65CvRect bba;//boundinb box maintain aspect ratio
 
    66  
 
    67//Find bounding box 
 
    68bb=findBB(imgSrc); 
 
    69  
 
    70//Get bounding box data and no with aspect ratio, the x and y can be corrupted
 
    71cvGetSubRect(imgSrc, &data, cvRect(bb.x, bb.y, bb.width, bb.height));
 
    72//Create image with this data with width and height with aspect ratio 1
 
    73//then we get highest size betwen width and height of our bounding box
 
    74int  size=(bb.width>bb.height)?bb.width:bb.height; 
 
    75result=cvCreateImage( cvSize( size, size ), 8, 1 );
 
    76cvSet(result,CV_RGB(255,255,255),NULL);
 
    77//Copy de data in center of image  
 
    78int  x=(int)floor((float)(size-bb.width)/2.0f);
 
    79int  y=(int)floor((float)(size-bb.height)/2.0f);
 
    80cvGetSubRect(result, &dataA, cvRect(x,y,bb.width, bb.height));
 
    81cvCopy(&data, &dataA, NULL); 
 
    82//Scale result 
 
    83scaledResult=cvCreateImage( cvSize( new_width, new_height ), 8, 1 );
 
    84cvResize(result, scaledResult, CV_INTER_NN);
 
    85  
 
    86//Return processed data 
 
    87return  *scaledResult; 
 
    88  
 
    89}

We use the function getData of basicOCR class to create the train data and train classes, this function get all images under OCR folder to create this train data, the OCR forlder is structured with 1 folder to each class and each file have are pbm files with this name cnn.pbm where c is the class {0..9} and nn is the number of image {00..99}

Each image we get is pre-processed and then convert the data in a feature vector we use.

basicOCR.cpp getData code:

 
    01void  basicOCR::getData() 
 
    02{ 
 
    03IplImage* src_image; 
 
    04IplImage prs_image; 
 
    05CvMat row,data; 
 
    06char  file[255]; 
 
    07int  i,j; 
 
    08for(i =0; i<classes; i++){
 
    09for( j = 0; j< train_samples; j++){
 
    10  
 
    11//Load file 
 
    12if(j<10)
 
    13sprintf(file,"%s%d/%d0%d.pbm",file_path, i, i , j);
 
    14else
 
    15sprintf(file,"%s%d/%d%d.pbm",file_path, i, i , j);
 
    16src_image = cvLoadImage(file,0);  
 
    17if(!src_image){
 
    18printf("Error: Cant load image %s\n", file);
 
    19//exit(-1); 
 
    20} 
 
    21//process file 
 
    22prs_image = preprocessing(src_image, size, size);
 
    23  
 
    24//Set class label 
 
    25cvGetRow(trainClasses, &row, i*train_samples + j);
 
    26cvSet(&row, cvRealScalar(i));  
 
    27//Set data 
 
    28cvGetRow(trainData, &row, i*train_samples + j);
 
    29  
 
    30IplImage* img = cvCreateImage( cvSize( size, size ), IPL_DEPTH_32F, 1 );
 
    31//convert 8 bits image to 32 float image
 
    32cvConvertScale(&prs_image, img, 0.0039215, 0);
 
    33  
 
    34cvGetSubRect(img, &data, cvRect(0,0, size,size));
 
    35  
 
    36CvMat row_header, *row1; 
 
    37//convert data matrix sizexsize to vecor
 
    38row1 = cvReshape( &data, &row_header, 0, 1 );
 
    39cvCopy(row1, &row, NULL); 
 
    40} 
 
    41} 
 
    42}

After processed and get train data and classes whe then train our model with this data, in our sample we use knn method then:

`1`	`knn=newCvKNearest( trainData, trainClasses, 0,` `false, K );`

Then we now can test our model, and we can use the test result to compare to another methods we can use, or if we reduce the image scale or similar. There are a function to create the test in our basicOCR class, test function.

This function get the other 500 samples and classify this in our selected method and check the obtained result.

 
    01void  basicOCR::test(){ 
 
    02IplImage* src_image; 
 
    03IplImage prs_image; 
 
    04CvMat row,data; 
 
    05char  file[255]; 
 
    06int  i,j; 
 
    07int  error=0; 
 
    08int  testCount=0; 
 
    09for(i =0; i<classes; i++){
 
    10for( j = 50; j< 50+train_samples; j++){
 
    11  
 
    12sprintf(file,"%s%d/%d%d.pbm",file_path, i, i , j);
 
    13src_image = cvLoadImage(file,0);  
 
    14if(!src_image){
 
    15printf("Error: Cant load image %s\n", file);
 
    16//exit(-1); 
 
    17} 
 
    18//process file 
 
    19prs_image = preprocessing(src_image, size, size);
 
    20float  r=classify(&prs_image,0); 
 
    21if((int)r!=i)
 
    22error++; 
 
    23  
 
    24testCount++; 
 
    25} 
 
    26} 
 
    27float  totalerror=100*(float)error/(float)testCount;
 
    28printf("System Error: %.2f%%\n", totalerror);
 
    29  
 
    30}

Test use the classify function that get image to classify, process image, get feature vector and classify it with a find_nearest of knn class. This function we use to classify the input user images:

 
    01float  basicOCR::classify(IplImage* img, intshowResult) 
 
    02{ 
 
    03IplImage prs_image; 
 
    04CvMat data; 
 
    05CvMat* nearest=cvCreateMat(1,K,CV_32FC1);
 
    06float  result; 
 
    07//process file 
 
    08prs_image = preprocessing(img, size, size);
 
    09  
 
    10//Set data 
 
    11IplImage* img32 = cvCreateImage( cvSize( size, size ), IPL_DEPTH_32F, 1 );
 
    12cvConvertScale(&prs_image, img32, 0.0039215, 0);
 
    13cvGetSubRect(img32, &data, cvRect(0,0, size,size));
 
    14CvMat row_header, *row1; 
 
    15row1 = cvReshape( &data, &row_header, 0, 1 );
 
    16  
 
    17result=knn->find_nearest(row1,K,0,0,nearest,0);
 
    18  
 
    19int  accuracy=0; 
 
    20for(inti=0;i<K;i++){ 
 
    21if( nearest->data.fl[i] == result)
 
    22accuracy++; 
 
    23} 
 
    24float  pre=100*((float)accuracy/(float)K);
 
    25if(showResult==1){
 
    26printf("|\t%.0f\t| \t%.2f%%  \t| \t%d of %d \t| \n",result,pre,accuracy,K);
 
    27printf(" ---------------------------------------------------------------\n");
 
    28} 
 
    29  
 
    30return  result; 
 
    31  
 
    32}

All work or training and test is in basicOCR class, when we create a basicOCR instance then only we need call to classify function to classify our input image. Then we go to use basic Painter we create before in other tutorial to user interactivity to draw a image and classify it.

hilter

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
Basic OCR in OpenCV

分类： matlab 图像 OpenCV 2012-02-23 20:1033人阅读评论(0)收藏举报 http://blog.damiles.com/category/tutorials/opencv-tutorials/In this tutorial we go to create a basic number OCR. It consist t
复制链接

扫一扫