python数字识别_OpenCV-Python中的简单数字识别OCR

1586010002-jmsa.png

I am trying to implement a "Digit Recognition OCR" in OpenCV-Python (cv2). It is just for learning purposes. I would like to learn both KNearest and SVM features in OpenCV.

I have 100 samples (i.e. images) of each digit. I would like to train with them.

There is a sample letter_recog.py that comes with OpenCV sample. But i still couldn't figure out on how to use it. I don't understand what are the samples, responses etc. Also, it loads a txt file at first, which i didn't understand first.

Later on searching a little bit, i could find a letter_recognition.data in cpp samples. I used it and made a code for cv2.KNearest in the model of letter_recog.py (just for testing):

import numpy as np

import cv2

fn = 'letter-recognition.data'

a = np.loadtxt(fn, np.float32, delimiter=',', converters={ 0 : lambda ch : ord(ch)-ord('A') })

samples, responses = a[:,1:], a[:,0]

model = cv2.KNearest()

retval = model.train(samples,responses)

retval, results, neigh_resp, dists = model.find_nearest(samples, k = 10)

print results.ravel()

It gave me an array of size 20000, i don't understand what it is.

Questions:

1) What is letter_recognition.data file ? How to build that file from my own data set?

2) What does results.reval() denote?

3) How we can write a simple digit recognition tool using letter_recognition.data file (either KNearest or SVM)?

解决方案

Well, I decided to workout myself on my question to solve above problem. What i wanted is to implement a simpl OCR using KNearest or SVM features in OpenCV. And below is what i did and how. ( it is just for learning how to use KNearest for simple OCR purposes).

1) My first question was about letter_recognition.data file that comes with OpenCV samples. I wanted to know what is inside that file.

It contains a letter, along with 16 features of that letter.

And this SOF helped me to find it. These 16 features are explained in the paperLetter Recognition Using Holland-Style Adaptive Classifiers.

( Although i didn't understand some of the features at end)

2) Since i knew, without understanding all those features, it is difficult to do that method. i tried some other papers, but all were a little difficult for a beginner.

So I just decided to take all the pixel values as my features. (I was not worried about accuracy or performance, i just wanted it to work, at least with the least accuracy)

I took below image for my training data:

IwQY6.png

( I know the amount of training data is less. But, since all letters are of same font and size, i decided to try on this).

To prepare the data for training, i made a small code in OpenCV. It does following things:

a) It loads the image.

b) Selects the digits ( obviously by contour finding and applying constraints on area and height of letters to avoid false detections).

c) Draws the bounding rectangle around one letter and wait for key press manually. This time we press the digit key ourselves corresponding to the letter in box.

d) Once corresponding digit key is pressed, it resizes this box to 10x10 and saves 100 pixel values in an array (here, samples) and corresponding manually entered digit in another array(here, responses).

e) Then save both the arrays in separate txt files.

At the end of manual classification of digits, all the digits in the train data( train.png) are labeled manually by ourselves, image will look like below:

jyAhT.png

Below is the code i used for above purpose ( of course, not so clean):

import sys

import numpy as np

import cv2

im = cv2.imread('pitrain.png')

im3 = im.copy()

gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)

blur = cv2.GaussianBlur(gray,(5,5),0)

thresh = cv2.adaptiveThreshold(blur,255,1,1,11,2)

################# Now finding Contours ###################

contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)

samples = np.empty((0,100))

responses = []

keys = [i for i in range(48,58)]

for cnt in contours:

if cv2.contourArea(cnt)>50:

[x,y,w,h] = cv2.boundingRect(cnt)

if h>28:

cv2.rectangle(im,(x,y),(x+w,y+h),(0,0,255),2)

roi = thresh[y:y+h,x:x+w]

roismall = cv2.resize(roi,(10,10))

cv2.imshow('norm',im)

key = cv2.waitKey(0)

if key == 27: # (escape to quit)

sys.exit()

elif key in keys:

responses.append(int(chr(key)))

sample = roismall.reshape((1,100))

samples = np.append(samples,sample,0)

responses = np.array(responses,np.float32)

responses = responses.reshape((responses.size,1))

print "training complete"

np.savetxt('generalsamples.data',samples)

np.savetxt('generalresponses.data',responses)

Now we enter in to training and testing part.

For testing part i used below image, which has same type of letters i used to train.

dPaE8.png

For training we do as follows:

a) Load the txt files we already saved earlier

b) create a instance of classifier we are using ( here, it is KNearest)

c) Then we use KNearest.train function to train the data

For testing purposes, we do as follows:

a) We load the image used for testing

b) process the image as earlier and extract each digit using contour methods

c) Draw bounding box for it, then resize to 10x10, and store its pixel values in an array as done earlier.

d) Then we use KNearest.find_nearest() function to find the nearest item to the one we gave. ( If lucky, it recognises the correct digit.)

I included last two steps ( training and testing) in single code below:

import cv2

import numpy as np

####### training part ###############

samples = np.loadtxt('generalsamples.data',np.float32)

responses = np.loadtxt('generalresponses.data',np.float32)

responses = responses.reshape((responses.size,1))

model = cv2.KNearest()

model.train(samples,responses)

############################# testing part #########################

im = cv2.imread('pi.png')

out = np.zeros(im.shape,np.uint8)

gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)

thresh = cv2.adaptiveThreshold(gray,255,1,1,11,2)

contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)

for cnt in contours:

if cv2.contourArea(cnt)>50:

[x,y,w,h] = cv2.boundingRect(cnt)

if h>28:

cv2.rectangle(im,(x,y),(x+w,y+h),(0,255,0),2)

roi = thresh[y:y+h,x:x+w]

roismall = cv2.resize(roi,(10,10))

roismall = roismall.reshape((1,100))

roismall = np.float32(roismall)

retval, results, neigh_resp, dists = model.find_nearest(roismall, k = 1)

string = str(int((results[0][0])))

cv2.putText(out,string,(x,y+h),0,1,(0,255,0))

cv2.imshow('im',im)

cv2.imshow('out',out)

cv2.waitKey(0)

And it worked , below is the result i got:

xS3gF.png

Here it worked with 100% accuracy, for which the reason, i assume, is all digits are of same kind and same size.

But any way, this is a good start to go for beginners ( i hope so).

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值