What technique do you recommend to segment the characters in this image to be ready to fed a model like the ones use with MNIST dataset; because they take one character at a time. This question is regadless the importance of transforming the image and the binarization of it.
Thanks!
解决方案
As a starting point i would try the following:
Use OTSU threshold.
Than do some morphological operations to get rid of noise and to isolate each digit.
Run connected component labling.
Fed each connected component to your classifier to get recognize the digit if the classification score is low discard.
Final validation you expect all the digit to be more or less on line and in more or less some constant distance from each other.
Here are the first 4 stages. Now you need to add your recognition software to recognize the digits.
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Params
EPSSILON = 0.4
MIN_AREA = 10
BIG_AREA = 75
# Read img
img = cv2.imread('i.jpg',0)
# Otzu threshold
a,thI = cv2.threshold(img,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# Morpholgical
se = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(1,1))
thIMor = cv2.morphologyEx(thI,cv2.MORPH_CLOSE,se)
# Connected compoent labling
stats = cv2.connectedComponentsWithStats(thIMor,connectivity=8)
num_labels = stats[0]
labels = stats[1]
labelStats = stats[2]
# We expect the conneccted compoennt of the numbers to be more or less with a constats ratio
# So we find the medina ratio of all the comeonets because the majorty of connected compoent are numbers
ratios = []
for label in range(num_labels):
connectedCompoentWidth = labelStats[label,cv2.CC_STAT_WIDTH]
connectedCompoentHeight = labelStats[label, cv2.CC_STAT_HEIGHT]
ratios.append(float(connectedCompoentWidth)/float(connectedCompoentHeight))
# Find median ratio
medianRatio = np.median(np.asarray(ratios))
# Go over all the connected component again and filter out compoennt that are far from the ratio
filterdI = np.zeros_like(thIMor)
filterdI[labels!=0] = 255
for label in range(num_labels):
# Ignore biggest label
if(label==1):
filterdI[labels == label] = 0
continue
connectedCompoentWidth = labelStats[label,cv2.CC_STAT_WIDTH]
connectedCompoentHeight = labelStats[label, cv2.CC_STAT_HEIGHT]
ratio = float(connectedCompoentWidth)/float(connectedCompoentHeight)
if ratio > medianRatio + EPSSILON or ratio < medianRatio - EPSSILON:
filterdI[labels==label] = 0
# Filter small or large compoennt
if labelStats[label,cv2.CC_STAT_AREA] < MIN_AREA or labelStats[label,cv2.CC_STAT_AREA] > BIG_AREA:
filterdI[labels == label] = 0
plt.imshow(filterdI)
# Now go over each of the left compoenet and run the number recognotion
stats = cv2.connectedComponentsWithStats(filterdI,connectivity=8)
num_labels = stats[0]
labels = stats[1]
labelStats = stats[2]
for label in range(num_labels):
# Crop the bounding box around the component
left = labelStats[label,cv2.CC_STAT_LEFT]
top = labelStats[label, cv2.CC_STAT_TOP]
width = labelStats[label, cv2.CC_STAT_WIDTH]
height = labelStats[label, cv2.CC_STAT_HEIGHT]
candidateDigit = labels[top:top+height,left:left+width]
# plt.figure(label)
# plt.imshow(candidateDigit)