python图像分割重组_Python:图像分割作为分类的预处理

What technique do you recommend to segment the characters in this image to be ready to fed a model like the ones use with MNIST dataset; because they take one character at a time. This question is regadless the importance of transforming the image and the binarization of it.

Thanks!

解决方案

As a starting point i would try the following:

Use OTSU threshold.

Than do some morphological operations to get rid of noise and to isolate each digit.

Run connected component labling.

Fed each connected component to your classifier to get recognize the digit if the classification score is low discard.

Final validation you expect all the digit to be more or less on line and in more or less some constant distance from each other.

Here are the first 4 stages. Now you need to add your recognition software to recognize the digits.

import cv2

import numpy as np

from matplotlib import pyplot as plt

# Params

EPSSILON = 0.4

MIN_AREA = 10

BIG_AREA = 75

# Read img

img = cv2.imread('i.jpg',0)

# Otzu threshold

a,thI = cv2.threshold(img,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)

# Morpholgical

se = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(1,1))

thIMor = cv2.morphologyEx(thI,cv2.MORPH_CLOSE,se)

# Connected compoent labling

stats = cv2.connectedComponentsWithStats(thIMor,connectivity=8)

num_labels = stats[0]

labels = stats[1]

labelStats = stats[2]

# We expect the conneccted compoennt of the numbers to be more or less with a constats ratio

# So we find the medina ratio of all the comeonets because the majorty of connected compoent are numbers

ratios = []

for label in range(num_labels):

connectedCompoentWidth = labelStats[label,cv2.CC_STAT_WIDTH]

connectedCompoentHeight = labelStats[label, cv2.CC_STAT_HEIGHT]

ratios.append(float(connectedCompoentWidth)/float(connectedCompoentHeight))

# Find median ratio

medianRatio = np.median(np.asarray(ratios))

# Go over all the connected component again and filter out compoennt that are far from the ratio

filterdI = np.zeros_like(thIMor)

filterdI[labels!=0] = 255

for label in range(num_labels):

# Ignore biggest label

if(label==1):

filterdI[labels == label] = 0

continue

connectedCompoentWidth = labelStats[label,cv2.CC_STAT_WIDTH]

connectedCompoentHeight = labelStats[label, cv2.CC_STAT_HEIGHT]

ratio = float(connectedCompoentWidth)/float(connectedCompoentHeight)

if ratio > medianRatio + EPSSILON or ratio < medianRatio - EPSSILON:

filterdI[labels==label] = 0

# Filter small or large compoennt

if labelStats[label,cv2.CC_STAT_AREA] < MIN_AREA or labelStats[label,cv2.CC_STAT_AREA] > BIG_AREA:

filterdI[labels == label] = 0

plt.imshow(filterdI)

# Now go over each of the left compoenet and run the number recognotion

stats = cv2.connectedComponentsWithStats(filterdI,connectivity=8)

num_labels = stats[0]

labels = stats[1]

labelStats = stats[2]

for label in range(num_labels):

# Crop the bounding box around the component

left = labelStats[label,cv2.CC_STAT_LEFT]

top = labelStats[label, cv2.CC_STAT_TOP]

width = labelStats[label, cv2.CC_STAT_WIDTH]

height = labelStats[label, cv2.CC_STAT_HEIGHT]

candidateDigit = labels[top:top+height,left:left+width]

# plt.figure(label)

# plt.imshow(candidateDigit)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
图像分割是将一幅图像分成若干个互不重叠的区域,每个区域内的像素具有相似的特征,这些特征包括颜色、纹理、亮度等等。图像分割可以作为分类预处理,通过分割得到的区域可以提取出更加准确的特征,从而提高分类的准确率。 在Python中,可以使用OpenCV库进行图像分割。以下是一个简单的图像分割重组的示例代码: ```python import cv2 import numpy as np # 读取图像 img = cv2.imread('test.jpg') # 对图像进行分割 gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) ret, thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU) kernel = np.ones((3,3), np.uint8) opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=2) sure_bg = cv2.dilate(opening, kernel, iterations=3) dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5) ret, sure_fg = cv2.threshold(dist_transform, 0.7*dist_transform.max(), 255, 0) sure_fg = np.uint8(sure_fg) unknown = cv2.subtract(sure_bg, sure_fg) ret, markers = cv2.connectedComponents(sure_fg) markers = markers + 1 markers[unknown==255] = 0 markers = cv2.watershed(img, markers) img[markers == -1] = [255,0,0] # 显示分割结果 cv2.imshow('image',img) cv2.waitKey(0) cv2.destroyAllWindows() ``` 在上述代码中,通过对图像进行二值化、形态学操作、距离变换等处理,得到了图像的分割结果。然后使用Watershed算法进行重组,最终得到了分割后的图像。可以按照自己的需求进行调整和优化。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值