python-opencv图像分割步骤_如何在Python/OpenCV中将图像分割成干净的段落?

该博客介绍了一种使用Python和OpenCV将图像分割成清晰段落的方法。通过水平模糊、阈值处理找到线条,然后识别文本行并确定段落边界。最终,使用矩形框标记出每个段落,并保存结果。
摘要由CSDN通过智能技术生成

这是特定于所附段落结构的。我不确定您是否需要更全面的解决方案,但可能需要额外的工作:import cv2

import numpy as np

import matplotlib.pyplot as plt

image = cv2.imread('paragraphs.png', 0)

# find lines by horizontally blurring the image and thresholding

blur = cv2.blur(image, (91,9))

b_mean = np.mean(blur, axis=1)/256

# hist, bin_edges = np.histogram(b_mean, bins=100)

# threshold = bin_edges[66]

threshold = np.percentile(b_mean, 66)

t = b_mean > threshold

'''

get the image row numbers that has text (non zero)

a text line is a consecutive group of image rows that

are above the threshold and are defined by the first and

last row numbers

'''

tix = np.where(1-t)

tix = tix[0]

lines = []

start_ix = tix[0]

for ix in range(1, tix.shape[0]-1):

if tix[ix] == tix[ix-1] + 1:

continue

# identified gap between lines, close previous line and start a new one

end_ix = tix[ix-1]

lines.append([start_ix, end_ix])

start_ix = tix[ix]

end_ix = tix[-1]

lines.append([start_ix, end_ix])

l_starts = []

for line in lines:

center_y = int((line[0] + line[1]) / 2)

xx = 500

for x in range(0,500):

col = image[line[0]:line[1], x]

if np.min(col) < 64:

xx = x

break

l_starts.append(xx)

median_ls = np.median(l_starts)

paragraphs = []

p_start = lines[0][0]

for ix in range(1, len(lines)):

if l_starts[ix] > median_ls * 2:

p_end = lines[ix][0] - 10

paragraphs.append([p_start, p_end])

p_start = lines[ix][0]

p_img = np.array(image)

n_cols = p_img.shape[1]

for paragraph in paragraphs:

cv2.rectangle(p_img, (5, paragraph[0]), (n_cols - 5, paragraph[1]), (128, 128, 0), 5)

cv2.imwrite('paragraphs_out.png', p_img)

输入/输出

7IbOy.png

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值