文本检测实战：使用OpenCV实现文本检测（EAST 文本检测器）(4)

最新推荐文章于 2024-05-05 04:15:49 发布

2401_84008965

最新推荐文章于 2024-05-05 04:15:49 发布

阅读量295

点赞数 3

分类专栏：程序员文章标签： opencv 人工智能计算机视觉

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/2401_84008965/article/details/138093492

版权

本文详细介绍了如何在Python中使用OpenCV的EAST文本检测器进行文本区域的自动检测，包括加载模型、图像预处理、边界框提取、非极大值抑制以及在图像和视频中的应用过程。

摘要由CSDN通过智能技术生成

end = time.time()

show timing information on text prediction

print(“[INFO] text detection took {:.6f} seconds”.format(end - start))

我们使用 cv2.dnn.readNet 将神经网络加载到内存中，方法是将路径传递给 EAST 检测器。

然后，我们通过将其转换为 blob 来准备我们的图像。要阅读有关此步骤的更多信息，请参阅深度学习：OpenCV 的 blobFromImage 工作原理。为了预测文本，我们可以简单地将 blob 设置为输入并调用 net.forward。这些行被抓取时间戳包围，以便我们可以打印经过的时间。通过将 layerNames 作为参数提供给 net.forward，我们指示 OpenCV 返回我们感兴趣的两个特征图：

用于导出输入图像中文本的边界框坐标的输出几何图
同样，分数图，包含给定区域包含文本的概率

我们需要一个一个地循环这些值中的每一个：

grab the number of rows and columns from the scores volume, then

initialize our set of bounding box rectangles and corresponding

confidence scores

(numRows, numCols) = scores.shape[2:4]

rects = []

confidences = []

loop over the number of rows

for y in range(0, numRows):

extract the scores (probabilities), followed by the geometrical

data used to derive potential bounding box coordinates that

surround text

scoresData = scores[0, 0, y]

xData0 = geometry[0, 0, y]

xData1 = geometry[0, 1, y]

xData2 = geometry[0, 2, y]

xData3 = geometry[0, 3, y]

anglesData = geometry[0, 4, y]

我们首先获取分数卷的维度（，然后初始化两个列表：

rects ：存储文本区域的边界框 (x, y) 坐标
置信度：将与每个边界框关联的概率存储在 rects 中

我们稍后将对这些区域应用非极大值抑制。循环遍历行。提取当前行 y 的分数和几何数据。接下来，我们遍历当前选定行的每个列索引：

loop over the number of columns

for x in range(0, numCols):

if our score does not have sufficient probability, ignore it

if scoresData[x] < args[“min_confidence”]:

continue

compute the offset factor as our resulting feature maps will

be 4x smaller than the input image

(offsetX, offsetY) = (x * 4.0, y * 4.0)

extract the rotation angle for the prediction and then

compute the sin and cosine

angle = anglesData[x]

cos = np.cos(angle)

sin = np.sin(angle)

use the geometry volume to derive the width and height of

the bounding box

h = xData0[x] + xData2[x]

w = xData1[x] + xData3[x]

compute both the starting and ending (x, y)-coordinates for

the text prediction bounding box

endX = int(offsetX + (cos * xData1[x]) + (sin * xData2[x]))

endY = int(offsetY - (sin * xData1[x]) + (cos * xData2[x]))

startX = int(endX - w)

startY = int(endY - h)

add the bounding box coordinates and probability score to

our respective lists

rects.append((startX, startY, endX, endY))

confidences.append(scoresData[x])

对于每一行，我们开始遍历列。我们需要通过忽略概率不够高的区域来过滤掉弱文本检测。

当图像通过网络时，EAST 文本检测器自然会减小体积大小——我们的体积大小实际上比我们的输入图像小 4 倍，因此我们乘以 4 以将坐标带回原始图像。

提取角度数据。然后我们分别更新我们的矩形和置信度列表。我们快完成了！最后一步是对我们的边界框应用非极大值抑制来抑制弱重叠边界框，然后显示结果文本预测：

apply non-maxima suppression to suppress weak, overl

最低0.47元/天解锁文章

关注

3
点赞
踩
10

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。