文本检测实战：使用OpenCV实现文本检测（EAST 文本检测器）(2)

最新推荐文章于 2024-08-16 09:28:22 发布

2401_84008929

最新推荐文章于 2024-08-16 09:28:22 发布

阅读量368

点赞数 3

分类专栏：程序员文章标签： opencv 人工智能计算机视觉

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/2401_84008929/article/details/138093445

版权

我们加载并复制我们的输入图像。确定原始图像尺寸与新图像尺寸的比率（基于为 --width 和 --height 提供的命令行参数）。然后我们调整图像大小，忽略纵横比。为了使用 OpenCV 和 EAST 深度学习模型进行文本检测，我们需要提取两层的输出特征图：

define the two output layer names for the EAST detector model that

we are interested – the first is the output probabilities and the

second can be used to derive the bounding box coordinates of text

layerNames = [

“feature_fusion/Conv_7/Sigmoid”,

“feature_fusion/concat_3”]

我们构建了一个 layerNames 列表：

第一层是我们的输出 sigmoid 激活，它为我们提供了一个区域是否包含文本的概率。

第二层是输出特征图，表示图像的“几何”——我们将能够使用这个几何来推导出输入图像中文本的边界框坐标

让我们加载 OpenCV 的 EAST 文本检测器：

load the pre-trained EAST text detector

print(“[INFO] loading EAST text detector…”)

net = cv2.dnn.readNet(args[“east”])

construct a blob from the image and then perform a forward pass of

the model to obtain the two output layer sets

blob = cv2.dnn.blobFromImage(image, 1.0, (W, H),

(123.68, 116.78, 103.94), swapRB=True, crop=False)

start = time.time()

net.setInput(blob)

(scores, geometry) = net.forward(layerNames)

end = time.time()

show timing information on text prediction

print(“[INFO] text detection took {:.6f} seconds”.format(end - start))

我们使用 cv2.dnn.readNet 将神经网络加载到内存中，方法是将路径传递给 EAST 检测器。

然后，我们通过将其转换为 blob 来准备我们的图像。要阅读有关此步骤的更多信息，请参阅深度学习：OpenCV 的 blobFromImage 工作原理。为了预测文本，我们可以简单地将 blob 设置为输入并调用 net.forward。这些行被抓取时间戳包围，以便我们可以打印经过的时间。通过将 layerNames 作为参数提供给 net.forward，我们指示 OpenCV 返回我们感兴趣的两个特征图：

用于导出输入图像中文本的边界框坐标的输出几何图
同样，分数图，包含给定区域包含文本的概率

我们需要一个一个地循环这些值中的每一个：

grab the number of rows and columns from the scores volume, then

initialize our set of bounding box rectangles and corresponding

confidence scores

(numRows, numCols) = scores.shape[2:4]

rects = []

confidences = []

loop over the number of rows

for y in range(0, numRows):

extract the scores (probabilities), followed by the geometrical

data used to derive potential bounding box coordinates that

surround text

scoresData = scores[0, 0, y]

xData0 = geometry[0, 0, y]

xData1 = geometry[0, 1, y]

xData2 = geometry[0, 2, y]

xData3 = geometry[0, 3, y]

anglesData = geometry[0, 4, y]

我们首先获取分数卷的维度（，然后初始化两个列表：

rects ：存储文本区域的边界框 (x, y) 坐标
置信度：将与每个边界框关联的概率存储在 rects 中

我们稍后将对这些区域应用非极大值抑制。循环遍历行。提取当前行 y 的分数和几何数据。接下来，我们遍历当前选定行的每个列索引：

loop over the number of columns

for x in range(0, numCols):

if our score does not have sufficient probability, ignore it

if scoresData[x] < args[“min_confidence”]:

continue

compute the offset factor as our resulting feature maps will

be 4x smaller than the input image

(offsetX, offsetY) = (x * 4.0, y * 4.0)

extract the rotation angle for the prediction and th

最低0.47元/天解锁文章

关注

3
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。