需求
要求用PaddleOCR检测车票并框出OCR信息,车票如下所示:
!pip install paddlehub==2.2.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
!hub install chinese_ocr_db_crnn_server==1.0.0
!pip install shapely
!pip install pyclipper
import paddlehub as hub
import cv2
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
img_list = ['ticket.jpg']
ocr = hub.Module(name="chinese_ocr_db_crnn_server")
result = ocr.recognize_text(paths=img_list) # 这种写法也可以
print(result)
输出结果为json
[2024-10-05 18:13:48,827] [ WARNING] - The _initialize method in HubModule will soon be deprecated, you can use the __init__() to handle the initialization of the object
[2024-10-05 18:13:49,237] [ WARNING] - The _initialize method in HubModule will soon be deprecated, you can use the __init__() to handle the initialization of the object
/home/aistudio/.paddlehub/modules/chinese_text_detection_db_server/processor.py:171: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
xmin = np.clip(np.floor(box[:, 0].min()).astype(np.int), 0, w - 1)
/home/aistudio/.paddlehub/modules/chinese_text_detection_db_server/processor.py:172: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
xmax = np.clip(np.ceil(box[:, 0].max()).astype(np.int), 0, w - 1)
/home/aistudio/.paddlehub/modules/chinese_text_detection_db_server/processor.py:173: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ymin = np.clip(np.floor(box[:, 1].min()).astype(np.int), 0, h - 1)
/home/aistudio/.paddlehub/modules/chinese_text_detection_db_server/processor.py:174: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
ymax = np.clip(np.ceil(box[:, 1].max()).astype(np.int), 0, h - 1)
/home/aistudio/.paddlehub/modules/chinese_text_detection_db_server/module.py:212: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
res['data'] = boxes.astype(np.int).tolist()
/home/aistudio/.paddlehub/modules/chinese_ocr_db_crnn_server/module.py:226: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
boxes[index].astype(np.int).tolist()
[{'save_path': '', 'data': [{'text': 'Y082619', 'confidence': 0.9972462058067322, 'text_box_position': [[89, 102], [229, 105], [229, 132], [89, 129]]}, {'text': '天津', 'confidence': 0.923931360244751, 'text_box_position': [[551, 109], [612, 111], [611, 144], [550, 142]]}, {'text': '售', 'confidence': 0.9392770528793335, 'text_box_position': [[622, 113], [654, 113], [654, 145], [622, 145]]}, {'text': '2016年05月16日08:03开', 'confidence': 0.9847583770751953, 'text_box_position': [[86, 142], [414, 149], [414, 173], [86, 166]]}, {'text': '12车034号', 'confidence': 0.9958025217056274, 'text_box_position': [[516, 150], [655, 154], [654, 182], [515, 177]]}, {'text': 'G177次', 'confidence': 0.9930286407470703, 'text_box_position': [[315, 184], [433, 187], [432, 221], [314, 218]]}, {'text': '二等座', 'confidence': 0.9949294924736023, 'text_box_position': [[579, 180], [657, 186], [655, 214], [577, 207]]}, {'text': '天津', 'confidence': 0.9746274948120117, 'text_box_position': [[135, 195], [233, 195], [233, 238], [135, 238]]}, {'text': '青岛', 'confidence': 0.9864411354064941, 'text_box_position': [[475, 197], [574, 203], [572, 244], [473, 239]]}, {'text': 'TianJin', 'confidence': 0.9756895899772644, 'text_box_position': [[135, 244], [240, 248], [239, 275], [134, 272]]}, {'text': 'QingDao', 'confidence': 0.9983763694763184, 'text_box_position': [[480, 250], [588, 254], [587, 285], [479, 281]]}, {'text': '¥259.00元', 'confidence': 0.9569356441497803, 'text_box_position': [[83, 277], [259, 282], [258, 316], [82, 311]]}, {'text': '限乘当日当次车', 'confidence': 0.976534903049469, 'text_box_position': [[74, 323], [261, 327], [261, 354], [74, 351]]}, {'text': '张伟', 'confidence': 0.9746922850608826, 'text_box_position': [[73, 362], [134, 362], [134, 394], [73, 394]]}, {'text': '2323241981***+30IX', 'confidence': 0.8457385897636414, 'text_box_position': [[82, 397], [347, 399], [347, 426], [82, 424]]}, {'text': '30671211920Y082619', 'confidence': 0.9982701539993286, 'text_box_position': [[80, 439], [325, 440], [325, 466], [80, 465]]}]}]
取对应的json对象
result_dict = dict(result[0])
rects = []
words = result_dict['data']
for word in words:
print)
result_dict = dict(result[0])
rects = []
words = result_dict["data"]
for word in words: # word为字典,result
print(word["text"])
print(word["confidence"])
points = word["text_box_position"]
# print(points)
rects.append(points) # 添加到列表
# 在原图上绘制矩形
im = cv2.imread(img_list[0], 1)
for rect in rects:
p1 = (rect[0][0], rect[0][1])
p2 = (rect[1][0], rect[1][1])
p3 = (rect[2][0], rect[2][1])
p4 = (rect[3][0], rect[3][1])
cv2.line(im, p1, p2, (0, 0, 255), 2)
cv2.line(im, p2, p3, (0, 0, 255), 2)
cv2.line(im, p3, p4, (0, 0, 255), 2)
cv2.line(im, p4, p1, (0, 0, 255), 2)
cv2.imwrite("ticket_result.jpg", im)
结果: