腾讯（表格OCR）API调用流程

憨憨师兄

已于 2022-10-25 09:32:16 修改

阅读量2.5k

点赞数 1

分类专栏：表格OCR服务API调用文章标签： 1024程序员节 python

于 2022-10-24 18:16:09 首次发布

本文链接：https://blog.csdn.net/weixin_48568323/article/details/127497239

版权

表格OCR服务API调用专栏收录该内容

6 篇文章

订阅专栏

2）1）进入文字识别控制台：https://console.cloud.tencent.com/ocr/overview

2）2）阅读《文字识别服务条款》后勾选同意并单击立即开通，即可一键开通编辑

3）获取SecretID、SecretKey

4)运行代码

3.json文件主要信息（举例信息以字典形式给出）

1.调用费用：

识别服务开通后，每个月可获得1,000次/月的免费资源包，于新开通用户于当日自动发放到账号
超出部分：0~1000次-->120元；0~10000次-->800元（资源包）

官网链接：文字识别计费概述-购买指南-文档中心-腾讯云 (tencent.com)

2.调用流程

1）腾讯云账号进行注册

腾讯云链接：腾讯云产业智变·云启未来 - 腾讯 (tencent.com)

2）开通文字识别服务

2）1）进入文字识别控制台：https://console.cloud.tencent.com/ocr/overview

2）2）阅读《文字识别服务条款》后勾选同意并单击立即开通，即可一键开通

3）获取SecretID、SecretKey

4)运行代码

代码修改初始信息：SecretId、SecretKey、img_path、save_path

import base64
import json
import pandas as pd
import re
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.ocr.v20181119 import ocr_client, models

"""SecretId/SecretKey/Image信息输入"""
SecretId = 'AKIDEGApGSSEl37wa0C9XsAHH7oN83yra9Zn'
SecretKey = ''
image_path = "D:\\table_generation-master\\demo\\table_recog_test\\0M208145r02110zw_1.png"
imageurl = None
save_path = "D:\\table_generation-master\\demo\\baidu_result\\jit_tencent.json"


def change_img_to_base64(image_path):
    """base64编码图片"""
    with open(image_path, 'rb') as f:
        image_data = f.read()
        base64_data: bytes = base64.b64encode(image_data)  # base64编码
        return base64_data


def json_to_dict(my_json: str) -> dict:
    """json转dict类型"""
    return json.loads(my_json)


def tencent_ocr(suffix, image_based_64, imageurl, SecretId, SecretKey, mode):
    """腾讯OCR
    :param suffix:图片的后缀，比如png,jpg
    :param image_based_64:图片的base64编码
    """
    try:
        cred = credential.Credential(SecretId, SecretKey)
        httpProfile = HttpProfile()
        httpProfile.endpoint = "ocr.tencentcloudapi.com"

        clientProfile = ClientProfile()
        clientProfile.httpProfile = httpProfile
        client = ocr_client.OcrClient(cred, "ap-beijing", clientProfile)

        req = models.RecognizeTableOCRRequest()
        """图片的 ImageUrl、ImageBase64 必须提供一个，如果都提供，只使用 ImageUrl"""
        if mode == "ImageBase64":
            params = {
                "ImageBase64": "data:image/{suffix};base64,{image_based_64}".format(
                    suffix=suffix, image_based_64=image_based_64.decode("utf8")),
            }
        elif mode == "ImageUrl":
            params = {
                "ImageUrl": imageurl
            }
        req.from_json_string(json.dumps(params))
        resp = client.TableOCR(req)
        print("resp_1", resp)
        return resp.to_json_string()

    except TencentCloudSDKException as err:
        print(err)


def formation(json_data, image_path):
    """根据腾讯ocr识别结果整理格式并输出"""
    if json_data is not None:
        dict_data = json_to_dict(json_data)

        rowIndex = []
        colIndex = []
        content = []
        print(dict_data)
        for item in dict_data['TextDetections']:
            rowIndex.append(item['RowTl'])
            colIndex.append(item['ColTl'])
            content.append(item['Text'])

        ##导出Excel
        ##ExcelWriter方案
        rowIndex = pd.Series(rowIndex)
        colIndex = pd.Series(colIndex)

        index = rowIndex.unique()
        index.sort()

        columns = colIndex.unique()
        columns.sort()

        data = pd.DataFrame(index=index, columns=columns)
        for i in range(len(rowIndex)):
            data.loc[rowIndex[i], colIndex[i]] = re.sub(" ", "", content[i])
        with open(image_path, 'rb') as f:

            writer = pd.ExcelWriter(re.match(".*\.", f.name).group() + "xlsx", engine='xlsxwriter')
            data.to_excel(writer, sheet_name='Sheet1', index=False, header=False)
            writer.save()
        f.close()

if __name__ == '__main__':

    if image:
        mode = 'ImageBase64'
    elif imageurl:
        mode = 'ImageUrl'
    else:
        assert True, "Input Image Message Error!"
    image_base64 = change_img_to_base64(image)  # 步骤1：图片转base64
    suffix = image.split('.')[-1]  # 后缀
    tencent_result: json = tencent_ocr(suffix, image_base64, imageurl, SecretId, SecretKey, mode)
    print(tencent_result)
    # formation(tencent_result, image)#如果需要导出excel，使用该函数
    with open(save_path, 'w', encoding='utf-8') as fp:
        fp.write(tencent_result+'\r\n')

3.json文件主要信息（举例信息以字典形式给出）

其他信息详见：文字识别表格识别（V2)-服务端 API 文档-文档中心-腾讯云 (tencent.com)

腾讯api:
TextDetections:
        ColTl:单元格左上角的列索引
        RowTl:单元格左上角的行索引
        ColBr:单元格右下角的列索引（coltl+span)
        RowBr:单元格右下角的行索引(rowtl+span)
        <header的上述四类信息均为-1>
        Text:单元格内容
        Type:header-->表头；body-->表格主体；footer-->表尾
        Confidence:置信度（0~100）
        Polugon:文本行坐标（四个顶点）