python3调用腾讯API——实现基本文本分析，计算机视觉(图像/文字/验证码/名片/驾驶证)识别

本文链接：https://blog.csdn.net/ITBigGod/article/details/103209496

python3调用腾讯API——实现基本文本分析，计算机视觉(图像/文字/验证码/名片/驾驶证)识别

前言

前面几篇文章是：
python3使用谷歌tesseract-ocr4.0实现图像/文字识别

python3调用百度API–ocr实现图像/文字/验证码识别

这些都是博主1月份的时候搞的东西了，放了很久在草稿箱里面，但是效果依然在。
在调用了谷歌，百度的OCR-api之后，现在试试腾讯的。

腾讯优图的OCR-API比复杂的就是生成签名。

这个鉴权签名

签名的要求：

将<key, value>请求参数对按key进行字典升序排序，得到有序的参数对列表N
URL编码算法用大写字母，对字符串S进行MD5运算，
将得到的MD5值所有字符转换成大写，得到接口请求签名。。。。。签名有效期5分钟

使用代码之前，你需要去腾讯AI平台创建应用，接入对应的能力。

在这里插入图片描述

爱谁你就点谁，本例子要OCR：
在这里插入图片描述
创建完成以后，就可以参考如下代码。

第一种方式调用-计算机视觉识别（可用于各类识别）

需要你自己的API账号、密码:

APPID = ‘xxxxx’
APPKEY = ‘xxxxxx’

获取地址：点我跳转

https://ai.qq.com/console/application/2111953790/data-info

在这里插入图片描述

这是官方的接口文档：接口文档说明。

这是一个多接口的综合版本，包含了对基本文本分析，计算机视觉的各类调用

提供的原始图片的base64编码数据（原图大小上限1MB，支持JPG、PNG、BMP格式）
在这里插入图片描述

和

在这里插入图片描述

上面图的功能都可以调用。

使用的时候，需要对应更改你的APPid和key。
使用哪种方式就调用哪种接口。

前提是你创建应用的时候，需要加入你的能力：
在这里插入图片描述

最后就是调用接口代码即可。

.
源码如下：

#!/usr/bin/python3

'''
create : 自带文字，坐标等信息
        像素坐标，包括左上角坐标x,y，以及宽、高
Purpose: python3--tecent ai api -腾讯API
'''

import requests
requests.packages.urllib3.disable_warnings()

import base64
import hashlib
import time
import random
import os, string
from io import BytesIO
from urllib.parse import urlencode
import json
from PIL import Image



class MsgTencent(object):
    def __init__(self, AppID, AppKey):
        self.app_id = AppID
        self.app_key = AppKey
        self.img_base64str = None

    def get_random_str(self):
        # 随机生成16位字符串
        rule = string.ascii_lowercase + string.digits
        str = random.sample(rule, 16)
        return "".join(str)

    def get_time_stamp(self):
        return str(int(time.time()))

    def __get_image_base64str__(self, image):
        if not isinstance(image, Image): return None
        outputBuffer = BytesIO()
        # bg.save(outputBuffer, format='JPEG')
        imgbase64 = base64.b64encode(outputBuffer.getvalue())
        print("图片转为base64加密：",imgbase64)
        return imgbase64

    def __get_imgfile_base64str__(self, image):
        print("进入图片转base64函数,源图片路径：",image)
        if not isinstance(image, str): return None
        if not os.path.isfile(image): return None

        with open(image, 'rb') as fp:
            imgbase64 = base64.b64encode(fp.read())
            print("图片采用base64加密：", imgbase64)
            return imgbase64

    def get_img_base64str(self, image):
        if isinstance(image, str):
            self.img_base64str = self.__get_imgfile_base64str__(image)
        elif isinstance(image, Image):
            self.img_base64str = self.__get_imgfile_base64str__(image)
        return self.img_base64str.decode()


    # 组装字典，MD5加密方法
    '''
    ======================================
    tencent获得参数对列表N（字典升级排序）
    ======================================
    1\依照算法第一步要求，对参数对进行排序，得到参数对列表N如下。
    参数名     参数值
    app_id     10000
    nonce_str     20e3408a79
    text     腾讯开放平台
    time_stamp     1493449657

    2\按URL键值拼接字符串T
    依照算法第二步要求，将参数对列表N的参数对进行URL键值拼接，值使用URL编码，URL编码算法用大写字母，例如%E8，而不是小写%e8，得到字符串T如下：
    app_id=10000&nonce_str=20e3408a79&text=%E8%85%BE%E8%AE%AF%E5%BC%80%E6%94%BE%E5%B9%B3%E5%8F%B0&time_stamp=1493449657

    3\拼接应用密钥，得到字符串S
    依照算法第三步要求，将应用密钥拼接到字符串T的尾末，得到字符串S如下。
    app_id=10000&nonce_str=20e3408a79&text=%E8%85%BE%E8%AE%AF%E5%BC%80%E6%94%BE%E5%B9%B3%E5%8F%B0&time_stamp=1493449657&app_key=a95eceb1ac8c24ee28b70f7dbba912bf

    4\计算MD5摘要，得到签名字符串
    依照算法第四步要求，对字符串S进行MD5摘要计算得到签名字符串如。
    e8f6f347d549fe514f0c9c452c95da9d

    5\转化md5签名值大写
    对签名字符串所有字母进行大写转换，得到接口请求签名，结束算法。
    E8F6F347D549FE514F0C9C452C95DA9D

    6\最终请求数据
    在完成签名计算后，即可得到所有接口请求数据，进一步完成API的调用。
    text     腾讯开放平台     接口请求数据，UTF-8编码
    app_id     10000     应用标识
    time_stamp     1493449657     请求时间戳（秒级），用于防止请求重放
    nonce_str     20e3408a79     请求随机字符串，用于保证签名不可预测
    sign     E8F6F347D549FE514F0C9C452C95DA9D     请求签名    
    '''
    # 生成sign权限校验！最难的就是这里
    def gen_dict_md5(self, req_dict, app_key):
        if not isinstance(req_dict, dict): return None
        if not isinstance(app_key, str) or not app_key: return None
        try:
            # 方法，先对字典排序，排序之后，写app_key，再urlencode
            sort_dict = sorted(req_dict.items(), key=lambda item: item[0], reverse=False)
            sort_dict.append(('app_key', app_key))
            sha = hashlib.md5()
            rawtext = urlencode(sort_dict).encode()
            sha.update(rawtext)
            md5text = sha.hexdigest().upper()

            # 字典可以在函数中改写-最后赋值给sign函数。
            if md5text: req_dict['sign'] = md5text
            return md5text
        except Exception as e:
            return None

    # 生成字典-
    def gen_req_dict(self, req_dict, app_id=None, app_key=None, time_stamp=None, nonce_str=None):
        """用MD5算法生成安全签名"""
        if not req_dict.get('app_id'):
            if not app_id: app_id = self.app_id
            req_dict['app_id'] = app_id

        # nonce_str 字典无值
        if not req_dict.get('time_stamp'):
            if not time_stamp: time_stamp = self.get_time_stamp()
            req_dict['time_stamp'] = time_stamp

        if not req_dict.get('nonce_str'):
            if not nonce_str: nonce_str = self.get_random_str()
            req_dict['nonce_str'] = nonce_str
        # app_key 取系统参数。
        if not app_key: app_key = self.app_key
        md5key = self.gen_dict_md5(req_dict, app_key)
        return md5key


'''
基本文本分析
===========
分词     对文本进行智能分词识别，支持基础词与混排词粒度     https://api.ai.qq.com/fcgi-bin/nlp/nlp_wordseg text
词性标注     对文本进行分词，同时为每个分词标注正确的词性     https://api.ai.qq.com/fcgi-bin/nlp/nlp_wordpos text
专有名词识别     对文本进行专有名词的分词识别，找出文本中的专有名词     https://api.ai.qq.com/fcgi-bin/nlp/nlp_wordner text
同义词识别     识别文本中存在同义词的分词，并返回相应的同义词     https://api.ai.qq.com/fcgi-bin/nlp/nlp_wordsyn text

计算机视觉--OCR识别
====================
通用OCR识别     识别上传图像上面的字段信息     https://api.ai.qq.com/fcgi-bin/ocr/ocr_generalocr image
身份证OCR识别     识别身份证图像上面的详细身份信息     https://api.ai.qq.com/fcgi-bin/ocr/ocr_idcardocr image,card_type(身份证，0-正面，1-反面)
名片OCR识别     识别名片图像上面的字段信息     https://api.ai.qq.com/fcgi-bin/ocr/ocr_bcocr image
行驶证驾驶证OCR识别     识别行驶证或驾驶证图像上面的字段信息     https://api.ai.qq.com/fcgi-bin/ocr/ocr_driverlicenseocr image,type(识别类型，0-行驶证识别，1-驾驶证识别)
营业执照OCR识别     识别营业执照上面的字段信息     https://api.ai.qq.com/fcgi-bin/ocr/ocr_bizlicenseocr image
银行卡OCR识别     识别银行卡上面的字段信息     https://api.ai.qq.com/fcgi-bin/ocr/ocr_creditcardocr image
'''


# 改成你自己的API账号、密码-可用改为全局变量
APPID = 'xxxxx'
APPKEY = 'xxxxxxxx'

TencentAPI = {
    # 基本文本分析API
    "nlp_wordseg": {
        'APINAME': '分词',
        'APIDESC': '对文本进行智能分词识别，支持基础词与混排词粒度',
        'APIURL': 'https://api.ai.qq.com/fcgi-bin/nlp/nlp_wordseg',
        'APIPARA': 'text'
    },
    "nlp_wordpos": {
        'APINAME': '词性标注',
        'APIDESC': '对文本进行分词，同时为每个分词标注正确的词性',
        'APIURL': 'https://api.ai.qq.com/fcgi-bin/nlp/nlp_wordpos',
        'APIPARA': 'text'
    },
    'nlp_wordner': {
        'APINAME': '专有名词识别',
        'APIDESC': '对文本进行专有名词的分词识别，找出文本中的专有名词',
        'APIURL': 'https://api.ai.qq.com/fcgi-bin/nlp/nlp_wordner',
        'APIPARA': 'text'
    },
    'nlp_wordsyn': {
        'APINAME': '同义词识别',
        'APIDESC': '识别文本中存在同义词的分词，并返回相应的同义词',
        'APIURL': 'https://api.ai.qq.com/fcgi-bin/nlp/nlp_wordsyn',
        'APIPARA': 'text'
    },

    # 计算机视觉--OCR识别API
    "ocr_generalocr": {
        'APINAME': '通用OCR识别',
        'APIDESC': '识别上传图像上面的字段信息',
        'APIURL': 'https://api.ai.qq.com/fcgi-bin/ocr/ocr_generalocr',
        'APIPARA': 'image'
    },
    "ocr_idcardocr": {
        'APINAME': '身份证OCR识别',
        'APIDESC': '识别身份证图像上面的详细身份信息',
        'APIURL': 'https://api.ai.qq.com/fcgi-bin/ocr/ocr_idcardocr',
        'APIPARA': 'image,card_type'
    },
    "ocr_bcocr": {
        'APINAME': '名片OCR识别',
        'APIDESC': '识别名片图像上面的字段信息',
        'APIURL': 'https://api.ai.qq.com/fcgi-bin/ocr/ocr_bcocr',
        'APIPARA': 'image'
    },
    "ocr_driverlicenseocr": {
        'APINAME': '行驶证驾驶证OCR识别',
        'APIDESC': '识别行驶证或驾驶证图像上面的字段信息',
        'APIURL': 'https://api.ai.qq.com/fcgi-bin/ocr/ocr_driverlicenseocr',
        'APIPARA': 'image,type'
    },
    "ocr_bizlicenseocr": {
        'APINAME': '营业执照OCR识别',
        'APIDESC': '识别营业执照上面的字段信息',
        'APIURL': 'https://api.ai.qq.com/fcgi-bin/ocr/ocr_bizlicenseocr',
        'APIPARA': 'image'
    },
    "ocr_creditcardocr": {
        'APINAME': '银行卡OCR识别',
        'APIDESC': '识别银行卡上面的字段信息',
        'APIURL': 'https://api.ai.qq.com/fcgi-bin/ocr/ocr_creditcardocr',
        'APIPARA': 'image'
    },
}


def ExecTecentAPI(*arg, **kwds):

    if kwds.get('Apiname'):
        apiname = kwds.pop('Apiname')
    url = TencentAPI[apiname]['APIURL']
    name = TencentAPI[apiname]['APINAME']
    desc = TencentAPI[apiname]['APIDESC']
    para = TencentAPI[apiname]['APIPARA']

    tx = MsgTencent(APPID, APPKEY)

    Req_Dict = {}

    for key in para.split(','):
        value = None
        if kwds.get(key):  value = kwds.pop(key)
        if key == 'image':
            # 图像获取base64
            value = tx.get_img_base64str(value)
        if key == 'text':
            # 文本进行GBK编码
            value = value.encode('gbk')
        if key == 'type':
            value = tx.get_img_base64str(value)

        Req_Dict[key] = value
        #print(key, value, Req_Dict[key])

    # 生成请求包
    sign = tx.gen_req_dict(req_dict=Req_Dict)
    resp = requests.post(url, data=Req_Dict, verify=False)
    print(name + '调用执行结果：')
    #print(name + '调用执行结果：' + resp.text)
    #print(type(resp))  # <class 'requests.models.Response'>
    #print(type(resp.text))  # <class 'str'>

    # 多级目录提取
    result = json.loads(resp.text)
    #print(type(result))
    print("识别结果转为字典：",result)

    # 提取识别的文字内容
    for i in result["data"]["item_list"]:
        text = i["itemstring"]          # 文字
        location = i["itemcoord"]       # 坐标
        # print(type(location), type(text))  # <class 'list'> <class 'str'>
        #print(i["itemcoord"],i["itemstring"])   # 左上角坐标x,y，以及宽、高 文字
        print("坐标：", location[0]['x'], "文字：", text)

    return resp.text

if __name__ == "__main__":

    # 名片ocr
    # file = r'./img/mp.png'
    # rest = ExecTecentAPI(Apiname='ocr_bcocr', image=file)

    # 文本分析
    # rest = ExecTecentAPI(Apiname='nlp_wordseg', text='上帝保佑你')

    # 驾驶证
    # file = r'./img/jsz.jpg'
    # rest = ExecTecentAPI(Apiname='ocr_driverlicenseocr', image=file)

    # 通用ocr
    file = r'./img/1.png'
    rest = ExecTecentAPI( Apiname='ocr_generalocr', image=file)

这个是借用别人的，原地址：
原地址为

结果如下：

在这里插入图片描述

名片或者驾驶证：
在这里插入图片描述

还有相比于百度的OCR呢，腾讯的无限制，只是有并发控制。
在这里插入图片描述

腾讯的要求原始图片必须是base64编码数据（原图大小上限1MB，支持JPG、PNG、BMP格式）。

第二种方式调用-通用OCR识别（可用于验证码/图文/文字识别）

代码：

#!/usr/bin/python3

"""
desc: 调用腾讯OCRapi实现文本识别
#@Readme : 请控制在1M内，支持JPG、PNG、BMP格式
"""

import base64, hashlib, json, random, string, time
from urllib import parse, request


def GetAccessToken(formdata, app_key):
    '''
    获取签名
    :param formdata:请求参数键值对
    :param app_key:应用秘钥
    :return:返回接口调用签名
    '''
    dic = sorted(formdata.items(), key=lambda d: d[0])
    sign = parse.urlencode(dic) + '&app_key=' + app_key
    m = hashlib.md5()
    m.update(sign.encode('utf8'))
    return m.hexdigest().upper()

# 改成你自己的API账号、密码-可用改为全局变量
app_id = '2111953790'
app_key = '43ROeSB3hbQY4D4M'
def RecogniseGeneral(app_id, time_stamp, nonce_str, image, app_key):
    '''
    腾讯OCR通用接口
    :param app_id:应用标识，正整数
    :param time_stamp:请求时间戳（单位秒），正整数
    :param nonce_str: 随机字符串，非空且长度上限32字节
    :param image:原始图片的base64编码
    :return:
    '''
    host = 'https://api.ai.qq.com/fcgi-bin/ocr/ocr_generalocr'
    formdata = {'app_id': app_id, 'time_stamp': time_stamp, 'nonce_str': nonce_str, 'image': image}
    app_key = app_key
    sign = GetAccessToken(formdata=formdata, app_key=app_key)
    formdata['sign'] = sign
    req = request.Request(method='POST', url=host, data=parse.urlencode(formdata).encode('utf8'))
    response = request.urlopen(req)
    if (response.status == 200):
        json_str = response.read().decode()
        #print('腾讯OCR通用接口返回结果：',json_str)
        jobj = json.loads(json_str)
        datas = jobj['data']['item_list']
        recognise = {}
        for obj in datas:
            recognise[obj['itemstring']] = obj
        return recognise


def Recognise(img_path):
    with open(file=img_path, mode='rb') as file:
        base64_data = base64.b64encode(file.read())
    nonce = ''.join(random.sample(string.digits + string.ascii_letters, 32))
    stamp = int(time.time())
    recognise = RecogniseGeneral(app_id=app_id, time_stamp=stamp, nonce_str=nonce, image=base64_data,
                                 app_key=app_key)  # 替换成自己的app_id,app_key
    # 提取出来看
    for k, v in recognise.items():
        print('腾讯OCR通用接口返回结果：',k, v)

    return recognise

# 腾讯优图的API比较复杂的就是生成签名

if __name__ == '__main__':
    img_path = r'./img/timg.jpeg'
    recognise_dic = Recognise(img_path)
    for k, value in recognise_dic.items():
        print('图片识别内容：',k)
        for v in value['itemcoord']:
            print('内容坐标：',v)

效果如图：
在这里插入图片描述

题外话：

有些人的教程写了需要的:

appid = 'xxxxx
secret_id =‘xxxxxxxxxxxxxxxx’
secret_key = ‘xxxxxxxxxxxxxxxxxxxxx’

这三个的，在如下地址可以获取到，很多找不到，说一下，其次，我们这个代码不需要它。

获得的地址: 点我跳转：

https://console.cloud.tencent.com/cam/capi

在这里插入图片描述

各种报错解决：

接口鉴权： https://ai.qq.com/doc/auth.shtml

如果你后面调用的时候，出现如图：在这里插入图片描述

16388，根据 https://ai.qq.com/doc/returncode.shtml 返回码可知，是请求签名无效，请检查请求中的签名信息（sign）是否有效。

如果报错这种，出现如图：：
在这里插入图片描述
Unicode编码转utf8：
bytes(secret_key,‘utf-8’)
即可。

出现如图：在这里插入图片描述
‘NoneType’ object has no attribute ‘decode’
这个报错信息提示有一个变量的值是None
None 的类型是NoneType , 它没有decode 方法。
第一，检查你是不是没有加载到数据！比如图片，找到没？
第二，把报错的地方：repo_dict[‘description’]转为str形式。