腾讯OCR的使用(卡证)
注:以一个身份证正面举例,涉及隐私问题,只贴代码以及代码的讲解
使用腾讯OCR的原因
我作为RPA工程师很清楚大家大多使用的都是百度OCR,但是经过上千张身份证银行卡的识别,事实证明腾讯OCR的分辨情况要比百度OCR高,这也是我花了半天时间琢磨了一下腾讯OCR该怎么使用,以及我在实践中遇到的问题以及改正方式。
首先百度OCR和腾讯OCR都有一个月1000次的免费,所以大家想自己玩玩可以自己去搜索着去玩玩试试看。提示,腾讯OCR和百度OCR都需要绑定微信号并实名认证。
百度OCR提供的API更丰富一些,比如说百度OCR调用接口后返还的数据是“字典”格式,而腾讯OCR返还的数据格式是“字符串”,你在数据处理的时候就有些搞人心态。
废话不多说直接贴代码:我用的是python,用java的小伙伴自己去官网给的代码调试吧
import ast
import base64
import json
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.ocr.v20181119 import ocr_client, models
#调用的图片需要经过base64编码处理
def path2base64(path):
with open(path, "rb")as f:
byte_data = f.read()
base64_str = base64.b64encode(byte_data).decode("ascii") #
base64
return base64_str
try:
#注意输入的不是你登录的账号密码,是在登录后解锁打开的一个uid和key,自己找找
cred = credential.Credential("输入你自己的uid", "输入你自己的key")
httpProfile = HttpProfile()
httpProfile.endpoint = "ocr.tencentcloudapi.com"
clientProfile = ClientProfile()
clientProfile.httpProfile = httpProfile
client = ocr_client.OcrClient(cred, "ap-guangzhou", clientProfile)
req = models.IDCardOCRRequest()
file = "C:\\Users\\tianyi.zhang\\Desktop\\963.jpg"
image = path2base64(file)
params = {
"ImageBase64": image,
"CardSide": "FRONT"
}
req.from_json_string(json.dumps(params))
resp = client.IDCardOCR(req)
#这部是将字符串格式的数据转换成字典格式
resp_info = ast.literal_eval(resp.to_json_string())
print(resp.to_json_string())
print(resp_info)
except TencentCloudSDKException as err:
print(err)
再说一下我在实际使用时碰到的一个问题
下列代码是银行卡的识别
import ast
import base64
import json
from tencentcloud.common import credential
from tencentcloud.common.profile.client_profile import ClientProfile
from tencentcloud.common.profile.http_profile import HttpProfile
from tencentcloud.common.exception.tencent_cloud_sdk_exception import TencentCloudSDKException
from tencentcloud.ocr.v20181119 import ocr_client, models
def path2base64(path):
with open(path, "rb")as f:
byte_data = f.read()
base64_str = base64.b64encode(byte_data).decode("ascii") #
base64
return base64_str
try:
cred = credential.Credential("", "")
httpProfile = HttpProfile()
httpProfile.endpoint = "ocr.tencentcloudapi.com"
clientProfile = ClientProfile()
clientProfile.httpProfile = httpProfile
client = ocr_client.OcrClient(cred, "ap-guangzhou", clientProfile)
file = "C:\\Users\\tianyi.zhang\\Desktop\\456.jpg"
image = path2base64(file)
req = models.BankCardOCRRequest()
params = {
"ImageBase64": image
}
req.from_json_string(json.dumps(params))
resp = client.BankCardOCR(req)
resp_info = ast.literal_eval(resp.to_json_string())
print(resp.to_json_string())
except TencentCloudSDKException as err:
print(err)
然后运行会报这样的错误
Traceback (most recent call last):
File "C:/Users/tianyi.zhang/PycharmProjects/数据分析/tencentOCR银行卡.py", line 36, in <module>
resp_info = ast.literal_eval(resp.to_json_string())
File "C:\Users\tianyi.zhang\AppData\Local\Programs\Python\Python38\lib\ast.py", line 99, in literal_eval
return _convert(node_or_string)
File "C:\Users\tianyi.zhang\AppData\Local\Programs\Python\Python38\lib\ast.py", line 88, in _convert
return dict(zip(map(_convert, node.keys),
File "C:\Users\tianyi.zhang\AppData\Local\Programs\Python\Python38\lib\ast.py", line 98, in _convert
return _convert_signed_num(node)
File "C:\Users\tianyi.zhang\AppData\Local\Programs\Python\Python38\lib\ast.py", line 75, in _convert_signed_num
return _convert_num(node)
File "C:\Users\tianyi.zhang\AppData\Local\Programs\Python\Python38\lib\ast.py", line 66, in _convert_num
_raise_malformed_node(node)
File "C:\Users\tianyi.zhang\AppData\Local\Programs\Python\Python38\lib\ast.py", line 63, in _raise_malformed_node
raise ValueError(f'malformed node or string: {node!r}')
ValueError: malformed node or string: <_ast.Name object at 0x0000026DF1391850>
报错的原因是银行卡获取到的字符串中有null值,在python中没办法直接转换成字典格式
修改方法是将:
resp_info = ast.literal_eval(resp.to_json_string())
改成
resp_info = ast.literal_eval(resp.to_json_string().replace('null','""'))
腾讯OCR的卡证识别就这些,希望对大家有帮助