【python】身份证识别

最新推荐文章于 2024-07-09 21:35:12 发布

yzzheng_60125

最新推荐文章于 2024-07-09 21:35:12 发布

阅读量5.4k

点赞数 4

分类专栏： # python 文章标签： python 图像识别百度

本文链接：https://blog.csdn.net/Alearn_/article/details/108587572

版权

python 专栏收录该内容

16 篇文章 1 订阅

订阅专栏

0. 需求说明

小叔在旅游公司上班，上次偶尔碰到他手动录入每个顾客的信息，所以感觉这个不方便，便给他用python写了个脚本，识别出用户身份证中的姓名和身份证号码。

1. 使用工具

python基础包
pandas库用来写入excel文件
百度智能云接口api

2.思路

先调用将图片放在本地一个文件目录下，然后以二进制形式读入每张image
将image图片通过request发送到百度智能云上，然后获取json对象
根据json对象得到用户信息，这里我只用到姓名和身份证号码
封装用户姓名和身份证号码成一个字典，然后所有用户信息就构成一个list
写入第4步中的list到excel文件中，这里用到pandas模块，比python中其他模块要简单

3.代码实现

先得根据自己有AK和Sk获取一个授权码（告诉百度云你要做的应用是什么）对应着getAccessToken（）

import requests
import base64
import json
import os
import pandas as pd

def getAccessToken():
    url = "https://aip.baidubce.com/oauth/2.0/token" 
    data = {
        'grant_type': 'client_credentials',
        'client_id': '这里填写个人ak',
        'client_secret': '这里填个人sk',
    }
    response = requests.post(url=url, data=data)
    data2 = json.loads(response.text) 
    accesstoken = data2['access_token']
    return accesstoken

def get_images(path):
    files = os.listdir(path) #得到文件夹下所有文件的名称
    images = []
    for file in files:
            try:
                filePath = os.path.join(path,file)
                with open(filePath,'rb') as f:  
                    image = base64.b64encode(f.read())
                    images.append(image)
            except Exception as e:
                print(str(e))
    return images

def recognize_Pic(path):
    # step1: 获取accessToken
    access_token = getAccessToken()
    # step2: 获取图片集合
    images = get_images(path)
    
    request_url = "https://aip.baidubce.com/rest/2.0/ocr/v1/idcard"
    request_url = request_url + "?access_token=" + access_token
    headers = {'content-type': 'application/x-www-form-urlencoded'}
    dic = []
    for image in images:
        name,id_num = getText(image,request_url,headers)
        dic.append({'姓名':name,'身份证号码':id_num})
    print(dic)
    writeExcel(dic)
    
def writeExcel(dic):
    pf = pd.DataFrame(dic)
    order = ['姓名','身份证号码']
    pf = pf[order]
    file_path = pd.ExcelWriter(r'C:\Users\Administrator\Desktop\test2.xlsx')
    pf.fillna(' ',inplace=True)
    pf.to_excel(file_path,encoding='utf-8',index=False,sheet_name="sheet1")
    file_path.save()
    
def getText(image,request_url,headers):
    params = {"id_card_side": "front", "image": image}
    response = requests.post(request_url, data=params, headers=headers)
    if response:
        data = response.json()
        print(data)
        name = data['words_result']['姓名']['words']
        id_num = data['words_result']['公民身份号码']['words']
        return (name,id_num)
    else:
        print('识别错误')
        
if __name__ == '__main__':     
    path = r'C:\Users\Administrator\Desktop\ID'
    recognize_Pic(path)