合合平台:https://www.textin.com/experience/pdf-to-word
文件base64转文档:Base64转docx的python实现_cjjmt的博客-CSDN博客
示例代码:
import requests
import json
def get_file_content(filePath):
with open(filePath, 'rb') as fp:
return fp.read()
class CommonOcr(object):
def __init__(self, img_path):
# 请登录后前往 “工作台-账号设置-开发者信息” 查看 x-ti-app-id
# 示例代码中 x-ti-app-id 非真实数据
self._app_id = '1exxxxxxxxxxxxxx7a7c'
# 请登录后前往 “工作台-账号设置-开发者信息” 查看 x-ti-secret-code
# 示例代码中 x-ti-secret-code 非真实数据
self._secret_code = '80ddxxxxx3f'
self._img_path = img_path
def recognize(self):
# PDF转Word
url = 'https://api.textin.com/ai/service/v1/file-convert/pdf-to-word'
head = {}
try:
image = get_file_content(self._img_path)
head['x-ti-app-id'] = self._app_id
head['x-ti-secret-code'] = self._secret_code
result = requests.post(url, data=image, headers=head)
return result.text
except Exception as e:
return e
if __name__ == "__main__":
response = CommonOcr(r'D:\文件\结果2021.pdf')
result2 = response.recognize()
import base64
result2 = eval(result2)
aa = result2.get("result")
print("aa>>", aa)
with open(r'tdt.doc', 'wb') as f:
f.write(base64.b64decode(aa))
print("11111111111")
该文章提供了一个使用Python实现的示例,展示如何通过合合平台API将PDF转换为Word文档,并将文件内容以Base64编码进行处理。代码中包含了调用API所需的认证信息,以及读取文件、发送POST请求和解码Base64的过程。
2224

被折叠的 条评论
为什么被折叠?



