搜狗ocr识别接口

最新推荐文章于 2025-04-13 07:56:29 发布

褶皱的包子

最新推荐文章于 2025-04-13 07:56:29 发布

阅读量1.1w

点赞数 1

分类专栏： OCR 爬虫项目汇总文章标签： OCR 图像识别

爬虫项目汇总同时被 2 个专栏收录

6 篇文章

订阅专栏

OCR

3 篇文章

订阅专栏

本文介绍了一种使用搜狗OCR接口进行图片文字识别的方法。通过Python的requests库，实现了图片的上传及识别结果的获取。文章详细展示了如何利用代码将本地图片上传至搜狗服务器，并调用其OCR服务解析图片中的文字。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

详细情况在代码中说明，如果不想自己使用TensorFlow，可使用下面接口

这是要识别的图片：

最终识别的结果：

This is a lot of 12 point text to test the
ocr code and see if it works on all types
of file format.
The quick brown dog jumped over the
lazy fox.The quick brown dog jumped
over the lazy fox.The quick brown dog
jumped over the lazy fox.The quick
brown dog jumped over the lazy fox.

代码块：

# _*_ coding: utf-8 _*_
# Time: 2019.4.25
# Author: maxiaohui
# Title 搜狗ocr识别接口
# 这个代码涉及到抓包用的fiddler

import requests  # 库文件

def post_image():
    img = "one.png"  # 图片路径
    files = {"pic_path": open(img, "rb")}  # files # 类似data数据
    url = "http://pic.sogou.com/pic/upload_pic.jsp"  # post的url
    html = requests.post(url, files=files).text  # requests 提交图片
    print('html is ',html)
    get_content(html)  # 结果是url就是图片的url sougou 把本地图片上传到sougou服务器变成了他的图片 调用解析函数把url传入


def get_content(keywords):
    url = "http://pic.sogou.com/pic/ocr/ocrOnline.jsp?query=" + keywords  # keywords就是图片url此方式为get请求
    ocrResult = requests.get(url).json()  # 直接转换为json格式
    contents = ocrResult['result']  # 类似字典 把result的value值取出来 是一个list然后里面很多json就是识别的文字
    for content in contents:  # 遍历所有结果
        print(content['content'].strip())  # strip去除空格 他返回的结果自带一个换行

post_image()  # 调用上传函数