Sping boot 图像文字识别Tesseract（OCR）

最新推荐文章于 2024-02-26 21:15:00 发布

大牛哥哥

最新推荐文章于 2024-02-26 21:15:00 发布

阅读量952

点赞数 1

文章标签： spring boot

本文链接：https://blog.csdn.net/weixin_54586234/article/details/128052754

版权

使用场景：随着人工智能的发展，生活中也逐渐出现了很多便捷高效的应用，人脸识别、证件识别认证、名片识别、车牌识别等，都在方便着我们的日常生活。同样，这些技术也可以为我们的日常业务处理流程提供智能高效的解决方案。日常交易、清算业务往来存在各种电子邮件、传真等单据，主要有确认成交单据、定存协议、对敲指令、银行间费用、网下中签公告、境外券商确认单等。这些单据需要人工识别提取要素录入系统，通过使用OCR技术，对单据图像内容进行识别、矫正，提取关键字段元素，与相关系统连接，能够减少人工手动录入，提高工作效率，降低人工录入失误。

Tesseract-OCR（包含官方中文识别包，需自行配置环境变量路径到 tessdata）

阿里云盘分享

提取码：v18l

    public static String FindOCR(String srImage, boolean zh) {
        try {
            System.out.println("识别");
            File imageFile = new File(srImage);
            if (!imageFile.exists()) {
                return "图片不存在";
            }
            BufferedImage textImage = ImageIO.read(imageFile);
            textImage = ImageHelper.convertImageToGrayscale(textImage);// 黑白处理
//            textImage = textImage.getSubimage(0,0,300,80);//截图图片
            textImage = ImageHelper.getScaledInstance(textImage, textImage.getWidth() * 10, textImage.getHeight() * 10);//放大图片识别
            Tesseract instance = Tesseract.getInstance();
            instance.setDatapath("C:\\Program Files (x86)\\Tesseract-OCR\\tessdata");//设置训练库
            if (zh)
                instance.setLanguage("chi_sim");//中文识别
            String result = null;
            result = instance.doOCR(textImage);
            return result;
        } catch (Exception e) {
            e.printStackTrace();
            return "识别失败";
        }
    }

    public static void main(String[] args) throws Exception {
        String result = FindOCR("D:\\WWWROOTYYKJ\\oc\\1111111111111111111111111111111111111.png", true);

        System.out.println(result);
    }

public static String FindOCR(String srImage, boolean zh) {
try {
System.out.println("识别");
File imageFile = new File(srImage);
if (!imageFile.exists()) {
return "图片不存在";
}
BufferedImage textImage = ImageIO.read(imageFile);
textImage = ImageHelper.convertImageToGrayscale(textImage);// 黑白处理
// textImage = textImage.getSubimage(0,0,300,80);//截图图片
textImage = ImageHelper.getScaledInstance(textImage, textImage.getWidth() * 10, textImage.getHeight() * 10);//放大图片识别
Tesseract instance = Tesseract.getInstance();
instance.setDatapath("C:\\Program Files (x86)\\Tesseract-OCR\\tessdata");//设置训练库
if (zh)
instance.setLanguage("chi_sim");//中文识别
String result = null;
result = instance.doOCR(textImage);
return result;
} catch (Exception e) {
e.printStackTrace();
return "识别失败";
}
}

public static void main(String[] args) throws Exception {
String result = FindOCR("D:\\WWWROOTYYKJ\\oc\\1111111111111111111111111111111111111.png", true);

System.out.println(result);
}

大牛哥哥

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Sping boot 图像文字识别Tesseract（OCR）

使用场景：随着人工智能的发展，生活中也逐渐出现了很多便捷高效的应用，人脸识别、证件识别认证、名片识别、车牌识别等，都在方便着我们的日常生活。日常交易、清算业务往来存在各种电子邮件、传真等单据，主要有确认成交单据、定存协议、对敲指令、银行间费用、网下中签公告、境外券商确认单等。这些单据需要人工识别提取要素录入系统，通过使用OCR技术，对单据图像内容进行识别、矫正，提取关键字段元素，与相关系统连接，能够减少人工手动录入，提高工作效率，降低人工录入失误。
复制链接

扫一扫