theme: fancy
highlight: a11y-dark
发生情景
导入依赖 <dependency> <groupId>net.sourceforge.tess4j</groupId> <artifactId>tess4j</artifactId> <version>4.1.1</version> </dependency>
中文字体库地址:点这里
测试
``` import net.sourceforge.tess4j.Tesseract; import net.sourceforge.tess4j.TesseractException;
import java.io.File;
/** * @author 飞宇千虹 * @date 2023-06-28 20:38 */ // 识别图片中的文字 public class tess4j {
public static void main(String[] args) throws TesseractException {
// 创建实例
Tesseract tesseract = new Tesseract();
// 设置字体路径
tesseract.setDatapath("E:\桌面\learning-files\code\toutiao\day4\tessdata");
// 设置字体路径
tesseract.setLanguage("chi_sim");
// 识别图片
File file = new File("E:\桌面\learning-files\code\toutiao\day4\text.png");
String s = tesseract.doOCR(file).replace("\r|\n","-");
System.out.println(s);
}
} ```
报错信息
``` Error opening data file E:\桌面\learning-files\code\toutiao\day4\tessdata/chisim.traineddata Please make sure the TESSDATAPREFIX environment variable is set to your "tessdata" directory. Failed loading language 'chi_sim' Tesseract couldn't load any languages! Exception in thread "main" java.lang.Error: Invalid memory access at com.sun.jna.Native.invokePointer(Native Method) at com.sun.jna.Function.invokePointer(Function.java:470) at com.sun.jna.Function.invoke(Function.java:404) at com.sun.jna.Function.invoke(Function.java:315) at com.sun.jna.Library$Handler.invoke(Library.java:212) at com.sun.proxy.$Proxy0.TessBaseAPIGetUTF8Text(Unknown Source) at net.sourceforge.tess4j.Tesseract.getOCRText(Tesseract.java:495) at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:358) at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:227) at net.sourceforge.tess4j.Tesseract.doOCR(Tesseract.java:194) at tess4j.main(tess4j.java:22)
Process finished with exit code 1
```
解决策略
因为存放chi_sim.traineddata 包含中文路径
,换个路径存放文件就好了