tesseract ocr java_java通过开源框架tesseract-ocr引擎实现图文识别

这个Java程序利用Tesseract OCR引擎实现图文识别。用户输入图片路径和保存路径,程序将遍历指定文件夹,对符合特定格式的图片进行文字识别,并将结果保存为文本文件。
摘要由CSDN通过智能技术生成

package major;

import java.awt.event.ActionEvent;

import java.awt.event.ActionListener;

import java.io.File;

import javax.swing.Box;

import javax.swing.JButton;

import javax.swing.JFormattedTextField;

import javax.swing.JFrame;

import javax.swing.JLabel;

import javax.swing.JOptionPane;

import javax.swing.JPasswordField;

import javax.swing.JTextField;

import tools.update;

public class Test {

private static boolean t=true;

/**

* @param args

*/

public static void main(String[] args) {

// TODO Auto-generated method stub

//OCR ocr=new OCR();

// try {

//String maybe2 = new OCR().recognizeText(new File("E:\\temp\\222.png"), "png");

//update.updateFromWeb(maybe2,"E:\\temp\\222.txt",true);

//System.out.println(maybe2);

//System.out.println("**********");

MyString str=new MyString();

System.out.println(str.getString(maybe2));

//} catch (Exception e) {

TODO Auto-generated catch block

//e.printStackTrace();

//}

//SoundServer s=new SoundServer();

//s.playSound("E:\\111\\HOOK1.wav");

Box form = Box.createVerticalBox();

JTextField field = new JTextField(30);

form.add(new JLabel("图片路径:"));

final JFormattedTextField webUrl =

new JFormattedTextField();

webUrl.setValue("e:/temp/");

form.add(webUrl);

JTextField field1 = new JTextField(30);

form.add(new JLabel("保存路径:"));

final JFormattedTextField saveUrl =

new JFormattedTextField();

saveUrl.setValue("e:/temp/");

form.add(saveUrl);

JButton button=new JButton("开始");

button.addActionListener(new ActionListener() {

@Override

public void actionPerformed(ActionEvent e) {

// TODO Auto-generated method stub

String fileUrls=(String)webUrl.getValue();

String maybe2;

try {

File file=new File(fileUrls);

File[] tempList = file.listFiles();

if(tempList==null){

JOptionPane.showMessageDialog( null , "该文件夹为空,","提示" , JOptionPane.INFORMATION_MESSAGE) ;

return;

}

for(int i=0;t&&i

String type=tempList[i].getName().split("\\.")[tempList[i].getName().split("\\.").length-1];

if(!type.equals("png")&&!type.equals("PNG")&&!type.equals("jpg")&&!type.equals("JPG")&&!type.equals("bmp")&&!type.equals("tiff")&&!type.equals("gif")

&&!type.equals("pcx")&&!type.equals("tga")&&!type.equals("fpx")&&!type.equals("svg")&&!type.equals("psd")&&!type.equals("cdr")

&&!type.equals("pcd")&&!type.equals("dxf")&&!type.equals("ufo")&&!type.equals("eps")&&!type.equals("ai")&&!type.equals("raw"))

{

continue;

}else{

String saveUrls=(String)saveUrl.getValue();

String name=tempList[i].getName().split("/")[tempList[i].getName().split("/").length-1].split("\\.")[0];

maybe2 = new OCR().recognizeText(tempList[i], type);

update.updateFromWeb(maybe2,saveUrls+"/"+name+".txt",true);

}

}

} catch (Exception e1) {

// TODO Auto-generated catch block

e1.printStackTrace();

}

}

});

form.add(button);

// JButton button1=new JButton("停止");

//button1.addActionListener(new ActionListener() {

//@Override

//public void actionPerformed(ActionEvent e) {

TODO Auto-generated method stub

//d.setT(false);

//d=new Demo();

//}

//});

// form.add(button1);

JFrame frame = new JFrame("User Information");

frame.getContentPane().add(form);

frame.pack();

frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

frame.setVisible(true);

}

}

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
Java OCR Framework An Optical Character Recognition Framework written purely in Java. Installation Build the project and add the jar for the project along with all the jars in the jar directory to your compile-time libraries. Usage There are 4 main parts to OCR: Normalization Segmentation Feature Extraction Classification Feature Extraction and Classification are the only required parts. For Feature Extraction there are 5 algorithms at your disposal Horizontal Celled Projection Vertical Celled Projection Horizontal Projection Histogram Vertical Projection Histogram Local Line Fitting This framework loosely uses a Fluent Interface Builder syntax. Example: OCR ocr = OCRBuilder .create() .normalization(new Normalization()) .segmentation(new Segmentation()) .featureExtraction( FeatureExtractionBuilder .create() .children( new HorizontalCelledProjection(5), new VerticalCelledProjection(5), new HorizontalProjectionHistogram(), new VerticalProjectionHistogram(), new LocalLineFitting(49)) .build()) .neuralNetwork( NeuralNetworkBuilder .create() .fromFile("neural_network.eg") .build()) .build(); Contributing Want to help out? Feel free to share your ideas. Fork it. Create a branch (git checkout -b my_fancy_feature) Commit your changes (git commit -am "Added amazing feature") Push to the branch (git push origin my_fancy_feature) Open a Pull Request References Arora, Sandhya (2008). “Combining Multiple Feature Extraction Techniques for Handwritten Devnagari Character Recognition”, IEEE Region 10 Colloquium. pp. 342-348 Haykin, Simon (1999). “Neural Networks A Comprehensive Foundation”, 2nd Edition. Pearson Education. Perez, Juan-Carlos ; Vidal, Enrique ; Sanchez, Lourdes (1994). “Simple and Effective Feature Extraction for Optical Character Recognition”, Selected Paper From the 5th Spanish Symposium on Pattern Recognition and Image Analysis. Zahid Hossain, M. ; Ashraful Amin, M. ; Yan, Hong (2012). “Rapid Feature Extraction for Optical Character Recognition”, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, No. 6. pp. 801-813 Thanks Thanks to Heaton Research for providing an amazing Neural Network framework. Also thanks to Apache Math Commons for doing all the math without the mess.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值