实现excel中的数据导入数据库,在java或是C#下是不难实现的,即使想实现在txt中的数据导入也不是什么难事,但是,最近接了个任务,学校要建一个英文版的教学平台,然后各个学院的课程资料与简介什么的都是英文的,学校有20个学院,每个学院多的有两百来个科目,少的也有百八十种,可他偏偏就是个word,而且,做的格式很不规范。这可愁着我了,我首先想到的是POI,于是google了一下,原来真的很容易实现,这个后面的代码可以发上去,可以实现03版,和07版的。差别主要是jar包的问题。03的要3个jar包,07的需要7个jar包。
* POI 读取 word 2003 和 word 2007 中文字内容的测试类
* @createDate 2009-07-25
* @author Carl He
*/
public class Test {
public static void main(String[] args) {
try {
word 2003: 图片不会被读取
InputStream is = new FileInputStream(new File("files\\2003.doc"));
WordExtractor ex = new WordExtractor(is);//is是WORD文件的InputStream
String text2003 = ex.getText();
System.out.println(text2003);
//对字符串进行分解
//word 2007 图片不会被读取, 表格中的数据会被放在字符串的最后
OPCPackage opcPackage = POIXMLDocument.openPackage("files\\2007.docx");
POIXMLTextExtractor extractor = new XWPFWordExtractor(opcPackage);
String text2007 = extractor.getText();
System.out.println(text2007);
} catch (Exception e) {
e.printStackTrace();
}
}
}
然后,重要的问题是,如何从word的字段中抓去文件才是关键,因为他们提供的word文件并不是excel,并不能直接导入,我还是果断上一个word文件吧,这样好理解:
Course Description ofBiochemistry
Course Name: Biochemistry Nature of Course:Compulsory course
Course Code: B1700025 Total Credits: 5.0
Total Credit Hours:80 Lecture Hours:80
Experimental Hours: 0 Oriented Majors: Bioscience, Biotechnology
Prerequisite Courses:
Penner: Validator(s):
Briefing of Course Content:
Biochemistry is a science exploring thechemical compositions and chemical reactions during life activitiesof livingorganisms. It is an important compulsive fundamental course for undergraduates majoring in bioscience and biotechnology. The main content of this course includes 1. The structure, function and the relationship between the structure and function of biological macromolecule such as protein and nucleic acid; 2. The metabolisms and regulation of biological macromolecules including carbohydrate, lipid, protein, nucleic acid etc.; 3. The transfer and expression of genetic information.
import java.io.File;
import java.util.ArrayList;
public class Directory {
private ArrayList nameList = new ArrayList();
private static String dirName = "d:\\Eclip