boolean ICTCLAS_FileProcess(byte[] sSrcFilename, int eCodeType, int bPOSTagged, byte[]sDestFilename)接口:该接口与ICTCLAS_ParagraphProcess接口类似,只是该接口用与处理文件,对整个文件的内容进行分词并标注,最后将结果存在目标文件中
* Method: ICTCLAS_FileProcess<!文本文件分词>
* Parameter: byte[] sSrcFilename<!要分词的文件>
* Parameter: int eCodeType<!要处理的文本编码类型>
* Parameter: int bPOSTagged<! 是否词性标准,0:不标注.1:标注.标注的词集根据ICTCLAS_SetPOSmap的设置值来定>
* Parameter: byte[] sDestFilename<! 结果文件存放位置>
* Returns: ICTCLAS_API bool<! 分词是否成功>
* Description: 1.用户若不指定分词结果保存位置,系统将结果保存至
当前目录下test_result.txt 中
2.sDestFilename,若该文件不存在, 则自动生成; 否则先清空已有内容
调用示例:
- package ICTCLAS.I3S.test;
- import java.io.BufferedReader;
- import java.io.IOException;
- import java.io.InputStreamReader;
- import java.io.UnsupportedEncodingException;
- import ICTCLAS.I3S.AC.ICTCLAS50;
- public class Test_ICTCLAS_FileProcess {
- /**
- * @param args
- */
- public static void main(String[] args) {
- // TODO Auto-generated method stub
- ICTCLAS50 ictclas = new ICTCLAS50();
- String useage = "java Test_ICTCLAS_ParagraphProcess sPath [nPOSmap]";
- if (args.length < 1) {
- System.err.println(useage);
- return;
- }
- try {
- if (!ictclas.ICTCLAS_Init(args[0].getBytes("GB2312"))) {
- System.err.println("Initial failed!");
- return;
- }
- System.out.println("Initial successed!");
- /* 设置词性标注集(0 计算所二级标注集,1 计算所一级标注集,2 北大二级标注集,3 北大一级标注集) */
- int nPosmap = args.length == 2 ? Integer.valueOf(args[1]) : 1;
- ictclas.ICTCLAS_SetPOSmap(nPosmap);
- BufferedReader reader = new BufferedReader(new InputStreamReader(
- System.in, "GB2312"));
- System.out.print("input the src file:");
- String srcFilename = reader.readLine();
- System.out.print("input the dst file:");
- String dstFilename = reader.readLine();
- if (ictclas.ICTCLAS_FileProcess(srcFilename.getBytes("GB2312"), 0,
- 1, dstFilename.getBytes("GB2312"))) {
- System.out.println("process successly!");
- } else {
- System.err.println("process failed!");
- }
- } catch (UnsupportedEncodingException e) {
- // TODO Auto-generated catch block
- e.printStackTrace();
- } catch (IOException e) {
- // TODO Auto-generated catch block
- e.printStackTrace();
- } finally {
- ictclas.ICTCLAS_Exit();
- }
- }
- }