1.首先解压缩ICTCLAS_api_part1.rar和ICTCLAS_api_part2.rar
2.把ICTCLAS_api_part1.rar解压出的文件放到java工程的根目录下(如下图所示)
3.然后把ICTCLAS_api_part2.rar解压出的文件放到src文件夹下(如下图所示)
4.接下来就能在程序中调用ICTCLAS的API了,下面说明最常用的的分词返回分词之后的String[]数组的操作流程:
A.首先
import ICTCLAS.I3S.AC.ICTCLAS50;
B.然后
ICTCLAS50 testICTCLAS50 = new ICTCLAS50();
String argu = ".";//file Configure.xml and Data directory stored.
//初始化
try {
if (testICTCLAS50.ICTCLAS_Init(argu.getBytes("GB2312")) == false)
{
System.out.println("Init Fail!");
throw new Exception("初始化错误");
}
} catch (UnsupportedEncodingException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
} catch (Exception e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
String s="我想要去北京旅游";
//导入用户词典前分词
byte nativeBytes[];
try {
nativeBytes = testICTCLAS50.ICTCLAS_ParagraphProcess(s.getBytes("GB2312"), 0, 0);
//分词处理
//System.out.println(nativeBytes.length);
String nativeStr = new String(nativeBytes, 0, nativeBytes.length, "GB2312");
String[] wordStrings=nativeStr.split(" ");
for (String string : wordStrings) {
System.out.println(string);
}
} catch (UnsupportedEncodingException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
程序输出:
我
想
要
去
北京
旅游