how to evolute the regex for word parsing. but not clean up
[@more@][code]
String testStr = "user input some text but havn't don't didn't i'm she's slslslkdfj we're";
System.out.println(testStr);
String sub1 = testStr.replaceAll("(w+'t)|(w+'s)|(w+'m)|(w+'re)|(w+')", "----");
String sub2 = testStr.replaceAll("['t|'re|'m|'s]", "----");
String[] sub3 = testStr.split("(w+'t)|(w+'s)|(w+'m)|(w+'re)|(w+')");
System.out.println(sub1);
Pattern p = Pattern.compile("(w+'t)|(w+'s)|(w+'m)|(w+'re)|(w+')");
//Pattern p = Pattern.compile("(s[0,1]w+&&^)[n't|'re|'m|'s]");
String mixChar = "user input some& _ text but <<<>>>- - havn't don't didn't i'm she's slslslkdfj we're";
sub3 = mixChar.split("[^a-z0-9']");
Matcher m = p.matcher(testStr);
m.matches();
while (m.find()) {
int t = m.start();
int b = m.end();
String sub = testStr.substring(t, b);
System.out.println(sub);
}
[code]
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/46332/viewspace-1007824/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/46332/viewspace-1007824/