总结常用的string的处理方式
1 StringUtils处理
空指针安全,常用的eqauls, isBlank/isNotBlank, split/join, indexOf, replace, trim/strip, isNumeric
public static void main(String[] args) {
//1
String demoStr1 = "one";
String demoStr2 = null;
// System.out.println(demoStr2.equals(demoStr1)); NULL POINTER
System.out.println(StringUtils.equals(demoStr2, demoStr1));
//2
String demoStr3 = "one.two.three.four";
String[] strings = StringUtils.split(demoStr3, ".");//传入的是separate符号
System.out.println(strings.length);//4
System.out.println(demoStr3.split(".").length);//0,传入的是一个正则表达式,"."代表所有符号
//3
System.out.println(StringUtils.join(strings, ","));
//4
System.out.println(StringUtils.indexOf(demoStr1, "o", 1));//-1
//5
String demoStr4 = " Mrs. Zhang";
String unescapeHtml4Str = StringEscapeUtils.unescapeHtml4(demoStr4);
System.out.println(demoStr4);
System.out.println(unescapeHtml4Str);
System.out.println("after replace:" + StringUtils.replace(unescapeHtml4Str, "\u00a0", ""));
//6 trimVSstrip
System.out.println((int) StringEscapeUtils.unescapeHtml4(" ").charAt(0));//160,而非常见32号空格,trim不掉
System.out.println(Character.isWhitespace('\u3000'));
System.out.println(Character.isWhitespace(StringEscapeUtils.unescapeHtml4(" ").charAt(0)));
String demoStr6 = "\u3000" + unescapeHtml4Str + " ";//\u3000中文空格符
System.out.println(demoStr6);
System.out.println("trim:" + StringUtils.trim(demoStr6));
System.out.println("strip:" + StringUtils.strip(demoStr6));
System.out.println("trim:" + StringUtils.trim(demoStr6).replace("\u00A0", ""));
System.out.println("strip:" + StringUtils.strip(demoStr6).replace("\u00A0", ""));
//7
String demoStr7 = "1234";
System.out.println(StringUtils.isNumeric(demoStr7));
}
equals:
false
split:
4
0
join:
one,two,three,four
indexOf:
-1
unescapeHtml:
Mrs. Zhang
Mrs. Zhang
after replace:Mrs. Zhang
trim or strip:
160
true
true
Mrs. Zhang
trim: Mrs. Zhang
strip: Mrs. Zhang
trim: Mrs. Zhang
strip:Mrs. Zhang
isNumeric:
true
1 很多换行,空格,StringUtils里有常量EMPTY,SPACE,LF,CR等常量可以直接用,让代码可读性更好</span>
2 去除html的字符用StringEscapeUtils.unescapeHtml4,有些去除后的字符,比如将 翻译成了160号(\u00a0)空格符,此时strip不掉,可以用replace替换掉
3 trim和strip的区别是,trim是trim掉的ascii码32号之前的控制符,而strip不仅会trim掉这些字符,而且会将Character.isWhitespace()判断为true的符号都去掉。
trim可以去掉英文空格符,strip可以去掉中文(\u3000)、英文空格符(\u0020),但是strip不掉\u00a0空格符。
4 String类的replace和replaceAll的区别是,replace是一个字符串,replaceAll是替换一个正则表达式
附:org.apache.commons.lang3.StringUtils的官方说明文档
2 正则匹配,摘出所有的符合的字符
Pattern+Matcher
private static final Pattern DatePattern = Pattern.compile("\\d{4}-\\d{2}-\\d{2}");
Matcher matcher = DatePattern.matcher(str);
if (matcher.find()) {
return matcher.group(1);
}
3 字符转码
String类的getBytes()
demoStr.getBytes("GBK")
4 判断是何种类型的字符
判断中文字符
private static boolean isChineseByScript(char c) {
Character.UnicodeScript sc = Character.UnicodeScript.of(c);
return sc == Character.UnicodeScript.HAN;
}
5 to be continue。。。StringUtils类的一些好用的工具函数
1 equalsIgnoreCase(),空指针安全,忽略大小写比较字符串