java判断字符是否为中文字符,之前常用判断字符区间的方法。
但是该【 0x4e00~0x9fbb 】区间判断却不能判断出中文标点符号。
private static boolean isChinese(char c) {
if (c >= 0x4e00 && c <= 0x9fbb) {
return true;
}
return false;
}
使用Character.UnicodeBlock中的CJK判断是非常准确的。
CJK是Chinese、Japan、Korea的首字母
/**
* 判断是否为中文字符
* CJK,中日韩
*
* Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS
* 4E00-9FBF:CJK 统一表意符号
*
* Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS
* F900-FAFF:CJK 兼容象形文字
*
* Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A
* 3400-4DBF:CJK 统一表意符号扩展 A
*
* Character.UnicodeBlock.GENERAL_PUNCTUATION
* 2000-206F:常用标点
*
* Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION
* 3000-303F:CJK 符号和标点
*
* Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS
* FF00-FFEF:半角及全角形式
*/
private static boolean isChinese(char c) {
Character.UnicodeBlock ub = Character.UnicodeBlock.of(c);
return ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS
|| ub == Character.UnicodeBlock.CJK_COMPATIBILITY_IDEOGRAPHS
|| ub == Character.UnicodeBlock.CJK_UNIFIED_IDEOGRAPHS_EXTENSION_A
|| ub == Character.UnicodeBlock.GENERAL_PUNCTUATION
|| ub == Character.UnicodeBlock.CJK_SYMBOLS_AND_PUNCTUATION
|| ub == Character.UnicodeBlock.HALFWIDTH_AND_FULLWIDTH_FORMS;
}