一、Oracle数据库
就一般情况来说,Oracle存储中英文的字段用varchar2类型就可以了,但有些时候,遇到生僻字就不行了, 在默认字符集环境下,实现Oracle储存生僻字: 㛃、䶮…(使用nvarchar2字段类型实现)
首先,把生僻字转换为Unicode。工具链接https://tool.chinaz.com/tools/unicode.aspx
“㛃” 转为Unicode为 “\u36c3”(注意: \u 是Unicode的转义字符,使用的时候要去掉)然后,插入SQL进行比较一下。
insert into TEST(SNAME,TNAME) values('u36c3','㛃');
二、mysql数据库
将该表或者改字段设置为字符集修改为utf8mb4
三、postgresql数据库
该数据库设置varchar可以存储生僻字
四、java处理生僻字
public class RareCharacterUtility {
public static boolean containsUserDefinedUnicode(String string) {
if (string == null) {
throw new NullPointerException("Stirng must be non-null");
}
int[] code = toCodePointArray(string);
// U+E000..U+F8FF
for (int c : code) {
if (c >= '\ue000' && c <= '\uf8ff') {
return true;
}
}
return false;
}
static int[] toCodePointArray(String str) {
int len = str.length();
int[] acp = new int[str.codePointCount(0, len)];
for (int i = 0, j = 0; i < len; i = str.offsetByCodePoints(i, 1)) {
acp[j++] = str.codePointAt(i);
}
return acp;
}
static String toHex(int[] chars) {
String r = "[";
for (int i=0; i<chars.length; i++) {
if (r.length() > 1) {
r += ",";
}
r += Integer.toHexString(chars[i]);
}
r += "]";
return r;
}
public static void main(String[] argu) {
String rr = ("\u5f20\ue0bf\uD86C\uDE70\uD840\uDC10\uD86D\uDF44\uD87E\uDCAC\u9fc6");
System.out.println("Unicode = " + toHex(toCodePointArray(rr)));
boolean r = (containsUserDefinedUnicode(rr));
System.out.println("Test result = " + r + " should be true");
}
}
参考:https://blog.csdn.net/qq_37312208/article/details/81869779?from=timeline