解决问题:SQLException: Incorrect string value: ‘\xF0\x9F\x8C\x9D\xE8\xA1...‘

1、问题背景

1.1 问题描述

在使用保存特殊字符时,数据库已经设置了utf8mb4编码格式,但是在调用接口保存含有额数字符的字段时,还是会报格式转化问题:SQLException: Incorrect string value: '\xF0\x9F\x8C\x9D\xE8\xA1...'

Caused by: java.sql.SQLException: Incorrect string value: '\xF0\x9F\x8C\x9D\xE8\xA1...' for column 'company_name' at row 1
	at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:964) ~[mysql-connector-java-5.1.43.jar:5.1.43]
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3973) ~[mysql-connector-java-5.1.43.jar:5.1.43]
	at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3909) ~[mysql-connector-java-5.1.43.jar:5.1.43]
	at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2527) ~[mysql-connector-java-5.1.43.jar:5.1.43]
	at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2680) ~[mysql-connector-java-5.1.43.jar:5.1.43]
	at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2494) ~[mysql-connector-java-5.1.43.jar:5.1.43]
	at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:1858) ~[mysql-connector-java-5.1.43.jar:5.1.43]
	at com.mysql.jdbc.PreparedStatement.execute(PreparedStatement.java:1197) ~[mysql-connector-java-5.1.43.jar:5.1.43]

1.2、问题原因

mysql-connector-java-5.1.43.jar:5.1.43,版本过低导致,数据库驱动在提交sql之前参数码值校验不支持utf8mb4编码格式导致。

2、问题定位

MySQL mysql-connector-java-5.1.x驱动中,校验设置字符串过程涉及 一下几个关键步骤:

2.1 解析连接参数

当用户通过JDBC URL指定characterEncoding参数时,驱动会解析该参数。

相关代码在ConnectionImpl类的构造函数中:

configureClientCharacterSet(boolean dontCheckServerMatch) throws SQLException {
configureClientCharacterSet(boolean dontCheckServerMatch) 处理的客户端的编码格式,5.1.47之前,默认不支持utf8mb4;
    boolean utf8mb4Supported = versionMeetsMinimum(5, 5, 2);
    boolean useutf8mb4 = utf8mb4Supported && (CharsetMapping.UTF8MB4_INDEXES.contains(this.io.serverCharsetIndex));
// UTF8MB4_INDEXES 集合中的utf8mb4 的码值没有33 的, serverCharsetIndex 初始化值为33

初始化数据

    // ConnectionImpl.java
    private void coreConnect(Properties mergedProps) throws SQLException, IOException {
        int newPort = 3306;
        String newHost = "localhost";

        String protocol = mergedProps.getProperty(NonRegisteringDriver.PROTOCOL_PROPERTY_KEY);

       
        this.port = newPort;
        this.host = newHost;

        // reset max-rows to default value
        this.sessionMaxRows = -1;

        // preconfigure some server variables which are consulted before their initialization from server
        this.serverVariables = new HashMap<String, String>();
        this.serverVariables.put("character_set_server", "utf8");

        this.io = new MysqlIO(newHost, newPort, mergedProps, getSocketFactoryClassName(), getProxy(), getSocketTimeout(),
         this.largeRowSizeThreshold.getValueAsInt());
        // 初始化 serverCharsetIndex 为33
        this.io.doHandshake(this.user, this.password, this.database);
        if (versionMeetsMinimum(5, 5, 0)) {
            // error messages are returned according to character_set_results which, at this point, is set from the response packet
            this.errorMessageEncoding = this.io.getEncodingForHandshake();
        }
    }

// MysqlIO.java
 void doHandshake(String user, String password, String database) throws SQLException {
        // Read the first packet
        this.checkPacketSequence = false;
        this.readPacketSequence = 0;

        Buffer buf = readPacket();

        // Get the protocol version
        this.protocolVersion = buf.readByte();


        if ((versionMeetsMinimum(4, 1, 1) || ((this.protocolVersion > 9) && (this.serverCapabilities & CLIENT_PROTOCOL_41) != 0))) {

            /* New protocol with 16 bytes to describe server characteristics */
            // read character set (1 byte) s设置serverCharsetIndex  为 33
            this.serverCharsetIndex = buf.readByte() & 0xff;
            // read status flags (2 bytes)
            this.serverStatus = buf.readInt();
            checkTransactionState(0);

            // read capability flags (upper 2 bytes)
            this.serverCapabilities |= buf.readInt() << 16;

            if ((this.serverCapabilities & CLIENT_PLUGIN_AUTH) != 0) {
                // read length of auth-plugin-data (1 byte)
                this.authPluginDataLength = buf.readByte() & 0xff;
            } else {
                // read filler ([00])
                buf.readByte();
            }
            // next 10 bytes are reserved (all [00])
            buf.setPosition(buf.getPosition() + 10);

        }

    }

 5.1.47之后版本的判断逻辑:改变了判断逻辑通过utf8mb4Supported 来判断是否为utf8mb4

boolean utf8mb4Supported = versionMeetsMinimum(5, 5, 2);
String utf8CharsetName = connectionCollationSuffix.length() > 0 ? connectionCollationCharset
        : (utf8mb4Supported ? "utf8mb4" : "utf8");
// 是通过utf8mb4Supported 转化编码费事支持了utf6mb4


if (!getUseOldUTF8Behavior()) {
      if (dontCheckServerMatch || !characterSetNamesMatches("utf8") || (utf8mb4Supported && !characterSetNamesMatches("utf8mb4"))
               || (connectionCollationSuffix.length() > 0
                                                && !getConnectionCollation().equalsIgnoreCase(this.serverVariables.get("collation_server"))))
        {
                 execSQL(null, "SET NAMES " + utf8CharsetName + connectionCollationSuffix, -1, null, DEFAULT_RESULT_SET_TYPE,
                                            DEFAULT_RESULT_SET_CONCURRENCY, false, this.database, null, false);
                 this.serverVariables.put("character_set_client", utf8CharsetName);
                 this.serverVariables.put("character_set_connection", utf8CharsetName);
       }
      } else {
                execSQL(null, "SET NAMES latin1", -1, null, DEFAULT_RESULT_SET_TYPE, DEFAULT_RESULT_SET_CONCURRENCY, false, this.database, null,
                                        false);
                 this.serverVariables.put("character_set_client", "latin1");
                 this.serverVariables.put("character_set_connection", "latin1");
                            }

2.2字符集映射

驱动使用CharsetMapping类将Java字符编码名称映射到MySQL字符集名称。例如,将"UTF-8"映射为"utf8":通过configureClientCharacterSet(boolean dontCheckServerMatch) 方法

关键源代码:

// 获取对应的MySQL字符集
String mysqlCharset = CharsetMapping.getMysqlCharsetForJavaEncoding(characterEncoding, this.getServerVersion());
if (mysqlCharset == null) {
    // 抛出异常,编码不支持
    throw SQLError.createSQLException("Unsupported character encoding: " + characterEncoding, ...);
}

其中还会分组设置一些编码格式;tempUTF8MB4Indexes 会放入与utf8mb4相关数据

        // 此处分组封装数据
        Collation notUsedCollation = new Collation(0, COLLATION_NOT_DEFINED, 0, NOT_USED);
        for (int i = 1; i < MAP_SIZE; i++) {
            Collation coll = collation[i] != null ? collation[i] : notUsedCollation;
            COLLATION_INDEX_TO_COLLATION_NAME[i] = coll.collationName;
            COLLATION_INDEX_TO_CHARSET[i] = coll.mysqlCharset;
            String charsetName = coll.mysqlCharset.charsetName;

            if (!charsetNameToCollationIndexMap.containsKey(charsetName) || charsetNameToCollationPriorityMap.get(charsetName) < coll.priority) {
                charsetNameToCollationIndexMap.put(charsetName, i);
                charsetNameToCollationPriorityMap.put(charsetName, coll.priority);
            }

            // Filling indexes of utf8mb4 collations
            if (charsetName.equals(MYSQL_CHARSET_NAME_utf8mb4)) {
                tempUTF8MB4Indexes.add(i); // utf8mb4 编码格式
            }
        }

3、问题解决

3.1 方案一

升级版本5.1.47以上版本,可以解决上述问题

3.2 方案二

升级大版本8.以上版本,但是需要修改配置数据库的驱动。

注:记录工作中遇到的问题及解决方案,大家有更优的解决方案,欢迎补充!

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值