GBase8a执行insert异常Incorrect string value: ‘\xF0\xA5\xA2\x89‘ for column ‘name‘ at row 1

异常信息:

Incorrect string value: ‘\xF0\xA5\xA2\x89’ for column ‘name’ at row 1
在这里插入图片描述

排查思路:

1、先排除目标表是否为utf8mb4的表,GBase8a只支持在utf8mb4上插入4字节字符
2、排除源表的情况,正常情况下,4字节的数据没办法直接插入utf8的表中
3、一般是使用load data方式入库,怀疑是否为load data导致

测试步骤:

1)导入数据

# 制造测试数据
[gbase@keep-gbase8a-node5 ~]$ echo '1,𧒽
> 2,𥢉
> 3,𥦉
> 4,𥞉
> 5,正
> 6,常' > t_mb4.txt


[gbase@keep-gbase8a-node5 ~]$ gccli
# 创建库表
gbase> drop database if exists test_shengpi;
Query OK, 2 rows affected (Elapsed: 00:00:00.06)

gbase> create database test_shengpi;
Query OK, 1 row affected (Elapsed: 00:00:00.01)

gbase> use test_shengpi;
Query OK, 0 rows affected (Elapsed: 00:00:00.01)

gbase> create table t_utf8(id int, name varchar(100)) charset=utf8;
Query OK, 0 rows affected (Elapsed: 00:00:00.04)

gbase> create table t_utf8mb4(id int, name varchar(100)) charset=utf8mb4;
Query OK, 0 rows affected (Elapsed: 00:00:00.04)

# 加载数据
gbase> load data infile 'file://192.168.122.31/home/gbase/t_mb4.txt' into table test_shengpi.t_utf8 data_format 3 fields terminated by ',' lines terminated by '\n';
Query OK, 6 rows affected (Elapsed: 00:00:00.09)
Task 582219 finished, Loaded 6 records, Skipped 0 records

2)测试insert into

gbase> insert into t_utf8mb4 select * from t_utf8;
ERROR 1708 (HY000): [192.168.122.31:5050](GBA-02AD-0005)Failed to query in gnode:
DETAIL: Incorrect string value: '\xF0\xA7\x92\xBD' for column 'name' at row 1
SQL: SELECT /*192.168.122.31_920_8_2024-07-03_15:03:45*/ /*+ TID('1968442') */ `vcname000001.test_shengpi.t_utf8`.`id` AS `id`, `vcname000001.test_shengpi.t_utf8`.`name` AS `name` FROM `test_shengpi`.`t_utf8_n2` `vcname000001.test_shengpi.t_utf8` target into server (HOST '192.168.122.31,192.168.122.47', PORT 5050, USER 'root', PASSWORD '', DATABASE 'test_shengpi', TABLE 't_utf8mb4_n2', COMMENT 'table_host 0 0 1, scn 582220, distribution 1' )

3)排查日志信息

[gbase@keep-gbase8a-node5 ~]$ vim /data/192.168.122.31/gnode/log/gbase/express.log

2024-07-03 15:03:45.590 Table test_shengpi.t_utf8mb4_n2, col 1, name, convert exception, value 𧒽, src type PA_VARCHAR(100), src charset utf8_general_ci, dst type PA_VARCHAR(100), dst charset 45: \xF0\xA7\x92\xBD, query SELECT /*192.168.122.31_920_8_2024-07-03_15:03:45*/ /*+ TID('1968442') */ `vcname000001.test_shengpi.t_utf8`.`id` AS `id`, `vcname000001.test_shengpi.t_utf8`.`name` AS `name` FROM `test_shengpi`.`t_utf8_n2` `vcname000001.test_shengpi.t_utf8` target into server (HOST '192.168.122.31,192.168.122.47', PORT 5050, USER 'root', PASSWORD '', DATABASE 'test_shengpi', TABLE 't_utf8mb4_n2', COMMENT 'table_host 0 0 1, scn 582220, distribution 1' )
2024-07-03 15:03:45.590 Incorrect string value: '\xF0\xA7\x92\xBD' for column 'name' at row 1
2024-07-03 15:03:45.591 (gns_host: 192.168.122.31) Incorrect string value: '\xF0\xA7\x92\xBD' for column 'name' at row 1 table:./test_shengpi/metadata/t_utf8mb4_n2
2024-07-03 15:03:45.598 (gns_host: 192.168.122.32) Incorrect string value: '\xF0\xA5\xA6\x89' for column 'name' at row 1 table:./test_shengpi/metadata/t_utf8mb4_n1
2024-07-03 15:03:45.636 Table ./test_shengpi/metadata/t_utf8mb4_n2, recv locks 1, flush rollback return 0.
2024-07-03 15:03:45.636 Table ./test_shengpi/metadata/t_utf8mb4_n1, recv locks 1, flush rollback return 0.

日志分析: 在insert的过程中,对name字段做了convert操作,由于字符集问题导致报错

4)解决方法

直接执行insert into 会报错,如下:

[gbase@keep-gbase8a-node5 ~]$ gccli

GBase client 9.5.3.28.149123692. Copyright (c) 2004-2024, GBase.  All Rights Reserved.

gbase> use test_shengpi;
Query OK, 0 rows affected (Elapsed: 00:00:00.00)

gbase> insert into t_utf8mb4 select * from t_utf8;
ERROR 1708 (HY000): [192.168.122.31:5050](GBA-02AD-0005)Failed to query in gnode:
DETAIL: Incorrect string value: '\xF0\xA5\xA6\x89' for column 'name' at row 1
SQL: SELECT /*192.168.122.31_1456_2_2024-07-03_15:23:03*/ /*+ TID('1968483') */ `vcname000001.test_shengpi.t_utf8`.`id` AS `id`, `vcname000001.test_shengpi.t_utf8`.`name` AS `name` FROM `test_shengpi`.`t_utf8_n2` `vcname000001.test_shengpi.t_utf8` target into server (HOST '192.168.122.31,192.168.122.47', PORT 5050, USER 'root', PASSWORD '', DATABASE 'test_shengpi', TAB
gbase>
解决方法1:

(1)把t_utf8表的数据导出成数据文件,

select * from test_shengpi.t_utf8 into outfile '/home/gbase/t_oututf8.txt' 
fields terminated by ',' LINES TERMINATED BY '\N'  WRITEMODE BY OVERWRITES;

(2)把数据文件再导入t_utf8mb4表

load data infile 'file://192.168.122.31/home/gbase/t_oututf8.txt/t_oututf8.txt' into table test_shengpi.t_utf8 
data_format 3 fields terminated by ',' lines terminated by '\n';

具体代码:

[gbase@keep-gbase8a-node5 ~]$ gccli

gbase> select * from test_shengpi.t_utf8 into outfile '/home/gbase/t_oututf8.txt'
    -> fields terminated by ','  WRITEMODE BY OVERWRITES;
Query OK, 6 rows affected (Elapsed: 00:00:00.05)

gbase> load data infile 'file://192.168.122.31/home/gbase/t_oututf8.txt/t_oututf8.txt' into table test_shengpi.t_utf8mb4
    -> data_format 3 fields terminated by ',' lines terminated by '\n';
Query OK, 6 rows affected (Elapsed: 00:00:00.08)
Task 582228 finished, Loaded 6 records, Skipped 0 records

gbase> set names utf8mb4;
Query OK, 0 rows affected (Elapsed: 00:00:00.00)

gbase> select * from test_shengpi.t_utf8mb4;
+------+------+
| id   | name |
+------+------+
|    1 | 𧒽     |
|    5 ||
|    3 | 𥦉     |
|    4 | 𥞉     |
|    6 ||
|    2 | 𥢉     |
+------+------+
6 rows in set (Elapsed: 00:00:00.02)

在这里插入图片描述

解决方法2:

具体方法: 在执行SQL时,对数据进行十六进制转换操作
具体代码:

# 使用hex对字段的数据进行十六进制转换
# 再用unhex对十六进制转为正常数据
gbase> insert into t_utf8mb4 select id,unhex(hex(name)) from t_utf8;
Query OK, 6 rows affected (Elapsed: 00:00:00.10)
Records: 6  Duplicates: 0  Warnings: 0

5)数据查询对比

(1)会话为utf8字符集
[gbase@keep-gbase8a-node5 ~]$ gccli

GBase client 9.5.3.28.149123692. Copyright (c) 2004-2024, GBase.  All Rights Reserved.

gbase> set names utf8;
Query OK, 0 rows affected (Elapsed: 00:00:00.00)

gbase> show variables like '%character%';
+--------------------------+------------------------------------------------------------+
| Variable_name            | Value                                                      |
+--------------------------+------------------------------------------------------------+
| character_set_client     | utf8                                                       |
| character_set_connection | utf8                                                       |
| character_set_database   | utf8                                                       |
| character_set_filesystem | binary                                                     |
| character_set_results    | utf8                                                       |
| character_set_server     | utf8                                                       |
| character_set_sort       | binary                                                     |
| character_set_system     | utf8mb4                                                    |
| character_sets_dir       | /data/192.168.122.31/gcluster/server/share/gbase/charsets/ |
+--------------------------+------------------------------------------------------------+
9 rows in set (Elapsed: 00:00:00.00)

gbase> select * from test_shengpi.t_utf8;
+------+------+
| id   | name |
+------+------+
|    3 | 𥦉     |
|    6 ||
|    1 | 𧒽     |
|    4 | 𥞉     |
|    2 | 𥢉     |
|    5 ||
+------+------+
6 rows in set (Elapsed: 00:00:00.04)

gbase> select * from test_shengpi.t_utf8mb4;
+------+------+
| id   | name |
+------+------+
|    1 | ?    |
|    4 | ?    |
|    2 | ?    |
|    5 ||
|    3 | ?    |
|    6 ||
+------+------+
6 rows in set (Elapsed: 00:00:00.03)

在这里插入图片描述

(2)会话为utf8mb4字符集
gbase> set names utf8mb4;
Query OK, 0 rows affected (Elapsed: 00:00:00.00)

gbase> show variables like '%character%';
+--------------------------+------------------------------------------------------------+
| Variable_name            | Value                                                      |
+--------------------------+------------------------------------------------------------+
| character_set_client     | utf8mb4                                                    |
| character_set_connection | utf8mb4                                                    |
| character_set_database   | utf8                                                       |
| character_set_filesystem | binary                                                     |
| character_set_results    | utf8mb4                                                    |
| character_set_server     | utf8                                                       |
| character_set_sort       | binary                                                     |
| character_set_system     | utf8mb4                                                    |
| character_sets_dir       | /data/192.168.122.31/gcluster/server/share/gbase/charsets/ |
+--------------------------+------------------------------------------------------------+
9 rows in set (Elapsed: 00:00:00.00)

gbase> select * from test_shengpi.t_utf8;
+------+------+
| id   | name |
+------+------+
|    3 | ???? |
|    6 ||
|    2 | ???? |
|    5 ||
|    1 | ???? |
|    4 | ???? |
+------+------+
6 rows in set (Elapsed: 00:00:00.02)

gbase> select * from test_shengpi.t_utf8mb4;
+------+------+
| id   | name |
+------+------+
|    3 | 𥦉     |
|    6 ||
|    1 | 𧒽     |
|    4 | 𥞉     |
|    2 | 𥢉     |
|    5 ||
+------+------+
6 rows in set (Elapsed: 00:00:00.02)

在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值