mysql load warning_mysqlimport show warnings

本文代码运行环境:MySQL:5.1.26-rc-community,Windows 2003

无意中在 emule 的安装目录下看到了个 ip-to-country.csv 文件。打开后,发现是世界各国及IP段对照的文件。格式如下:33996344,33996351,GB,GBR,英国

50331648,69956103,US,USA,美国

69956104,69956111,BM,BMU,百慕达群岛

69956112,83886079,US,USA,美国

94585424,94585439,SE,SWE,瑞典

100663296,121195295,US,USA,美国

121195296,121195327,IT,ITA,意大利

121195328,152305663,US,USA,美国

真是个好东东!正好一个项目要用到,就准备把数据导入到 MySQL 数据库中。首先在 MySQL 数据库中建立表结构:use testdb;

create table ip_to_country

(

ip1 int unsigned not null

,ip2 int unsigned not null

,cname1 varchar(10) not null

,cname2 varchar(10) not null

,cname3 varchar(50) not null

)

engine=innodb default charset=utf8;

我准备用 MySQL 导入工具:mysqlimport 来完成导入任务。首先把数据文件 ip-to-country.csv copy 到 d:\,为了使其和 MySQL 中表名匹配,重命名为 ip_to_country.csv。然后根据数据文件格式,编写并执行下面的 mysqlimport 脚本:mysqlimport --local

--user=root

--password=123456

--delete

--default-character-set="gb2312"

--fields-terminated-by=","

--fields-enclosed-by=""

--fields-escaped-by=""

--lines-terminated-by="\r\n"

testdb

"d:/ip_to_country.csv"

注意:上面的 mysqlimport 参数应写成一行,为了便于阅读我有意分成多行。执行上面的 mysqlimport 命令后,发现有 Warnings:testdb.ip_to_country: Records: 65290 Deleted: 0 Skipped: 0 Warnings: 15

数据已经导入,不管那么多了, 先进 MySQL 数据库看下效果。首先设置 character_set_results=gb2312;然后查询 10 条记录出来看看,步骤如下:mysql> set character_set_results=gb2312;

mysql> show variables like '%char%';

+--------------------------+--------------------------+

| Variable_name | Value |

+--------------------------+--------------------------+

| character_set_client | utf8 |

| character_set_connection | utf8 |

| character_set_database | utf8 |

| character_set_filesystem | binary |

| character_set_results | gb2312 |

| character_set_server | utf8 |

| character_set_system | utf8 |

| character_sets_dir | D:\MySQL\share\charsets\ |

+--------------------------+--------------------------+mysql> select * from ip_to_country limit 10;

+-----------+-----------+--------+--------+--------+

| ip1 | ip2 | cname1 | cname2 | cname3 |

+-----------+-----------+--------+--------+--------+

| 33996344 | 33996351 | GB | GBR | ? |

| 50331648 | 69956103 | US | USA | |

| 69956104 | 69956111 | BM | BMU | |

| 69956112 | 83886079 | US | USA | |

| 94585424 | 94585439 | SE | SWE | |

| 100663296 | 121195295 | US | USA | |

| 121195296 | 121195327 | IT | ITA | |

| 121195328 | 152305663 | US | USA | |

| 152305664 | 152338431 | GB | GBR | ? |

| 152338432 | 167772159 | US | USA | |

+-----------+-----------+--------+--------+--------+

结果发现国家的中文名称都是乱码。奇怪,已经把 mysqlimport 的 default-character-set 参数设为:gb2312,为什么会有乱码?最后不得以,只好在 MySQL 数据库中,把表 ip_to_country 的字符集改为 gb2312。mysql> alter table ip_to_country default character set gb2312;

mysql> alter table ip_to_country convert to character set gb2312;

然后重新执行导入命令 mysqlimport,这时候发现 MySQL 乱码问题已解决,中文国家名字可以正常显示:mysql> select * from ip_to_country limit 10;

+-----------+-----------+--------+--------+------------+

| ip1 | ip2 | cname1 | cname2 | cname3 |

+-----------+-----------+--------+--------+------------+

| 33996344 | 33996351 | GB | GBR | 英国 |

| 50331648 | 69956103 | US | USA | 美国 |

| 69956104 | 69956111 | BM | BMU | 百慕达群岛 |

| 69956112 | 83886079 | US | USA | 美国 |

| 94585424 | 94585439 | SE | SWE | 瑞典 |

| 100663296 | 121195295 | US | USA | 美国 |

| 121195296 | 121195327 | IT | ITA | 意大利 |

| 121195328 | 152305663 | US | USA | 美国 |

| 152305664 | 152338431 | GB | GBR | 英国 |

| 152338432 | 167772159 | US | USA | 美国 |

+-----------+-----------+--------+--------+------------+

留下一个问题:mysqlimport 到底能不能把文本文件中的 gb2312 字符转换成 utf8 导入到 MySQL 数据库中?

虽然问题看起来已经解决了,但我还想试下 MySQL load data 命令。 mysqlimport 虽然把数据导入数据库了,但还有 15 Warnings 在闹心。我本想利用 mysqlimport 自身的功能来查看这些 Warnings 到底是怎么回事,但翻翻手册,仍无计可施。MySQL 中有个 show warnings 给我一线希望。我这样想:先在 MySQL 中执行 Load data,然后 show warnings 不就可以找到问题所在了吗?mysql> truncate table ip_to_country;

mysql>

mysql> load data infile "d:/ip_to_country.csv"

replace into table ip_to_country

character set gb2312

fields terminated by "," enclosed by ""

lines terminated by "\r\n";

ERROR 1262 (01000): Row 6737 was truncated; it contained more data than there were input columns

晕,又出现个拦路虎:ERROR 1262 (01000): Row 6737 was truncated; it contained more data than there were input columns. 最后发现问题是 sql_mode 的问题。mysql> show variables like '%sql_mode%';

+---------------+----------------------------------------------------------------+

| Variable_name | Value |

+---------------+----------------------------------------------------------------+

| sql_mode | strict_trans_tables,no_auto_create_user,no_engine_substitution |

+---------------+----------------------------------------------------------------+

mysql> set sql_mode='no_auto_create_user,no_engine_substitution';

把 strict_trans_tables 从 sql_mode 中去掉,再次执行 MySQL Load datamysql> load data infile "d:/ip_to_country.csv"

replace into table ip_to_country

character set gb2312

fields terminated by "," enclosed by ""

lines terminated by "\r\n";

Query OK, 65290 rows affected, 15 warnings (0.63 sec)

Records: 65290 Deleted: 0 Skipped: 0 Warnings: 15

接下来,用 MySQL show warnings 命令,来找警告的详细描述:mysql> show warnings;

+---------+------+-------------------------------------------------------------------------------+

| Level | Code | Message |

+---------+------+-------------------------------------------------------------------------------+

| Warning | 1262 | Row 6737 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 6817 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 6914 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 6916 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 6918 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 6988 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 7028 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 7226 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 7569 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 7791 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 47856 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 47885 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 49331 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 49539 was truncated; it contained more data than there were input columns |

| Warning | 1262 | Row 49547 was truncated; it contained more data than there were input columns |

+---------+------+-------------------------------------------------------------------------------+

根据行号 Row 6737 到 ip_to_country.csv 中查看,发现果然有问题,国家中文名中间多了个逗号 “,”1089579216,1089579223,VI,VIR,维京群岛,美国

看来,需要在表 ip_to_country 中再增加一列,来存放多出的内容。于是修改表结构:mysql> alter table ip_to_country add column cname4 varchar(50) null;

再次执行 mysql load data,数据顺利导入。这时仍有警告,这些警告是因为文件中的大部分数据行只有 5 列,而表中有 6 列,因此 MySQL 才 Warning。

把表 ip_to_country 的字符集改为 utf8,看有没有乱码:truncate table ip_to_country;

alter table ip_to_country default character set utf8;

alter table ip_to_country convert to character set utf8;

再次,执行 MySQL Load data 命令:mysql> load data infile "d:/ip_to_country.csv"

replace into table ip_to_country

character set gb2312

fields terminated by "," enclosed by ""

lines terminated by "\r\n";

Query OK, 65290 rows affected, 65275 warnings (0.64 sec)

Records: 65290 Deleted: 0 Skipped: 0 Warnings: 65275

怀着激动的心情,select:mysql> select * from ip_to_country where cname4 is not null limit 10;

+------------+------------+--------+--------+----------+--------+

| ip1 | ip2 | cname1 | cname2 | cname3 | cname4 |

+------------+------------+--------+--------+----------+--------+

| 1089579216 | 1089579223 | VI | VIR | 维京群岛 | 美国 |

| 1093062144 | 1093062399 | VI | VIR | 维京群岛 | 美国 |

| 1097896192 | 1097897215 | VI | VIR | 维京群岛 | 美国 |

| 1097947136 | 1097949183 | VI | VIR | 维京群岛 | 美国 |

| 1097951232 | 1097953279 | VI | VIR | 维京群岛 | 美国 |

| 1101625344 | 1101625407 | VI | VIR | 维京群岛 | 美国 |

| 1101971072 | 1101971079 | VI | VIR | 维京群岛 | 美国 |

| 1113864768 | 1113864783 | VI | VIR | 维京群岛 | 美国 |

| 1119428608 | 1119432703 | VI | VIR | 维京群岛 | 美国 |

| 1123590144 | 1123594239 | VI | VIR | 维京群岛 | 美国 |

+------------+------------+--------+--------+----------+--------+

可见,MySQL load data infile 指令,可以实现不同字符集之间的转换。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值