txt文本导入mysql换行,将文本导入MySQL:奇怪的格式

I'm importing some data from a .txt file into a MySQL database table, using mysqlimport. It seems to import OK (no error messages) but looks very odd when displayed, and can't be searched as expected.

Here are the details. The original text file is saved in UTF-8, with records that look (in a text editor) like this. The second field includes line breaks:

WAR-16,52 ~~~~~ Lorem ipsum dolor sit.

Lorem ipsum dolor sit.

~~~~~ ENDOFRECORD

WAR-16,53~~~~~Lorem ipsum dolor sit.

Lorem ipsum dolor sit.

Lorem ipsum dolor sit.

Lorem ipsum dolor sit.

~~~~~ ENDOFRECORD

The database table into which I am importing is very simple:

+-------+---------------+------+-----+---------+-------+

| Field | Type | Null | Key | Default | Extra |

+-------+---------------+------+-----+---------+-------+

| id | varchar(100) | YES | | NULL | |

| text | varchar(5000) | YES | | NULL | |

+-------+---------------+------+-----+---------+-------+

2 rows in set (0.00 sec)

When I import the file, here's the command I use:

$ mysqlimport -u root db textfile.txt --fields-terminated-by="~~~~~" --lines-terminated-by="ENDOFTHELINE" --default-character-set='utf8'

db.records_list: Records: 18778 Deleted: 0 Skipped: 0 Warnings: 18787

Here's what I see if I then ask MySQL to display the records:

mysql> select * from textfile;

|

W A R - 1 6 , 5 2 | L o r e m i p s u m d o l o r s i t .

L o r e m i p s u m d o l o r s i t .

(etc)

So, it looks as though spaces, or some strange encoding extras, are being added to the text.

And here's the problem with the database query:

mysql> select * from textfile where id like "%WAR%";

returns nothing; nor does adding spaces:

mysql> select * from textfile where id like "%W A R%";

Only this command returns anything

mysql> select * from textfile where id like "%W%";

Can anyone guess what might be happening? I feel like it must be an encoding problem, but I can't work it out.

------ UPDATE --------

OK, I've checked the database and connection encoding.

mysql> show variables like "character_set_%";

+--------------------------+----------------------------------------+

| Variable_name | Value |

+--------------------------+----------------------------------------+

| character_set_client | latin1 |

| character_set_connection | latin1 |

| character_set_database | latin1 |

| character_set_filesystem | binary |

| character_set_results | latin1 |

| character_set_server | latin1 |

| character_set_system | utf8 |

| character_sets_dir | /usr/local/mysql/share/mysql/charsets/ |

+--------------------------+----------------------------------------+

8 rows in set (0.01 sec)

And show table status says the table is latin1_swedish_ci.

I have re-saved the text file in "Western (Windows Latin 1)" (using TextEdit on Snow Leopard) and tried to import it using the same command as above. However I still have the same encoding problem.

I also tried, again with no luck:

creating a new table with UTF-8 and importing the existing file

copying & pasting the text into another text file that I've previously imported fine, and trying to import that.

Still totally baffled :(((

解决方案

As noted in your duplicate question, make sure that your table and connection are both using UTF-8. That is one of the common sources of this issue.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值