mysql 导入 特殊字符_在MySQL导出/导入中丢失特殊字符

本文记录了从MySQL 3.23.58版本迁移到5.5.19版本过程中遇到的特殊字符丢失问题及解决方法。作者尝试了多种手段如mysqldump导出、编辑文件转换类型以及终端导入等,但仍然出现了特殊字符被替换为问号的情况。最终通过特定字符替换的方法解决了这一问题。
摘要由CSDN通过智能技术生成

bd96500e110b49cbb3cd949968f18be7.png

I'm trying to move a MySQL 3.23.58 database to a different server running 5.5.19.

The old one has latin1 encoding specified, and as far as I can tell the underlying data is indeed honestly latin1. I have tried many things, chiefly:

exporting from terminal with mysqldump and the latin1 encoding flag.

editing in vim to change "TYPE=InnoDB" to "ENGINE=InnoDB" for MySQL 5 compatibility.

importing to the new server from terminal.

Browsing the old server (in Sequel Pro for Mac, or MySQL Query Browser on PC), special characters don't always show properly, but they're there (looking at the binary in hex). (And in any case it works with the PHP web app.)

Browsing the new server, all special characters appear to have been replaced by question marks. I know that sometimes special characters can display as a question mark (or �) if the wrong encoding is specified. But these appear to be genuine straight-up encoded ASCII question marks on a binary level. The special characters (chiefly curly quotation marks and dashes) appear to have been lost, or destroyed, in the export/import.

Any idea why?

I know there are many things that can go wrong with encoding, with many different things at fault. I have been reading about this for several days (here and elsewhere) and tried setting all the right character encodings, tried UTF-8, tried casting and converting, tried Sequel Pro's export/import (as opposed to the terminal), etc. But I am stumped.

解决方案

Good, it looks like we've narrowed down your problem. I found this post

If your text editor is vim, then most likely the "<92>" is the

hexadecimal code of an extended ASCII character. In this case, it is

Hex(92) or Oct(222) or Dec(146) , which is "right single quotation

mark"; not to confused with "single quote" which is ASCII Dec code 39.

One way to remove all non-ASCII characters from your file could be -

perl -plne 's/[^[:ascii:]]//g'

Otherwise just search and replace "<92>" and "<97>" in your exported file with an appropriate character.

[Edit]

I'm not a VIM user but this post addresses the issue of replacing the <92> smart quote characters

For each value that you see in your file, just do a string

substitution, like so:

:%s/<93>/\’/g

of course, you can’t just type that <93> in there, so to get it in

there you use

CTRL-V x 93

which inserts hex 93 in place.

In recently exported CSV’s from excel, I’ve seen hex 91-97.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值