可恶的编码,可恶的MYSQL

    这两天总是给我出来[Illegal mix of collations ( gb2312_chinese_ci,IMPLICIT) and ( gbk_chinese_ci,COERCIBLE) for operation '='] 的错误,搞了很久,重新安装Mysql,更改Mysql的默认启动参数,更改my.in文件等等,更该每个表的默认字符集,折腾了一溜够,终于领略了Mysql的魅力,不过也终于明白了一些问题。
    它出现这个错误的主要原因就是字符集的事情,两个不同的字符集比较的时候,不能比较,因为它比较的时候,都是把字符(symbols )转换成了编码(encodings),然后才能比较大小和相等。不同的字符(symbols )在不同的字符集中的编码(encodings)是不同的,所以不能简单地比较,但是这个破东西是不是弄得太复杂了:(
   
   PS:我修改的地方,my.ini里面把所有的都改成了gb2312,更改数据库和表,字段等,都改成了gb2312,还有连接JDBC处(jdbc:mysql://localhost/bugreport?useUnicode=true&characterEncoding=gb2312)
A character set is a set of symbols and encodings. A collation is a set of rules for comparing characters in a character set. Let's make the distinction clear with an example of an imaginary character set.
Suppose that we have an alphabet with four letters: `A', `B', `a', `b'. We give each letter a number: `A' = 0, `B' = 1, `a' = 2, `b' = 3. The letter `A' is a symbol, the number 0 is the encoding for `A', and the combination of all four letters and their encodings is a character set.
Now, suppose that we want to compare two string values, `A' and `B'. The simplest way to do this is to look at the encodings: 0 for `A' and 1 for `B'. Because 0 is less than 1, we say `A' is less than `B'. Now, what we've just done is apply a collation to our character set. The collation is a set of rules (only one rule in this case): ``compare the encodings.'' We call this simplest of all possible collations a binary collation.
But what if we want to say that the lowercase and uppercase letters are equivalent? Then we would have at least two rules: (1) treat the lowercase letters `a' and `b' as equivalent to `A' and `B'; (2) then compare the encodings. We call this a case-insensitive collation. It's a little more complex than a binary collation.
In real life, most character sets have many characters: not just `A' and `B' but whole alphabets, sometimes multiple alphabets or eastern writing systems with thousands of characters, along with many special symbols and punctuation marks. Also in real life, most collations have many rules: not just case insensitivity but also accent insensitivity (an ``accent'' is a mark attached to a character as in German `Ö') and multiple-character mappings (such as the rule that `Ö' = `OE' in one of the two German collations).
 
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值