日文编码SHIFT_JIS/MS932使用

本文对比了Shift_JIS与Windows-31J两种字符集的区别,详细分析了二者在映射到Unicode时的不同之处,并提供了JDK源码中对于这两种字符集定义的摘录。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

总结:在多数情况下,使用MS932代替SHIFT_JIS,可减少乱码。

-----------------------------------------------------------------------------


参考:http://www.asteria.com/tutorial/asbook320_application_read.html



(6) Differences Between Shift_JIS and Windows-31J

 

 

As we mentioned earlier, Shift_JIS and Windows-31J employ different character sets and codes. This means that you must use different mapping converters when converting between them and Unicode.

The table below gives you the differences between Shift_JIS and Windows-31J at a glance:

・Mapping from Shift_JIS/Windows-31J to Unicode

JIS X 0208 charactersShift_JIS/Windows-31J codesShift_JIS→UnicodeWindows-31J→Unicode
~ (1-33, WAVE DASH)8160U+301CU+FF5E
∥ (1-34, DOUBLE VERTICAL LINE)8161U+2016U+2225
- (1-61, MINUS SIGN)817CU+2212U+FF0D
¢ (1-81, CENT SIGN)8191U+00A2U+FFE0
£ (1-82, POUND SIGN)8192U+00A3U+FFE1
¬ (2-44, NOT SIGN)81CAU+00ACU+FFE2
IBM extensions NoYes
NEC extensions NoYes

User-defined characters are mapped into the Unicode Private Use Area as shown in the table below:

ConverterShift_JIS rangeUnicode range
Windows-31JF040~F9FCE000~E757

・Mapping from Unicode to Shift_JIS/Windows-31J

Unicode charactersUnicode codesShift_JISWindows-31J
∥ (DOUBLE VERTICAL LINE)U+20168161×
- (MINUS SIGN)U+2212817C×
~ (WAVE DASH)U+301C8160×
- (FULLWIDTH HYPHEN-MINUS)U+FF0D×817C
~ (FULLWIDTH TILDE)U+FF5E×8160
¢ (FULLWIDTH CENT SIGN)U+FFE0×8191
£ (FULLWIDTH POUND SIGN)U+FFE1×8192
¬ (FULLWIDTH NOT SIGN)U+FFE2×81CA

To sum up, Shift_JIS and Windows-31J differ in the following ways:

 

  • Windows-31J can handle the additional characters from IBM and NEC.
  • Code points differ for some symbols when they are converted into Unicode.
  • In general, if you stick with Windows-31J, which is the larger character set, you shouldn't have any problems.

     

------------------------------ JDK 源代码摘要 --------------------------------------
  134           charset("Shift_JIS", "SJIS",

  135                   new String[] {

  136                       // IANA aliases

  137                       "sjis", // historical

  138                       "shift_jis",

  139                       "shift-jis",

  140                       "ms_kanji",

  141                       "x-sjis",

  142                       "csShiftJIS"

  143                   });

  144   

  145           // The definition of this charset may be overridden by the init method,

  146           // below, if the sun.nio.cs.map property is defined.

  147           //

  148           charset("windows-31j", "MS932",

  149                   new String[] {

  150                       "MS932", // JDK historical

  151                       "windows-932",

  152                       "csWindows31J"

  153                   });

  154   

  155           charset("JIS_X0201", "JIS_X_0201",

  156                   new String[] {

  157                       "JIS0201", // JDK historical

  158                       // IANA aliases

  159                       "JIS_X0201",

  160                       "X0201",

  161                       "csHalfWidthKatakana"

  162                   });

------------------------------

public class CharToByteMS932 extends CharToByteMS932DB {
     CharToByteJIS0201 cbJIS0201 = new CharToByteJIS0201();
    ... ...
------------------------------
public class CharToByteSJIS extends CharToByteJIS0208 {
     CharToByteJIS0201 cbJIS0201 = new CharToByteJIS0201();
    ... ...
------------------------------

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值