Java在2004/2005转换为UTF-16之前使用了UCS-2.原始选择UCS-2的原因是mainly historical:
Unicode was originally designed as a fixed-width 16-bit character encoding. The primitive data type char in the Java programming language was intended to take advantage of this design by providing a simple data type that could hold any character.
Originally, Unicode was designed as a pure 16-bit encoding, aimed at representing all modern scripts. (Ancient scripts were to be represented with private-use characters.) Over time, and especially after the addition of over 14,500 composite characters for compatibility with legacy sets, it became clear that 16-bits were not sufficient for the user community. Out of this arose UTF-16.
由于@wero有already mentioned,使用UTF-8无法有效地进行随机访问.所有事情都在衡量,UCS-2似乎是当时最好的选择,特别是因为那个阶段没有分配补充字符.这使得UTF-16成为最简单的自然进展.