慕丝7291255
字符串是字符列表(即代码点)。表示字符串所用的字节数完全取决于用于将其转换为字节的编码。也就是说,您可以将字符串转换为字节数组,然后查看其大小,如下所示:// The input string for this testfinal String string = "Hello World";// Check length, in charactersSystem.out.println(string.length()); // prints "11"// Check encoded sizesfinal byte[] utf8Bytes = string.getBytes("UTF-8");System.out.println(utf8Bytes.length); // prints "11"final byte[] utf16Bytes= string.getBytes("UTF-16");System.out.println(utf16Bytes.length); // prints "24"final byte[] utf32Bytes = string.getBytes("UTF-32");System.out.println(utf32Bytes.length); // prints "44"final byte[] isoBytes = string.getBytes("ISO-8859-1");System.out.println(isoBytes.length); // prints "11"final byte[] winBytes = string.getBytes("CP1252");System.out.println(winBytes.length); // prints "11"所以你看,即使一个简单的“ASCII”字符串在其表示中也可以有不同的字节数,这取决于使用哪种编码。使用您感兴趣的任何字符集作为您的案例getBytes()。并且不要陷入假设UTF-8将每个字符表示为单个字节的陷阱,因为这也不是真的:final String interesting = "\uF93D\uF936\uF949\uF942"; // Chinese ideograms// Check length, in charactersSystem.out.println(interesting.length()); // prints "4"// Check encoded sizesfinal byte[] utf8Bytes = interesting.getBytes("UTF-8");System.out.println(utf8Bytes.length); // prints "12"final byte[] utf16Bytes= interesting.getBytes("UTF-16");System.out.println(utf16Bytes.length); // prints "10"final byte[] utf32Bytes = interesting.getBytes("UTF-32");System.out.println(utf32Bytes.length); // prints "16"final byte[] isoBytes = interesting.getBytes("ISO-8859-1");System.out.println(isoBytes.length); // prints "4" (probably encoded "????")final byte[] winBytes = interesting.getBytes("CP1252");System.out.println(winBytes.length); // prints "4" (probably encoded "????")(请注意,如果您不提供字符集参数,则使用平台的默认字符集。这在某些上下文中可能很有用,但通常您应该避免依赖于默认值,并且在编码/时始终使用显式字符集解码是必需的。)