这是一个在处理大文本文件字符编码转换时碰到的问题,即使用CharsetDecoder.decode()方法解码一个MappedByteBuffer对象时,如果这个MBB对象的长度设置的不好,可能会出现“java.nio.charset.MalformedInputException:Malformed input length is 2.”的错误。但是如果直接使用Charset.decode()方法,则不会出现这样的错误。两端代码片段如下:
1、使用CharsetDecoder.decode()方法:
1
.
2 File infile = new File(inFilename);
3 RandomAccessFile raf = new RandomAccessFile(infile, "r");
4 MappedByteBuffer mbb = raf.getChannel().map(FileChannel.MapMode.READ_ONLY,0,6000);
5 Charset inCharset = Charset.forName("GBK");
6 Charset outCharset = Charset.forName("UTF-8");
7
8 CharsetDecoder inDecoder = inCharset.newDecoder();
9 CharsetEncoder outEncoder = outCharset.newEncoder();
10
11 CharBuffer cb = inDecoder.decode(mbb);
12
13 ByteBuffer outbb = outEncoder.encode(cb);
14
15 CharSequence str = new String(outbb.array());
16 System.out.println("str is :"+str);
17 .
18
19
2 File infile = new File(inFilename);
3 RandomAccessFile raf = new RandomAccessFile(infile, "r");
4 MappedByteBuffer mbb = raf.getChannel().map(FileChannel.MapMode.READ_ONLY,0,6000);
5 Charset inCharset = Charset.forName("GBK");
6 Charset outCharset = Charset.forName("UTF-8");
7
8 CharsetDecoder inDecoder = inCharset.newDecoder();
9 CharsetEncoder outEncoder = outCharset.newEncoder();
10
11 CharBuffer cb = inDecoder.decode(mbb);
12
13 ByteBuffer outbb = outEncoder.encode(cb);
14
15 CharSequence str = new String(outbb.array());
16 System.out.println("str is :"+str);
17 .
18
19
2、直接使用Charset.decode()方法:
.
File infile = new File(inFilename);
RandomAccessFile raf = new RandomAccessFile(infile, "r");
MappedByteBuffer mbb = raf.getChannel().map(FileChannel.MapMode.READ_ONLY,0,6000);
Charset inCharset = Charset.forName("GBK");
Charset outCharset = Charset.forName("UTF-8");
//CharsetDecoder inDecoder = inCharset.newDecoder();
//CharsetEncoder outEncoder = outCharset.newEncoder();
CharBuffer cb = inCharset.decode(mbb);
ByteBuffer outbb = outCharset.encode(cb);
CharSequence str = new String(outbb.array());
System.out.println("str is :"+str);
.
File infile = new File(inFilename);
RandomAccessFile raf = new RandomAccessFile(infile, "r");
MappedByteBuffer mbb = raf.getChannel().map(FileChannel.MapMode.READ_ONLY,0,6000);
Charset inCharset = Charset.forName("GBK");
Charset outCharset = Charset.forName("UTF-8");
//CharsetDecoder inDecoder = inCharset.newDecoder();
//CharsetEncoder outEncoder = outCharset.newEncoder();
CharBuffer cb = inCharset.decode(mbb);
ByteBuffer outbb = outCharset.encode(cb);
CharSequence str = new String(outbb.array());
System.out.println("str is :"+str);
.