从代码看,是基于频率分析,还是比较准确的。
测试代码如下:
- import java.io.File;
- import java.io.UnsupportedEncodingException;
- import java.net.MalformedURLException;
- import java.net.URL;
- public class Test {
- public static void main(String[] args) throws UnsupportedEncodingException, MalformedURLException {
- BytesEncodingDetect s = new BytesEncodingDetect();
- String str = "??¤¤¤å";
- System.out.println(BytesEncodingDetect.nicename[s.detectEncoding(str.getBytes("ISO-8859-1"))]);
- System.out.println(new String(str.getBytes("ISO-8859-1"), "BIG5"));
- System.out.println(BytesEncodingDetect.nicename[s.detectEncoding("Java世界".getBytes())]);
- System.out.println(BytesEncodingDetect.nicename[s.detectEncoding(new URL("http://www.iteye.com"))]);
- System.out.println(BytesEncodingDetect.nicename[s.detectEncoding(new File("src/Test.java"))]);
- }
- }
输出结果
- Big5
- ??中文
- GB-2312
- UTF-8
- UTF-8
BytesEncodingDetect.java (153.24 K) 类的源代码,太大了,各位自己下载附件吧
原文:http://www.java2000.net/p1679