从代码看,是基于频率分析,还是比较准确的。
测试代码如下:
import java.io.File;
import java.io.UnsupportedEncodingException;
import java.net.MalformedURLException;
import java.net.URL;
public class Test {
public static void main(String[] args) throws UnsupportedEncodingException, MalformedURLException {
BytesEncodingDetect s = new BytesEncodingDetect();
String str = "??¤¤¤å";
System.out.println(BytesEncodingDetect.nicename[s.detectEncoding(str.getBytes("ISO-8859-1"))]);
System.out.println(new String(str.getBytes("ISO-8859-1"), "BIG5"));
System.out.println(BytesEncodingDetect.nicename[s.detectEncoding("Java世界".getBytes())]);
System.out.println(BytesEncodingDetect.nicename[s.detectEncoding(new URL("http://www.iteye.com"))]);
System.out.println(BytesEncodingDetect.nicename[s.detectEncoding(new File("src/Test.java"))]);
}
}
输出结果:
Big5
??中文
GB-2312
UTF-8
UTF-8
附件: