下面这个函数亲测实验成功。
获取txt编码格式函数如下:
private String getCharset(String fileName) throws IOException{
BufferedInputStream bin = new BufferedInputStream(new FileInputStream(fileName));
int p = (bin.read() << 8) + bin.read();
String code = null;
switch (p) {
case 0xefbb:
code = "UTF-8";
break;
case 0xfffe:
code = "Unicode";
break;
case 0xfeff:
code = "UTF-16BE";
break;
default:
code = "GBK";
}
return code;
}
测试读取文件:
public String getTextFromText(String filePath){
try {
InputStreamReader isr = new InputStreamReader(new FileInputStream(filePath),getCharset(filePath));
BufferedReader br = new BufferedReader(isr);
StringBuffer sb = new StringBuffer();
String temp = null;
while((temp = br.readLine()) != null){
sb.append(temp);
}
br.close();
return sb.toString();
} catch (FileNotFoundException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return null;
}
原文http://blog.sina.com.cn/s/blog_68ed2a9b0100vqrn.html
备注:
实验发现这篇文章(http://tinyking.blog.51cto.com/3338571/667453)给的方法不行。
InputStream inputStream = new FileInputStream("E:/1.txt");
byte[] head = new byte[3];
inputStream.read(head);
String code = "";
code = "gb2312";
if (head[0] == -1 && head[1] == -2 )
code = "UTF-16";
if (head[0] == -2 && head[1] == -1 )
code = "Unicode";
if(head[0]==-17 && head[1]==-69 && head[2] ==-65)
code = "UTF-8";
System.out.println(code);
该博客介绍了如何使用Java读取不同编码格式的TXT文件,提供了一个名为`getCharset`的函数,通过判断文件头字节来确定编码,如UTF-8、Unicode、UTF-16BE或GBK。此外,还展示了`getTextFromText`方法用于读取整个文件内容。文章中提到某些其他方法可能不可靠。
2万+

被折叠的 条评论
为什么被折叠?



