java txt 编码格式

最新推荐文章于 2021-07-29 20:19:19 发布

lili1985516

最新推荐文章于 2021-07-29 20:19:19 发布

阅读量973

点赞数

文章标签： java byte file string class 文档

本文链接：https://blog.csdn.net/lili1985516/article/details/7443119

版权

首先对java中得编码格式进行了研究。发现在java中

java编码与txt编码对应
java	txt
unicode	unicode big endian
utf-8	utf-8
utf-16	unicode
gb2312	ANSI

java读取txt文件，如果编码格式不匹配，就会出现乱码现象。所以读取txt文件的时候需要设置读取编码。txt文档编码格式都是写在文件头的，在程序中需要先解析文件的编码格式，获得编码格式后，在按此格式读取文件就不会产生乱码了。

 
 InputStream inputStream = new FileInputStream("E:/1.txt");  
        byte[] head = new byte[3];  
        inputStream.read(head);    
        String code = "";  
   
            code = "gb2312";  
        if (head[0] == -1 && head[1] == -2 )  
            code = "UTF-16";  
        if (head[0] == -2 && head[1] == -1 )  
            code = "Unicode";  
        if(head[0]==-17 && head[1]==-69 && head[2] ==-65)  
            code = "UTF-8";  
          
        System.out.println(code);

这样就获得了txt的编码格式了。

public class EncodingType
{
  public static System.Text.Encoding GetType(string FILE_NAME)
  {
   FileStream fs = new FileStream(FILE_NAME, FileMode.Open, FileAccess.Read);
   System.Text.Encoding r= GetType(fs);
   fs.Close();
   return r;
  }
  public static System.Text.Encoding GetType(FileStream fs)
  {
   /*byte[] Unicode=new byte[]{0xFF,0xFE};
   byte[] UnicodeBIG=new byte[]{0xFE,0xFF};
   byte[] UTF8=new byte[]{0xEF,0xBB,0xBF};*/

   BinaryReader r = new BinaryReader(fs,System.Text.Encoding.Default);
   byte[] ss=r.ReadBytes(3);
   r.Close();
   //编码类型 Coding=编码类型.ASCII;
   if(ss[0]>=0xEF)
   {
    if(ss[0]==0xEF && ss[1]==0xBB && ss[2]==0xBF)
    {
     return System.Text.Encoding.UTF8;
    }
    else if(ss[0]==0xFE && ss[1]==0xFF)
    {
     return System.Text.Encoding.BigEndianUnicode;
    }
    else if(ss[0]==0xFF && ss[1]==0xFE)
    {
     return System.Text.Encoding.Unicode;
    }
    else
    {
     return System.Text.Encoding.Default;
    }
   }
   else
   {
    return System.Text.Encoding.Default;
   }
  }
}