java获取一个文件的编码，使用UniversalDetector类

最新推荐文章于 2024-04-30 17:28:01 发布

守夜码农

最新推荐文章于 2024-04-30 17:28:01 发布

阅读量783

点赞数 1

文章标签： java 开发语言

本文链接：https://blog.csdn.net/qq_51081923/article/details/132433508

版权

这边使用的是UniversalDetector类，这个类需要通过maven依赖导入，对应的依赖：

        <!-- 文件编码识别工具 -->
        <dependency>
            <groupId>com.github.albfernandez</groupId>
            <artifactId>juniversalchardet</artifactId>
            <version>2.3.0</version>
        </dependency>

导入对应的依赖后就可以使用UniversalDetector类来获取文件的编码了：

package org.example;

import org.mozilla.universalchardet.UniversalDetector;

import java.io.*;
import java.nio.charset.Charset;

public class TestCharSet {
    public static Charset detectFileEncoding(String filePath) throws IOException {
        FileInputStream fis = new FileInputStream(new File(filePath));
        BufferedInputStream bis = new BufferedInputStream(fis);

        Charset charset = Charset.defaultCharset();
        byte[] buffer = new byte[4096];
        UniversalDetector detector = new UniversalDetector(null);

        int bytesRead;
        while ((bytesRead = bis.read(buffer)) != -1) {
            if (detector.isDone()) {
                break;
            }

            detector.handleData(buffer, 0, bytesRead);
        }

        detector.dataEnd();
        String encoding = detector.getDetectedCharset();
        if (encoding != null) {
            charset = Charset.forName(encoding);
        }

        detector.reset();
        bis.close();
        fis.close();

        return charset;
    }

    public static void main(String[] args) {
        String filePath = "E:/123.txt";

        try {
            Charset fileEncoding = detectFileEncoding(filePath);
            System.out.println("File Encoding: " + fileEncoding.name());
        } catch (IOException e) {
            System.out.println("Error occurred while detecting file encoding: " + e.getMessage());
        }
    }
}

测试的结果为：