文件及 IO 流工具包_io 文件流工具包-CSDN博客

本文链接：https://blog.csdn.net/zhangxin09/article/details/55805849

文本介绍了文件及 IO 流的相关操作。基于 Java 7 新的 File 和 Path API。

这类叫 FileHelper，完整源码在 https://gitee.com/sp42_admin/ajaxjs/tree/master/aj-base/src/main/java/com/ajaxjs/util/io。

文件复制、移动

文件复制、移动。

/**
 * 复制文件
 * 
 * @param target    源文件
 * @param dest      目的文件/目录，如果最后一个为目录，则不改名，如果最后一个为文件名，则改名
 * @param isReplace 是否替换已存在的文件，true = 覆盖
 * @return true 表示复制成功
 */
public static boolean copy(String target, String dest, boolean isReplace) {
	try {
		if (isReplace)
			Files.copy(Paths.get(target), Paths.get(dest), StandardCopyOption.REPLACE_EXISTING);
		else
			Files.copy(Paths.get(target), Paths.get(dest));
	} catch (IOException e) {
		LOGGER.warning(e);
		return false;
	}

	return true;
}

/**
 * 移动文件
 * 
 * @param target 源文件
 * @param dest   目的文件/目录，如果最后一个为目录，则不改名，如果最后一个为文件名，则改名
 * @return 是否操作成功
 */
public static boolean move(String target, String dest) {
	try {
		Files.copy(Paths.get(target), Paths.get(dest));
	} catch (IOException e) {
		LOGGER.warning(e);
		return false;
	}

	return true;
}

移动就是重命名了。

文件删除

先看看 Path & Files 新方式的。

/**
 * 删除文件或目录
 * 
 * @param filePath 文件的完全路径
 */
public static void delete(String filePath) {
	Path path = Paths.get(filePath);

	try {
		if (path.toFile().isDirectory()) {
			Files.walkFileTree(path, new SimpleFileVisitor<Path>() {
				@Override
				public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) throws IOException {
					Files.delete(file);
					return FileVisitResult.CONTINUE;
				}

				@Override
				public FileVisitResult postVisitDirectory(Path dir, IOException e) throws IOException {
					if (e == null) {
						Files.delete(dir);
						return FileVisitResult.CONTINUE;
					} else
						throw e;
				}
			});
		} else
			Files.delete(path);
	} catch (IOException e) {
		LOGGER.warning(e);
	}
}

感觉好啰嗦，不如旧的方式。

/**
 * 删除文件或目录
 * 
 * @param file 文件对象
 */
public static void delete(File file) {
	if (file.isDirectory()) {
		File[] files = file.listFiles();

		for (File f : files)
			delete(f);
	}

	if (!file.delete())
		LOGGER.warning("文件 {0} 删除失败！", file.toString());
}

/**
 * 删除文件或目录
 * 
 * @param filePath 文件的完全路径
 */
public static void delete(String filePath) {
	delete(new File(filePath));
}

打开文本文件

打开文件，返回其文本内容。开发者还应该明确规定文件的字符编码，以避免任异常或解析错误。

/**
 * 打开文件，返回其文本内容，可指定编码
 * 
 * @param filePath 文件磁盘路径
 * @param encode   文件编码
 * @return 文件内容
 */
public static String openAsText(String filePath, Charset encode) {
	LOGGER.info("读取文件[{0}]", filePath);

	Path path = Paths.get(filePath);

	try {
		if (Files.isDirectory(path))
			throw new IOException("参数 fullpath：" + filePath + " 不能是目录，请指定文件");
	} catch (IOException e) {
		LOGGER.warning(e);
		return null;
	}

	try {
		StringBuilder sb = new StringBuilder();
		Files.lines(path, encode).forEach(str -> sb.append(str));

		return sb.toString();
	} catch (IOException e) {
		LOGGER.warning(e);
	}

	return null;
}

重载方法默认为 UTF-8 编码。如果读入的文件的编码是 ANSI 编码，那么会报 java.nio.charset.MalformedInputException:Input 异常。

/**
 * 打开文件，返回其文本内容
 * 
 * @param filePath 文件的完全路径
 * @return 文件内容
 */
public static String openAsText(String filePath) {
	return openAsText(filePath, StandardCharsets.UTF_8);
}

此方法不适合读取很大的文件，因为可能存在内存空间不足的问题。这时可以试试旧的方式。

/**
 * 旧的方式打开
 * 
 * @param filePath
 * @return
 */
public static String openAsTextOld(String filePath) {
	LOGGER.info("读取文件[{0}]", filePath);

	try {
		return byteStream2string(new FileInputStream(new File(filePath)));
	} catch (FileNotFoundException e) {
		LOGGER.warning(e);
		return null;
	}
}

另外有打开文件的 Byte[]。

/**
 * 获得指定文件的 byte 数组
 * 
 * @param file 文件对象
 * @return 文件字节数组
 */
public static byte[] openAsByte(File file) {
	try {
		return inputStream2Byte(new FileInputStream(file));
	} catch (FileNotFoundException e) {
		LOGGER.warning(e);
		return null;
	}
}

写入文件

旧方法

先说说写入数据 byte[] 而不只是文本，直接使用 FileOutputStream 。

/**
 * 保存文件数据
 * 
 * @param file 文件对象
 * @param data 文件数据
 */
public static void save(File file, byte[] data) {
	try (OutputStream out = new FileOutputStream(file)) {
		out.write(data);
		out.flush();
	} catch (IOException e) {
		LOGGER.warning(e);
	}
}

当然文本也可以（getBytes() 转为 byte[]），如下例。

save(new File("c://temp/abc.txt"), "dfdf 你好".getBytes(StandardCharsets.UTF_8));

如果用封装方法写入，好处是带缓冲。

/**
 * 保存文本文件
 * 
 * @param file 文件对象
 * @param text 文本内容
 */
public static void saveTextOld(File file, String text) {
	LOGGER.info("正在保存文件{0}， 保存内容：\n{1}", file.toString(), text);

	try (OutputStream out = new FileOutputStream(file);) {
		bytes2output(out, text.getBytes(StandardCharsets.UTF_8));
	} catch (IOException e) {
		LOGGER.warning(e);
	}
}

其实 OutputStreramWriter 将输出的字符流转化为字节流输出的时候，字符流就已经带缓冲，故直接调用更好。

/**
 * 保存文本文件
 * 
 * @param file 文件对象
 * @param text 文本内容
 */
public static void saveTextOld(File file, String text) {
	LOGGER.info("正在保存文件{0}， 保存内容：\n{1}", file.toString(), text);

	try (OutputStream out = new FileOutputStream(file); OutputStreamWriter writer = new OutputStreamWriter(out, StandardCharsets.UTF_8);) {
		writer.write(text);
	} catch (IOException e) {
		LOGGER.warning(e);
	}
}

另外老外提到的方法是 FileWriter，中文不行，会乱码。

新方法

还是先说说保存数据的方法。Files.write() 封装好，一句话搞定。

/**
 * 保存文件数据
 * 
 * @param file        文件对象
 * @param data        文件数据
 * @param isOverwrite 是否覆盖文件
 */
public static void save(File file, byte[] data, boolean isOverwrite) {
	LOGGER.info("正在保存文件" + file);

	try {
		if (!isOverwrite && file.exists())
			throw new IOException(file + "文件已经存在，禁止覆盖！");

		if (file.isDirectory())
			throw new IOException(file + " 不能是目录，请指定文件");

		if (!file.exists())
			file.createNewFile();

		Files.write(file.toPath(), data);
	} catch (IOException e) {
		LOGGER.warning(e);
	}
}

那么保存文本也不是什么事了。

/**
 * 保存文本内容
 * 
 * @param file 文件对象
 * @param text 文本内容
 */
public static void saveText(File file, String text) {
	if (Version.isDebug) {
		String _text = text.length() > 200 ? text.substring(0, 200) + "..." : text;
		LOGGER.info("正在保存文件{0}， 保存内容：\n{1}", file.toString(), _text);
	} else
		LOGGER.info("正在保存文件{0}， 保存内容：\n{1}", file.toString());

	save(file, text.getBytes(StandardCharsets.UTF_8), true);
}

/**
 * 保存文本内容
 * 
 * @param filePath 文件路径
 * @param text     文本内容
 */
public static void saveText(String filePath, String text) {
	saveText(new File(filePath), text);
}

创建目录及空文件

创建目录

注意 mkdir() 和 mkdirs() 的区别。

/**
	 * 创建目录
	 * 
	 * @param folder 目录字符串
	 */
	public static void mkDir(String folder) {
		File _folder = new File(folder);
		if (!_folder.exists())// 先检查目录是否存在，若不存在建立
			_folder.mkdirs();
		
		_folder.mkdir();
	}

	/**
	 * 根据文件名创建目录。 先剥离文件名，剩下的就是目录名。 如果没有输出目录则先创建。
	 * 
	 * @param filePath 完整路径，最后一个元素为文件名
	 */
	public static void mkDirByFileName(String filePath) {
		String[] arr = filePath.split("\\/|\\\\");
		arr[arr.length - 1] = "";// 取消文件名，让最后一个元素为空字符串
		String folder = String.join(SEPARATOR, arr);

		mkDir(folder);
	}

	/**
	 * 检测文件所在的目录是否存在，如果没有则建立。可以跨多个未建的目录
	 * 
	 * @param file 必须是文件，不是目录
	 */
	public static void initFolder(File file) {
		if (file.isDirectory())
			throw new IllegalArgumentException("参数必须是文件，不是目录");

		mkDir(file.getParent());
	}

	/**
	 * 检测文件所在的目录是否存在，如果没有则建立。可以跨多个未建的目录
	 * 
	 * @param file 必须是文件，不是目录
	 */
	public static void initFolder(String file) {
		initFolder(new File(file));
	}

创建空文件

/**
	 * 新建一个空文件
	 * 
	 * @param folder   如果路径不存在则自动创建
	 * @param fileName 保存的文件名
	 * @return 新建文件的 File 对象
	 */
	public static File createFile(String folder, String fileName) {
		LOGGER.info("正在新建文件 {0}", folder + SEPARATOR + fileName);

		mkDir(folder);
		return new File(folder + SEPARATOR + fileName);
	}

	/**
	 * 创建文件，注意这是一个空的文件。如果没有指定目录则创建；检测是否可以覆盖文件
	 * 
	 * @param filePath    文件完整路径，最后一个元素是文件名
	 * @param isOverwrite 是否覆盖文件
	 * @return 文件对象
	 * @throws IOException 文件已经存在
	 */
	public static File createFile(String filePath, boolean isOverwrite) throws IOException {
		LOGGER.info("正在新建文件 {0}", filePath);

		mkDirByFileName(filePath);

		File file = new File(filePath);
		if (!isOverwrite && file.exists())
			throw new IOException("文件已经存在，禁止覆盖！");

		return file;
	}

未收录 API

先在这里收集好，需要时才加入源码库。

获取文件 MIME

获取文件格式 MIME（Multipurpose Internet Mail Extensions）类型有两种方法。第一个是通过 Files.probeContentType()：

/**
 * 获取文件名的 MIME 类型
 * 
 * @param filename 文件名
 * @return MIME 类型
 */
public static String getMime(String filename) {
	Path path = Paths.get(filename);

	try {
		return Files.probeContentType(path);
	} catch (IOException e) {
		LOGGER.warning(e);
		return null;
	}
}

// test
assertEquals("text/html", getMime("C:\\foo\\bar.htm"));

第二种是通过 JavaX 的 MimetypesFileTypeMap。

/**
 * 获取文件名的 MIME 类型
 * 
 * @param file 文件对象
 * @return MIME 类型
 */
public static String getMime(File file) {
	String contentType = new MimetypesFileTypeMap().getContentType(file);

	if (file.getName().endsWith(".png"))
		contentType = "image/png"; // TODO needs?

	if (contentType == null)
		contentType = "application/octet-stream";

	return contentType;
}

两种检测手段都不会真正打开文件进行检查而是单纯文件名的字符串判断。

遍历整个文件目录

使用函数式风格设计，通过 Predicate<Path> method 给出回调函数。

/**
 * 遍历整个文件目录，递归的
 * 
 * @param _dir   指定的目录
 * @param method 搜索函数
 * @return 搜索结果
 * @throws IOException 参数不是目录
 */
public static List<Path> walkFileTree(String _dir, Predicate<Path> method) throws IOException {
	Path dir = Paths.get(_dir);

	if (!Files.isDirectory(dir))
		throw new IOException("参数 [" + _dir + "]不是目录，请指定目录");

	List<Path> result = new LinkedList<>();
	Files.walkFileTree(dir, new SimpleFileVisitor<Path>() {
		@Override
		public FileVisitResult visitFile(Path file, BasicFileAttributes attrs) {
			if (method.test(file))
				result.add(file);

			return FileVisitResult.CONTINUE;
		}
	});

	return result;
}

/**
 * 遍历整个目录，非递归的
 * 
 * @param _dir   指定的目录
 * @param method 搜索函数
 * @return 搜索结果
 * @throws IOException 参数不是目录
 */
public static List<Path> walkFile(String _dir, Predicate<Path> method) throws IOException {
	Path dir = Paths.get(_dir);

	if (!Files.isDirectory(dir))
		throw new IOException("参数 ：" + _dir + " 不是目录，请指定目录");

	List<Path> result = new LinkedList<>();
	try (DirectoryStream<Path> stream = Files.newDirectoryStream(dir)) {
		for (Path e : stream) {
			if (method.test(e))
				result.add(e);
		}
	}

	return result;
}

判断是否 UTF-8

文件格式通常只有 UTF-8 和 GBK 两种格式，而一般的 UTF-8 格式文件一般是带着 BOM 信息的，这样的文件刚开始的三个字节永远是一样的，所以如果能根据这个规律探测文件编码格式，那似乎问题就解决。但是很可惜我们的文件是不带 BOM 信息的，所以只好统计整个文件的字节流，看符合 UTF-8 格式的字节多还是非 UTF-8 格式的字节多，如果是前者，那这个文件就很可能是 UTF-8 格式。根据这个规律，我们发现探测的准确度还是比较高的。

/**
 * 判断是否 UTF-8
 * 
 * @param data
 * @return true 表示为 UTF-8 编码
 */
public static boolean isUTF8(byte[] data) {
	int countGoodUtf = 0, countBadUtf = 0;
	byte currentByte = 0x00, previousByte = 0x00;

	for (int i = 1; i < data.length; i++) {
		currentByte = data[i];
		previousByte = data[i - 1];

		if ((currentByte & 0xC0) == 0x80) {
			if ((previousByte & 0xC0) == 0xC0)
				countGoodUtf++;
			else if ((previousByte & 0x80) == 0x00)
				countBadUtf++;
		} else if ((previousByte & 0xC0) == 0xC0)
			countBadUtf++;
	}

	return countGoodUtf > countBadUtf;
}