huffman压缩 java_基于Huffman编码的压缩技术的Java实现

在某一个特定的文件系统中,某些字符可能会累计重复出现多次。编码压缩技术采用的原理就是统计这些字符出现的频率,并根据频率的高低对该字符进行编码。这样,处理全部信息的总码长一定小于实际信息的符号长度,从而达到压缩的目的。本文用java实现的Huffman编码压缩技术是实现的编码式压缩技术。1Huffman编码原理霍夫曼(Huffman)编码是1952年为文本文件而建立,是一种统计编码。属于无损压缩编码。霍夫曼编码的码长是变化的,对于出现频率高的信息,编码的长度较短;而对于出现频率低的信息,编码长度较长。构造Huffman编码的可以先将原始数据构造成一棵带权值的Huffman树。步骤如下:(1)将信号源的符号按照出现概率递减的顺序排列。(2)将两个最小出现概率进行合并相加,得到的结果作为新符号的出现概率。(3)重复进行步骤1和2直到概率相加的结果等于1为止。(4)在合并运算时,概率大的符号用编码0表示,概率小的符号用编码1表示。(5)记录下概率为1处到当前信号源符号之间的0,l序列,从而得到每个符号的编码。例如电文“ABACCDA”,有4种字符。可以用两位二进制代码:00,01,10,11分别表示A,B,C,D。译码为”00010010101100”。这样的二进制串长度为16位。比原始的字符串长度8*8=64远小。若采用Huffman编码,则得:A:1;B:000C:01;D:001。译码字符串为:1000101010011。长度为13位,比16又小。对于应用文档来说,一般会有大量词汇重复出现,这样两者之间的差距也随着文件的大小与词汇重复出现的频率而越来越高。2用java实现huffman压缩算法的程序流程图1程序流图3程序实现根据上面的流程以及huffman算法的原理,有如下的程序类组织结构:3.1程序相关类Frame类:框架类,用于实现程序的界面,其中还包括文件的读入,统计字符的频率,各字符按频率排序等函数HuffmanNode类:利用Huffman算法的特点,进行Huffman数据结构定义,如图2。BuildHuffman类:初始化Huffman树,并得到编码。其中包括生成Huffman树结构的函数(具体实现参看附程序),编码函数,写二进制文件函数,解码函数等,结构参看图3。3.1.1在Frame类中,可能会遇到的问题在图1的步骤1中(也就是在统计字符串频率时),对于简单的英文字母与数字字符来说,没有什么难度。而对于统计中文字符或其他一些由2个字节构成的字符时,可能会遇到某些问题。因为在对文件进行遍历统计频率时,需要对读入字符的步长加以控制。如果是单字节的字母,则步长为1,而如果是2个字节的字符,步长就需要相应的改为2。这样,就存在一个步长控制问题了。根据目前国际通用的Unicode代码表示的汉字存储区位,位于4E009FFF的为中日韩文字,其他双字节字符也有相应固定的区位。有这样的unicode编码条件,问题就可能解决了:在读入字符时,只需要判断读入的字符是否是位于双字节字符的存储区中。而对于java1.5以上的版本中,character类中提供了isHighSurrogate函数。该函数能判断能判断所读入的字符是否是双字节字符的前半部分。这样就能有效解决读入字符步长控制问题。从而正确统计各种类型的字符出现频率(代码可以见附表)。3.1.2构造HuffmanNode类HuffmanNode是一个定义的典型的Huffman型的数据结构。具体形式如图2。图2就是即将构造的Huffman结构。接下来需要将按频率已经排好序的各字符初始化成该类的实例(具

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
好的,我可以为您提供Huffman压缩与解压缩文件的Java代码。以下是实现的步骤: ## Huffman压缩 1. 读取文件并统计每个字符出现的频率 2. 用字符频率构建Huffman树 3. 遍历Huffman树并生成每个字符的编码 4. 将编码写入输出文件,并将Huffman树写入输出文件 5. 关闭输入输出流 ```java import java.io.*; public class HuffmanCompression { private static final int BYTE_SIZE = 8; public static void compress(String inputFile, String outputFile) throws IOException { FileInputStream inputStream = new FileInputStream(inputFile); int[] frequencies = new int[256]; int totalChars = 0; int nextByte = inputStream.read(); while (nextByte != -1) { frequencies[nextByte]++; totalChars++; nextByte = inputStream.read(); } inputStream.close(); HuffmanNode root = buildHuffmanTree(frequencies); String[] codes = buildCodes(root); FileOutputStream outputStream = new FileOutputStream(outputFile); writeHeader(outputStream, frequencies); writeCompressedData(inputStream, outputStream, codes); outputStream.close(); } private static HuffmanNode buildHuffmanTree(int[] frequencies) { PriorityQueue<HuffmanNode> pq = new PriorityQueue<>(); for (int i = 0; i < frequencies.length; i++) { if (frequencies[i] > 0) { pq.add(new HuffmanNode((byte) i, frequencies[i])); } } while (pq.size() > 1) { HuffmanNode left = pq.poll(); HuffmanNode right = pq.poll(); HuffmanNode parent = new HuffmanNode(left, right); pq.add(parent); } return pq.poll(); } private static String[] buildCodes(HuffmanNode root) { String[] codes = new String[256]; buildCodesHelper(codes, root, ""); return codes; } private static void buildCodesHelper(String[] codes, HuffmanNode node, String code) { if (node.isLeaf()) { codes[node.getByte()] = code; } else { buildCodesHelper(codes, node.getLeft(), code + "0"); buildCodesHelper(codes, node.getRight(), code + "1"); } } private static void writeHeader(FileOutputStream outputStream, int[] frequencies) throws IOException { for (int i = 0; i < frequencies.length; i++) { outputStream.write(frequencies[i]); } } private static void writeCompressedData(FileInputStream inputStream, FileOutputStream outputStream, String[] codes) throws IOException { int currentByte = 0; int numBits = 0; int nextByte = inputStream.read(); while (nextByte != -1) { String code = codes[nextByte]; for (int i = 0; i < code.length(); i++) { currentByte = currentByte << 1; if (code.charAt(i) == '1') { currentByte |= 1; } numBits++; if (numBits == BYTE_SIZE) { outputStream.write(currentByte); currentByte = 0; numBits = 0; } } nextByte = inputStream.read(); } if (numBits > 0) { currentByte = currentByte << (BYTE_SIZE - numBits); outputStream.write(currentByte); } } private static class HuffmanNode implements Comparable<HuffmanNode> { private byte b; private int frequency; private HuffmanNode left; private HuffmanNode right; public HuffmanNode(byte b, int frequency) { this.b = b; this.frequency = frequency; } public HuffmanNode(HuffmanNode left, HuffmanNode right) { this.frequency = left.frequency + right.frequency; this.left = left; this.right = right; } public int getByte() { return b & 0xff; } public int getFrequency() { return frequency; } public boolean isLeaf() { return left == null && right == null; } public HuffmanNode getLeft() { return left; } public HuffmanNode getRight() { return right; } @Override public int compareTo(HuffmanNode o) { return Integer.compare(this.frequency, o.frequency); } } } ``` ## Huffman压缩 1. 读取文件头并构建Huffman树 2. 读取压缩数据,并根据Huffman树解码 3. 将解码后的数据写入输出文件 4. 关闭输入输出流 ```java import java.io.*; public class HuffmanDecompression { private static final int BYTE_SIZE = 8; public static void decompress(String inputFile, String outputFile) throws IOException { FileInputStream inputStream = new FileInputStream(inputFile); int[] frequencies = new int[256]; for (int i = 0; i < frequencies.length; i++) { frequencies[i] = inputStream.read(); } HuffmanNode root = buildHuffmanTree(frequencies); int totalChars = root.getFrequency(); FileOutputStream outputStream = new FileOutputStream(outputFile); int numBits = 0; int currentByte = 0; HuffmanNode node = root; int nextByte = inputStream.read(); while (nextByte != -1) { for (int i = 0; i < BYTE_SIZE; i++) { int bit = (nextByte >> (BYTE_SIZE - 1 - i)) & 1; if (bit == 0) { node = node.getLeft(); } else { node = node.getRight(); } if (node.isLeaf()) { outputStream.write(node.getByte()); node = root; totalChars--; if (totalChars == 0) { break; } } } nextByte = inputStream.read(); } inputStream.close(); outputStream.close(); } private static HuffmanNode buildHuffmanTree(int[] frequencies) { PriorityQueue<HuffmanNode> pq = new PriorityQueue<>(); for (int i = 0; i < frequencies.length; i++) { if (frequencies[i] > 0) { pq.add(new HuffmanNode((byte) i, frequencies[i])); } } while (pq.size() > 1) { HuffmanNode left = pq.poll(); HuffmanNode right = pq.poll(); HuffmanNode parent = new HuffmanNode(left, right); pq.add(parent); } return pq.poll(); } private static class HuffmanNode implements Comparable<HuffmanNode> { private byte b; private int frequency; private HuffmanNode left; private HuffmanNode right; public HuffmanNode(byte b, int frequency) { this.b = b; this.frequency = frequency; } public HuffmanNode(HuffmanNode left, HuffmanNode right) { this.frequency = left.frequency + right.frequency; this.left = left; this.right = right; } public int getByte() { return b & 0xff; } public int getFrequency() { return frequency; } public boolean isLeaf() { return left == null && right == null; } public HuffmanNode getLeft() { return left; } public HuffmanNode getRight() { return right; } @Override public int compareTo(HuffmanNode o) { return Integer.compare(this.frequency, o.frequency); } } } ``` 以上就是Huffman压缩与解压缩文件的Java代码,您可以根据您的需要进行修改和优化。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值