一、学习内容
Huffman编码:是一种编码方式,是可变字长编码的一种。节点的权越小,其离树的根节点越远。被认为是最优二叉树。
相关链接:Huffman编码_=-=-=的博客-CSDN博客
Huffman编码算法详解_qinglongzhan的博客-CSDN博客_huffman编码算法
例子:给定权集:{2,3,4,7,8,9},构造一棵Huffman树。
下面为闵老师提出的学习内容:
1.定义了一个内嵌类. 如果是实际项目, 我就为其单独写一个文件了, 这里仅仅是为了方便.
2.每个节点的内容包括: 字符 (仅对叶节点有效)、权重 (用的整数, 该字符的个数)、指向子节点父节点的引用. 这里指向父节点的引用是必须的.
3.NUM_CHARS 是指 ASCII 字符集的字符个数. 为方便起见, 仅支持 ASCII.
4.inputText 的引入只是想把程序尽可能细分成独立的模块, 这样便于学习和调拭.
5.alphabet 仅存 inputText 出现过的字符.
6.alphabetLength 完全可以用 alphabet.length() 代替, 但我就喜欢写成独立的变量.
7.charCounts 要为所有的节点负责, 其元素对应于 HuffmanNode 里面的 weight. 为了节约, 可以把其中一个省掉.
8.charMapping 是为了从 ASCII 里面的顺序映射到 alphabet 里面的顺序. 这也是我只采用 ASCII 字符集 (仅 256 字符) 的原因.
9.huffmanCodes 将个字符映射为一个字符串, 其实应该是二进制串. 我这里不是想偷懒么.
10.nodes 要先把所有的节点存储在一个数组里面, 然后再链接它们. 这是常用招数.
11.构造方法仅初始化了 charMapping, 读入了文件.
12.readText 采用了最简单粗暴的方式. 还可以有其它的逐行读入的方式.
13.要自己弄个文本文件, 里面存放一个字符串 abcdedgsgs 之类, 或者几行英文文本.
————————————————
版权声明:本文为CSDN博主「minfanphd」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/minfanphd/article/details/116975721
package datastructure.tree;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.stream.Collectors;
/**
* Huffman tree, encoding, and decoding. For simplicity, only ASCII characters
* are supported.
*
* @author Fan Min minfanphd@163.com.
*/
public class Huffman {
/**
* An inner class for Huffman nodes.
*/
class HuffmanNode {
/**
* The char. Only valid for leaf nodes.
*/
char character;
/**
* Weight. It can also be double.
*/
int weight;
/**
* The left child.
*/
HuffmanNode leftChild;
/**
* The right child.
*/
HuffmanNode rightChild;
/**
* The parent. It helps constructing the Huffman code of each character.
*/
HuffmanNode parent;
/**
*******************
* The first constructor
*******************
*/
public HuffmanNode(char paraCharacter, int paraWeight, HuffmanNode paraLeftChild,
HuffmanNode paraRightChild, HuffmanNode paraParent) {
character = paraCharacter;
weight = paraWeight;
leftChild = paraLeftChild;
rightChild = paraRightChild;
parent = paraParent;
}// Of HuffmanNode
/**
*******************
* To string.
*******************
*/
public String toString() {
String resultString = "(" + character + ", " + weight + ")";
return resultString;
}// Of toString
}// Of class HuffmanNode
/**
* The number of characters. 256 for ASCII.
*/
public static final int NUM_CHARS = 256;
/**
* The input text. It is stored in a string for simplicity.
*/
String inputText;
/**
* The length of the alphabet, also the number of leaves.
*/
int alphabetLength;
/**
* The alphabet.
*/
char[] alphabet;
/**
* The count of chars. The length is 2 * alphabetLength - 1 to include
* non-leaf nodes.
*/
int[] charCounts;
/**
* The mapping of chars to the indices in the alphabet.
*/
int[] charMapping;
/**
* Codes for each char in the alphabet. It should have the same length as
* alphabet.
*/
String[] huffmanCodes;
/**
* All nodes. The last node is the root.
*/
HuffmanNode[] nodes;
/**
*********************
* The first constructor.
*
* @param paraFilename
* The text filename.
*********************
*/
public Huffman(String paraFilename) {
charMapping = new int[NUM_CHARS];
readText(paraFilename);
}// Of the first constructor
/**
*********************
* Read text.
*
* @param paraFilename
* The text filename.
*********************
*/
/**
* 文件读取
*/
public void readText(String paraFilename) {
try {
inputText = Files.newBufferedReader(Paths.get(paraFilename), StandardCharsets.UTF_8)
.lines().collect(Collectors.joining("\n"));
} catch (Exception ee) {
System.out.println(ee);
System.exit(0);
} // Of try
System.out.println("The text is:\r\n" + inputText);
}// Of readText
}