第 28 天: Huffman 编码 (节点定义与文件读取)
从这开始感觉真的有点麻烦了,而且出现了很多没有怎么接触的东西,查阅了很多。特别是文件那一块的东西。
什么是哈夫曼树?
要理解什么是哈夫曼树,首先要理解几个概念
路径:从树中一个结点到另一个结点之间的分支构成这两个结点之间的路径。
路径长度:路径上的分支数目。
树的路径长度:从树根到每一个结点的路径长度之和。(完全二叉树就是树的路径长度最短的二叉树)
考虑带权的结点
结点的带权路径长度:从该结点到树根之间的路径长度和结点上权的乘积。
树的带权路径长度:树的所有结点的带权路径之和(WPL)
假设有 n 个权值为{ w1 , w2 , ...} ,是构建一颗有 n 个叶子结点的二叉树,每个叶子结点带权为 Wn ,则其中带权路径长度 WPL 最小的二叉树称为 最优二叉树 或 哈夫曼树。
package day17;
import java.nio.charset.StandardCharsets;
import java.nio.file.Files;
import java.nio.file.Paths;
import java.util.stream.Collectors;
public class Huffman {
/**
* An inner class for Huffman nodes.
*/
class HuffmanNode {
/**
* The char. Only valid for leaf nodes.
*/
char character;
/**
* Weight. It can also be double.
*/
int weight;
/**
* The left child.
*/
HuffmanNode leftChild;
/**
* The right child.
*/
HuffmanNode rightChild;
/**
* The parent. It helps constructing the Huffman code of each character.
*/
HuffmanNode parent;
/**
*******************
* The first constructor
*******************
*/
public HuffmanNode(char paraCharacter, int paraWeight, HuffmanNode paraLeftChild, HuffmanNode paraRightChild,
HuffmanNode paraParent) {
character = paraCharacter;
weight = paraWeight;
leftChild = paraLeftChild;
rightChild = paraRightChild;
parent = paraParent;
}// Of HuffmanNode
/**
*******************
* To string.
*******************
*/
public String toString() {
String resultString = "(" + character + ", " + weight + ")";
return resultString;
}// Of toString
}// Of class HuffmanNode
/**
* The number of characters. 256 for ASCII.
*/
public static final int NUM_CHARS = 256;
/**
* The input text. It is stored in a string for simplicity.
*/
String inputText;
/**
* The length of the alphabet, also the number of leaves.
*/
int alphabetLength;
/**
* The alphabet.
*/
char[] alphabet;
/**
* The count of chars. The length is 2 * alphabetLength - 1 to include non-leaf
* nodes.
*/
int[] charCounts;
/**
* The mapping of chars to the indices in the alphabet.
*/
int[] charMapping;
/**
* Codes for each char in the alphabet. It should have the same length as
* alphabet.
*/
String[] huffmanCodes;
/**
* All nodes. The last node is the root.
*/
HuffmanNode[] nodes;
/**
*********************
* The first constructor.
*
* @param paraFilename The text filename.
*********************
*/
public Huffman(String paraFilename) {
charMapping = new int[NUM_CHARS];
readText(paraFilename);
}// Of the first constructor
/**
*********************
* Read text.
*
* @param paraFilename The text filename.
*********************
*/
public void readText(String paraFilename) {
try {
inputText = Files.newBufferedReader(Paths.get(paraFilename), StandardCharsets.UTF_8).lines()
.collect(Collectors.joining("\n"));
} catch (Exception ee) {
System.out.println(ee);
System.exit(0);
} // Of try
System.out.println("The text is:\r\n" + inputText);
}// Of readText
public static void main(String args[]) {
Huffman tempHuffman = new Huffman("E:/Jialuofei/huffmantext-small.txt");
}
}
没写完,仅是今天的代码,对于这个readText方法,里面的开始确实有点懵,后面一个个仔细看了才逐渐弄懂,包括try catch处理异常的问题。然后我随便写了个文本文档把地址传上去了。得到结果: