日撸Java三百行 day29(Huffman编码-建树)

文章详细介绍了如何构建Huffman树的过程,包括自底向上的构建思想,通过遍历输入字符串来初始化字母表,以及逐步构建二叉树的步骤。在构建过程中,首先统计字符出现次数,然后根据字符频率创建叶子节点,通过选择最小权值节点构建树结构,最后找到树的根节点。
摘要由CSDN通过智能技术生成

1.建立Huffman树

1.1 Huffman树构建思想

Huffman树为了使得构建出来的树的带权路径长度最小,所以才采取自底向上的构建方式,让权值越大的越靠近根。这里以Huffman二叉树为例,首先从节点中选择权值最小的两个节点,生成一个父节点然后连接起来(一般小的做左孩子大的做右孩子),更新父节点的权值,以此下去直到所有节点都被链接。

1.2 字母表构建

  • 初始化charMapping,建立tempCharCounts,遍历输入字符串,根据对应字符的ASCII码修改tempCharCounts中对应下标位置的值;
  • 再通过遍历tempCharCounts,计算出非零值个数以更新alphabetlength;
  • 根据alphabetlength建立alphabet、charCounts,根据tempCharCounts的非零值,alphabet中增加对应ACSII码的字符,charCounts中记录该字符出现次数,charMapping中在该字符对应的ACSII大小的下标位置,记录其在alphabet中位置的下标。
public void constructAlphabet() {
	// Initialize.
	Arrays.fill(charMapping, -1);

	// The count for each char. At most NUM_CHARS chars.
	int[] tempCharCounts = new int[NUM_CHARS];

	// The index of the char in the ASCII charset.
	int tempCharIndex;

	// Step 1. Scan the string to obtain the counts.
	char tempChar;
	for (int i = 0; i < inputText.length(); i++) {
		tempChar = inputText.charAt(i);
		tempCharIndex = (int) tempChar;

		System.out.print("" + tempCharIndex + " ");

		tempCharCounts[tempCharIndex]++;
	} // Of for i

	// Step 2. Scan to determine the size of the alphabet.
	alphabetLength = 0;
	for (int i = 0; i < 255; i++) {
		if (tempCharCounts[i] > 0) {
			alphabetLength++;
		} // Of if
	} // Of for i

	// Step 3. Compress to the alphabet
	alphabet = new char[alphabetLength];
	charCounts = new int[2 * alphabetLength - 1];

	int tempCounter = 0;
	for (int i = 0; i < NUM_CHARS; i++) {
		if (tempCharCounts[i] > 0) {
			alphabet[tempCounter] = (char) i;
			charCounts[tempCounter] = tempCharCounts[i];
			charMapping[i] = tempCounter;
			tempCounter++;
		} // Of if
    } // Of for i

	System.out.println("The alphabet is: " + Arrays.toString(alphabet));
	System.out.println("Their counts are: " + Arrays.toString(charCounts));
	System.out.println("The char mappings are: " + Arrays.toString(charMapping));
}// Of constructAlphabet

Array.fill():可以在指定位置进行指定数值填充。

1.3 构建Huffman树

  • 申请节点空间并建立对应的tempProcessed以记录该节点是否已被选择;
  • 建立并根据alphabet和charCount初始化所有的叶子节点的字符与权值;
  • 找最小权值的节点(下标)做左孩子,修改tempProcessed该下标位置的内容为true。再找此时最小权值的节点做右孩子,创建一个父节点然后链接它们。
/**
*********************
* Construct the tree.
*********************
*/
public void constructTree() {
	// Step 1. Allocate space.
	nodes = new HuffmanNode[alphabetLength * 2 - 1];
	boolean[] tempProcessed = new boolean[alphabetLength * 2 - 1];

	// Step 2. Initialize leaves.
	for (int i = 0; i < alphabetLength; i++) {
		nodes[i] = new HuffmanNode(alphabet[i], charCounts[i], null, null, null);
	} // Of for i

	// Step 3. Construct the tree.
	int tempLeft, tempRight, tempMinimal;
	for (int i = alphabetLength; i < 2 * alphabetLength - 1; i++) {
		// Step 3.1 Select the first minimal as the left child.
		tempLeft = -1;
		tempMinimal = Integer.MAX_VALUE;
		for (int j = 0; j < i; j++) {
			if (tempProcessed[j]) {
				continue;
			} // Of if

			if (tempMinimal > charCounts[j]) {
				tempMinimal = charCounts[j];
				tempLeft = j;
			} // Of if
		} // Of for j
		tempProcessed[tempLeft] = true;

		// Step 3.2 Select the second minimal as the right child.
		tempRight = -1;
		tempMinimal = Integer.MAX_VALUE;
		for (int j = 0; j < i; j++) {
			if (tempProcessed[j]) {
				continue;
			} // Of if

			if (tempMinimal > charCounts[j]) {
				tempMinimal = charCounts[j];
				tempRight = j;
			} // Of if
		} // Of for j
		tempProcessed[tempRight] = true;
		System.out.println("Selecting " + tempLeft + " and " + tempRight);

		// Step 3.3 Construct the new node.
		charCounts[i] = charCounts[tempLeft] + charCounts[tempRight];
		nodes[i] = new HuffmanNode('*', charCounts[i], nodes[tempLeft], nodes[tempRight], null);

		// Step 3.4 Link with children.
		nodes[tempLeft].parent = nodes[i];
		nodes[tempRight].parent = nodes[i];
		System.out.println("The children of " + i + " are " + tempLeft + " and " + tempRight);
	} // Of for i
}// Of constructTree

1.4 求根函数

/**
*********************
* Get the root of the binary tree.
* 
* @return The root.
*********************
*/
public HuffmanNode getRoot() {
	return nodes[nodes.length - 1];
}// Of getRoot

2.测试主程序

public static void main(String args[]) {
	Huffman tempHuffman = new Huffman("E:/postgraduate/csdn/temp/huffmantext-small.txt");
	tempHuffman.constructAlphabet();
		
	tempHuffman.constructTree();
		
	HuffmanNode tempRoot = tempHuffman.getRoot();
	System.out.println("The root is: " + tempRoot);
}// Of main

输出:

注:这里的charMapping太长,故没有截图结完,内容为出现字符对应的ASCII码的位置记录着该字符在alphabet中下标,无效值为-1。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值