【数据结构】哈夫曼树的前期准备（二）

最新推荐文章于 2022-10-29 22:29:13 发布

繁星伴晚安

最新推荐文章于 2022-10-29 22:29:13 发布

阅读量157

点赞数

分类专栏： # 数据结构作业

本文链接：https://blog.csdn.net/weixin_48180029/article/details/117536376

版权

数据结构作业专栏收录该内容

13 篇文章 3 订阅

订阅专栏

注意： 所有的准备工作只是为了将这个题目分解成很多小问题解决，而最终我写的哈夫曼解压缩代码会与准备工作有很大区别。

根据得到的哈夫曼树，生成编码
在这里插入图片描述

生成编码需要用到递归函数
递归函数首先确定函数的形参和返回值，明确函数的作用。
其次，确定递归之间的关联。
最后确定终止条件。

写压缩文件

如何根据一个字符串（由8位二进制组成）确定其对应的ASCLL编码？
在内存中，任何一个ASCLL字符都占用一个字节，一个字节8位二进制
比如’a’ 对应的ASCLL码是97，那么’a’在内存中保存的数据就是97对应的二进制01100001（要保证是8位）
还有了解补码、原码的知识，这里不再赘述。

现在，已知一个字符串str = “01100001”，如何求出对应表示的ASCLL码?

#include<iostream>
using namespace std;
int main() {
	string str = "01100001";
	char c = 0;
	for (int i = 0; i < 8; i++) {
		c = 2 * c + (str[i]-'0');
	}
	cout << c;
	return 0;
}

在这里插入图片描述
现在，我需要再次读取data.txt内容，每次读取一个字节，前面的工作已经可以得到文件中每一个字符对应的哈夫曼编码了。但是此时每个字符对应的哈夫曼不是八位的，所以就可能是多个字符的哈夫曼编码构成一个八位的二进制。还有就是当读完最后一个字符时，对应形成的二进制不一定是八位，所以后面要补0凑成八位二进制。

在这里插入图片描述
验证正确性

输出数据的二进制表示形式
在这里插入图片描述

在这里插入图片描述

文件解压缩

现在，我们要对压缩后的文件进行解压。我们可以获得压缩后文件的内容，要根据读出的哈夫曼编码找出对应的字符。现在遇到的难题是一次读文件读的是字符（8个二进制）但是哈夫曼编码不一定是八位的。

还有一个问题就是先前我们用map_code保存了字符-哈夫曼编码的键值对
但是，现在我们需要根据哈夫曼编码获得对应的字符，所以还需要再用一个map_change保存哈夫曼编码-字符的键值对。
在这里插入图片描述

其实在写的过程中又遇到了问题。因为之前由于最后一个字节是没有凑成八位后填补的0，但是此时如果读0，它是存在哈夫曼编码中的，所以之前在读data.txt文件时，要记录文件有多少个字节。
在这里插入图片描述

在这里插入图片描述

#pragma once
#include<iostream>
using namespace std;
#include<queue>
#include<unordered_map>
#include<fstream>
#include<bitset>
struct huffmanNode
{
	char element;
	int weight;
	huffmanNode* leftChild, * rightChild;
	huffmanNode(char element = NULL, int weight = 0, huffmanNode* leftChild = NULL,
		huffmanNode* rightChild = NULL) {
		this->element = element;
		this->weight = weight;
		this->leftChild = leftChild;
		this->rightChild = rightChild;
	}

};
struct cmp
{
	bool operator()(huffmanNode* x, huffmanNode* y) {
		return x->weight > y->weight;
	}

};
class huffmanTree
{
private:
	unordered_map<char, int> map_weight;
	unordered_map<char, string> map_code;
	unordered_map<string, char>map_change;
	huffmanNode* root;
	int fileSize;//记录data.txt文件的大小
public:
	huffmanTree();
	~huffmanTree();

	void createTree();
	void compressedFile(const char* path = "D:\\data.txt");
	void outputFile(const char* path = "D:\\newData.txt");
	void dispose(huffmanNode* t);
	void print();
	void print(huffmanNode* t);
	void output();
	void print_binary();
	void getCode();//获取编码
	void Code(huffmanNode* t, string code);

	void uncompress(const char* path = "D:\\test.txt");
};

huffmanTree::huffmanTree()
{
	root = NULL;
}

huffmanTree::~huffmanTree()
{
	dispose(root);
	root = NULL;
}

void huffmanTree::createTree()
{
	if (map_weight.empty()) {
		cout << "未读取文件，请先读取文件!\n";
		return;
	}
	priority_queue<huffmanNode*, vector<huffmanNode*>, cmp> q;
	unordered_map<char, int>::iterator it = map_weight.begin();
	for (; it != map_weight.end(); it++) {
		huffmanNode* node = new huffmanNode(it->first, it->second);

		q.push(node);
	}

	huffmanNode* w, * x, * y;
	while (q.size() > 1) {
		x = q.top();
		q.pop();
		y = q.top();
		q.pop();
		int weight = x->weight + y->weight;
		w = new huffmanNode('?', weight, x, y);
		q.push(w);
	}

	root = q.top();


}

void huffmanTree::compressedFile(const char* path)
{
	ifstream ifs;
	ifs.open(path, ios::in);
	char c;
	fileSize = 0;
	while ((c = ifs.get()) != EOF) {
		map_weight[c]++;
		fileSize++;
	}
	ifs.close();


}

inline void huffmanTree::outputFile(const char* 
	path)
{
	char value = 0;//将要向文件写入的字符
	int wSize = 0;
	ifstream ifs;
	ifs.open("D:\\data.txt",ios::in);
	char c;//保存从data.txt文件中每次读取的字符
	string str;//记录字符c对应的哈夫曼编码

	fstream ofs;
	ofs.open(path, ios::out);
	while ((c = ifs.get()) != EOF) {
		str = map_code[c];
		for (int i = 0; i < str.length(); i++) {
			value = value * 2 + (str[i] - '0');
			wSize++;
			if (wSize == 8) {
				ofs << value;
				wSize = 0;
				value = 0;
			}

			
		}
	}
	if (wSize < 8) {
		value <<= (8 - wSize);
		ofs << value;
	}
	ifs.close();
	ofs.close();


}

void huffmanTree::dispose(huffmanNode* t)
{
	if (t != NULL) {
		dispose(t->leftChild);
		dispose(t->rightChild);
		delete t;
	}
}

void huffmanTree::print()
{
	print(root);
}

void huffmanTree::print(huffmanNode* t)
{
	if (t != NULL) {
		print(t->leftChild);
		cout << t->element;
		print(t->rightChild);

	}
}

void huffmanTree::output()
{
	unordered_map<char, string>::iterator it = map_code.begin();
	for (; it != map_code.end(); it++) {
		cout << it->first << " " << it->second << "\n";
	}
}

inline void huffmanTree::print_binary()
{
	ifstream ifs;
	ifs.open("D:\\newData.txt", ios::in);
	char c;
	while ((c = ifs.get()) != EOF) {
		cout << bitset<8>(c);
	}
	cout << "\n";
	ifs.close();
}

inline void huffmanTree::getCode()
{
	Code(root,"");
}

inline void huffmanTree::Code(huffmanNode* t, string code)
{
	if (t != NULL) {
		if (t->leftChild == NULL && t->rightChild == NULL) {
			map_code[t->element] = code;
			map_change[code] = t->element;
		}
		else {
			Code(t->leftChild, code + "0");
			Code(t->rightChild, code + "1");
		}
	}
}

inline void huffmanTree::uncompress(const char* path)
{
	ifstream ifs;
	ifs.open("D:\\newData.txt", ios::in);
	ofstream ofs;
	ofs.open(path, ios::out);
	char c;
	string coding="";
	int count = 0;
	while ((c = ifs.get()) != EOF) {
		string str = bitset<8>(c).to_string();
		for (int i = 0; i < str.length(); i++) {
			coding += str[i];
			if (map_change.count(coding)) {//代表存在这个编码
				ofs << map_change[coding];
				count++;
				if (count == fileSize)
					break;
				coding = "";

			}
		}
		if (count == fileSize)
			break;
	}

	ofs.close();
	ifs.close();
}

#include<iostream>
using namespace std;
#include"huffmanTree.h"
int main() {
	huffmanTree h;
	h.compressedFile();
	h.createTree();
	h.getCode();
	h.outputFile();
	h.print_binary();
	h.uncompress();

	return 0;
}

在这里插入图片描述

写的内容很粗糙，所以要有后期的修改，一步一步的是实现每一个模块的功能，基本上都是自己做出来的，所以收获感很强。

繁星伴晚安

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
【数据结构】哈夫曼树的前期准备（二）

注意：所有的准备工作只是为了将这个题目分解成很多小问题解决，而最终我写的哈夫曼解压缩代码会与准备工作有很大区别。根据得到的哈夫曼树，生成编码生成编码需要用到递归函数递归函数首先确定函数的形参和返回值，明确函数的作用。其次，确定递归之间的关联。最后确定终止条件。写压缩文件现在，我需要再次读取data.txt内容，每次读取一个字节，前面的工作已经可以得到文件中每一个字符对应的哈夫曼编码了。如何根据一个字符串（由8位二进制组成）确定其对应的ASCLL编码？在内存中，任何一个ASCL
复制链接

扫一扫