哈夫曼编码 POJ 1521

最新推荐文章于 2023-02-04 11:15:00 发布

jaune_arc

最新推荐文章于 2023-02-04 11:15:00 发布

阅读量790

点赞数 1

分类专栏： acm刷题文章标签：算法 acm竞赛

本文链接：https://blog.csdn.net/jaune_arc/article/details/107424240

版权

acm刷题专栏收录该内容

10 篇文章 0 订阅

订阅专栏

哈夫曼编码习题——POJ 1521

哈夫曼编码原理
题目理解
实现
可以学到的知识
可扩展部分

第一篇文章，格式无要求。

哈夫曼编码原理

哈夫曼编码是一种根据词频变化的变长二进制编码方式，多用于压缩算法（实践中用的也不多，虽然最优但比较古老了）。作为一个比较基础的数学原理，其思想多用于计算机的算法编程实践中。
详见：哈弗曼编码的实现

题目理解

输入： 大写字符串+下划线的集合，下划线代表空格。
输出： ascii编码所需2进制长度哈夫曼编码后长度压缩率（前者除以后者），遇到END终结。
解题思路： 重点在哈夫曼树的实现。

实现

讲真，这哈夫曼用的比较简单，本来想用正常的哈夫曼，后来还是简化了。
PS：一些代码没优化，可以进一步改动

#include <iostream>
#include <string>
#include <queue>
#include <malloc.h>
#include <map>
#include <vector> 
#include <cstdio>
using namespace std;

int N;
string s;
typedef struct Tree{
	int freq;
	int num;
	int depth;
	struct Tree* left;
	struct Tree* right;
	
}Node;

struct cmp{
	bool operator() (Node* a, Node* b){
		if (a->freq > b->freq)
			return true;
		else 
			return false;
	} 
};


//Node node[30];
int allSize; 

int main(){
	while (cin >> s && s != "END"){
		map<char, int> tmpMap;
		for (int i=0; i<s.size(); i++){
			map<char, int>::iterator iter = tmpMap.find(s[i]);
			if (iter == tmpMap.end()){
				tmpMap.insert(pair<char, int>(s[i], 1));
			}else{
				iter->second += 1;
			}
		}
		
		map<char, int>::iterator iter = tmpMap.begin();
		allSize = 0;
		Node** node = (Node**)malloc(30*sizeof(Node*)) ;
		priority_queue<Node*, vector<Node*>, cmp> queue; 
		
		// 创建node，从0到allSize-1； 
		for (; iter != tmpMap.end(); iter++){
			node[allSize] = (Node*)malloc(sizeof(Node));
			node[allSize]->freq = iter->second;
			node[allSize]->depth = 0;
			node[allSize]->num = allSize;
			node[allSize]->left = NULL;
			node[allSize]->right = NULL; 
			queue.push(node[allSize]);
			allSize++;
			//
		}
		
		// 合并，形成哈夫曼树
		// 使用优先队列
		int last, result = 0;
		while (queue.size() > 1){
			Node* tmp1 = queue.top();
			queue.pop();
			Node* tmp2 = queue.top();
			queue.pop();
			int tmpNum = (tmp1->freq + tmp2->freq);
			result += tmpNum; 
			//cout << "test " << tmpNum << endl;
			int num = tmp1->num;
			node[num]->freq = tmpNum;
			queue.push(node[num]);
			free(node[tmp2->num]);
			last = num;
		}
		if (result == 0){
			result = s.size();
		}
		cout << s.size()*8 << " " << result << " " ;
		printf("%.1lf\n", s.size()*8.0/result);
		free(node[last]);
		free(node);
	}
	
	return 0;
}

可以学到的知识

哈夫曼编码的原理，是最优前缀编码
STL使用： <map>
map初始化：map<char, int> tmpMap;
map查找：map<char, int>::iterator iter = tmpMap.find(“A”);
如果返回tmpMap.end()表示无，否则返回对应的元素，使用iter->first，iter->second来获得前后元素。
map迭代：for(; iter != tmpMap.end(); iter++);
STL使用 <queue>
priority_queue初始化：priority_queue<int, vector<int>, cmp>
其中int表示数据类型，vector<int>表示队列的容器，cmp表示比较方法，可以使用默认提供的less<数据类型>（降序）或greater<数据类型>（升序）；或者自定义。
自定义有两种方式，一种是重载运算符<：

bool operator< (Node a, Node b)
if (a.x == b.x) return a.y > b.y;
return a.x > b.x
// return true 表示a的优先级比b小，放在b之后，本例中表示升序排列。

另一种是重写仿函数：

struct cmp{
bool operator() (Node a, Node b){
if (a.x == b.x) return a.y > b.y;
return a.x > b.x;
}
}
// return true表示a的优先级小于b的优先级，a放在b后

priority_queue插入：queue.push();
priority_queue头部元素：queue.top();
priority_queue弹出队头元素：queue.pop();

注：queue使用堆来实现。

精度控制
使用<cstdio>的printf("%.2lf\n", result)来控制小数点后精度为2。
使用<iostream>的cout << setiosflags(ios::fixed) << setprecision(2) << result; 保留2位小数。
使用cout << setprecision(3) << result; 保留3位有效数字。

可扩展部分

哈弗曼编码与信息论中的自信息量，证明最优
哈夫曼编码的图论理解：树的最短带权路径长度（WPL）
实际使用

jaune_arc

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
哈夫曼编码 POJ 1521

哈夫曼编码习题——POJ 1521哈夫曼编码原理题目理解实现第一篇文章，格式无要求。哈夫曼编码原理哈夫曼编码是一种根据词频变化的变长二进制编码方式，多用于压缩算法（实践中用的也不多，虽然最优但比较古老了）。作为一个比较基础的数学原理，其思想多用于计算机的算法编程实践中。详见：哈弗曼编码的实现题目理解输入：大写字符串+下划线的集合，下划线代表空格。输出： ascii编码所需2进制长度哈夫曼编码后长度压缩率（前者除以后者），遇到END终结。解题思路：重点在哈夫曼树的实现。实现讲真，这
复制链接

扫一扫