利用Haffman 算法实现对ascii字符文件的压缩
EmilMatthew(EmilMatthew@126.com)
摘要:
本文是对Haffman算法进行的一次实践。根据ascii码文件中各ascii字符出现的频率情况创建Haffman树,再将各字符对应的哈夫曼编码写入文件中。同时,亦可根据对应的哈夫曼树,将哈夫曼编码文件解压成字符文件.
关键词: Haffman算法,压缩,解压缩
Implement Haffman algorithm to the zipping of ascii file
EmilMatthew(EmilMatthew@126.com)
Abstract:
This article is a practice of haffman algorithm. First, I create the haffman tree based on the appearance frequency of each ascii character in the files ,then I output each ascii character’s corresponding haffman code to the zipped file. And I also make the program could unzip the haffman zipped files into the ascii files.
Key Words: Haffman Algorithm,Zip,UnZip
1前言:
Haffman算法是个简单而高效的贪心算法,主要用来创建最优二叉树.可以在通讯时,对于出现频率较高的字符,用较少的比特数便可以进行通讯.从而节省通讯线路的资源消耗。
该算法在各类数据结构, 算法,组合数学,离散数学,图论等主题的书籍中都有所涉及。故本文不再赘述,本文致力于用Haffman算法实现压缩与解压缩,采用的语言为C语言,编译环境VC++6.0.
下面给出[1]中实现的Haffman树的结构及创建算法,有两点说明:
a) 这里的Haffman树采用的是基于数组的带左右儿子结点及父结点下标作为存储结点
的二叉树形式,这种空间上的消耗带来了算法实现上的便捷。
b) 由于对于最后生成的Haffman树,其所有叶子结点均为从一个内部树扩充出去的,所以,当外部叶子结点数为m个时,内部结点数为m-1.整个Haffman树的需要的结点数为2m-1.
/*Code1: Haffman Algorithm*/
#define MAXCHAR 30000
#define MAXNODE 300
#define MAXNUM 150
#define InfoType char
struct HtNode
{
EBTreeType ww;
char info;
int parentIndex;
int llinkIndex;
int rlinkIndex;
};
struct HtTree
{
struct HtNode ht[MAXNODE];
int rootIndex;
};
typedef struct HtTree* PHtTree;
PHtTree haffmanAlgorithm(int m,EBTreeType* w)
{
PHtTree pht;
int i,j;
int firstMinIndex,secondMinIndex;
int firstMinW,secondMinW;
pht=(PHtTree)malloc(sizeof(struct HtTree));
assertF(pht!=NULL,"in haffman algorithm,mem apply failure/n");
/*Initialize the tree array*/
for(i=0;i<2*m-1;i++)
{
pht->ht[i].llinkIndex=-1;