摘 要
在计算机信息处理中,“哈夫曼编码”是一种一致性编码法(又称"熵编码法"),用于数据的无损耗压缩。我们会发现一些数据经常出现的频率高,有些出现的频率低。我们利用哈夫曼算法建立一棵哈夫曼树(最优二叉树),同时将数据出现的频率作为权值赋给哈夫曼树中的结点。根据建立好的哈夫曼树我们进行编码,从根结点出发在左子树则标为0,右则标为1。直到到指定的叶子结点,然后将遍历过程中标记的0,1代码存在一个数组中。以此实现将使用频率高的字符的编码尽可能的少,也就使得总的长度减少。在哈夫曼编码的基础上进行解码,就可以还原压缩的数据。
Abstract
In the computer information processing, "Huffman coding" is a consistent coding method (also known as "entropy coding method") for lossless data compression. We will find some of the data often appear in high frequency, some of the frequency of the low. Huffman algorithm we use to establish a Huffman tree (the optimal binary tree), while the frequency of the data as the weights assigned to Huffman tree nodes. According to our well-established Huffman coding, starting from the root node in the left subtree is marked as 0, the right is labeled 1. Up to the designated leaf nodes, and then traverse the process of marking the existence of a 0,1 code array. Implementation will be used as the character encoding of high frequency low as possible, thus making the total length of the reduction. In the Huffman coding based on the decoding, you can restore the compressed data.
目 录