【知识发现】python开源哈夫曼编码库huffman

1、哈夫曼树:


   安装:pip install huffman


   Github地址: https://github.com/nicktimko/huffman
   pypi地址:https://pypi.python.org/pypi/huffman
   源码很值得参考。


2、案例:

# -*- coding: utf-8 -*-
'''
Created on 2017年9月26日

@author: Administrator
'''
import huffman
import collections

t1=huffman.codebook([('A', 2), ('B', 4), ('C', 1), ('D', 1)])
print (t1)
t2=huffman.codebook(collections.Counter('man the stand banana man').items())
print (t2)

说明:rovided an iterable of 2-tuples in (symbol, weight) format, generate a Huffman codebook, returned as a dictionary in {symbol: code, ...} format.

3、构造哈夫曼树参考:

#构造哈夫曼树
import heapq
trees=huff_df.values.T.tolist()  #dataframe转化成list
heapq.heapify(trees)
while len(trees)>1:
    rightChild,leftChild=heapq.heappop(trees),heapq.heappop(trees)
    parentNode=(leftChild[0]+rightChild[0],leftChild,rightChild)
    heapq.heappush(trees,parentNode)
print (trees)

huff_df是一个dataframe,转化成list,里面是结点及其频率。

以下是Python实现哈夫曼编码的示例代码: ```python import heapq from collections import defaultdict # 定义节点类 class Node: def __init__(self, char, freq): self.char = char self.freq = freq self.left = None self.right = None # 定义小于运算符,用于堆排序 def __lt__(self, other): return self.freq < other.freq # 构建哈夫曼树 def build_huffman_tree(freq_dict): heap = [] for char, freq in freq_dict.items(): heapq.heappush(heap, Node(char, freq)) while len(heap) > 1: node1 = heapq.heappop(heap) node2 = heapq.heappop(heap) merged_node = Node(None, node1.freq + node2.freq) merged_node.left = node1 merged_node.right = node2 heapq.heappush(heap, merged_node) return heap[0] # 递归遍历哈夫曼树,生成编码字典 def traverse_huffman_tree(node, current_code, encoding_dict): if node is None: return if node.char is not None: encoding_dict[node.char] = current_code return traverse_huffman_tree(node.left, current_code + "0", encoding_dict) traverse_huffman_tree(node.right, current_code + "1", encoding_dict) # 对文本进行编码 def encode_text(text, encoding_dict): encoded_text = "" for char in text: encoded_text += encoding_dict[char] return encoded_text # 对编码后的文本进行解码 def decode_text(encoded_text, decoding_dict): current_code = "" decoded_text = "" for bit in encoded_text: current_code += bit if current_code in decoding_dict: decoded_text += decoding_dict[current_code] current_code = "" return decoded_text # 统计字符出现频率 def count_freq(text): freq_dict = defaultdict(int) for char in text: freq_dict[char] += 1 return freq_dict # 示例 text = "hello world" freq_dict = count_freq(text) huffman_tree = build_huffman_tree(freq_dict) encoding_dict = {} traverse_huffman_tree(huffman_tree, "", encoding_dict) encoded_text = encode_text(text, encoding_dict) decoding_dict = {v: k for k, v in encoding_dict.items()} decoded_text = decode_text(encoded_text, decoding_dict) print("编码字典:", encoding_dict) print("编码后的文本:", encoded_text) print("解码后的文本:", decoded_text) ```
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值