C# 霍夫曼解码

csdn_aspnet

已于 2024-06-19 09:07:02 修改

阅读量840

点赞数 19

分类专栏： C# 文章标签： c# 算法数据结构

于 2024-06-19 09:04:13 首次发布

本文链接：https://blog.csdn.net/hefeng_aspnet/article/details/139590544

版权

C# 专栏收录该内容

46 篇文章 2 订阅

订阅专栏

Huffman Tree 进行解码示例图

c语言：c语言霍夫曼编码 | 贪婪算法（Huffman Coding | Greedy Algo）_霍夫曼的贪婪c语言-CSDN博客

c++：c++ 霍夫曼编码 | 贪婪算法（Huffman Coding | Greedy Algo）_霍夫曼的贪婪算法设计核心代码-CSDN博客

c#：C# 霍夫曼编码 | 贪婪算法（Huffman Coding | Greedy Algo）-CSDN博客

c++ STL：c++ STL 霍夫曼编码 | 贪婪算法（Huffman Coding | Greedy Algo）-CSDN博客

java：java 霍夫曼编码 | 贪婪算法（Huffman Coding | Greedy Algo）-CSDN博客

python：python 霍夫曼编码 | 贪婪算法（Huffman Coding | Greedy Algo）-CSDN博客

javascript：JavaScript 霍夫曼编码 | 贪婪算法（Huffman Coding | Greedy Algo）-CSDN博客

我们在之前的文章中讨论了霍夫曼编码。在这篇文章中，我们将讨论解码。

例子：

输入数据： AAAAAABCCCCCCDDEEEEE
频率： A：6，B：1，C：6，D：2，E：5

编码数据： 00000000000011001010101010111111110101010

哈夫曼树： “#”是用于内部节点的特殊字符，因为
内部节点不需要字符字段。

#(20)
/ \
#(12) #(8)
/ \ / \
A(6) C(6) E(5) #(3)
/ \
B(1) D(2)

‘A’ 的代码是 ‘00’，‘C’ 的代码是 ‘01’，..

解码数据： AAAAAAABCCCCCCDDEEEEE

输入数据： GeeksforGeeks

字符频率为
e 10, f 1100, g 011, k 00, o 010, r 1101, s 111

编码的哈夫曼数据： 01110100011111000101101011101000111
解码的哈夫曼数据： geeksforgeeks

请按照以下步骤解决问题：

注意：要解码编码数据，我们需要霍夫曼树。我们遍历二进制编码数据。要找到与当前位对应的字符，我们使用以下简单步骤：

        1、我们从根开始，依次进行，直到找到叶子。
        2、如果当前位为 0，我们就移动到树的左节点。
        3、如果该位为 1，我们移动到树的右节点。
        4、如果在遍历过程中遇到叶节点，我们会打印该特定叶节点的字符，然后再次从步骤 1 开始继续迭代编码数据。

下面的代码将一个字符串作为输入，对其进行编码，并将其保存在变量编码字符串中。然后对其进行解码并打印原始字符串。

下面是上述方法的实现：

using System;
using System.Collections.Generic;
using System.Linq;

namespace HuffmanEncoding
{
// To store the frequency of character of the input data
class FrequencyTable
{
private readonly Dictionary<char, int> _freq = new Dictionary<char, int>();

public void Add(char c)
{
if (_freq.ContainsKey(c))
{
_freq++;
}
else
{
_freq = 1;
}
}

public Dictionary<char, int> ToDictionary()
{
return _freq;
}
}

// A Huffman tree node
class HuffmanNode : IComparable<HuffmanNode>
{
public HuffmanNode Left { get; set; }
public HuffmanNode Right { get; set; }
public char Data { get; set; }
public int Frequency { get; set; }

public HuffmanNode(char data, int freq)
{
Data = data;
Frequency = freq;
}

// Define the comparison method for sorting the nodes in the heap
public int CompareTo(HuffmanNode other)
{
return Frequency - other.Frequency;
}
}

// Utility class for creating Huffman codes
class HuffmanEncoder
{
// To map each character its Huffman value
private readonly Dictionary<char, string> _codes = new Dictionary<char, string>();

// Create an empty min-heap
private readonly List<HuffmanNode> _minHeap = new List<HuffmanNode>();

// Function to build the Huffman tree and store it in minHeap
private void BuildHuffmanTree(Dictionary<char, int> freq)
{
foreach (var kvp in freq)
{
_minHeap.Add(new HuffmanNode(kvp.Key, kvp.Value));
}
// Convert the list to a min-heap using the built-in sort method
_minHeap.Sort();
while (_minHeap.Count > 1)
{
var left = _minHeap.First();
_minHeap.RemoveAt(0);
var right = _minHeap.First();
_minHeap.RemoveAt(0);
var top = new HuffmanNode('$', left.Frequency + right.Frequency);
top.Left = left;
top.Right = right;
_minHeap.Add(top);
// Sort the list to maintain the min-heap property
_minHeap.Sort();
}
}

// Utility function to store characters along with their Huffman value in a hash table
private void StoreCodes(HuffmanNode root, string str)
{
if (root == null)
{
return;
}
if (root.Data != '$')
{
_codes[root.Data] = str;
}
StoreCodes(root.Left, str + "0");
StoreCodes(root.Right, str + "1");
}

// Utility function to print characters along with their Huffman value
public void PrintCodes(HuffmanNode root, string str)
{
if (root == null)
{
return;
}
if (root.Data != '$')
{
Console.WriteLine(root.Data + " : " + str);
}
PrintCodes(root.Left, str + "0");
PrintCodes(root.Right, str + "1");
}

// Function iterates through the encoded string s
// If s[i] == '1' then move to node.right
// If s[i] == '0' then move to node.left
// If leaf node, append the node.data to our output string
public string DecodeFile(HuffmanNode root, string s)
{

string ans = "";
HuffmanNode curr = root;
int n = s.Length;
for (int i = 0; i < n; i++)
{
if (s[i] == '0')
{
curr = curr.Left;
}
else
{
curr = curr.Right;
}

// Reached leaf node
if (curr.Left == null && curr.Right == null)
{
ans += curr.Data;
curr = root;
}
}
return ans + "\0";
}

// Function to build the Huffman tree and store it in minHeap
public void BuildCodes(Dictionary<char, int> freq)
{
BuildHuffmanTree(freq);
StoreCodes(_minHeap.First(), "");
}

public Dictionary<char, string> GetCodes()
{
return _codes;
}

public HuffmanNode GetRoot()
{
return _minHeap.First();
}
}

class Program
{
static void Main(string[] args)
{
// Driver code
string str = "geeksforgeeks";
string encodedString = "";
string decodedString;
var freqTable = new FrequencyTable();
foreach (char c in str)
{
freqTable.Add(c);
}
var huffmanEncoder = new HuffmanEncoder();
huffmanEncoder.BuildCodes(freqTable.ToDictionary());
Console.WriteLine("Character With their Frequencies:");
foreach (var kvp in huffmanEncoder.GetCodes())
{
Console.WriteLine($"{kvp.Key} : {kvp.Value}");
}

foreach (char c in str)
{
encodedString += huffmanEncoder.GetCodes();
}

Console.WriteLine("\nEncoded Huffman data:");
Console.WriteLine(encodedString);

// Function call
decodedString = huffmanEncoder.DecodeFile(huffmanEncoder.GetRoot(), encodedString);
Console.WriteLine("\nDecoded Huffman Data:");
Console.WriteLine(decodedString);
}
}
}

输出：
具有以下频率的字符：

e 10
f 1100
g 011
k 00
o 010
r 1101
s 111

编码的哈夫曼数据：
01110100011111000101101011101000111

解码的哈夫曼数据：
geeksforgeeks

时间复杂度：

霍夫曼编码算法的时间复杂度为O(n log n)，其中n为输入字符串的字符个数。辅助空间复杂度也是O(n)，其中n为输入字符串的字符个数。

在给定的代码实现中，时间复杂度主要由使用优先级队列创建 Huffman 树决定，这需要 O(n log n) 时间。空间复杂度主要由用于存储字符频率和代码的映射决定，这需要 O(n) 空间。用于打印代码和存储代码的递归函数也增加了空间复杂度。

比较输入文件大小和输出文件大小：
比较输入文件大小和霍夫曼编码的输出文件。我们可以用一种简单的方法计算输出数据的大小。假设我们的输入是一个字符串“geeksforgeeks”，存储在文件 input.txt 中。

输入文件大小：

输入： “geeksforgeeks”
字符总数即输入长度：13
大小： 13 个字符出现次数 * 8 位 = 104 位或 13 个字节。

输出文件大小：

输入： “geeksforgeeks”

——————————————————
字符 | 频率 | 二进制哈夫曼值 |
——————————————————

e | 4 | 10 |
f | 1 | 1100 |
g | 2 | 011 |
k | 2 | 00 |
o | 1 | 010 |
r | 1 | 1101 |
s | 2 | 111 |

—————————————————

因此要计算输出大小：

e：出现 4 次 * 2 位 = 8 位
f：出现 1 次 * 4 位 = 4 位
g：出现 2 次 * 3 位 = 6 位
k：出现 2 次 * 2 位 = 4 位
o：出现 1 次 * 3 位 = 3 位
r：出现 1 次 * 4 位 = 4 位
s：出现 2 次 * 3 位 = 6 位

总和： 35 位，约 5 字节

由此可见，编码后的数据量是比较大的，上面的方法也可以帮我们确定N的值，也就是编码后数据的长度。

csdn_aspnet

关注

19
点赞
踩
15

收藏

觉得还不错? 一键收藏
0
评论
C# 霍夫曼解码

在给定的C#实现中，时间复杂度主要由使用优先级队列创建 Huffman 树决定，这需要 O(n log n) 时间。霍夫曼编码算法的时间复杂度为O(n log n)，其中n为输入字符串的字符个数。下面的代码将一个字符串作为输入，对其进行编码，并将其保存在变量编码字符串中。4、如果在遍历过程中遇到叶节点，我们会打印该特定叶节点的字符，然后再次从步骤 1 开始继续迭代编码数据。由此可见，编码后的数据量是比较大的，上面的方法也可以帮我们确定N的值，也就是编码后数据的长度。要解码编码数据，我们需要霍夫曼树。
复制链接

扫一扫