数据结构与算法基础-学习-36-哈夫曼之优化

目录

一、知识点回顾

1、哈夫曼树

2、哈夫曼编码

3、哈夫曼解码

二、环境信息

三、哈夫曼编码解码流程

四、优化

1、字符统计操作优化

2、转换操作优化

3、密文数据优化

五、虚机测试

1、测试数据

2、测试效果


一、知识点回顾

1、哈夫曼树

链接:数据结构与算法基础-学习-17-二叉树之哈夫曼树

2、哈夫曼编码

链接:数据结构与算法基础-学习-18-哈夫曼编码

3、哈夫曼解码

链接:数据结构与算法基础-学习-19-哈夫曼解码

二、环境信息

名称
CPUIntel(R) Core(TM) i5-1035G1 CPU @ 1.00GHz
操作系统CentOS Linux release 7.9.2009 (Core)
内存4G
逻辑核数4
gcc版本4.8.5 20150623 (Red Hat 4.8.5-44) (GCC)
cmake版本2.8.12.2

三、哈夫曼编码解码流程

四、优化

1、字符统计操作优化

原实现:从头遍历数组,找到字符,计数加一,没有找到追加字符,明文数据有多少个字符遍历多少次数组,时间复杂度:O(n^2)。

新实现:建立HASH表,字符为KEY,计数为VALUE,HASH表搜索失败,插入KEY,计数置一,搜索成功,计数加一,时间复杂度:O(n)。

HASH表的实现可以参考之前的博客:

数据结构与算法基础-学习-20-查找之散列表(HASH TABLE)

2、转换操作优化

生成哈夫曼树节点时,需要从所有节点中找两个最小的节点,组合为一个新的节点。

原实现:数组中包含所有节点,每次找两个最小的节点,都需要从数组头部开始扫描到尾部,无论这个节点是否被合并过。n个叶子结点,需要合并n-1次,一共n * 2 - 1个结点,有多少个节点遍历多少次数组。

新实现:自动升序排序的链表,每次只要取出链表头部的两个节点,就是最小的两个值,再将生成的新节点压入到链表中,链表中也不存在合并过的节点,减少了链表长度,提升了链表排序的效率。

链表的实现可以参考之前的博客:

数据结构与算法基础-学习-13-线性表之链队

3、密文数据优化

明文例如占用1016个字节,生成的密文是由0和1组成,变为4396字节,我们可以将密文数据转换成二进制数据进行存储,存储空间就减少为4396 / 8 = 549,大大节省了存储空间。不过在密文转二进制、二进制转密文之间会有一个性能上的消耗。

二进制转密文的实现可以参考之前的博客:

C语言学习-23-十进制转二进制(多种方法实现)

五、虚机测试

1、测试数据

[gbase@czg2 Exec]$ cat /home/gbase/TestData/E.txt
The pthread_rwlock_trywrlock() function shall apply a write lock like the pthread_rwlock_wrlock() function, with the exception that the function shall fail if any thread currently holds rwlock (for reading or writing).The pthread_rwlock_wrlock()function shall apply a write lock to the read-write lock referenced by rwlock. The calling thread acquires the write lock if no other thread (reader or writer) holds the read-write lock rwlock. Otherwise, the thread shall block until it can acquire the lock. The calling thread may deadlock if at the time the  call is made it holds the read-write lock (whether a read or write lock). Implementations may favor writers over readers to avoid writer starvation. Results are undefined if any of these functions are called with an uninitialized read-write lock. If  a  signal  is  delivered to a thread waiting for a read-write lock for writing, upon return from the signal handler the thread resumes waiting for the read-write lock for writing as if it was not interrupted.

2、测试效果

[gbase@czg2 Exec]$ perf stat -e page-faults ./TestHfm < /home/gbase/TestData/E.txt
ExecCopyStr Function, The Memory Was Successfully Expanded.
2024-07-04 14:58:21-P[9456]-T[9456]-[Info ]-HfmTreePrint       :
HfmTabLen          : 74
HfmCodeTabLen      : 38
[ Index | Chr  | Code         ]
[ 1     | '\n' | 000111000    ]
[ 2     | ' '  | 110          ]
[ 3     | '('  | 0110100      ]
[ 4     | ')'  | 0001101      ]
[ 5     | ','  | 00011101     ]
[ 6     | '-'  | 0001100      ]
[ 7     | '.'  | 0110111      ]
[ 8     | 'I'  | 011010101    ]
[ 9     | 'O'  | 0001110011   ]
[ 10    | 'R'  | 0001110010   ]
[ 11    | 'T'  | 10011100     ]
[ 12    | '_'  | 0001111      ]
[ 13    | 'a'  | 1000         ]
[ 14    | 'b'  | 011010100    ]
[ 15    | 'c'  | 10101        ]
[ 16    | 'd'  | 10100        ]
[ 17    | 'e'  | 001          ]
[ 18    | 'f'  | 111111       ]
[ 19    | 'g'  | 1111100      ]
[ 20    | 'h'  | 0000         ]
[ 21    | 'i'  | 0111         ]
[ 22    | 'k'  | 00010        ]
[ 23    | 'l'  | 0101         ]
[ 24    | 'm'  | 0110110      ]
[ 25    | 'n'  | 11110        ]
[ 26    | 'o'  | 0100         ]
[ 27    | 'p'  | 1111101      ]
[ 28    | 'q'  | 011010111    ]
[ 29    | 'r'  | 1110         ]
[ 30    | 's'  | 01100        ]
[ 31    | 't'  | 1011         ]
[ 32    | 'u'  | 100110       ]
[ 33    | 'v'  | 10011101     ]
[ 34    | 'w'  | 10010        ]
[ 35    | 'x'  | 0110101101   ]
[ 36    | 'y'  | 1001111      ]
[ 37    | 'z'  | 0110101100   ]
[ Index | Weight | ParentIndex | ChildIndexL | ChildIndexR | Ptr        ]
[ 1     | 1      | 40          | 0           | 0           | 0xc59204   ]
[ 2     | 163    | 70          | 0           | 0           | 0xc59218   ]
[ 3     | 6      | 48          | 0           | 0           | 0xc5922c   ]
[ 4     | 6      | 47          | 0           | 0           | 0xc59240   ]
[ 5     | 3      | 43          | 0           | 0           | 0xc59254   ]
[ 6     | 6      | 47          | 0           | 0           | 0xc59268   ]
[ 7     | 8      | 49          | 0           | 0           | 0xc5927c   ]
[ 8     | 2      | 42          | 0           | 0           | 0xc59290   ]
[ 9     | 1      | 39          | 0           | 0           | 0xc592a4   ]
[ 10    | 1      | 39          | 0           | 0           | 0xc592b8   ]
[ 11    | 4      | 45          | 0           | 0           | 0xc592cc   ]
[ 12    | 6      | 46          | 0           | 0           | 0xc592e0   ]
[ 13    | 64     | 64          | 0           | 0           | 0xc592f4   ]
[ 14    | 2      | 42          | 0           | 0           | 0xc59308   ]
[ 15    | 37     | 59          | 0           | 0           | 0xc5931c   ]
[ 16    | 36     | 59          | 0           | 0           | 0xc59330   ]
[ 17    | 92     | 67          | 0           | 0           | 0xc59344   ]
[ 18    | 22     | 55          | 0           | 0           | 0xc59358   ]
[ 19    | 10     | 51          | 0           | 0           | 0xc5936c   ]
[ 20    | 44     | 61          | 0           | 0           | 0xc59380   ]
[ 21    | 63     | 63          | 0           | 0           | 0xc59394   ]
[ 22    | 23     | 56          | 0           | 0           | 0xc593a8   ]
[ 23    | 54     | 62          | 0           | 0           | 0xc593bc   ]
[ 24    | 8      | 49          | 0           | 0           | 0xc593d0   ]
[ 25    | 41     | 60          | 0           | 0           | 0xc593e4   ]
[ 26    | 53     | 62          | 0           | 0           | 0xc593f8   ]
[ 27    | 11     | 51          | 0           | 0           | 0xc5940c   ]
[ 28    | 2      | 41          | 0           | 0           | 0xc59420   ]
[ 29    | 81     | 66          | 0           | 0           | 0xc59434   ]
[ 30    | 25     | 57          | 0           | 0           | 0xc59448   ]
[ 31    | 77     | 65          | 0           | 0           | 0xc5945c   ]
[ 32    | 16     | 54          | 0           | 0           | 0xc59470   ]
[ 33    | 5      | 45          | 0           | 0           | 0xc59484   ]
[ 34    | 32     | 58          | 0           | 0           | 0xc59498   ]
[ 35    | 1      | 38          | 0           | 0           | 0xc594ac   ]
[ 36    | 9      | 50          | 0           | 0           | 0xc594c0   ]
[ 37    | 1      | 38          | 0           | 0           | 0xc594d4   ]
[ 38    | 2      | 41          | 37          | 35          | 0xc594e8   ]
[ 39    | 2      | 40          | 10          | 9           | 0xc594fc   ]
[ 40    | 3      | 43          | 1           | 39          | 0xc59510   ]
[ 41    | 4      | 44          | 38          | 28          | 0xc59524   ]
[ 42    | 4      | 44          | 14          | 8           | 0xc59538   ]
[ 43    | 6      | 46          | 40          | 5           | 0xc5954c   ]
[ 44    | 8      | 48          | 42          | 41          | 0xc59560   ]
[ 45    | 9      | 50          | 11          | 33          | 0xc59574   ]
[ 46    | 12     | 52          | 43          | 12          | 0xc59588   ]
[ 47    | 12     | 52          | 6           | 4           | 0xc5959c   ]
[ 48    | 14     | 53          | 3           | 44          | 0xc595b0   ]
[ 49    | 16     | 53          | 24          | 7           | 0xc595c4   ]
[ 50    | 18     | 54          | 45          | 36          | 0xc595d8   ]
[ 51    | 21     | 55          | 19          | 27          | 0xc595ec   ]
[ 52    | 24     | 56          | 47          | 46          | 0xc59600   ]
[ 53    | 30     | 57          | 48          | 49          | 0xc59614   ]
[ 54    | 34     | 58          | 32          | 50          | 0xc59628   ]
[ 55    | 43     | 60          | 51          | 18          | 0xc5963c   ]
[ 56    | 47     | 61          | 22          | 52          | 0xc59650   ]
[ 57    | 55     | 63          | 30          | 53          | 0xc59664   ]
[ 58    | 66     | 64          | 34          | 54          | 0xc59678   ]
[ 59    | 73     | 65          | 16          | 15          | 0xc5968c   ]
[ 60    | 84     | 66          | 25          | 55          | 0xc596a0   ]
[ 61    | 91     | 67          | 20          | 56          | 0xc596b4   ]
[ 62    | 107    | 68          | 26          | 23          | 0xc596c8   ]
[ 63    | 118    | 68          | 57          | 21          | 0xc596dc   ]
[ 64    | 130    | 69          | 13          | 58          | 0xc596f0   ]
[ 65    | 150    | 69          | 59          | 31          | 0xc59704   ]
[ 66    | 165    | 70          | 29          | 60          | 0xc59718   ]
[ 67    | 183    | 71          | 61          | 17          | 0xc5972c   ]
[ 68    | 225    | 71          | 62          | 63          | 0xc59740   ]
[ 69    | 280    | 72          | 64          | 65          | 0xc59754   ]
[ 70    | 328    | 72          | 2           | 66          | 0xc59768   ]
[ 71    | 408    | 73          | 67          | 68          | 0xc5977c   ]
[ 72    | 608    | 73          | 69          | 70          | 0xc59790   ]
[ 73    | 1016   | 0           | 71          | 72          | 0xc597a4   ]
2024-07-04 14:58:21-P[9456]-T[9456]-[Info ]-DllPrint           :
HeadNode           : 0xc59f20
NodeCnt            : 1
InitF              : [NULL,(nil)]
CopyF              : [NULL,(nil)]
DestroyF           : [NULL,(nil)]
PrintF             : [NULL,(nil)]
Mode               : Queue
PreNode    CurNode    NextNode   Data
0xc59f20   0xc59f20   0xc59f20   [1016    ,0xc597a4]
2024-07-04 14:58:21-P[9456]-T[9456]-[Info ]-Printf HashTable   :
Index : 10    , DataNum : 1     , List :
(Key : '\n', Func : 'Empty', FuncPtr : (nil), Val : '1', HitCnt : 2)
Index : 32    , DataNum : 1     , List :
(Key : ' ', Func : 'Empty', FuncPtr : (nil), Val : '2', HitCnt : 326)
Index : 40    , DataNum : 1     , List :
(Key : '(', Func : 'Empty', FuncPtr : (nil), Val : '3', HitCnt : 12)
Index : 41    , DataNum : 1     , List :
(Key : ')', Func : 'Empty', FuncPtr : (nil), Val : '4', HitCnt : 12)
Index : 44    , DataNum : 1     , List :
(Key : ',', Func : 'Empty', FuncPtr : (nil), Val : '5', HitCnt : 6)
Index : 45    , DataNum : 1     , List :
(Key : '-', Func : 'Empty', FuncPtr : (nil), Val : '6', HitCnt : 12)
Index : 46    , DataNum : 1     , List :
(Key : '.', Func : 'Empty', FuncPtr : (nil), Val : '7', HitCnt : 16)
Index : 73    , DataNum : 1     , List :
(Key : 'I', Func : 'Empty', FuncPtr : (nil), Val : '8', HitCnt : 4)
Index : 79    , DataNum : 1     , List :
(Key : 'O', Func : 'Empty', FuncPtr : (nil), Val : '9', HitCnt : 2)
Index : 82    , DataNum : 1     , List :
(Key : 'R', Func : 'Empty', FuncPtr : (nil), Val : '10', HitCnt : 2)
Index : 84    , DataNum : 1     , List :
(Key : 'T', Func : 'Empty', FuncPtr : (nil), Val : '11', HitCnt : 8)
Index : 95    , DataNum : 1     , List :
(Key : '_', Func : 'Empty', FuncPtr : (nil), Val : '12', HitCnt : 12)
Index : 97    , DataNum : 1     , List :
(Key : 'a', Func : 'Empty', FuncPtr : (nil), Val : '13', HitCnt : 128)
Index : 98    , DataNum : 1     , List :
(Key : 'b', Func : 'Empty', FuncPtr : (nil), Val : '14', HitCnt : 4)
Index : 99    , DataNum : 1     , List :
(Key : 'c', Func : 'Empty', FuncPtr : (nil), Val : '15', HitCnt : 74)
Index : 100   , DataNum : 1     , List :
(Key : 'd', Func : 'Empty', FuncPtr : (nil), Val : '16', HitCnt : 72)
Index : 101   , DataNum : 1     , List :
(Key : 'e', Func : 'Empty', FuncPtr : (nil), Val : '17', HitCnt : 184)
Index : 102   , DataNum : 1     , List :
(Key : 'f', Func : 'Empty', FuncPtr : (nil), Val : '18', HitCnt : 44)
Index : 103   , DataNum : 1     , List :
(Key : 'g', Func : 'Empty', FuncPtr : (nil), Val : '19', HitCnt : 20)
Index : 104   , DataNum : 1     , List :
(Key : 'h', Func : 'Empty', FuncPtr : (nil), Val : '20', HitCnt : 88)
Index : 105   , DataNum : 1     , List :
(Key : 'i', Func : 'Empty', FuncPtr : (nil), Val : '21', HitCnt : 126)
Index : 107   , DataNum : 1     , List :
(Key : 'k', Func : 'Empty', FuncPtr : (nil), Val : '22', HitCnt : 46)
Index : 108   , DataNum : 1     , List :
(Key : 'l', Func : 'Empty', FuncPtr : (nil), Val : '23', HitCnt : 108)
Index : 109   , DataNum : 1     , List :
(Key : 'm', Func : 'Empty', FuncPtr : (nil), Val : '24', HitCnt : 16)
Index : 110   , DataNum : 1     , List :
(Key : 'n', Func : 'Empty', FuncPtr : (nil), Val : '25', HitCnt : 82)
Index : 111   , DataNum : 1     , List :
(Key : 'o', Func : 'Empty', FuncPtr : (nil), Val : '26', HitCnt : 106)
Index : 112   , DataNum : 1     , List :
(Key : 'p', Func : 'Empty', FuncPtr : (nil), Val : '27', HitCnt : 22)
Index : 113   , DataNum : 1     , List :
(Key : 'q', Func : 'Empty', FuncPtr : (nil), Val : '28', HitCnt : 4)
Index : 114   , DataNum : 1     , List :
(Key : 'r', Func : 'Empty', FuncPtr : (nil), Val : '29', HitCnt : 162)
Index : 115   , DataNum : 1     , List :
(Key : 's', Func : 'Empty', FuncPtr : (nil), Val : '30', HitCnt : 50)
Index : 116   , DataNum : 1     , List :
(Key : 't', Func : 'Empty', FuncPtr : (nil), Val : '31', HitCnt : 154)
Index : 117   , DataNum : 1     , List :
(Key : 'u', Func : 'Empty', FuncPtr : (nil), Val : '32', HitCnt : 32)
Index : 118   , DataNum : 1     , List :
(Key : 'v', Func : 'Empty', FuncPtr : (nil), Val : '33', HitCnt : 10)
Index : 119   , DataNum : 1     , List :
(Key : 'w', Func : 'Empty', FuncPtr : (nil), Val : '34', HitCnt : 64)
Index : 120   , DataNum : 1     , List :
(Key : 'x', Func : 'Empty', FuncPtr : (nil), Val : '35', HitCnt : 2)
Index : 121   , DataNum : 1     , List :
(Key : 'y', Func : 'Empty', FuncPtr : (nil), Val : '36', HitCnt : 18)
Index : 122   , DataNum : 1     , List :
(Key : 'z', Func : 'Empty', FuncPtr : (nil), Val : '37', HitCnt : 2)
TotalDataNum : 37    
ArrayMaxLen  : 257   
[1001110000000011101111101101100001110001100010100000111111101001001010100101010001000011111011111010011111001011100101010010101000100110100000110111011111110011011110101011011011101001111011001100000010000101010111010001111101111110101011001111110100011010010111001111011001110010101001010100010110010101110001000111010110000001110111110110110000111000110001010000011111110100100101010010101000100001111100101110010101001010100010011010000011011101111111001101111010101101101110100111100001110111010010011110110000110101100000011100010110101101101010011111101101101110100111101101011000010001011110101100000011101111111001101111010101101101110100111101100110000001000010101011101111111000011101011100111111111110100011110100111111010110000111000110001010011010101100110111011100011111010110101100111111000000100010110100011001101110100100101010010101000101100110100111111010011101101110001100010100011111110111110011001001110110100101110011110110111111101111100000110101101111001110000000011101111101101100001110001100010100000111111101001001010100101010001000011111001011100101010010101000100110100000110111111110011011110101011011011101001111011001100000010000101010111010001111101111110101011001111110100011010010111001111011001110010101001010100010110101101001101011000000111011100011000101000001100100101110011110110011100101010010101000101101110001111111001111000111110101010011010011001101010010011111101110100100101010010101000100110111110100111000000001110101011000010101010111111101111100110101100001110001100010100110100010101011010111100110011111100010110011010110000001110100101110011110110011100101010010101000101100111111111110111100100110010010110000001111011010110000111000110001010011001101001110001100010100001111011001001110110100101110011110110011110000110111000000100010110100011001101011000000111011100011000101000001100100101110011110110011100101010010101000101101110100100101010010101000100110111110000111001110110000001111010010011101100001000111011101011000000111010110000111000110001010011001100000010000101010111001101010001010100101010001011010011011110101101110101110011110111101010110001111011010001010101101011110011001111110001110101100000011100101010010101000100110111110100111000000001110101011000010101010111111101111100110101100001110001100010100110011011010001001111110101000011000101000101010010101000101100111111111110100010111101011000000111010110111011011000111010110000001110110101011000010101011100111011001100110110100010100001110011110111100000010001011010001100110101100000011101110001100010100000110010010111001111011001110010101001010100010110011010010010000000110110000001111011010001101110001100010100110010011101101001011100111101100111001010100101010001000011010110111110011010101011011011111010101001011011000111110101110001011011101001111001100110011011010001001111110111111100010011101010011101101001011100111101100111100110011001001001110100111101101110001100010100001111001100110101101001101000100111010100011110100110100101110011110110011110110011001011100011101001110110001011011101001111001101111100001110010001011001001100101101101100110100011100011101001101111010100001111111011111110001101001100111111111110100011110100111111001001111111101011000000101100001110111111100110111101010110110111010011110011001101000111000111010101100001010101001101001101001001111011000011010001111011010011011110011111110011110110111100001010111011010110000110100110111000110001010000011001001011100111101100111001010100101010001001101111100110101011111111101101000110110011000111111110011110100001011101100111011001101101010000101010111100111010011110001101001101011010011010001101011000011100011000101001101001010000111101101111111011111001101111110100111011010001101110001100010100000110010010111001111011001110010101001010100010110111111010011101101001011100111101101111111011111000001110111010011011111010100111101101110001101110011011101111011011111111100100011011011010110000001110011000111111110011110100001011100000100011110101000101001111011010110000001110101100001110001100010100110111000101100100110011011000101100110100101000011110110111111101111100110111111010011101101011000000111011100011000101000001100100101110011110110011100101010010101000101101111110100111011010010111001111011011111110111110011010000110011001111111111100111101111010010100001100110111100100101111001111111010110011110111010011011111011011001101000110111000111000],[4396]
2024-07-04 14:58:21-P[9456]-T[9456]-[Info ]-BinaryPrint        :
BitUseLen          : 4
DataUseLen         : 549
DataMaxLen         : 551
[The pthread_rwlock_trywrlock() function shall apply a write lock like the pthread_rwlock_wrlock() function, with the exception that the function shall fail if any thread currently holds rwlock (for reading or writing).The pthread_rwlock_wrlock()function shall apply a write lock to the read-write lock referenced by rwlock. The calling thread acquires the write lock if no other thread (reader or writer) holds the read-write lock rwlock. Otherwise, the thread shall block until it can acquire the lock. The calling thread may deadlock if at the time the  call is made it holds the read-write lock (whether a read or write lock). Implementations may favor writers over readers to avoid writer starvation. Results are undefined if any of these functions are called with an uninitialized read-write lock. If  a  signal  is  delivered to a thread waiting for a read-write lock for writing, upon return from the signal handler the thread resumes waiting for the read-write lock for writing as if it was not interrupted.
],[1016]

 Performance counter stats for './TestHfm':

               230      page-faults:u                                               

       0.002016665 seconds time elapsed

       0.001643000 seconds user
       0.000000000 seconds sys

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值