Research Project: Huffman Codes (30)
In 1953, David A. Huffman published his paper “A Method for the Constructionof Minimum-Redundancy Codes”, and hence printed his name in the history ofcomputer science. As a professor who gives the final exam problem on Huffmancodes, I am encountering a big problem: the Huffman codes are NOT unique. Forexample, given a string “aaaxuaxz”, we can observe that the frequencies of thecharacters 'a', 'x', 'u' and 'z' are 4, 2, 1 and 1, respectively. We may either encode thesymbols as {'a'=0, 'x'=10, 'u'=110, 'z'=111}, or in another way as {'a'=1, 'x'=01, 'u'=001,'z'=000}, both compress the string into 14 bits. Another set of code can be given as{'a'=0, 'x'=11, 'u'=100, 'z'=101}, but {'a'=0, 'x'=01, 'u'=011, 'z'=001} is NOT correctsince “aaaxuaxz” and “aazuaxax” can both be decoded from the code00001011001001. The students are submitting all kinds of codes, and I need acomputer program to help me determine which ones are correct and which ones arenot.
Input Specification:
Each input file contains one test case. For each case, the first line gives aninteger N (2 ≤ N ≤ 63), then followed by a line that contains all the N distinctcharacters and their frequencies in the following format:
c[1] f[1] c[2] f[2] ... c[N] f[N]
where c[i] is a character chosen from {'0' - '9', 'a' - 'z', 'A' - 'Z', '_'}, and f[i] is thefrequency of c[i] and is an integer no more than 1000. The next line gives a positiveinteger M (≤1000), then followed by M student submissions. Each studentsubmission consists of N lines, each in the format:
c[i] code[i]
where c[i] is the i-th character and code[i] is a string of '0's and '1's.
Output Specification:
For each test case, print in each line either “Yes” if the student’s submission iscorrect, or “No” if not.
Sample Input:
7A 1 B 1 C 1 D 3 E 3 F 6 G 6
4A 00000B 00001C 0001D 001E 01
F 10
G 11
A 01010
B 01011
C 0100
D 011
E 10
F 11
G 00
A 000
B 001
C 010
D 011
E 100
F 101
G 110
A 00000
B 00001
C 0001
D 001
E 00
F 10
G 11
Sample Output:
Yes
Yes
No
No
#include <iostream>
#include <cstring>
#include <cstdio>
#include <algorithm>
#include <cmath>
using namespace std;
struct {
int freq;
} code[100];
struct {
int freq;
bool lchild, rchild;
bool leaf;
} tree[400];
int value(char c) {
if (c >= '0' && c <= '9') return c - '0';
else if (c >= 'a' && c <= 'z') return c - 'a' + 10;
else if (c >= 'A' && c <= 'Z') return c - 'A' + 36;
else if (c == '_') return 62;
else return -1;
}
bool check(int max_idx) {
int min_freq[15];
for (int i = 0; i < 15; i++)
min_freq[i] = INT_MAX;
min_freq[0] = tree[1].freq;
for (int i = 2; i <= max_idx; i++) {
if (tree[i].leaf)
if (tree[i].lchild || tree[i].rchild)
return false;
if (tree[i].lchild != tree[i].rchild)
return false;
float dd = log(i) / log(2);
int depth = dd / 1;
if (tree[i].freq > min_freq[depth - 1])
return false;
if (tree[i].freq != 0)
min_freq[depth] = min_freq[depth] < tree[i].freq ?
min_freq[depth] : tree[i].freq;
}
return true;
}
int main() {
int n;
scanf("%d", &n);
for (int i = 0; i < n; i++) {
char string[2], c;
scanf("%s", string);
c = string[0];
scanf("%d", &code[value(c)].freq);
}
int times = 0;
scanf("%d", ×);
while(times--){
memset(tree, 0, sizeof(tree));
int max_idx = 1;
for (int i = 0;i < n; i++) {
char c, str[5], tmp[20];
int arr[20];
scanf("%s", str);
scanf("%s", tmp);
c = str[0];
int j = 0;
while (tmp[j] != '\0') {
arr[j] = tmp[j] - '0';
j++;
}
j--;
int idx;
idx = 1;
int k = 0;
while (k <= j) {
tree[idx].freq += code[value(c)].freq;
if (arr[k] == 0) {
tree[idx].lchild = true;
idx = idx * 2;
}
else {
tree[idx].rchild = true;
idx = idx * 2 + 1;
}
k++;
}
tree[idx].leaf = true;
tree[idx].freq = code[value(c)].freq;
max_idx = idx > max_idx ? idx : max_idx;
}
if (check(max_idx))
printf("Yes\n");
else
printf("No\n");
}
return 0;
}