依照学校数据结构实验指导书做的第二天
在哈夫曼树实验中,实验指导书给了一串字符串然后让我们按字符串进行建树。
而建树的第一步是把字符串中每一个字符对应的出现频率统计好,但是出现了字符超出ascii值的范围
字符串如下:
String str="The Chinese official said he viewed the Trump Presidency not as an aberration but as the product of a failing political system. This jibes with other accounts. The Chinese leadership believes that the United States, and Western democracies in general, haven’t risen to the challenge of a globalized economy, which necessitates big changes in production patterns, as well as major upgrades in education and public infrastructure. In Trump and Trumpism, the Chinese see an inevitable backlash to this failure.";
直接从docx文档复制的字符串,然后调用如下代码查看:
String str="The Chinese official said he viewed the Trump Presidency not as an aberration but as the product of a failing political system. This jibes with other accounts. The Chinese leadership believes that the United States, and Western democracies in general, haven’t risen to the challenge of a globalized economy, which necessitates big changes in production patterns, as well as major upgrades in education and public infrastructure. In Trump and Trumpism, the Chinese see an inevitable backlash to this failure.";
for(int i=0;i<str.length();i++)
if((int)str.charAt(i)>127)
System.out.println(str.charAt(i)+" "+(int)str.charAt(i));
输出结果为:
而后我判断了一下160号是否和前127中的空格是否相等
String str="The Chinese official said he viewed the Trump Presidency not as an aberration but as the product of a failing political system. This jibes with other accounts. The Chinese leadership believes that the United States, and Western democracies in general, haven’t risen to the challenge of a globalized economy, which necessitates big changes in production patterns, as well as major upgrades in education and public infrastructure. In Trump and Trumpism, the Chinese see an inevitable backlash to this failure.";
for(int i=0;i<str.length();i++)
if((int)str.charAt(i)>127)
if(str.charAt(i)==' ')
{
System.out.println("√");
}
else
System.out.println(("X"));
结果为:
即与前127中的空格不相等。
而后我查看了128-255的ascii对应字符(即扩展字符),对不上号,而后搜索了一下看到如下:
现在估计是字符编码不同于ascii,导致值不一样。
而我的解决方法是强行物理改变,即把报错的字符,修改为对应的ascii字符。