cipher text
“ryf utxy?” “ryf utxy,” orqf jhqihu. “nx nqww urwz! ax bhgo roogix,
trouqyvo - utxix qo yhutqyv oh fryvxihgo chi rylhyx nth tro ohdxutqyv
uh tqfx ro ehybxioruqhy! ojxxet, oh r nqox hwf cixyetdry orqf uh dx
hyex, qo ry qybxyuqhy hc dry’o uh jixbxyu tqd cihd utqyzqyv. qu qo
rwoh ry qycrwwqswx dxryo hc fqoehbxiqyv utru ntqet tx nqotxo uh tqfx.
r tgdry sxqyv, trouqyvo, eryyhu ixoqou utx hjjhiugyqul uh ixbxrw
tqdoxwc ryf xpjixoo tqo jxiohyrwqul ntqet ehybxioruqhy vqbxo tqd.
xbxil uqdx tx nqww vqbx tqdoxwc rnrl.” “ntru fh lhg xpjxeu egou uh
uxww lhg?” txiegwx jhqihu odqwxf. “r wqx,” tx orqf. “ryf sl qu, q
otrww zyhn utx uigut!”
赫伯特·S·基姆在他那部经典的密码学入门著作 《密码和隐密写作》(Codes and Secret Writing)里提道:英文的字母频率排列顺序: ETAONR ISHDLF CMUGYP WBVKJX QZ,最常见的字母对是TH HE AN RE ER IN ON AT ND ST ES EN OF TE ED OR TI HI AS TO,最常见的连写字母对是LL EE SS OO TT FF RR NN PP CC。
使用最多的前12个字母占了总使用次数的80%,使用最多的前8个字母则占了总使用次数的65%。数种排名函数能很好地拟合字母频率,而双参数Cocho/Beta排名函数(two-parameter Cocho/Beta rank function)是当中的佼佼者。用另一种不能调节参数的排名函数也能不错地拟合字母频率分布,该函数也能拟合蛋白质序列中的氨基酸频率。
英语中字母出现的频率
字母 | 频率 |
---|---|
a | 8.167% |
b | 1.492% |
c | 2.782% |
d | 4.253% |
e | 12.702% |
f | 2.228% |
g | 2.015% |
h | 6.094% |
i | 6.966% |
j | 0.153% |
k | 0.772% |
l | 4.025% |
m | 2.406% |
n | 6.749% |
o | 7.507% |
p | 1.929% |
q | 0.095% |
r | 5.987% |
s | 6.327% |
t | 9.056% |
u | 2.758% |
v | 0.978% |
w | 2.360% |
x | 0.150% |
y | 1.974% |
z | 0.074% |
首字母频率,单词中首字母的频率如下:
首字母 | 频率 |
---|---|
a | 11.602% |
b | 4.702% |
c | 3.511% |
d | 2.670% |
e | 2.007% |
f | 3.779% |
g | 1.950% |
h | 7.232% |
i | 6.286% |
j | 0.590% |
k | 0.597% |
l | 2.705% |
m | 4.374% |
n | 2.365% |
o | 6.264% |
p | 2.545% |
q | 0.173% |
r | 1.653% |
s | 16.671% |
t | 7.755% |
u | 1.487% |
v | 0.649% |
w | 6.753% |
x | 0.034% |
y | 1.620% |
z | 0.037% |
解密方法:先用频率分析得到原始的字符对应表,发现仍旧是乱序,再根据部分单词、句子的特性猜测其原始内容,调整字符对应表,多次执行此操作,逐渐得到更多的字符对应关系:
源代码:
#include<bits/stdc++.h>
using namespace std;
struct node {
char c;
double val;
}p[100];
int cmp(node a,node b)
{
return a.val<b.val;
}
char m[1000]; //按照空格读取的字符数组,即每一个单词为一个单元
int n[50]; //每个单元存取某个字符出现的次数
int main()
{
int sum=0,len,i;
//double sum=0;
freopen("2.txt","r",stdin);//读取文件
//freopen("su.txt","w",stdout);可以创建一个文件,将结果写入
memset(n,0,sizeof(n));
while(scanf("%s",m)!=EOF)
// printf("%s\n",m);
{
len=strlen(m);
for(i=0;i<len;i++){
if(m[i]>='a'&&m[i]<='z')
{
n[m[i]-'a']++;
sum+=1;
}
if(m[i]>='A'&&m[i]<='Z'){
n[m[i]-'A']++;
sum+=1;
}
}
}
for(i=0;i<26;i++){
printf("%c:%d %f\n",'a'+i,n[i],1.0*n[i]/sum);
}
for (i=0;i<=25;i++) {
p[i].c=i; p[i].val=1.0*n[i]/sum;
}
sort(p,p+26,cmp);
for (i=25;i>=0;i--) {
printf("%c ",p[i].c+'a');
}
/*for (i=0;i<26;i++) {
p2[i].c=i;
}
///
memset(f,-1,sizeof(f));
sort(p2,p2+26,cmp);
for (i=0;i<26;i++) {
f[p[i].c-'a']=p2[i].val-'a';
}
freopen("1.txt","r",stdin);
while (scanf("%s",s)!=EOF) {
len=strlen();
}*/
return 0;
}
实验截图:
最后解密的文本为:
“and then?”
“and then,” said poirot. “we will talk! ae vous assure, hastings - there is nothing so dangerous for anyone who has something to hide as conversation! speech, so a wise old frenchman said to me once, is an invention of man’s to prevent him from thinking. it is also an infallible means of discovering that which he wishes to hide. a human being, hastings, cannot resist the opportunity to reveal himself and express his personality which conversation gives him. every time he will give himself away.”
“what do you expect cust to tell you?”
hercule poirot smiled.
“a lie,” he said. “and by it, i shall know the truth!”