在前一篇文章中,已经讨论了如何根据重合指数求解密钥。现在讨论如何在已知密钥长度的情况下求解密钥。
拟重合指数:
qi 表示字母 i 在维吉尼亚密文分布中发生的概率, pi 表示字母 i 在正常英语文本分布中发生的概率,则拟重合指数可以定义为
R = ∑ pi * qi (‘a’<= i <=‘z’)
当两个频率分布类似时,R 的值相对要高。
求解:
也就是说,当明文在某个字母加密下的内积最大时,该字母极有可能是密钥中的字母。
将密文按密钥长度分组,按列进行划分,每列所组成的密文就相当于进行了凯撒加密,因为密钥字符在字母表中,所以每列进行循环26次,并统计在该字符下的内积R,最后挑选出内积最大的作为该列的密钥字符。
按列循环完后得到的字符组成的可能是密钥。(根据所用到的正常英语文本中字母概率不同而有所差错,并不能保证完全正确,可以根据明文的大致内容修改密钥)
要是实在不行的话,用网站解密吧:
vigenere-solver
大致代码:
key_len = 23
def getkey(c,cnt):
alp_rate1 =[0.08167,0.01492,0.02782,0.04253,0.12705,0.02228,0.02015,0.06094,0.06996,0.00153,0.00772,0.04025,0.02406,0.06749,0.07507,0.01929,0.0009,0.05987,0.06327,0.09056,0.02758,0.00978,0.02360,0.0015,0.01974,0.00074]
len_c = len(c)
temp = 'abcdefghijklmnopqrstuvwxyz'
alp_rate2 = []
for i in temp:#统计字母在该列密文中的出现频率
alp_rate2.append(c.count(i)/len_c)
inner = []#统计内积
for i in range(26):#将26个字母作为该列密文
sum_inner = 0#位移量i时的内积
for j in range(26):
sum_inner += alp_rate1[j]*alp_rate2[j]
inner.append(sum_inner)
alp_rate2 = alp_rate2[1:] + alp_rate2[:1] #对密文进行位移后再统计频率,等价于将字母频率列表向左移位
#根据内积筛选合适的字母
return temp[inner.index(max(inner))]
#求解密钥内容
key = ''
for i in range(23):
c = ''
for j in range(i,cipher_len,key_len):
c += cipher[j]
key += getkey(c,i)
print(key)
例题:[NCTF2019]Sore
根据给出的代码可以看出这是类似维吉尼亚密码,只不过这里的密钥字符在ascii_letters里面选。既不知道密钥长度也不知道密钥的具体内容。
第一步:求解密钥长度
cipher = open(r'ciphertext.txt', 'r').read()
cipher_len = len(cipher)
#求解密钥长度
for n in range(1,38):#枚举密钥长度
CI = 0#CI为所有列数的重合指数的总和
for i in range(0,n):#枚举每列的重合指数
c = ''
for j in range(i,cipher_len,n):#按列组成字符串
c += cipher[j]
L = len(c)
c_elem = set(c)#取不重复的字符
CIi = 0 #该列的重合指数
for litter in c_elem:
num = c.count(litter)#统计该字符的个数
CIi += num/L*(num-1)/(L-1)
CI += CIi
CI /= n#求每列的平均重合指数
if 0.06 < CI < 0.07:
print('密钥长度:',n,'重合指数:',CI)
求出密钥长度为23。那么接下来就是破解密钥的具体内容了。
第二步:求解密钥的具体内容
key_len = 23
def getkey(c,cnt):
alp_rate1 =[0.08167,0.01492,0.02782,0.04253,0.12705,0.02228,0.02015,0.06094,0.06996,0.00153,0.00772,0.04025,0.02406,0.06749,0.07507,0.01929,0.0009,0.05987,0.06327,0.09056,0.02758,0.00978,0.02360,0.0015,0.01974,0.00074]
#alp_rate1 = [0.0856,0.0139,0.0297,0.0378,0.1304,0.0289,0.0199,0.0528,0.0627,0.0013,0.0042,0.0339,0.0249,0.0707,0.0797,0.0199,0.0012,0.0677,0.0607,0.1045,0.0249,0.0092,0.0149,0.0017,0.0199,0.0008]
len_c = len(c)
temp = 'abcdefghijklmnopqrstuvwxyz'
alp_rate2 = []
for i in temp:#统计字母在该列密文中的出现频率
alp_rate2.append(c.count(i)/len_c)
inner = []#统计内积
for i in range(26):#将26个字母作为该列密文
sum_inner = 0#位移量i时的内积
for j in range(26):
sum_inner += alp_rate1[j]*alp_rate2[j]
inner.append(sum_inner)
alp_rate2 = alp_rate2[1:] + alp_rate2[:1] #对密文进行位移后再统计频率,等价于将字母频率列表向左移位
#根据内积筛选合适的字母
cha = 100
alp = ''
inner_i = 0
'''
for i in range(26):
if abs(inner[i]-0.065546) < cha:
alp = temp[i]
inner_i = inner[i]
cha = abs(inner[i]-0.065546)
return alp
'''
return temp[inner.index(max(inner))]
#求解密钥内容
key = ''
for i in range(23):
c = ''
for j in range(i,cipher_len,key_len):
c += cipher[j]
key += getkey(c,i)
print(key)
得到密钥为 :
vlbeunuovbpucklsjxlfpaq
然后根据加密推算出解密代码:
第三步:恢复明文
#求解明文
from string import ascii_letters
ctoi = lambda x: ascii_letters.index(x)
itoc = lambda x: ascii_letters[x]
#cipher = ''.join( itoc( ( ctoi(p) + ctoi( key[i % len_key] ) ) % 52 ) for i,p in enumerate(plain) )
m = ''.join( itoc( ( ctoi(p) - ctoi( key[i % key_len] ) ) % 52 ) for i,p in enumerate(cipher) )
print(m)
这里截取了部分明文:
密钥:
vlbeunuovbpucklsjxlfpaq
明文:
ShewouldrtwelkrigHtnext
tomewhenAelifttheSealio
nsbutshehidrtwalkToofar
第四步:根据明文修改密钥
显然,这个密钥有些地方是不对的。所以要根据内容大致意思进行推测正确的明文,在计算出密文。
明文中第三行butshehidrtwalkToofar,根据多年的英语学习经历,可以推测正确的明文时butshedidntwalkToofar.
于是
明文: h --> d ,多移4位,r --> n,多移4位。
根据公式
(密文 - 密钥)%52 = 明文
密文不能修改,那么对应密钥字母应该多移4位。
得到正确密钥:
vlbeunuozbpycklsjxlfpaq
from string import ascii_letters
m = 'nsbutshedidntwalkToofar'
c = 'IDcyNFBsCjsLvGlDtqztuaH'
ctoi = lambda x: ascii_letters.index(x)
itoc = lambda x: ascii_letters[x]
key = ''.join(itoc( ( ctoi(c[i]) - ctoi( m[i] ) ) % 52 ) for i in range(len(c)))
print(key)
要想知道更巧妙地方法吗?
当然是网站解密vigenere-solver,直接给出密钥长度和密钥内容以及明文。
刚提交上去,居然不对。淦!再次看别人的wp才发现x要大写。
根据矫正后的明文来看
ShewouldntwalkrigHtnext
tomewhenwelefttheSealio
nsbutshedidntwalkToofar
会发现突兀的地方,就是所得出的明文在不该大写的地方进行了大写:rigHt , Sea , Too。
再次矫正密钥
from string import ascii_letters
m = 'Shewouldntwalkrightnext'
c = 'nsfAIHFrMuLynuCApeEstxJ'
ctoi = lambda x: ascii_letters.index(x)
itoc = lambda x: ascii_letters[x]
key = ''.join(itoc( ( ctoi(c[i]) - ctoi( m[i] ) ) % 52 ) for i in range(len(c)))
print(key)
真真正正的密钥:vlbeunuozbpycklsjXlfpaq