使用爬山法实现简单替换密码的破译

Rytter

已于 2022-09-13 23:33:08 修改

阅读量1.3k

点赞数

分类专栏：西电实验文章标签：网络安全 python

于 2022-09-13 22:31:21 首次发布

本文链接：https://blog.csdn.net/qq_52380836/article/details/126843146

版权

西电实验专栏收录该内容

8 篇文章 11 订阅

订阅专栏

使用爬山法实现简单替换密码的破译

本来我没有这个作业，看到有人在群里问这个作业，就顺手写出来了，感觉收获挺大的，就写成个博文和大家一起交流。

欢迎来我网站:www.xuanworld.top

一、题目介绍

给出密文求明文

Sy l nlx sr pyyacao l ylwj eiswi upar lulsxrj isr sxrjsxwjr, ia esmm rwctjsxsza sj wmpramh, lxo txmarr jia aqsoaxwa sr pqaceiamnsxu, ia esmm caytra jp famsaqa sj. Sy, px jia pjiac ilxo, ia sr pyyacao rpnajisxu eiswi lyypcor l calrpx ypc lwjsxu sx lwwpcolxwa jp isr sxrjsxwjr, ia esmm lwwabj sj aqax px jia rmsuijarj aqsoaxwa. Jia pcsusx py nhjir sr agbmlsxao sx jisr elh. -Facjclxo Ctrramm

二、爬山法介绍

其实爬山法跟AI方面的梯度下降有点相似，如果有AI基础就能一眼看懂

相似度

首先我们介绍一下适应度的概念，英文是字母语言，通过字母的组合来构成相应的意义，如果字母排列顺序不对，就很不适合，跟正常语言差别很大，如果跟正常英文很相似的话，相似度就越高，简单来说相似度就是衡量一个句子正不正常的一个标准。

基本原理

其实它主要思想还是类似于暴力破解

首先我们先给出一个映射ABC…Z ->ABC…Z
我们对ABC…Z的顺序中选两个交换位置，并记住这个映射为f_1，然后用f_1将密文中的字母进行代换。
将代换后的句子进行相似度分析，分析结果为score_1
再重复2，得到f_2，并算出score_2，比较score，哪个大就保留哪个映射
如此重复上面4个步骤，重复上1000次，就能得到一个最优解，输出这个解就行了

三、源代码

import random
from ngram_score import ngram_score


def get_keydict(key_dict, current_key, original_alphabet):
    for i in range(len(current_key)):
        key_dict[current_key[i]] = original_alphabet[i]  # 这是将当前的密文进行映射
    return key_dict


def exchange(mydict, message):
    message = list(message)
    cnt = 0
    for i in message:
        if i in mydict:
            message[cnt] = mydict[i]
        cnt = cnt + 1
    return "".join(message)


S = "sy l nlx sr pyyacao l ylwj eiswi upar lulsxrj isr sxrjsxwjr, ia esmm rwctjsxsza sj wmpramh, lxo txmarr jia aqsoaxwa sr pqaceiamnsxu, ia esmm caytra jp famsaqa sj. Sy, px jia pjiac ilxo, ia sr pyyacao rpnajisxu eiswi lyypcor l calrpx ypc lwjsxu sx lwwpcolxwa jp isr sxrjsxwjr, ia esmm lwwabj sj aqax px jia rmsuijarj aqsoaxwa. Jia pcsusx py nhjir sr agbmlsxao sx jisr elh. -Facjclxo Ctrramm"
S_new = S.replace(" ", "")
S_new = S_new.replace(",", "")
S_new = S_new.replace("-", "")
S_new = S_new.replace(".", "")
# 参数初始化
m_message = S_new.upper()  # 这是全部改变为大写的密文
current_key = list('ABCDEFGHIJKLMNOPQRSTUVWXYZ')  # 这是当前的密文
original_alphabet = list('ABCDEFGHIJKLMNOPQRSTUVWXYZ')
key_dict = dict()  # 这是一个字典，用来将字母映射到上面字母表上去
fitness = ngram_score('english_quadgrams.txt')

last_score = -2 ** 31
current_max_score = -2 ** 31
generation = 0  # generation就是迭代的数量

while generation < 10:
    # 上面是迭代最高次数，一般10以内就能出结果
    generation = generation + 1

    # 随机改变顺序
    random.shuffle(current_key)
    key_dict = get_keydict(key_dict, current_key, original_alphabet)  # 获得明密文映射
    last_score = fitness.score(exchange(key_dict, m_message))  # 计算适应度

    count = 0
    while count < 1000:
        a = random.randint(0, 25)
        b = random.randint(0, 25)
        # 随机交换并进行比较
        child_current_key = current_key[:]
        child_current_key[a], child_current_key[b] = child_current_key[b], child_current_key[a]

        child_key_dict = dict()
        child_key_dict = get_keydict(child_key_dict, child_current_key, original_alphabet)
        score = fitness.score(exchange(child_key_dict, m_message))
        # 说明新的key_dict更高效
        if score > last_score:
            last_score = score
            current_key = child_current_key
            count = 0
        count = count + 1

    # 输出结果
    if last_score > current_max_score:
        current_max_score = last_score
        maxkey = current_key
        key_dict = get_keydict(key_dict, current_key, original_alphabet)
        print("第", generation, "代结果")
        print(exchange(key_dict, S.upper()).lower())

四、运行结果

if a man is offered a fact which goes against his instincts, he will scrutinike it closely, and unless the evidence is overwhelming, he will refuse to believe it. if, on the other hand, he is offered something which affords a reason for acting in accordance to his instincts, he will accept it even on the slightest evidence. the origin of myths is explained in this way. -bertrand russell

五、注意事项

本文用到了外网开源的代码，就是计算相似度的那个，外网网站为：http://practicalcryptography.com/cryptanalysis/text-characterisation/quadgrams/

下载所需文件，放在一个文件夹里运行就行了，原文用的是python2写的，python3运行有bug，所以将相似度计算的代码我改成python3放到下面。

from math import log10


class ngram_score(object):
    def __init__(self, ngramfile, sep=' '):
        ''' load a file containing ngrams and counts, calculate log probabilities '''
        self.ngrams = {}
        for line in open(ngramfile):
            key, count = line.split(sep)
            self.ngrams[key] = int(count)
        self.L = len(key)
        self.N = sum(self.ngrams.values())
        # calculate log probabilities
        for key in self.ngrams.keys():
            self.ngrams[key] = log10(float(self.ngrams[key]) / self.N)
        self.floor = log10(0.01 / self.N)

    def score(self, text):
        ''' compute the score of text '''
        score = 0
        ngrams = self.ngrams.__getitem__
        for i in range(len(text) - self.L + 1):
            if text[i:i + self.L] in self.ngrams:
                score += ngrams(text[i:i + self.L])
            else:
                score += self.floor
        return score