python单词统计、给定一个段落()_python – 在非间隔段落中查找单词?

我正在使用python构建一个Caesar密码解密器,它可以解密已经加密的单词.但是,它显示了所有强力解密尝试,例如,使用3的密钥加密的“HELLO”是KHOOR.解密后的结果是“KHOORJGNNQIFMMPHELLOGDKKNFCJJMEBIILDAHHKCZGGJBYFFIAXEEHZWDDGYVCCFXUBBEWTAADVSZZCURYYBTQXXASPWWZROVVYQNUUXPMTTWOLSSVNKRRUMJQQTLIPPS”我想知道是否有使用字典与Python在此输出来搜索英文单词的方式或能提高我的代码,只打印出已知的英语单词.如果之前有人问过这个道歉,我四处寻找,似乎找不到合适的东西.

解决方法:

englishWords = ['HELLO', 'ME', 'AXE', 'FOO', 'BAR', 'BAZ'] #and many more

cypher = 'KHOORJGNNQIFMMPHELLOGDKKNFCJJMEBIILDAHHKCZGGJBYFFIAXEEHZWDDGYVCCFXUBBEWTAADVSZZCURYYBTQXXASPWWZROVVYQNUUXPMTTWOLSSVNKRRUMJQQTLIPPS'

for word in englishWords:

if word not in cypher: continue

print('Found "{}"'.format(word))

这会产生:

Found "HELLO"

Found "ME"

Found "AXE"

如果这是关于看到,如果解密文本的密钥是正确的,即如果结果可能是英文单词,我不会寻找单词,但尝试在结果中找到不符合英语的集群音节公寓.

这是一个非常天真的字母频率扫描实现:

#! /usr/bin/python3

plain = 'Z RD NFEUVIZEX ZW KYVIV ZJ R NRP KF LJV R UZTKZFERIP NZKY GPKYFE KF JVRITY WFI RE VEXCZJY NFIU ZE KYZJ FLKGLK FI TRE Z ZDGIFMV DP TFUV KF FECP GIZEK FLK BEFNE VEXCZJY NFIUJ. RGFCFXZVJ ZW KYZJ YRJ SVVE RJBVU SVWFIV, Z JVRITYVU RIFLEU REU TFLCUE\'K JVVD KF WZEU KYV IZXYK KYZEX.'.upper ()

freqs = {'E': 12.7, 'T': 9.1, 'A': 8.2, 'O': 7.5, 'I': 7.0}

def cypher(text, key):

return ''.join(chr((ord(c) - ord('A') + key) % 26 + ord('A')) if 'A' <= c <= 'Z' else c for c in text)

def crack(text):

length = len(text)

best = 100000

bestMatch = ''

for key in range(26):

cand = cypher(text, key)

quality = 0

for l, c in {letter: sum(1 for c in cand if c == letter) for letter in 'ETAOI'}.items():

quality += (c / length - freqs[l]) ** 2

if quality < best:

best = quality

bestMatch = cand

return bestMatch

print(crack(plain))

这里有三个例子:

Input: TQ ESTD TD LMZFE DPPTYR, TQ ESP VPJ QZC OPNJASPCTYR L EPIE TD ESP NZCCPNE ZYP, T.P. TQ ESP CPDFWE XTRSE MP PYRWTDS HZCOD, T HZFWOY'E WZZV QZC HZCOD, MFE ECJ EZ QTYO NWFDEPCD TYDTOP ESP CPDFWE HSTNS OZ YZE NZXAWJ HTES ESP PYRWTDS DJWWLMWP LALCEFD.

Output: IF THIS IS ABOUT SEEING, IF THE KEY FOR DECYPHERING A TEXT IS THE CORRECT ONE, I.E. IF THE RESULT MIGHT BE ENGLISH WORDS, I WOULDN'T LOOK FOR WORDS, BUT TRY TO FIND CLUSTERS INSIDE THE RESULT WHICH DO NOT COMPLY WITH THE ENGLISH SYLLABLE APARTUS.

Input: KWSJUZAFY XGJ WFYDAKZ OGJVK AF S TDGUC GX MFVAXXWJWFLASLWV LWPL DACW LZSL AK UWJLSAFDQ HGKKATDW, SFV VGAFY AL WXXAUAWFLDQ AK S YWFMAFWDQ AFLWJWKLAFY HJGTDWE. TML AL'K HJGTDWESLAU XGJ DGLK GX JWSKGFK, BMKL GFW GX OZAUZ AK LZSL QGMJ WFUJQHLWV LWPL ESQ AFUDMVW LWPL LZSL JSFVGEDQ ZSHHWFK LG XGJE SF WFYDAKZ OGJV UGEHDWLWDQ TQ UZSFUW.

Output: SEARCHING FOR ENGLISH WORDS IN A BLOCK OF UNDIFFERENTIATED TEXT LIKE THAT IS CERTAINLY POSSIBLE, AND DOING IT EFFICIENTLY IS A GENUINELY INTERESTING PROBLEM. BUT IT'S PROBLEMATIC FOR LOTS OF REASONS, JUST ONE OF WHICH IS THAT YOUR ENCRYPTED TEXT MAY INCLUDE TEXT THAT RANDOMLY HAPPENS TO FORM AN ENGLISH WORD COMPLETELY BY CHANCE.

Input: QZC PILXAWP, UFDE ESP EPIE JZF'GP AZDEPO SPCP TYNWFOPD SPWW, TQ, WTA, WZR, LDA LYO ACZMLMWJ ZESPCD. JZF NZFWO ECTX OZHY ESP LWEPCYLETGPD MJ ZYWJ DPLCNSTYR QZC HZCOD ESP DLXP WPYRES LD JZFC ELCRPE HZCO, LYO ZYWJ QZC HZCOD HTES ESP DLXP WPEEPC ALEEPCY. MFE ESLE'D CPLWWJ BFTEP L WZE ZQ HZCV EZ RPE LCZFYO ESP QLNE ESLE JZFC TYTETLW ZFEAFE SLD L WZE ZQ FDPWPDD OLEL TY TE.

Output: FOR EXAMPLE, JUST THE TEXT YOU'VE POSTED HERE INCLUDES HELL, IF, LIP, LOG, ASP AND PROBABLY OTHERS. YOU COULD TRIM DOWN THE ALTERNATIVES BY ONLY SEARCHING FOR WORDS THE SAME LENGTH AS YOUR TARGET WORD, AND ONLY FOR WORDS WITH THE SAME LETTER PATTERN. BUT THAT'S REALLY QUITE A LOT OF WORK TO GET AROUND THE FACT THAT YOUR INITIAL OUTPUT HAS A LOT OF USELESS DATA IN IT.

这里是没有间距和标点符号的最后一个例子:

Input: ZRDLJZEXGPKYFEKFSLZCURTRVJRITZGYVIUVTIPGKVIZKNFIBJREUUVTIPGKJKYVRCIVRUPVETIPGKVUNFIUYFNVMVIZKJYFNJRCCZKJ SILKVWFITVUVTIPGKZFERKKVDGKJWFIVORDGCVYVCCFVETIPGKVUNZKYRBVPFW3ZJBYFFI

Output: IAMUSINGPYTHONTOBUILDACAESARCIPHERDECRYPTERITWORKSANDDECRYPTSTHEALREADYENCRYPTEDWORDHOWEVERITSHOWSALLITS BRUTEFORCEDECRYPTIONATTEMPTSFOREXAMPLEHELLOENCRYPTEDWITHAKEYOF3ISKHOOR

标签:python,dictionary,encryption

来源: https://codeday.me/bug/20190612/1227021.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值