Python Challenge系列解题0-4

最新推荐文章于 2022-03-18 12:03:51 发布

Skiery

最新推荐文章于 2022-03-18 12:03:51 发布

阅读量443

点赞数

分类专栏： Python相关文章标签： PythonChallenge 解题思路

本文链接：https://blog.csdn.net/weixin_39228490/article/details/87101458

版权

Python相关专栏收录该内容

5 篇文章 0 订阅

订阅专栏

PythonChallenge系列解题步骤@Skiery

最近刚开始做深度学习图像处理，想着做一些python相关的小玩意热热手，正好身边有老师朋友推荐了这个东西，就一点点做来玩玩看。
python挑战源链接

python challenge 0

challenge 0

challenge 0
页面给出的提示：Try to challenge the URL address.
观察原地址，发现最后为0.html，看图中提示，猜想应该是计算 $2^{38}$ ，用结算结果将0替换。

solution 0

常规处理方法吧，引入math库，用pow()函数解决问题，简单明了，代码如下：

from math import *
math.pow(2,38)

运行后得到结果:274877906944，将url链接中的0替换，进入下一题

Python challenge 1

challenge1

challenge 1
然后是一条提示和一个字符串，如下：
everybody thinks twice before solving this.

g fmnc wms bgblr rpylqjyrc gr zw fylb. rfyrq ufyr amknsrcpq ypc dmp. bmgle gr gl zw fylb gq glcddgagclr ylb rfyr'q ufw rfgq rcvr gq qm jmle. sqgle qrpgle.kyicrpylq() gq pcamkkclbcb. lmu ynnjw ml rfc spj.

观察图片可以发现，三个字母依次按照字母表顺序向后映射，大概这样去替换字符串内的内容就OK，下面就是如何用python解决了

Solution 1

方法1：

首先输入字符串：

code = "g fmnc wms bgblr rpylqjyrc gr zw fylb. rfyrq ufyr amknsrcpq ypc dmp. bmgle gr gl zw fylb gq glcddgagclr ylb rfyr'q ufw rfgq rcvr gq qm jmle. sqgle qrpgle.kyicrpylq() gq pcamkkclbcb. lmu ynnjw ml rfc spj."

接下来按字母表顺序将每个字母右移后重新构成字符串，然后得到结果.
tips:有两个实用函数，介绍一下： ord() --可将字母转换为十进制的ASCII码 chr() --与前者相反

print(''.join([chr(((ord(s) + 2) - ord('a')) % 26 + ord('a')) if s >= 'a' and s <= 'z' else s for s in raw]))

运行后得到结果：

i hope you didnt translate it by hand. thats what computers are for. doing it in by hand is inefficient and that's why this text is so long. using string.maketrans() is recommended. now apply on the url.

方法2:

使用str.maketrans()函数构建映射表：

table = str.maketrans(
    "abcdefghijklmnopqrstuvwxyz", "cdefghijklmnopqrstuvwxyzab"
)

result = code.translate(table)

方法3：

大家可以自己试一试，利用zip(a,b)构建自己的字典代替maketrans函数的功能

以方法二为例，将map放入我们得到的函数，得到新的结果：

result = code.translate("flag")

得到新的url线索ocr，输入之后进入下一题

Python challenge 2

challenge 2

题目给出了提示：recognize the characters. maybe they are in the book,
but MAYBE they are in the page source.—“识别文字，他们可能在书中，也可能在源网页中”

Solution 2

这个加粗说明了一切，所以打开页面源代码看一下（我用的是chromes浏览器，F11打开，查看网页的HTML源代码），蹡蹡，果然发现在页面结尾注释了线索：

<!--
find rare characters in the mess below:
-->
<!--
%%$@_$^__#)^)&!_+]!*@&^}@.........      #篇幅有限，大家自行打开看啊，很长的一个符号序列
->>

提示说在下面的一堆东西里找到出现最少的文字，没什么头绪的我先拷到python里输出看了一眼（因为中间有换行符，可以用三个双引号赋值该字符串），得到str，注意看，因为线索的前后标和给定的符号序列前后标记不一样，所以可以利用函数直接从该页面爬出来这段代码，使用urllib.request和re库函数即可，方法如下：

import urllib.request
html = urllib.request.urlopen("http://www.pythonchallenge.com/pc/def/ocr.html").read().decode()

之后重新设定flag，从网页中提取字符串：

import re
str = re.findall("<!--(.*?)-->", html, re.DOTALL)

得到目标字符串str,然后利用python字典的计数功能开始计数，代码如下：

for i in str:
	count[i] = count.get(i,0)+1
print(count)

得到count如下：

{'\n': 1221, '%': 6104, '$': 6046, '@': 6157, '_': 6112, '^': 6030, '#': 6115, '
)': 6186, '&': 6043, '!': 6079, '+': 6066, ']': 6152, '*': 6034, '}': 6105, '[':
 6108, '(': 6154, '{': 6046, 'e': 1, 'q': 1, 'u': 1, 'a': 1, 'l': 1, 'i': 1, 't'
: 1, 'y': 1}

观察后发现，其中出现较少的字母为’e’，‘q’，‘u’，‘a’，‘l’，‘i’，‘t’，‘y’，em，正好是euqality这个单词，替换掉ocr试一下对不对，哈哈，成功进入第三题

Python Challenge 3

challenge 3

提示写了一句这个：One small letter, surrounded by EXACTLY three big bodyguards on each of its sides.

Solution 3

em，蜡烛大小，大小，看起来是我比较怕的正则表达式，哈哈，仨大写一个小写再来仨大写，然后就没其他东西了，考虑一下，八成又是从源网页爬东西，打开看看，果然，和上一道题同样的格式框了一堆乱码，先爬下来：

import urllib.request
import re
html = urllib.request.urlopen("http://www.pythonchallenge.com/pc/def/equality.html").read().decode()
str = re.findall("<!--(.*?)-->", html, re.DOTALL)[-1]   ##########

ok,然后利用正则表达式把被前面三个大写，后面三个大写的小写字母抠出来吧，百度了一下，搞定了下面的正则表达式，代码如下：

result  = "".join(re.findall("[^A-Z]+[A-Z]{3}([a-z])[A-Z]{3}[^A-Z]+",str))

打印result发现结果为：linkedlist
修改url，进入下一道题

Python Challenge 4

challenge 4前置

进来之后发现只有页面名，linkedlist.php，页面改用php啦，于是按要求访问：
challenge 4
在这里插入图片描述
然后观察了一下，啥提示都没有，但是图片是个超链接，点进去看，发现是一个修改域名的问题，页面上写了and the next nothing is 44827，链接是http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345，改完发现又弹出来下一条···显然不能手动输入到尾嘛，结合前面的题目，我们继续尝试爬取页面中的数字，修改域名，循环进行访问

Solution 4

思路1

结合之前的问题，先爬页面：

import urllib.request
import re
add_temp = "http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing="
add_s = '12345'
while add_s :
	html = urllib.request(add_temp+add_s).read().decode()
	add_s = re.findall("[" "]+([0-9]*)",html)[-1]
	print(add_s)      ###开始循环打印flag直至页面信息结尾匹配不到空格加数字的形式，耐心等着就行

停止之后，print(html)，发现作者好皮…
Yes.Divide by two and keep going.所以将最后打印出来的结果’16044’除以2，继续开始循环试试看吧…

add_s = str(int(16044/2))    #注意将结果转换为int再转字符串，或者使用(//)整数除法
while add_s :
	html = urllib.request(add_temp+add_s).read().decode()
	add_s = re.findall("[" "]+([0-9]*)",html)[-1]
	print(add_s)

em，这次跑完发现最终报错了··于是进入报错前的最后一个页面66831.php看看，bingo，得到flag是peak.html，愉快的修改url，发现进入下一题咯

思路2

之前是延续前面的解题思路，在网上看了一些其他东西后，发现用compile去匹配识别似乎更好些，于是研究一下，给出代码如下：

import urllib.request
import re
add_temp = "http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing="
add_s = '12345'
pattern = re.compile("the next is ([0-9]*)")             #设定匹配规则
match = pattern.search(html)
while add_s:
	html = urllib.request.urlopen(add_temp+add_s).read().decode()
	print(html)
	match = pattern.search(html)    #用该规则在爬取的html内容中搜索
	add_s = match.group(1)           #爬取结果子函数group(0)返回整个串，group(1)返回正则表达式中匹配关心的部分（即小括号框住的部分）
		expection:
		print("Check the final html")