Python Challenge是一个网页闯关游戏,在每一关里你都需要通过一些提示,编写程序来寻找答案找出下一关的网页地址。
第0关
网址:
http://www.pythonchallenge.com/pc/def/0.html
思考:
首先观察标签栏主题显示的是warming up,表明这关的主题是热身
给的提示是:
Hint: try to change the URL address.
图片的结果就是下一关的网址。
解答:
>>> print 2**38
274877906944L
>>>
所以下一关的网址就是:http://www.pythonchallenge.com/pc/def/274877906944.html
第1关
网址:
http://www.pythonchallenge.com/pc/def/map.html
思考:
观察标签栏主题, “what about making trans”
给的提示是:
Hint1: everybody thinks twice before solving this.
观察图片, K->M, O->Q, E->G, 每个字母对应其后的两个字母, 很显然的一个 凯撒密码 。提示下面还有一堆乱码。按照图片中的方法, 把乱码还原。
解答:
import string
original = "g fmnc wms bgblr rpylqjyrc gr zw fylb. rfyrq ufyr amknsrcpq ypc " \
"dmp. bmgle gr gl zw fylb gq glcddgagclr ylb rfyr'q ufw rfgq " \
"rcvr gq qm jmle. sqgle qrpgle.kyicrpylq() gq pcamkkclbcb. lmu " \
"ynnjw ml rfc spj."
table = string.maketrans(
"abcdefghijklmnopqrstuvwxyz", "cdefghijklmnopqrstuvwxyzab"
)
print original.translate(table)
>>>
i hope you didnt translate it by hand. thats what computers are for. doing it in by hand is inefficient and that's why this text is so long. using string.maketrans() is recommended. now apply on the url.
如同标签栏主题所提示, 显然作者是想要做题人用string.maketrans()来写。官方包括了此写法, 答案非常详尽。主要来看Python中的string模块。
所以下一关地址 http://www.pythonchallenge.com/pc/def/ocr.html
第2关
网址:
http://www.pythonchallenge.com/pc/def/ocr.html
思考:
观察标签栏主题, “ocr”,中文光学字符识别
给的提示是:
recognize the characters. maybe they are in the book,
but MAYBE they are in the page source.
查看网页的源代码,然后得到提示2
网页源码的注释中有: find rare characters in the mess below
下面是一堆字符,显然是从这对字符中找出现次数最少的
import re
data = "" # Copy-and-paste or extract otherwise from the challenge page's HTML-source
print "".join(re.findall("[A-Za-z]", data))
>>>
equality
下一关地址是 http://www.pythonchallenge.com/pc/def/equality.html
第3关
网址:
http://www.pythonchallenge.com/pc/def/equality.html
思考:
观察标签栏主题, “re”,正则表达式
给的提示是:
One small letter, surrounded by EXACTLY three big bodyguards on each of its sides.
查看网页的源代码,网页源码中又是一堆字符。
用正则表达式, 找到这样的“小写字符”:其两侧恰好都被3个大写字母占据
import re
all_str = "".join(open("level3code.txt"))
chars = re.findall(r'[^A-Z][A-Z]{3}([a-z])[A-Z]{3}[^A-Z]', all_str)
print "".join(chars)
>>>
linkedlist
其中字符组[^A-Z]匹配是,不匹配大写字母A-Z,^不匹配
下一关地址是 http://www.pythonchallenge.com/pc/def/linkedlist.html
第4关
网址:
http://www.pythonchallenge.com/pc/def/linkedlist.html
思考:
打开提示linkedlist.php,打开http://www.pythonchallenge.com/pc/def/linkedlist.php
观察标签栏主题, “follow the chain”
图片没看懂,查看网页源码
urllib may help. DON’T TRY ALL NOTHINGS, since it will never end. 400 times is more than enough.
DON’T TRY ALL NOTHINGS是什么意思?
点击画面上的图片,跳转到http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=12345,画面上显示:and the next nothing is 92512
这下明白了,要修改URL中nothing后面的数字。改吧!当改到第三次的时候,画面上提示:Your hands are getting tired and the next nothing is 50010。 有意思,印证了这句话:it will never end. 400 times is more than enough.
所以这题的意思就是 urllib 不停地打开新网页,网页的地址只要将nothing后面的数字替换为提示的数字即可
import urllib, re, time
uri = "http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=%s"
nothing_rep = "and the next nothing is (\d+)"
nothing = "12345" # You'll later be asked to change this
# to "46059" and re-run the script.
while True:
try:
source = urllib.urlopen(uri % nothing).read()
nothing = re.search(nothing_rep, source).group(1)
except:
break
print nothing
16044
>>>
打开http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=16044
Yes. Divide by two and keep going.
把nothing=8022 继续处理
peak.html
下一关地址是 http://www.pythonchallenge.com/pc/def/peak.html
第5关
网址:
http://www.pythonchallenge.com/pc/def/peak.html
思考:
观察标签栏主题, “peak hell”
图片下有这么一句话“ pronounce it ”
再看源代码,发现
peak hell sounds familiar ?
直接上网查前人成果,答案原来是pickle。pickle和peak hell发音很像。
pickle是PYTHON的序列化模块,提供PYTHON对象的序列化与反序列化。
网页源码中还有一个”banner.p”,打开banner.p后又看到一堆mess,应该是对这堆mess进行反序列化。
import pickle, urllib
handle=urllib.urlopen("http://www.pythonchallenge.com/pc/def/banner.p")
data = pickle.load(handle)
handle.close()
for elt in data:
print "".join([e[1] * e[0] for e in elt])
打印出由#组成的channel的图像,channel就是过关答案了