Python Challenge攻略(更新到5)

最新推荐文章于 2024-09-15 22:31:42 发布

Puncsky

最新推荐文章于 2024-09-15 22:31:42 发布

阅读量1.4w

点赞数

分类专栏： Python 文章标签： python 正则表达式 import file apple character

本文链接：https://blog.csdn.net/Puncsky/article/details/7166541

版权

Python 专栏收录该内容

1 篇文章 0 订阅

订阅专栏

本文是Python Challenge的攻略，涉及凯撒密码、网页爬虫和pickle模块的使用。通过正则表达式解析文件内容，利用pickle进行对象的储存与取储存操作，解决一系列解谜问题。

摘要由CSDN通过智能技术生成

本文用以记录Python小白在 http://www.pythonchallenge.com/的通关历程。加油吧，少年。

0. Warming Up http://www.pythonchallenge.com/pc/def/0.html

#!/usr/bin/env python
print 2**38

嗯，确实很方便，不用担心内存神马的。

1. 凯撒密码 Caesar Cipher http://www.pythonchallenge.com/pc/def/map.html

我的方案：

#!/usr/bin/env python

text = "g fmnc wms bgblr rpylqjyrc gr zw fylb. rfyrq ufyr amknsrcpq ypc dmp. bmgle gr gl zw fylb gq glcddgagclr ylb rfyr'q ufw rfgq rcvr gq qm jmle. sqgle qrpgle.kyicrpylq() gq pcamkkclbcb. lmu ynnjw ml rfc spj."
text2 = "map"

def char_shift(char, shift):
  if char.islower():
    return chr((ord(char)-97+shift)%26+97)
  elif char.isupper():
    return chr((ord(char)-65+shift)%26+97)
  else:
    return char

def caesar_cipher(str, shift):
  plaintext=""
  for char in str:
    plaintext+=char_shift(char, shift)
  return plaintext
print caesar_cipher(text, 2)
print caesar_cipher(text2, 2)

鉴于“using string.maketrans() is recommended“ 用这个方案的代码是：

#!usr/bin/evn python

from string import maketrans

alphabet="abcdefghijklmnopqrstuvwxyz"
alphabet_maped="cdefghijklmnopqrstuvwxyzab"
map=maketrans(alphabet, alphabet_maped)

text="g fmnc wms bgblr rpylqjyrc gr zw fylb. rfyrq ufyr amknsrcpq ypc dmp. bmgle gr gl zw fylb gq glcddgagclr ylb rfyr'q ufw rfgq rcvr gq qm jmle. sqgle qrpgle.kyicrpylq() gq pcamkkclbcb. lmu ynnjw ml rfc spj."
text2="map"

print text.translate(map)
print text2.translate(map)

官方答案见：http://wiki.pythonchallenge.com/index.php?title=Level1:Main_Page

2. 词频 Character Count http://www.pythonchallenge.com/pc/def/ocr.html

将网页源文件的注释中的一堆字符串存到data.txt中，然后写如下代码读取。

注意：如果直接用text=open("data.txt")是不行的，无法读取文件成为字符串。

#!/usr/bin/evn python
charCount={}
file=open("data.txt")
text=file.read()
file.close()
for c in text:
  charCount[c]=charCount.get(c,0)+1

#sort for convenience
sortedCharTuples = sorted(charCount.items())

for charTuple in sortedCharTuples:
  print "%s = %d" % (charTuple[0], charTuple[1])

发现稀有字符都是字母，用正则表达式

import re
print "".join(re.findall("[A-Za-z]*", open("data.txt").read())) #需要把结果连接起来所以用join，否则就显示['rMt', 'rxt', 'rVt']这种。

官方解法：http://wiki.pythonchallenge.com/index.php?title=Level2:Main_Page

3. 正则表达式 re http://www.pythonchallenge.com/pc/def/equality.html

注意：类似于XXXXxXXXX是不行的。也就是说配正则表达式的时候配[A-Z]{3}[a-z][A-Z]{3}是不行的。乱七八糟的字符们都存在data2.txt里面。

#!/usr/bin/evn python
import re
print "".join(c[4] for c in re.findall("[^A-Z][A-Z]{3}[a-z][A-Z]{3}[^A-Z]", open("data2.txt").read()))

其中c[4] for c in range这种用法很巧妙

4. 网页爬虫 http://www.pythonchallenge.com/pc/def/linkedlist.php

点击图片后the next nothing 依次出现，手动输入必然是很麻烦的，需要做一个类似于爬虫的东西，真实让人激动啊。既然要在文本中找字，活学活用，使用正则表达式。

水到渠成：

import urllib
import re
urlItem = urllib.urlopen("http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing=[这里换成你需要的]")
htmSource = urlItem.read()
urlItem.close()
nothing="".join(re.findall("[0-9]*",htmSource))
print nothing
while nothing:
  urlItem1 = urllib.urlopen("http://www.pythonchallenge.com/pc/def/linkedlist.php?nothing="+nothing)
  htmSource1 = urlItem1.read()
  urlItem1.close()
  print htmSource1
  nothing="".join(re.findall("[0-9]*",htmSource1))
  print nothing

注意：其中会有扰乱的页面，所以这里显示了每一页的具体内容，根据出现的特例进行修改，然后往后继续爬，直到出现了peak.html

5. Peak Hell = Pickle http://www.pythonchallenge.com/pc/def/peak.html

既然要用到pickle 考虑到网页源代码中有banner.p下载。打开来看……天书啊。

先学习下pickle

储存器

Python提供一个标准的模块，称为pickle。使用它你可以在一个文件中储存任何Python对象，之后你又可以把它完整无缺地取出来。这被称为持久地储存对象。

还有另一个模块称为cPickle，它的功能和pickle模块完全相同，只不过它是用C语言编写的，因此要快得多（比pickle快1000倍）。你可以使用它们中的任一个，而我们在这里将使用cPickle模块。记住，我们把这两个模块都简称为pickle模块。

储存与取储存

例12.2 储存与取储存

#!/usr/bin/python # Filename: pickling.py

import cPickle as p
#import pickle as p

shoplistfile = 'shoplist.data'
# the name of the file where we will store the object

shoplist = ['apple', 'mango', 'carrot']

# Write to the file
f = file(shoplistfile, 'w') p.dump(shoplist, f) # dump the object to a file
f.close()

del shoplist # remove the shoplist # Read back from the storage
f = file(shoplistfile) storedlist = p.load(f)
print storedlist

（源文件：code/pickling.py）

输出

$ python pickling.py ['apple', 'mango', 'carrot']

它如何工作

首先，请注意我们使用了import..as语法。这是一种便利方法，以便于我们可以使用更短的模块名称。在这个例子中，它还让我们能够通过简单地改变一行就切换到另一个模块（cPickle或者pickle）！在程序的其余部分的时候，我们简单地把这个模块称为p。

为了在文件里储存一个对象，首先以写模式打开一个file对象，然后调用储存器模块的dump函数，把对象储存到打开的文件中。这个过程称为储存。

然后读读看看呗：（用pprint智能化显示）

import pickle
import pprint

data = pickle.load(open('banner.p'))
pprint.pprint(data)

结果是

[[(' ', 95)],
[(' ', 14), ('#', 5), (' ', 70), ('#', 5), (' ', 1)],
[(' ', 15), ('#', 4), (' ', 71), ('#', 4), (' ', 1)],
[(' ', 15), ('#', 4), (' ', 71), ('#', 4), (' ', 1)],
[(' ', 15), ('#', 4), (' ', 71), ('#', 4), (' ', 1)],
[(' ', 15), ('#', 4), (' ', 71), ('#', 4), (' ', 1)],
[(' ', 15), ('#', 4), (' ', 71), ('#', 4), (' ', 1)],
[(' ', 15), ('#', 4), (' ', 71), ('#', 4), (' ', 1)],
[(' ', 15), ('#', 4), (' ', 71), ('#', 4), (' ', 1)],
[(' ', 6),
('#', 3),
(' ', 6),

...

这种序列，观察可以得出是每一行各个字符序列及其显示的个数

然后将其显示出来，就有

import pickle
import pprint

data = pickle.load(open('banner.p'))
pprint.pprint(data)

output = open('output.txt', 'w')
for line in data:
    print >> output, ''.join([c[0]*c[1] for c in line])

得到 channel

最后说明一下，我作弊了:P