IDF实验室：初探乾坤--简单编程-字符统计

最新推荐文章于 2016-12-04 19:10:39 发布

Shinukami

最新推荐文章于 2016-12-04 19:10:39 发布

阅读量3.3k

点赞数

分类专栏： IDF实验室文章标签： IDF实验室爬虫 python

本文链接：https://blog.csdn.net/Shinukami/article/details/46319363

版权

IDF实验室专栏收录该内容

20 篇文章 1 订阅

订阅专栏

地址：

ctf.idf.cn/index.php?g=game&m=article&a=index&id=37

题目：

这里这里 → http://ctf.idf.cn/game/pro/37

Writeup:

（第二份代码引用他人新浪博客：blog.sina.com.cn/s/blog_e53f38130102vjlz.html ）

很明显，编写代码分别统计woldy五个字母的数量，并提交。但是注意需要在2秒内提交，所以需要写爬虫，With Python ！！！

第一次：自己用Python3.4写的：( 源代码如下 )

我连Cookies ，和 Headers 都全部伪装了。。。

但是他总是返回给我说，“你数学是小学体育老师教的吗？”

我就无语了！！！！

import urllib.request
import urllib.parse
import re

url = "http://ctf.idf.cn/game/pro/37/index.php"
req = urllib.request.Request(url)
response = urllib.request.urlopen(url)
html = response.read().decode('utf-8')

A = html.find('<hr />') + 6
B = html.find('<hr />', A)
f = html[A:B]

w = f.count('w')
o = f.count('o')
l = f.count('l')
d = f.count('d')
y = f.count('y')
ans1 = '%d'%w+'%d'%o+'%d'%l+'%d'%d+'%d'%y

length = str(len(ans1))

head = {}
head['Host'] = "ctf.idf.cn"
head['User-Agent'] = "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0"
head['Accept-Language'] = "zh-CN,en-US;q=0.7,en;q=0.3"
head['Accept-Encoding'] = "gzip, deflate"
head['Referer'] = "http://ctf.idf.cn/game/pro/37/"
head['Cookie'] = "Hm_lvt_184d7dcce9f76d1f5ab23d66e447d9a8=1432209840,1432307014,1433080747,1433157423; PHPSESSID=23ro01bddb6a7604ovumie8nr7; Hm_lpvt_184d7dcce9f76d1f5ab23d66e447d9a8=1433157799"
head['Connection'] = "keep-alive"
head['Cache-Control'] ="max-age=0"

xdata['Content-Type'] = "application/x-www-form-urlencoded"
xdata['Content-Length'] = length
xdata = {'anwser':ans1}
xdata = urllib.parse.urlencode(xdata).encode('utf-8')

req = urllib.request.Request(url, data = xdata)
response = urllib.request.urlopen(req)
html = response.read().decode('utf-8')

A = html.find('<body>') + 6
B = html.find('<hr />', A )
f = html[A:B]

print(f)

第二次：用 Python2.7 编写：（原代码如下）

首先安装BeautifulSoup

代码参考一个新浪博客：blog.sina.com.cn/s/blog_e53f38130102vjlz.html

#! /usr/python
#coding:utf-8
import sys, urllib,urllib2
import requests
#from BeautifulSoup import BeautifulSoup
from bs4 import BeautifulSoup
url = "http://ctf.idf.cn/game/pro/37/" #网页地址
s = requests.session()
content = s.get("http://ctf.idf.cn/game/pro/37/").text #获取页面内容
test=content.split('<hr />') #把字符串用分成3部分
print test[1]
w=0
o=0
l=0
d=0
y=0
for i, ch in enumerate(test[1]): #遍历分割后的第二部分字符串
   if ch=="w":
       w=w+1
   elif ch=="o":
       o=o+1
   elif ch=="l":
       l=l+1
   elif ch=="d":
       d=d+1
   elif ch=="y":
       y=y+1
tem='%d' %w +'%d' %o +'%d' %l +'%d' %d +'%d' %y #把数字拼成字符串
print tem
values = {'anwser':tem} #填写表单
result = s.post('http://ctf.idf.cn/game/pro/37/', data=values) #提交表单
print(result.text)