mysql主从配置cobf_[No.003-0]爬虫网易赔率数据并导入到mysql数据库

首先,我拥有这个导入赔率的需求,并且,我需要的是所有的数据赔率,我需要把某些脏赔率(极有可能不会出现的赔率,误导彩迷的赔率)删除,并且我需要知道这些赔率的比分,删除这些赔率,可以逆推正确比分以及赔率的区间。

所以我不得不做的就是把每天的赔率数据导入到我自己的数据库,根据一定的运算法则,将可能性筛选出来,然后我再做进一步的判断。

#encoding:utf-8

import urllib2

from bs4 import BeautifulSoup

website = "http://caipiao.163.com/order/jczq-hunhe/#from=leftnav"

page = urllib2.urlopen(website)

soup = BeautifulSoup(page)

for incident in soup('td'):

print incident

得到类似于以下的结果集:

负其他
120.00
0
14.00
1
5.20
2
3.55
3
3.50
4
4.70
5
7.50
6
13.00
7+
18.00

……

这里得到的结果,仅仅是赔率的结果,而且需要吧gametype的内容筛选出来,得到紧缺的,总进球 7球 18赔率的结果;

接下来

提取td中的内容,使用re正则来提取;

直接在for循环中使用re,避免使用文件作为缓存

#查询半全场的赔率

for item in soup.findAll("td",{"gametype":"bqc"}):

print item.find("div").string

#查询表中的标签,并将内容筛选出来

#半全场赔率,依次为"胜胜","胜平","胜负","平胜","平平","平负","负胜","负平","负负"

for item in soup.findAll("td",{"gametype":"bqc"}):

print item.find("div").string

#再查询比分赔率

#先是胜赔,1:0~胜其他,之后是平赔,0:0~3:3,平其他,并将其导入比分赔率bfpl

temp = ["1:0","2:0","2:1","3:0","3:1","3:2","4:0","4:1","4:2","5:0","5:1","5:2","胜其他","0:0","1:1","2:2","3:3","平其他","0:1","0:2","1:2","0:3","1:3","2:3","0:4","1:4","2:4","0:5","1:5","2:5","负其他"]

i = 1

bfpl = []

for item in soup.findAll("td",{"gametype":"bf"}):

bfpl.append(item.find("div").string)

#---------------------

#构建比分赔率字典

i = 1

temp = ["1:0","2:0","2:1","3:0","3:1","3:2","4:0","4:1","4:2","5:0","5:1","5:2","胜其他","0:0","1:1","2:2","3:3","平其他","0:1","0:2","1:2","0:3","1:3","2:3","0:4","1:4","2:4","0:5","1:5","2:5","负其他"]

len

#再查询总进球赔率

for item in soup.findAll("td",{"gametype":"zjq"}):

print item.find("div").string

#----------------------------------------

#查询所有的主队、客队名字数据以及场次数据

#主队hostTeam

i = 1

hostTeam = []

for item in soup.findAll("em",{"class":"hostTeam"}):

hostTeam.append(item.b.string)

i+=1

for item in hostTeam:

print hostTeam[item]

#客队guestTeam

i = 1

guestTeam = []

for item in soup.findAll("em",{"class":"guestTeam"}):

guestTeam.append(item.b.string)

i+=1

for item in guestTeam:

print guestTeam[item]

#------------------

#场次以及主队客队数据

#------------------

i = 1

for item in hostTeam:

print '---------'

print screening[i],hostTeam[i],guestTeam[i]

i+=1

#-----------------------

#场次信息 jtip

i = 1

screening = []

for item in soup.findAll("span",{"class":"co1"}):

screening.append(item.i.string)

i+=1

#遍历场次数据

i=1

for item in screening:

print screening[i]

i+=1

#------------------

#做出场次+比分的list-->scbf[]

for item in screening:

i=0

while i

scbf[i]=screening[i]+temp[i]

i+=1

#=====================

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值