中国福利彩票双色球游戏规则:双色球投注区分为红色球号码区和蓝色球号码区,红色球号码区由1-33共三十三个号码组成,蓝色球号码区由1-16共十六个号码组成。投注时选择6个红色球号码和1个蓝色球号码组成一注进行单式投注,每注金额人民币2元。开奖日期每周二、四、日。开奖结果在这个网站可查https://www.zhcw.com/kjxx/ssq/
我们知道双色球的组合总共有C(33,6)*16种,具体数字是17,721,088 也就是一千七百多万种。之前看到有网友想根据以往的开奖信息统计下每种组合的各等级奖项的中奖次数。说起来需求很简单,开奖结果也可以爬下来,其实具体算法还是有些坑的,主要还是时间和空间占用的问题了。
先说下空间占用,如果是打算用文本文件存放统计结果,那么每种组合假设占14个字符,每级奖的中奖次数占4个字符(到2024-3-13为止开了1683期),6级奖占24个字符,那么得38个字符,不算换行符的话,总计得六百多兆字节了。考虑到美观更可视化,可能得像这样显示更舒服 红球 [11,14,18,26,31,33] 蓝球 [3] 中奖 [0,0,0,0,1,0] 那么可能需要占用得空间更大些。可以考虑采用pickle文件存盘,在python中对数据进行统计展示。
算法可以正向和反向,正向是枚举所有组合,在所有开奖信息中查每个组合得中奖情况,反向是根据每期开奖信息,查出所有中奖组合。当开奖期数比较少时,后者快些,而且后者是根据期数计算量呈数线性增长。前者一上来就是全部组合,当然现实情况是在某期后,所有组合都有中过奖了。
下面给出反向的算法:
import itertools
import time
import pickle
rbkv={}
def dictincr(key,amount):
iv=rbkv.get(key)
if iv==None:
rbkv[key]=amount
else:
rbkv[key]=iv+amount
return
def calci(l1,l2,b):
x=0
for i in l1:
x+=(0b1<<(i-1))
for i in l2:
x+=(0b1<<(i-1))
return (x<<4)+b-1
def getall(s):
r1, r2, r3, r4, r5, r6, goalblue = int(s[0:2]), int(s[2:4]), int(s[4:6]), int(s[6:8]), int(s[8:10]), int(s[10:12]), int(s[12:14])
goalred=[r1,r2,r3,r4,r5,r6]
restred=list(range(1,34))
for rt in goalred:
restred.remove(rt)
restblue=list(range(1,17))
restblue.remove(goalblue)
#j1
index=calci(goalred, [], goalblue)
dictincr(index,amount=(0b1<<55))
#j2
for i in restblue:
index = calci(goalred, [], i)
dictincr(index, amount=(0b1<<44))
#j3
for redhit in itertools.combinations(goalred, 5):
for j in restred:
index = calci(redhit, [j], goalblue)
dictincr(index, amount=(0b1<<33))
#j4
for redhit in itertools.combinations(goalred, 5):
for j in restred:
for i in restblue:
index = calci(redhit, [j], i)
dictincr(index, amount=(0b1<<22))
for redhit in itertools.combinations(goalred, 4):
for redmis in itertools.combinations(restred, 2):
index = calci(redhit, redmis, goalblue)
dictincr(index, amount=(0b1<<22))
# j5
for redhit in itertools.combinations(goalred, 4):
for redmis in itertools.combinations(restred, 2):
for i in restblue:
index = calci(redhit, redmis, i)
dictincr(index, amount=(0b1<<11))
for redhit in itertools.combinations(goalred, 3):
for redmis in itertools.combinations(restred, 3):
index = calci(redhit, redmis, goalblue)
dictincr(index, amount=(0b1<<11))
# j6
for redhit in itertools.combinations(goalred, 2):
for redmis in itertools.combinations(restred, 4):
index = calci(redhit, redmis, goalblue)
dictincr(index, amount=1)
for j in goalred:
for redmis in itertools.combinations(restred, 5):
index = calci([j], redmis, goalblue)
dictincr(index, amount=1)
for redmis in itertools.combinations(restred, 6):
index = calci([], redmis, goalblue)
dictincr(index, amount=1)
if __name__ == '__main__':
filename = "D:/Downloads/ssq.txt" #文件内容就是每行一期14位中奖数字 前12位是红球后2位为蓝球,每个球2位,红球01-33 蓝球01-16
with open(filename) as file_object:
nums = file_object.readlines()
start = time.time()
for s in nums[:5]: #计算5期的结果,可替换成需要的数字,如果计算全部就删除[:5]
getall(s)
end = time.time()
print("Running time %s seconds" % (end - start))
with open("d:/downloads/filep.pickle", "wb") as f:
pickle.dump(rbkv,f,protocol=pickle.HIGHEST_PROTOCOL)
#上面存为pickle文件下面存为txt文件,可选一种,将另一种注释掉
f = open("d:/downloads/file6.txt", "w",encoding='utf-8')
ref = {(1 << (i - 1)): i for i in range(1, 34)}
for key,value in rbkv.items():
blue = (int(key) & 15) + 1
reds = int(key) >> 4
redl = []
scs=[]
for i in range(0,6):
newreds=reds & (reds-1)
redl.append(str(ref[reds -newreds]))
reds=newreds
tj=value>>(55-i*11)
scs.append(str(tj))
value-=(tj<<(55-i*11))
f.write('红球 ['+','.join(redl)+'] 蓝球 ['+str(blue)+'] 中奖 ['+','.join(scs)+']\n')
f.close()
end = time.time()
print("file time %s seconds" % (end - start))