接到一个小任务,打算10天之内完成,记录下自己遇到的每个问题和解决的办法。
需求:数据库1的记录所在的表名是按照主键key 求hash(4)所得,将它转移到数据库2中,数据库2的表名是按照主键key求hash(5)所得。
难点分析:
1、界面实现
2、导入参数
3、hash的运用和计算
4、记录的转移和操作;
day1
安装好MySQL-python
Python操作MySQL基础
#coding=utf-8
import MySQLdb
conn= MySQLdb.connect(
host='192.168.1.90',
port = 3306,
user='QSuserWrite',
passwd='qs123456',
db ='shiji',
)
cur = conn.cursor()
#创建数据表
#cur.execute("create table student(id int ,name varchar(20),class varchar(30),age varchar(10))")
#插入一条数据
#cur.execute("insert into student values('2','Tom','3 year 2 class','9')")
#cur.execute("create table student(id int,name varchar(20),class varchar(20),age int)")
#cur.execute("Alter tudent add primary key(id)")
sqlinsert="insert into student values(%s,%s,%s,%s)"
cur.executemany(sqlinsert,[
('8','john','1 year 2 class','10'),
('9','john','1 year 2 class','10'),
('10','john','1 year 2 class','10'),
('11','john','1 year 2 class','10'),
])
#修改查询条件的数据
#cur.execute("update student set class='3 year 1 class' where name = 'Tom'")
#删除查询条件的数据
#cur.execute("delete from student where age='9'")
cur.close()
conn.commit()
conn.close()
1、Python操作是先建立一个conn的连接;
2、cur = conn.cursor()
通过获取到的数据库连接conn下的cursor()方法来创建游标。
3、cur.execute("create table student(id int ,name varchar(20),class varchar(30),age varchar(10))")
通过游标cur 操作execute()方法可以写入纯sql语句。通过execute()方法中写如sql语句来对数据进行操作。
4、可以设置sqlinsert="insert into student values(%s,%s,%s,%s)",从而节省代码量;
5、插入多条记录时,
cur.executemany(sqlinsert,[
('8','john','1 year 2 class','10'),
('9','john','1 year 2 class','10'),
('10','john','1 year 2 class','10'),
('11','john','1 year 2 class','10'),
])
方法用executemany,注意sqlinsert后要加中括号【】,每条记录要用逗号隔开;
day2
记录的转移操作的问题已经解决;
cur.execute("insert into student1(id, name, class, age) select id, name, class, age from student where 1=1")
今天争取解决掉hash的运用计算;
hash的算法来分表的问题:我处理的办法是将指定字段求sha1值,sha1值有40位之多,为了节约空间,选择最后四位,再转化为10进制,最后取摸l,l为要分的表的个数;
hash生成的函数为:
import hashlib
import string
def func_hash(m):
sha=hashlib.sha1(m)
s=sha.hexdigest()
return s
#这个函数返回字符串m的sha1加密值s
def function_hash(m,l):
a=func_hash(m)
b=a[36:40]
c=int(b,16)
n=c%l
print n
return n
这样用function_hash(Huhu,12)就可以算出的值为5,即该条记录应该放在student5中;
补充说一下数据库查询语句中,表名是常量,如果要对多个表格做循环遍历操作(m为1到12的整数),可以使用student"+str(m)+"来表示表名,这一点在批量修改数据库,以及本次任务的最终目的都是必要的。如下例:
m=function_hash('Huhu',12)
name='student'
#cur.execute("insert into student10(id, name, class, age) select id, name, class, age from student where name='john' ")
print("insert into student10(id, name, class, age) select id, name, class, age from student where name='john' ") #表名为常量
print("insert into student"+str(m)+"(id, name, class, age) select id, name, class, age from student where name='Huhu' ") #正确方式
print("insert into student+str(m)(id, name, class, age) select id, name, class, age from student where name='Huhu' ") #错误写法
cur.execute("insert into student"+str(m)+"(id, name, class, age) select id, name, class, age from student where name='Huhu' ")
接下来新建13个表格,格式都一样,表名是student、student1一直到student12.并且清空里面的所有数据:
for m in range(1,12 +1):
cur.execute("create table student”+str(m)+" (id int ,name varchar(20),class varchar(30),age varchar(10))")
cur.execute("delete from student"+str(m)+" where 1=1")
接下来想用类似的办法在student中多生成一些数据,但是遭遇了bug,这是师傅给的代码,自己明天好好看看:
for i in range(14,100 +1):
sql="insert into student(id, name, class, age) values ("
print(sql)
sql1=""+str(i)+""+','+"\'"+'name'+str(i)+"\'"+','+"\'"+'3 years'+"\'"+','+'10'+')'
print(sql+sql1)
cur.execute(sql+sql1)
这种方法的i的值是传不进去的:
cur.execute("insert into student(id, name, class, age) values(int(i),'name'+str(i),'3years 1class',10)")
day3
今天尝试使用真实得 数据库来操作一次。由于数据库中没有ID,所以遍历的问题,以及每次遍历后的数据记录的转移是个问题;
以下是代码:其中捕获的主键tradetailid是一条记录,不是字符串:
#coding=utf-8
import MySQLdb
import hashlib
import string
def func_hash(m):
sha=hashlib.sha1(m)
s=sha.hexdigest()
return s
#这个函数返回字符串m的sha1加密值s
def function_hash(m,l):
a=func_hash(m)
b=a[36:40]
c=int(b,16)
n=c%l
print n+1
return n+1
conn= MySQLdb.connect(
host='192.168.1.90',
port = 3306,
user='QSuserWrite',
passwd='qs123456',
db ='shijito',
)
cur = conn.cursor()
for i in range(1,50 +1):
print i
sql1="select TradeAccount from shijifrom.TransactionDetails20150101 limit "
sql2=""+str(i)+""+','"1"
# print(sql2)
print(sql1+sql2)
aa=cur.execute(sql1+sql2)
# print aa
info=cur.fetchone()
cur.scroll(0,'absolute')
print info
sql3="select TraDetailID from shijifrom.TransactionDetails20150101 limit "
sql4=""+str(i)+""+','"1"
print(sql3+sql4)
aa=cur.execute(sql3+sql4)
info_id=cur.fetchone()
cur.scroll(0,'absolute')
print info_id
m=function_hash("+info+",12)
print m
cur.execute("insert into TransactionDetails"+str(m)+"( TraDetailID,TradeNo,UserID,TradeAccount,OrderID,OutOrderID,TradeTime,TradeType,OtherSideName,INMoney,OutMoney,AccountBalance,OtherSideAccount,Remark,Platform,PNR,ClearTime,Source,AddTime,BalanceAccountId,AccountType,AccountTypeName,Subject,BillType,A1,A2,A3,A4,A5,A6,A7,A8,A9,A10,A11,A12,KeyWord1,KeyWord2,Passenger,PayForAccount,AllAccountBalance) select * from shijifrom.TransactionDetails20150101 where TraDetailID="+info_id+"")
'''
print("delete from student where name = 'name"+str(i)+"' ")
cur.execute("insert into student"+str(m)+"(id,name,class,age) select id,name,class,age from shijifrom.student where name = 'name"+str(i)+"' ")
cur.execute("delete from student where name = 'name"+str(i)+"' ")
'''
cur.close()
conn.commit()
conn.close()
day4 今天终于实现了基本的功能,只是在hash的具体算法上还需要和公司内定的算法想沟通修改,不过总体的功能已经实现。把一些小的功能分到不同的Python程序中实现。
求总行数
#coding=utf-8
import MySQLdb
import hashlib
import string
conn= MySQLdb.connect(
host='192.168.1.90',
port = 3306,
user='QSuserWrite',
passwd='qs123456',
db ='shijito',
)
cur = conn.cursor()
date='shijifrom.TransactionDetails20150101'
sql="select count(TraDetailID) from "+date+" where 1=1"
aa=cur.execute(sql)
info_id=cur.fetchone()
print info_id[0]
#print("+str(info_id[0])+")
for i in range(1,info_id[0] +1):
print i
cur.close()
conn.commit()
conn.close()
表格数据的新建和删除:
#coding=utf-8
import MySQLdb
import hashlib
import string
conn= MySQLdb.connect(
host='192.168.1.90',
port = 3306,
user='QSuserWrite',
passwd='qs123456',
db ='shijito',
)
cur = conn.cursor()
'''
#新建数据库和表格
for i in range(13,14 +1):
cur.execute("create table student"+str(i)+" (id int,name varchar(20),class varchar(30),age varchar(10));")
cur.execute("Alter table student"+str(i)+" ADD primary key(id) ")
for p in range(1,100 +1):
sql="insert into student"+str(i)+" (id, name, class, age) values ("
print(sql)
sql1=""+str(p)+""+','+"\'"+'name'+str(p)+"\'"+','+"\'"+'3 years'+"\'"+','+'10'+')'
print(sql+sql1)
cur.execute(sql+sql1)
#新建数据库和表格
'''
#新建某个表的表格
'''
for p in range(1,1000 +1):
sql="insert into student (id, name, class, age) values ("
sql1=""+str(p)+""+','+"\'"+'name'+str(p)+"\'"+','+"\'"+'3 years 1class'+"\'"+','+'10'+')'
print(sql+sql1)
cur.execute(sql+sql1)
'''
for p in range(0,29 +1):
sql="delete from TransactionDetails"+str(p)+" where 1=1;"
print(sql)
cur.execute(sql)
#新建某个表的表格
'''
#规范化原表的求hash的值(name)
for n in range(1,12 +1):
print("update student set name='name"+str(n)+"' where id='"+str(n)+"'")
cur.execute("update student set name='name"+str(n)+"' where id='"+str(n)+"'")
#规范化原表的求hash的值(name)
'''
cur.close()
conn.commit()
conn.close()
自己模拟的hash函数:
# -*- coding: cp936 -*-
import string
import hashlib
def func_hash(m):
sha=hashlib.sha1(m)
s=sha.hexdigest()
return s
#这个函数返回字符串m的sha1加密值s
def function_hash(m,l):
a=func_hash(m)
b=a[36:40]
c=int(b,16)
n=c%l
print n
return n
function_hash('mingrixiang', 12)
最终的实现代码,实现了跨数据库筛选数据,期间解决了遍历的问题:
#coding=utf-8
import MySQLdb
import hashlib
import string
def func_hash(m):
sha=hashlib.sha1(m)
s=sha.hexdigest()
return s
#这个函数返回字符串m的sha1加密值s
def function_hash(m,l):
a=func_hash(m)
b=a[36:40]
c=int(b,16)
n=c%l
print n
return n
conn= MySQLdb.connect(
host='192.168.1.90',
port = 3306,
user='QSuserWrite',
passwd='qs123456',
db ='shijito',
)
cur = conn.cursor()
date='QUserTransactionDetailsDB2015.TransactionDetails20151128'
#实时交易日期!!!!
sql_num="select count(TraDetailID) from "+date+" where 1=1"
aa=cur.execute(sql_num)
info_num=cur.fetchone()
print("select count(TraDetailID) from "+date+" where 1=1")
print info_num[0]
#获取表格的总行数
for i in range(0,info_num[0]-1 +1):
print i
sql1="select TradeAccount from "+date+" limit "
sql3=""+str(i)+""+','"1"
print(sql1+sql3)
aa=cur.execute(sql1+sql3)
info=cur.fetchone()
print info[0]
#获取TradeAccount
sql2="select TraDetailID from "+date+" limit "
print(sql2+sql3)
bb=cur.execute(sql2+sql3)
info_id=cur.fetchone()
print info_id[0]
#获取主键TraDetailID
m=function_hash(info[0],30)
print(info[0],30)
print m
# print ("TransactionDetails"+str(m)+"")
print("insert into TransactionDetails"+str(m)+"")
strSql1="insert into TransactionDetails"+str(m)+"(TraDetailID,TradeNo,UserID,TradeAccount,OrderID,\
OutOrderID,TradeTime,TradeType,OtherSideName,INMoney,OutMoney,AccountBalance,OtherSideAccount,Remark,Platform,\
PNR,ClearTime,Source,AddTime,BalanceAccountId,AccountType,AccountTypeName,Subject,BillType,A1,A2,A3,A4,A5,A6,A7,\
A8,A9,A10,A11,A12,KeyWord1,KeyWord2,Passenger,PayForAccount) select TraDetailID,TradeNo,UserID,TradeAccount,OrderID,\
OutOrderID,TradeTime,TradeType,OtherSideName,INMoney,OutMoney,AccountBalance,OtherSideAccount,Remark,Platform,\
PNR,ClearTime,Source,AddTime,BalanceAccountId,AccountType,AccountTypeName,Subject,BillType,A1,A2,A3,A4,A5,A6,A7,\
A8,A9,A10,A11,A12,KeyWord1,KeyWord2,Passenger,PayForAccount from "+date+" where TraDetailID="
strSql2="\'"+str(info_id[0])+"\'"
print(strSql1+strSql2)
cur.execute(strSql1+strSql2)
#插入数据的操作
cur.close()
conn.commit()
conn.close()
接下来还可以优化一下遍历表格并且转移相同的tradeaccount的记录,可以一次查询转移多条记录,还需要慢慢改进;
另外的问题是hash的规则;