小菜鸟的初学心理历程
给定一个URL 从中获取数据,组装成字典形式,然后做数据对比
思路
1)先获取该URL的数据
2)获取URL中需要的dateLeft 和pe 数据
3)组装成字典形式
4)获取权限登录(后面插入的需求)
5)通过抓包得到权限登录的接口和PC端数据浏览器的接口
6)做数据对比(compare)
产业链的接口:
url1 = ‘http://ft.10jqka.com.cn/thsft/iFindService/Chain/new-analyse/getzbkbxhyzs?type=bdt’
登录接口:
url2 = ‘http://ft.10jqka.com.cn/thsft/jgbservice?reqtype=verify&account=V3O%2BEomC5Ak3igK60gSGUYqnFzYuTJdEVZCBHS85FM3PjCznPoIBgWtSxBMI8hRK6eO4sUif1WcksFwgBBTGQLM8nDMFfR2Se%2BH6tBpZK7QQmUmyvWaMxELRZjIZsPR2xBUvnMpjUwkcb8emY5%2FDTItUldZMqg1uQCYLzkP6fuw%3D&passwd=TLNGOHH0GbDjvakxY%2FkQwoCJRpdlMuC5cd78dB9U3ul67s3AZxRwrbvKlM4Uro%2BJKcVM74ZRRl1ngDsXT2iaOl7YXqzqpadCtu%2B6kVLuX7cq6wFEWDEEzQvYfiVnVIP3evFblaOYspEWmnZ5mo4tpiVdwHgimV7Nd3YPG55Tbto%3D&sid=91&qsid=800&product=E02&version=8.60.50&imei=263714F8DFC0701D1DAA25C40528F8A1&securities=同花顺统一版iFinD&nohqlist=0&passwdencoded=1&jgbversion=1.10.12.339&check=1&Encrypt=1’
数据浏览器接口:
url3 = ‘http://ft.10jqka.com.cn/thsft/sectorservice_new’
初次写代码:
1)通过度娘和博客搜索自己想要得到的结果应该使用的语法
导入需要的模块 request,json
#!/usr/bin/python
# -*- coding: UTF-8 -*-
import requests
import json
url = 'http://172.19.80.62:81/thsft/iFindService/Chain/new-analyse/getzbkbxhyzs?type=bdt'
r = requests.post(url)
result = json.loads(r.content)
print(result)
此时得到URL的数据。
然后继续获取URL中你想要的dateLeft和pe数据
下面用的是最基础的获取数据的方法(一阶一阶找):
dateLeft = result['data']['left']['dateLeft']
print('dateLeft: ',dateLeft)
pe = result['data']['left']['option1'][u'\u534a\u5bfc\u4f53']['pe']
print('pe: ',pe)
到这一步就可以得到你想要的数据了,值得一提的是,我做到这里的时候,大家注意一下这里:
pe = result[‘data’][‘left’][‘option1’][u’\u534a\u5bfc\u4f53’][‘pe’]
由于一直没有注意编码的问题,导致我这里获取pe的数据 ,一直获取不到,然后各种度娘,注意到了编码问题,这里还挺好玩的。
如果大家想用编码表白的话也是挺好玩的2333
例如:
print (u'\u9738\u738b\u522b\u59ec')
输出结果:
霸王别姬
做到这里后,需要做的就是将获取到的两组数据组装成字典形式。
那需要注意的就是,首先要判断下两组数据的长度是否一致。
if len(dateLeft) != len(pe):
print("The two lengths are incinsistent and cannot form a dictionary!")
else:
dictionary = dict(zip(dateLeft,pe))
print(dictionary)
我是通过一个if…else 语句来做判断的,目前的水平也就这样了(大哭)
到这里后需要封装成一个方法,方便使用
然后封装这里,我看了半天书,原谅我真的很菜,看了书也看不懂,只好求助我们的大神,给我讲解一番。
终于做出了如下代码所示:
#!/usr/bin/python
# -*- coding: UTF-8 -*-
import requests
import json
class Get_data:
''' #这里被注释掉了…………
def __init__(self,dateLeft,pe):
self.dateLeft = dateLeft
self.pe = pe
'''
def get_data(self, url):
r = requests.post(url)
result = json.loads(r.content)
return result
def get_jbsessionid(self,url):
headers = requests.get(url).headers
print 'headers:',headers
jgbsessid = headers['Set-Cookie'].split(';')[0].split('=')[1]
#print jgbsessid
return jgbsessid
def get_response(self, url1, url2):
jgbsessid1 = self.get_jbsessionid(url2)
print 'jgbsessid:', jgbsessid1
url1 = url1 + '&jgbsessid='+ jgbsessid1
print 'url1',url1
result = self.get_data(url1)
#print(result)
dateLeft = result['data']['left']['dateLeft']
print 'dateLeft:', dateLeft
pe = result['data']['left']['option1'][u'\u534a\u5bfc\u4f53']['pe']
#print 'pe:', pe
if len(dateLeft) != len(pe):
print "The two lengths are incinsistent and cannot form a dictionary!"
else:
dictionary = dict(zip(dateLeft,pe))
#print dictionary
return dictionary
if __name__ == "__main__":
url1 = 'http://ft.10jqka.com.cn/thsft/iFindService/Chain/new-analyse/getzbkbxhyzs?type=bdt'
url2 = 'http://ft.10jqka.com.cn/thsft/jgbservice?reqtype=verify&account=V3O%2BEomC5Ak3igK60gSGUYqnFzYuTJdEVZCBHS85FM3PjCznPoIBgWtSxBMI8hRK6eO4sUif1WcksFwgBBTGQLM8nDMFfR2Se%2BH6tBpZK7QQmUmyvWaMxELRZjIZsPR2xBUvnMpjUwkcb8emY5%2FDTItUldZMqg1uQCYLzkP6fuw%3D&passwd=TLNGOHH0GbDjvakxY%2FkQwoCJRpdlMuC5cd78dB9U3ul67s3AZxRwrbvKlM4Uro%2BJKcVM74ZRRl1ngDsXT2iaOl7YXqzqpadCtu%2B6kVLuX7cq6wFEWDEEzQvYfiVnVIP3evFblaOYspEWmnZ5mo4tpiVdwHgimV7Nd3YPG55Tbto%3D&sid=91&qsid=800&product=E02&version=8.60.50&imei=263714F8DFC0701D1DAA25C40528F8A1&securities=同花顺统一版iFinD&nohqlist=0&passwdencoded=1&jgbversion=1.10.12.339&check=1&Encrypt=1'
myDj = Get_data()
#a = myDj.get_response(url2)
b = myDj.get_response(url2)
这里面出现了一个jgbsessid,这是一个获取登录权限的作用,通过开发小哥哥的讲解,然后在小哥哥的指导下做出来的,虽然每行都懂了,但是目前还是无法做到自己独立完成(哭唧唧~~~)
接下来继续我的学习之旅~
以上程序做到了:
1)获取产业链接口数据并组装成字典形式,封装为一个方法
2)任何登录获取数据都是要有登录权限的。
下面就是 获取数据浏览器的数据,然后也组成成字典形式,对两者进行数据比对。
1)自定义一个函数,访问数据浏览器
def get_browser(self,url,data):
####访问数据浏览器
browser = requests.post(url = url,data = data)
#print 'browser:', browser
content1 = browser.content
#print 'content1:', content1
return content1
2)获取数据浏览器的pe数据:
由于数据浏览器中“半导体”模块下有很多数据,可以通过循环的方式逐步获取数据浏览器的pe数据,并将得到的数据格式与要进行数据比对的产业链得到的数据格式保持一致
dictionary1 = {}
for i in range(len(dateLeft)):
date = dateLeft[i]
date = date.replace('-', '')
#print 'date:', date
pe_name1 = '''<?xml version="1.0" encoding="gbk"?><request><items>
<item name="654069_000_00_0_1">
<params><param name="F0" value="'''+date+'''" system="false"/>
<param name="F2" value="100" system="true"/>
<param name="FN" value="100" system="true"/>
</params></item></items><sectors history="false">
<sector name="001030008001" system="true"/>
</sectors></request>'''
################
data1 = header #header: {'jgbsessid': '8a76c1e39131c06c670842d152118c9c'}
data1['xml_request'] = pe_name1
####返回json格式
data1['return'] = 'json'
print('dictionary1:'),dictionary1
self.compare(dict2,dictionary1)
return data_content
以上运行过程中,可能会出现异常或错误,导致程序种终止
所以使用try…except…进行异常处理,这样程序就不会因为异常而终止,把可能发生的错误放在try模块里,用except来处理异常。
try:
data_content = self.get_browser(url3,data1)
data_content = json.loads(data_content)
#print 'type(data_content):',data_content['row']
data_date = data_content['row'][0][1]
#print 'data_date:',data_date
data_value = data_content['row'][0][2]
except Exception:
data_date = '0000-00-00'
data_value = '000'
3)自定义一个比较函数 compare
将得到的两组数据进行数据比对,数据一致的为True,不一致的为False
def compare(self,dictionary,dictionary1):
for key in dictionary:
if dictionary[key] == dictionary1[key]:
print 'True',key
else:
print 'False',key
整体模块程序:
#!/usr/bin/python
# -*- coding: UTF-8 -*-
import requests
import json
class Get_data:
'''
def __init__(self,dateLeft,pe):
self.dateLeft = dateLeft
self.pe = pe
'''
def get_data(self, url):
r = requests.post(url)
result = json.loads(r.content)
return result
def get_jbsessionid(self,url):
headers = requests.get(url).headers
print 'headers:',headers
jgbsessid = headers['Set-Cookie'].split(';')[0].split('=')[1]
header = {'jgbsessid':jgbsessid}
####返回两个数据
return jgbsessid,header
def get_browser(self,url,data):
####访问数据浏览器
browser = requests.post(url = url,data = data)
#print 'browser:', browser
content1 = browser.content
#print 'content1:', content1
return content1
def convert(self, input):
if isinstance(input, dict):
return {self.convert(key): self.convert(value) for key, value in input.iteritems()}
elif isinstance(input, list):
return [self.convert(element) for element in input]
elif isinstance(input, unicode):
return input.encode('utf-8')
else:
return input
def get_response(self, url1, url2,url3):
#######调用函数获取登录权限,返回两个,接受两个
jgbsessid1,header = self.get_jbsessionid(url2)
print 'jgbsessid:', jgbsessid1
print 'header:',header
####约定俗成的登录
url1 = url1 + '&jgbsessid='+ jgbsessid1
print 'url1',url1
###########################################获取接口的PE数据############################
result = self.get_data(url1)
#print(result)
dateLeft = result['data']['left']['dateLeft']
print 'dateLeft:', dateLeft
####获取接口的PE数据
pe = result['data']['left']['option1'][u'\u534a\u5bfc\u4f53']['pe']
#print 'pe:', pe
if len(dateLeft) != len(pe):
print "The two lengths are incinsistent and cannot form a dictionary!"
else:
dictionary = dict(zip(dateLeft,pe))
dictionary = self.convert(dictionary)
dict2 = {}
for key in dictionary:
key2 = key.replace('-', '')
value = dictionary[key]
dict2[key2] = value
print 'dict2:',dict2
###########################################获取数据浏览器的PE数据############################
dictionary1 = {}
for i in range(len(dateLeft)):
date = dateLeft[i]
date = date.replace('-', '')
#print 'date:', date
pe_name1 = '''<?xml version="1.0" encoding="gbk"?><request><items>
<item name="654069_000_00_0_1">
<params><param name="F0" value="'''+date+'''" system="false"/>
<param name="F2" value="100" system="true"/>
<param name="FN" value="100" system="true"/>
</params></item></items><sectors history="false">
<sector name="001030008001" system="true"/>
</sectors></request>'''
################
data1 = header #header: {'jgbsessid': '8a76c1e39131c06c670842d152118c9c'}
data1['xml_request'] = pe_name1
####返回json格式
data1['return'] = 'json'
#########调用函数去获取数据浏览器的指标值
try:
data_content = self.get_browser(url3,data1)
data_content = json.loads(data_content)
#print 'type(data_content):',data_content['row']
data_date = data_content['row'][0][1]
#print 'data_date:',data_date
data_value = data_content['row'][0][2]
except Exception, e:
data_date = '0000-00-00'
data_value = '000'
#print 'data_value:', data_value
dictionary1[str(data_date)] = str(round(data_value, 1))
print 'dictionary1:',dictionary1
self.compare(dict2,dictionary1)
return data_content
def compare(self,dictionary,dictionary1):
for key in dictionary:
if dictionary[key] == dictionary1[key]:
print 'True',key
else:
print 'False',key
if __name__ == "__main__":
####产业链接口
url1 = 'http://ft.10jqka.com.cn/thsft/iFindService/Chain/new-analyse/getzbkbxhyzs?type=bdt'
####登陆接口
url2 = 'http://ft.10jqka.com.cn/thsft/jgbservice?reqtype=verify&account=V3O%2BEomC5Ak3igK60gSGUYqnFzYuTJdEVZCBHS85FM3PjCznPoIBgWtSxBMI8hRK6eO4sUif1WcksFwgBBTGQLM8nDMFfR2Se%2BH6tBpZK7QQmUmyvWaMxELRZjIZsPR2xBUvnMpjUwkcb8emY5%2FDTItUldZMqg1uQCYLzkP6fuw%3D&passwd=TLNGOHH0GbDjvakxY%2FkQwoCJRpdlMuC5cd78dB9U3ul67s3AZxRwrbvKlM4Uro%2BJKcVM74ZRRl1ngDsXT2iaOl7YXqzqpadCtu%2B6kVLuX7cq6wFEWDEEzQvYfiVnVIP3evFblaOYspEWmnZ5mo4tpiVdwHgimV7Nd3YPG55Tbto%3D&sid=91&qsid=800&product=E02&version=8.60.50&imei=263714F8DFC0701D1DAA25C40528F8A1&securities=同花顺统一版iFinD&nohqlist=0&passwdencoded=1&jgbversion=1.10.12.339&check=1&Encrypt=1'
#########浏览器接口
url3 = 'http://ft.10jqka.com.cn/thsft/sectorservice_new'
########实例化一个对象
myDj = Get_data()
#a = myDj.get_response(url2)
#######对象去调用这个函数
b = myDj.get_response(url1,url2,url3)
#print 'a:', a
#print 'b', b
以上是我第一次通过业务学习的过程,整个过程中有自己去根据自己初学的理解和百度写的,也有开发小哥哥指导的,还需要我自己来慢慢琢磨,争取能够自己独立完成这样一份测试。
欢迎小哥哥小姐姐 指导学习,嘻嘻嘻~~~