python学习笔记 Day 18 下载数据及 Web API

最新推荐文章于 2021-02-19 17:08:58 发布

steventian72

最新推荐文章于 2021-02-19 17:08:58 发布

阅读量208

点赞数

分类专栏：下载数据 Web API

本文链接：https://blog.csdn.net/steventian72/article/details/85062376

版权

下载数据同时被 2 个专栏收录

1 篇文章 0 订阅

订阅专栏

Web API

1 篇文章 0 订阅

订阅专栏

Day 18 下载数据及 Web API

python常用模块小结

CSV数据文件访问分析

使用CSV

import csv

filename = 'sitka_weather_07-2014.csv'
with open(filename) as f:
	reaer = csv.reader(f)
	header_row = next(reader)

enumerate()函数：enumerate() 函数用于将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列，同时列出数据和数据下标，一般用在 for 循环当中。

enumerate(sequence, [start=0])

Sample:

	with open(filename) as f:
		reader = csv.reader(f)
		header_row = next(reader)
		for index, column_header in enumerate(header_row):
			print (index, column_header)

遍历csv文件并提取数据：for + append

with open(filename) as f:
reader = csv.reader(f)
header_row = next(reader)

dates, highs, lows = [], [], []
for row in reader:
	current_date = datetime.strptime(row[0], "%Y-%m-%d")
	high = int(row[1])
	low = int(row[3])
	dates.append(current_date)
	highs .append(high)
	lows.append(low)

错误处理

with open(filename) as f:
reader = csv.reader(f)
header_row = next(reader)

dates, highs, lows = [], [], []
for row in reader:
	try:
		current_date = datetime.strptime(row[0], "%Y-%m-%d")
		high = int(row[1])
		low = int(row[3])
	except ValueError:
		print (current_date, 'missing data')
	else:
		dates.append(current_date)
		highs .append(high)
		lows.append(low)

JSON格式
- pygal.i18n 不存在，No module named 'pygal.i18n’错误：
  - 改用pygal_maps_world.i18n：
    - OS X
```
$ pip install pygal_maps_world
```
    - Windows
```
\> python -m pip install pygal_maps_world
```
  - 将’ from pygal.i18n import COUNTRIES '改为
```
from pygal_maps_world.i18n import COUNTRIES		```
```
- module ‘pygal’ has no attribute ‘Worldmap’ 错误
  - 改用‘pygal_maps_world’
```
import pygal_maps_world.maps

wm = pygal_maps_world.maps.World()
```

Web API

Web API用于与网站进行交互，请求数据（以JSON或CSV返回）。
requests包，让python能向网站请求信息以及检查返回的响应。
- 安装requests包
  - OS X
```
$ pip install --user requests
```
```
  - Windows
```
```
$ python -m pip install --user requests
```

处理并响应字典

	import requests
	
	#执行API调用并存储响应
	url = "https://api.github.com/search/repositories?q=language:python&sort=stars"
	r = requests.get(url)
	print ("Status code: ", r.status_code)
	
	#将API响应存储在一个字典变量中
	response_dict = r.json()
	print ("Total repositories: ", response_dict['total_count'])
	
	#探索有关仓库的信息
	repo_dicts = response_dict['items']
	print ("Repositories returned: " , len(repo_dicts))
	
	#研究第一个仓库
	repo_dict = repo_dicts[0]
	print ("\nKeys:", len(repo_dict))
	for key in repo_dict.keys():
		print (key)

进一步研究‘仓库’

	#研究第一个仓库
	for repo_dict in repo_dicts:
		print ("\nSelcted information about first repository: ")
		print ('Name: ' + repo_dict['name'])
		print ('Owner: ' , repo_dict['owner']['login'])
		print ('Start: ' , repo_dict['stargazers_count'])
		print ('Repository: ', repo_dict['html_url'])
		print ('Created: ', repo_dict['created_at'])
		print ('Updated: ', repo_dict['updated_at'])
		print ('Description: ', repo_dict['description'])

‘NoneType’ object has no attribute ‘decode’ 错误：运行下面的代码时出现上述错误：

	names, plot_dicts = [], []
	for repo_dict in repo_dicts:
		names.append(repo_dict['name'])
		plot_dict = {
			'value': repo_dict['stargazers_count'],
			'label': repo_dict['description'] ,
			}
		plot_dicts.append(plot_dict)
		
	#可视化
	my_style = LS('#333366', base_style = LCS)
	
	my_config = pygal.Config()
	my_config.x_label_rotation = 45
	my_config.show_legend = False
	my_config.title_font_size = 24
	my_config.label_font_size = 14
	my_config.major_label_font_size = 18
	my_config.truncate_label = 15
	my_config_show_y_guides = False
	my_config.width = 1000
	
	chart = pygal.Bar(my_config, style = my_style)
	chart.title = 'Most-starred Python Projects on GitHub'
	chart.x_labels = names
	
	chart.add('', plot_dicts)
	chart.render_to_file('python_repos.svg')

参考下面两种解决办法：

第一种方法，即：

'label': str(repo_dict['description']),

改为：

'label': str(repo_dict['description']),

既简单又方便。

Hacker News API，学习以下三个知识点：

根据Web API调用返回的列表，动态生成WEB API调用网址，并再次调用WEB API访问并获取数据；
字典的dict.get()函数，不确定某个键是否包含在字典中时，可使用方法dict.get()，它在指定的键存在时返回与之相关的值，在指定的键不存在时返回第二个实参指定的值
模块operator中的函数item getter()，以及与sorted()函数的配合使用。这个函数传递键’comments’，它将从这个列表中的每个字典中提取与键’comments’相关的值，函数sorted()将根据这种值对列表进行排序

import requests
from operator import itemgetter

#执行API调用并存储响应
url = 'https://hacker-news.firebaseio.com/v0/topstories.json'
r = requests.get(url)
print ('Status code: ', r.status_code)

#处理有关每篇文章的信息
submission_ids = r.json()
#创建submission_dicts空列表，用于存储热门文章字典
submission_dicts = []

#取前30个热门文章ID
for submission_id in submission_ids[:30]:
	#对于每篇文章，都执行一个API调用
	#根据存储在submission_ids列表中的ID生成URL
	url = ('https://hacker-news.firebaseio.com/v0/item/' + 
		str(submission_id) + '.json')
	submission_r = requests.get(url)
	print(submission_r.status_code)

	response_dict = submission_r.json()

	#为当前处理的文章生成一个字典	
	submission_dict = {
	'title': response_dict['title'],
	'link': 'http://news.ycombinator.com/item?id=' + str(submission_id),
	'comments': response_dict.get('descendants', 0)
	}
	submission_dicts.append(submission_dict)

submission_dicts = sorted(submission_dicts, key = 
	itemgetter('comments'),reverse = True)

for submission_dict in submission_dicts:
	print ('\nTitle: ', submission_dict['title'])
	print ('Discussion link: ', submission_dict['link'])
	print ('Comments: ', submission_dict['comments'])

上面这段代码返回的数据结果：

[{"title": "Glitter bomb tricks parcel thieves", 
"link": "http://news.ycombinator.com/item?id=18706193", 
"comments": 304}, 
{"title": "Stop Learning Frameworks", 
"link": "http://news.ycombinator.com/item?id=18706785", 
"comments": 175}, 
{"title": "Reasons Python Sucks", 
"link": "http://news.ycombinator.com/item?id=18706174", 
"comments": 175}, 
{"title": "I need to copy 2000+ DVDs in 3 days. What are my options?", 
"link": "http://news.ycombinator.com/item?id=18690587", 
"comments": 167}, 
{"title": "SpaceX Is Raising $500M at a $30.5B Valuation", 
"link": "http://news.ycombinator.com/item?id=18706506", 
"comments": 139}, 
.........
]

steventian72

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
打赏
0
评论
python学习笔记 Day 18 下载数据及 Web API

Day 17 下载数据python常用模块小结CSV数据文件访问分析使用CSVimport csvfilename = 'sitka_weather_07-2014.csv'with open(filename) as f: reaer = csv.reader(f) header_row = next(reader)enumerate()函数：enumera...
复制链接

扫一扫