python编程之数据可视化2--下载数据
下载数据
在本文学习之后,你将能够处理各种类型和格式的数据集,并对如何创建复杂的图表有更深入的认识。
1.CSV文件格式
CSV的文件格式是使用逗号(,)将数据分隔开来,例如:
2019-1-6,63,52,96,85,85
本文使用CSV格式的文件是天气的数据,可以从http://www.wunderground.com/history中下载。
我已经存入百度网盘,读者可以自行下载。
链接:https://pan.baidu.com/s/1MZgZghOADlKBdVUV9u7Qow
提取码:ssnp
1.1 分析CSV的头
我们接下来使用csv模块分析CSV文件的数据行,可以快速提取感兴趣的值。首先,我们来看第一行,包含了一系列有关的数据描述:
highs_lows.py
import csv
filename = "sitka_weather_07-2014.csv"
with open(filename) as f:
reader = csv.reader(f)
header_row = next(reader)
print(header_row)
输出结果:
['AKDT', 'Max TemperatureF', 'Mean TemperatureF', 'Min TemperatureF', 'Max Dew PointF', 'MeanDew PointF', 'Min DewpointF', 'Max Humidity', ' Mean Humidity', ' Min Humidity', ' Max Sea Level PressureIn', ' Mean Sea Level PressureIn', ' Min Sea Level PressureIn', ' Max VisibilityMiles', ' Mean VisibilityMiles', ' Min VisibilityMiles', ' Max Wind SpeedMPH', ' Mean
Wind SpeedMPH', ' Max Gust SpeedMPH', 'PrecipitationIn', ' CloudCover', ' Events', ' WindDirDegrees']
1.2 打印文件头及其位置
highs_lows.py
import csv
filename = "sitka_weather_07-2014.csv"
with open(filename) as f:
reader = csv.reader(f)
header_row = next(reader)
# print(header_row)
for index, column_header in enumerate(header_row):
print(index,column_header)
输出结果:
0 AKDT
1 Max TemperatureF
2 Mean TemperatureF
3 Min TemperatureF
4 Max Dew PointF
5 MeanDew PointF
6 Min DewpointF
7 Max Humidity
8 Mean Humidity
9 Min Humidity
10 Max Sea Level PressureIn
11 Mean Sea Level PressureIn
12 Min Sea Level PressureIn
13 Max VisibilityMiles
14 Mean VisibilityMiles
15 Min VisibilityMiles
16 Max Wind SpeedMPH
17 Mean Wind SpeedMPH
18 Max Gust SpeedMPH
19 PrecipitationIn
20 CloudCover
21 Events
22 WindDirDegrees
1.3 提取并读取数据
读取每天的最高气温
highs_lows.py
import csv
filename = "sitka_weather_07-2014.csv"
with open(filename) as f:
reader = csv.reader(f)
header_row = next(reader)
# print(header_row)
highs = []
for row in reader:
highs.append(row[1])
print(highs)
输出结果:
['64', '71', '64', '59', '69',