13_Downloading_Data

CSV

csv文件的读取

绘制最高最低温度折线图,绘制结果如下:

 读取csv文件并提取其中数据进行绘图:

import csv
from datetime import datetime
from matplotlib import pyplot as plt

"""
使用csv.reader()读取csv文件得到迭代器
在遍历列表时调用enumerate可以得到两个返回值:元素的编号和元素的值
highs存储最高温度的统计数据
try-except-else结构处理数据缺失的情况
"""
filename = 'sitka_weather_2018_simple.csv'
with open(filename) as f:
    reader = csv.reader(f)
    header_row = next(reader) 
    dates, highs, lows = [], [], []
    for row in reader:
        try:
            current_date = datetime.strptime(row[2], "%Y-%m-%d")
            high = int(row[5])
            low = int(row[6])
        except ValueError:
            print(current_date, 'missing data')
        else:
            dates.append(current_date)
            highs.append(high)
            lows.append(low)
            
"""            
seaborn为图表主题
alpha为显示度
fill_between在两条曲线之间填充颜色
"""
plt.style.use('seaborn')
fig, ax = plt.subplots()
ax.plot(dates, highs, c='red', alpha=0.5)
ax.plot(dates, lows, c='blue', alpha=0.5)
ax.fill_between(dates, highs, lows, facecolor='blue', alpha=0.1)

ax.set_title("Daily high temperatures - 2018-7", fontsize=24)
ax.set_xlabel('', fontsize=16)
fig.autofmt_xdate()    #旋转x轴刻度防止重叠
ax.set_ylabel("Temperature (F)", fontsize=16)
ax.tick_params(axis='both', which='major', labelsize=16)

The Datetime Module

读取时间的模组的使用

from datetime import datetime
first_date = datetime.strptime('2018-7-1', '%Y-%m-%d') 
print(type(first_date))
print(first_date.strftime('%B %d %Y'))
print(first_date)
""" 
<class 'datetime.datetime'>
July 01 2018
2018-07-01 00:00:00
"""

不同时间的对应关系:

%A Weekday name, such as Monday
%B Month name, such as January
%m Month, as a number (01 to 12)
%d Day of the month, as a number (01 to 31) 
%Y Four-digit year, such as 2015
%y Two-digit year, such as 15
%H Hour, in 24-hour format (00 to 23)
%I Hour, in 12-hour format (01 to 12)
%p am or pm
%M Minutes (00 to 59) %S Seconds (00 to 59)

JSON

从json文件中读取股市信息并图像化:

import json
import matplotlib.pyplot as plt

filename = 'btc_close_2017.json'
with open(filename) as f:
    btc_data = json.load(f)  

date=[]; close=[]; months=[]
for btc_dict in btc_data:
    date.append(btc_dict['date'])
    months.append(int(btc_dict['month']))
    close.append(int(float(btc_dict['close'])))

"""绘制股市折线图"""
plt.rcParams['font.sans-serif'] = ['SimHei']#设置中文字体
plt.style.use('ggplot')

"""散点与折线图的结合,表现为一个覆盖关系"""
plt.plot(close, linewidth=0.5)
plt.scatter(date,close, s=5)
"""date[::20]:间隔20天取一天"""
plt.xticks(date[::20],rotation=45,fontsize=6)
plt.title('收盘价',fontsize=10)
plt.tight_layout()  #防止图像过大导致错误
plt.savefig('收盘价折线图.jpg',dpi=300)

对收盘价取对:

import matplotlib.pyplot as plt
import math
 
plt.rcParams['font.sans-serif'] = ['SimHei']
plt.style.use('ggplot')
"""得到对10取对数的收盘价列表"""
close_log = [math.log10(xx) for xx in close]
plt.plot(close_log, linewidth=0.5)
plt.scatter(date,close_log, s=5)
plt.xticks(date[::20],rotation=45,fontsize=6)
plt.title('收盘价',fontsize=10)
plt.tight_layout()
plt.savefig('收盘价对数变换折线图.jpg',dpi=300)

 Groupby()函数

对数据中连续相同的元素进行打包。第一个参数为需要处理的数据,第二个参数为一个函数作为判断的标准。

from itertools import groupby
data = ['a', 'bb', 'ccc', 'dd', 'eee', 'f']
for key, value_iter in groupby(data, len):  
    print(key, ':', list(value_iter))
"""
1 : ['a']
2 : ['bb']
3 : ['ccc']
2 : ['dd']
3 : ['eee']
1 : ['f']
"""

对data中的元素调换一下顺序:

from itertools import groupby
data = ['a', 'bb', 'cc', 'ddd', 'eee', 'f']
for key, value_iter in groupby(data, len):  
    print(key, ':', list(value_iter))
"""
1 : ['a']
2 : ['bb', 'cc']
3 : ['ddd', 'eee']
1 : ['f']
"""

Zip()函数

zip()将可迭代的对象作为参数,将对象中对应的元素打包成一个个元组,然后返回由这些元组组成的对象。返回值为迭代器。

*可以将一个对象解压。

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> c = [4,5,6,7,8]
>>> zipped = zip(a,b)
>>> list(zip(*zipped)) 
[(1, 2, 3), (4, 5, 6)]
>>> zipped = zip(a,b)
>>> x,y = zip(*zipped)
>>> print(x,y)
(1,2,3)(4,5,6)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值