【数据分析】PM2.5 Data of Five Chinese Cities

最新推荐文章于 2022-07-28 19:22:03 发布

叶柖

最新推荐文章于 2022-07-28 19:22:03 发布

阅读量624

点赞数 1

分类专栏：数据分析文章标签：数据分析 python

本文链接：https://blog.csdn.net/qq_38929220/article/details/120105666

版权

Measurements for Shenyang, Chengdu, Beijing, Guangzhou, and Shanghai
数据来源：https://www.kaggle.com/uciml/pm25-data-for-five-chinese-cities

北京PM2.5随时间变化情况

数据列
在这里插入图片描述
The time period for this data is between Jan 1st, 2010 to Dec 31st, 2015. Missing data are denoted as NA.

No: row number
year: year of data in this row
month: month of data in this row
day: day of data in this row
hour: hour of data in this row
season: season of data in this row
PM: PM2.5 concentration (ug/m^3)
DEWP: Dew Point (Celsius Degree)
TEMP: Temperature (Celsius Degree)
HUMI: Humidity (%)
PRES: Pressure (hPa)
cbwd: Combined wind direction
Iws: Cumulated wind speed (m/s)
precipitation: hourly precipitation (mm)
Iprec: Cumulated precipitation (mm)

将数据中的分离的时间字段重组为时间序列

period = pd.PeriodIndex(year=df['year'], month=df['month'], day=df['day'], hour=df['hour'], freq='H')
df['datetime'] = period

时间频率freq 在这里插入图片描述
将datetime设置为Index

inplace：True替换原有数据，默认False返回新对象

df.set_index('datetime', inplace=True)

数据较多，取一个月的均值

df = df.resample('M').mean()

代码

import pandas as pd
from matplotlib import pyplot as plt

file_path = './data/BeijingPM20100101_20151231.csv'
df = pd.read_csv(file_path)

# 将数据中的分离的时间字段重组为时间序列
period = pd.PeriodIndex(year=df['year'], month=df['month'], day=df['day'], hour=df['hour'], freq='H')
df['datetime'] = period

# 将datetime指定为index
df.set_index

最低0.47元/天解锁文章

叶柖

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
打赏
0
评论
【数据分析】PM2.5 Data of Five Chinese Cities

Measurements for Shenyang, Chengdu, Beijing, Guangzhou, and Shanghai数据来源：https://www.kaggle.com/uciml/pm25-data-for-five-chinese-cities北京PM2.5随时间变化情况数据列The time period for this data is between Jan 1st, 2010 to Dec 31st, 2015. Missing data are denoted
复制链接

扫一扫