利用Python进行数据分析 笔记3

pandas对时区进行计数

DataFramepandas中最重要的数据结构,它用于将数据表示为一个表格。从一组原始记录中创建DataFrame是很简单的:

from pandas import DataFrame,Series
import pandas as pd;import numpy as np
frame = DataFrame(records)
print frame

 

<class 'pandas.core.frame.DataFrame'>

Int64Index: 3560 entries, 0 to 3559

Data columns:

_heartbeat_    120  non-null values

a              3440  non-null values

al             3094  non-null values

c              2919  non-null values

cy             2919  non-null values

g              3440  non-null values

gr             2919  non-null values

h              3440  non-null values

hc             3440  non-null values

hh             3440  non-null values

kw             93  non-null values

l              3440  non-null values

ll             2919  non-null values

nk             3440  non-null values

r              3440  non-null values

t              3440  non-null values

tz             3440  non-null values

u              3440  non-null values

dtypes: float64(4), object(14)

 

接下来:

print frame['tz'][:10]

 

0     America/New_York

1       America/Denver

2     America/New_York

3    America/Sao_Paulo

4     America/New_York

5     America/New_York

6        Europe/Warsaw

7                     

8                     

9                     

Name: tz

这里frame的输出形式是摘要视图,主要是用于较大的DateFrame对象。Frame[‘tz’]所返回的series对象有一个value_counts方法,该方法可以让我们得到所需的信息:

tz_counts=frame['tz'].value_counts()
print tz_counts

America/New_York                  1251

                                   521

America/Chicago                    400

America/Los_Angeles                382

America/Denver                     191

Europe/London                       74

Asia/Tokyo                          37

Pacific/Honolulu                    36

Europe/Madrid                       35

America/Sao_Paulo                   33

Europe/Berlin                       28

Europe/Rome                         27

America/Rainy_River                 25

Europe/Amsterdam                    22

America/Phoenix                     20

America/Indianapolis                20

Europe/Warsaw                       16

America/Mexico_City                 15

Europe/Stockholm                    14

Europe/Paris                        14

America/Vancouver                   12

Pacific/Auckland                    11

Europe/Prague                       10

Europe/Oslo                         10

Europe/Moscow                       10

Europe/Helsinki                     10

Asia/Hong_Kong                      10

America/Puerto_Rico                 10

Asia/Istanbul                        9

Asia/Calcutta                        9

America/Montreal                     9

Europe/Lisbon                        8

Europe/Vienna                        6

Europe/Athens                        6

Chile/Continental                    6

Australia/NSW                        6

Asia/Bangkok                         6

America/Edmonton                     6

Europe/Copenhagen                    5

Europe/Budapest                      5

Asia/Seoul                           5

America/Anchorage                    5

Europe/Zurich                        4

Europe/Bucharest                     4

Europe/Brussels                      4

Asia/Dubai                           4

Asia/Beirut                          4

America/Winnipeg                     4

America/Halifax                      4

Europe/Dublin                        3

Europe/Bratislava                    3

Asia/Kuala_Lumpur                    3

Asia/Karachi                         3

Asia/Jerusalem                       3

Asia/Jakarta                         3

Asia/Harbin                          3

America/Managua                      3

America/Bogota                       3

Africa/Cairo                         3

Europe/Vilnius                       2

Europe/Riga                          2

Europe/Malta                         2

Europe/Belgrade                      2

Asia/Amman                           2

America/Recife                       2

America/Guayaquil                    2

America/Chihuahua                    2

Africa/Ceuta                         2

Europe/Volgograd                     1

Europe/Uzhgorod                      1

Europe/Sofia                         1

Europe/Skopje                        1

Europe/Ljubljana                     1

Australia/Queensland                 1

Asia/Yekaterinburg                   1

Asia/Riyadh                          1

Asia/Pontianak                       1

Asia/Novosibirsk                     1

Asia/Nicosia                         1

Asia/Manila                          1

Asia/Kuching                         1

America/Tegucigalpa                  1

America/St_Kitts                     1

America/Santo_Domingo                1

America/Montevideo                   1

America/Monterrey                    1

America/Mazatlan                     1

America/Lima                         1

America/La_Paz                       1

America/Costa_Rica                   1

America/Caracas                      1

America/Argentina/Mendoza            1

America/Argentina/Cordoba            1

America/Argentina/Buenos_Aires       1

Africa/Lusaka                        1

Africa/Johannesburg                  1

Africa/Casablanca                    1

Length: 97

然后,我们利用绘图库对这段数据生成一张图片。为此,我们先给记录中未知或缺失的时区填上一个替代值。Fillna函数可以替代缺失值,而未知值即可以通过布尔型数组索引加以替换

clean_tz=frame['tz'].fillna('Missing')
clean_tz[clean_tz=='']='Unknown'
tz_counts=clean_tz.value_counts()
tz_counts[:10]

利用counts对象的plot方法即可得到一张水平条形图:

import matplotlib.pyplot as plt  #注意 如果使用pychram编译器 这句一定要写 要不然图出不来
tz_counts[:10].plot(kind='barh',rot=0)
plt.show()

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值