python pandas series_Python数据分析-pandas之Series

pandas是基于NumPy的一种工具,提供了快速便捷地处理结构化数据的大量数据结构和函数。使用最多的pandas对象主要是Series(一组数据及相应的索引标签)和DataFrame (二维表结构)。

Series (Series)是能够保存任何类型的数据(整数,字符串,浮点数,Python对象等)的一维标记数组。轴标签统称为索引。

列表创建

ser1 = pd.Series([10,20,30,40,50])

ser1

*********************

0 10

1 20

2 30

3 40

4 50

dtype: int64

*********************

标量值创建

ser3 = Series(100,index=['A','B','C','D','E'])

ser3

***************

A 10

B 20

C 30

D 40

E 50

dtype: int64

***************

随机数的创建

ser6 = pd.Series(np.random.randn(5),index=['a','b','c','d','e'])

print(ser6)

***************

a -0.329401

b -0.435921

c -0.232267

d -0.846713

e -0.406585

dtype: float64

***************

字典创建

ser4 = Series({'咖啡':30,'可乐':10,'奶茶':20},name = "price")

ser4

***************

咖啡 30

可乐 10

奶茶 20

Name: price ,dtype: int64

***************

ndarray数组创建

ser5 = Series(np.arange(5))

ser5

***************

0 0

1 1

2 2

3 3

4 4

dtype: int32

***************

Series属性

ser4

***************

咖啡 30

可乐 10

奶茶 20

dtype: int64

***************

ser4.index

# Index(['咖啡', '可乐', '奶茶'], dtype='object')

ser4.values

# array([30, 10, 20], dtype=int64)

ser4[2]

#20

ser4['奶茶']

#20

ser2

**********

A 10

B 20

C 30

D 40

E 50

dtype: int64

**********

ser2[ser2>ser2.median()]

*********

D 40

E 50

dtype: int64

*********

'D' in ser2

# True

Series的基本运算

import pandas as pd

import numpy as np

cities={'Beijing':55000,'Shanghai':60000,'shenzhen':50000,'Hangzhou':20000,'Guangzhou':45000,'Suzhou':None}

apts=pd.Series(cities,name='income')

apts

***********************

Beijing 55000.0

Shanghai 60000.0

shenzhen 50000.0

Hangzhou 20000.0

Guangzhou 45000.0

Suzhou NaN

Name: income, dtype: float64

***********************

apts[3]

# 20000.0

apts[[3,4,1]]

***************************

Hangzhou 20000.0

Guangzhou 45000.0

Shanghai 60000.0

Name: income, dtype: float64

***************************

apts[1:]

*****************************

Shanghai 60000.0

shenzhen 50000.0

Hangzhou 20000.0

Guangzhou 45000.0

Suzhou NaN

Name: income, dtype: float64

*****************************

apts[:-2]

*****************************

Beijing 55000.0

Shanghai 60000.0

shenzhen 50000.0

Hangzhou 20000.0

Name: income, dtype: float64

*****************************

apts[1:]+apts[:-1]

*****************************

Beijing NaN

Guangzhou 90000.0

Hangzhou 40000.0

Shanghai 120000.0

Suzhou NaN

shenzhen 100000.0

Name: income, dtype: float64

*****************************

apts['Shanghai']

# 60000.0

'Hangzhou' in apts

# True

'Choingqing' in apts

# False

less_than_50000=(apts<=50000)

apts[less_than_50000]

*****************************

shenzhen 50000.0

Hangzhou 20000.0

Guangzhou 45000.0

Name: income, dtype: float64

*****************************

apts.mean()

# 46000.0

'Old income of shenzhen:{}'.format(apts['shenzhen'])

# 'Old income of shenzhen:50000.0'

apts['shenzhen']=70000

apts

*****************************

Beijing 55000.0

Shanghai 60000.0

shenzhen 70000.0

Hangzhou 20000.0

Guangzhou 45000.0

Suzhou NaN

Name: income, dtype: float64

*****************************

#将小于50000的数据全部都转化成40000

less_than_50000=(apts<50000)

apts[less_than_50000]=40000

apts

*****************************

Beijing 55000.0

Shanghai 60000.0

shenzhen 70000.0

Hangzhou 40000.0

Guangzhou 40000.0

Suzhou NaN

Name: income, dtype: float64

*****************************

apts/2

*****************************

Beijing 27500.0

Shanghai 30000.0

shenzhen 35000.0

Hangzhou 20000.0

Guangzhou 20000.0

Suzhou NaN

Name: income, dtype: float64

*****************************

apts**1.5

*****************************

Beijing 1.289864e+07

Shanghai 1.469694e+07

shenzhen 1.852026e+07

Hangzhou 8.000000e+06

Guangzhou 8.000000e+06

Suzhou NaN

Name: income, dtype: float64

*****************************

np.log(apts)

*****************************

Beijing 10.915088

Shanghai 11.002100

shenzhen 11.156251

Hangzhou 10.596635

Guangzhou 10.596635

Suzhou NaN

Name: income, dtype: float64

*****************************

apts.notnull()

*****************************

Beijing True

Shanghai True

shenzhen True

Hangzhou True

Guangzhou True

Suzhou False

Name: income, dtype: bool

*****************************

apts.isnull()

*****************************

Beijing False

Shanghai False

shenzhen False

Hangzhou False

Guangzhou False

Suzhou True

Name: income, dtype: bool

*****************************

apts[apts.isnull()]

*****************************

Suzhou NaN

Name: income, dtype: float64

*****************************

apts2=pd.Series({'Beijing':10000,'Shanghai':8000,'shenzhen':6000,'Tianjin':40000,'Guangzhou':7000,'Chongqing':30000})

apts2

*****************************

Beijing 10000

Shanghai 8000

shenzhen 6000

Tianjin 40000

Guangzhou 7000

Chongqing 30000

dtype: int64

*****************************

#索引缺失相加

apts3 = apts+apts2

apts3

*****************************

Beijing 65000.0

Chongqing NaN

Guangzhou 47000.0

Hangzhou NaN

Shanghai 68000.0

Suzhou NaN

Tianjin NaN

shenzhen 76000.0

dtype: float64

*****************************

apts3[apts3.isnull()]=apts3.mean() #将缺失位置赋值为中值

apts3

*****************************

Beijing 65000.0

Chongqing 64000.0

Guangzhou 47000.0

Hangzhou 64000.0

Shanghai 68000.0

Suzhou 64000.0

Tianjin 64000.0

shenzhen 76000.0

dtype: float64

*****************************

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值