python数据分析之pandas里的Series

1 Series


线性的数据结构,series是一个一维数组

Pandas会默认用0到-1来作为series的index,但也可以自己指定index(可以把index理解为dict里面的key)

1.1 创造一个series数据

import pandas as pd
import numpy as np

s = pd.Series([9, 'zheng', 'beijing', 128])

print(s)
  • 打印

0         9

1        zheng

2         beijing

3        128

dtype: object

  • 访问其中某个数据

print(s[1:2])

# 打印

1    zheng

dtype: object

1.2 指定index

import pandas as pd
import numpy as np

s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])

print(s)
  • 打印
1          9
2      zheng
3    beijing
e        128
f        usa
g        990
dtype: object
  • 根据索引值找出值

print(s['f'])   # usa

1.3 用dictionary构造一个series

import pandas as pd
import numpy as np

s = {"ton": 20, "mary": 18, "jack": 19, "car": None}

sa = pd.Series(s, name="age")

print(sa)
  • 打印
car      NaN
jack    19.0
mary    18.0
ton     20.0
Name: age, dtype: float64
  • 检测类型

print(type(sa))   # <class 'pandas.core.series.Series'>

1.4 用numpy ndarray构造一个Series

  • 生成一个随机数
import pandas as pd
import numpy as np

num_abc = pd.Series(np.random.randn(5), index=list('abcde'))
num = pd.Series(np.random.randn(5))

print(num)
print(num_abc)

# 打印
0   -0.102860
1   -1.138242
2    1.408063
3   -0.893559
4    1.378845
dtype: float64
a   -0.658398
b    1.568236
c    0.535451
d    0.103117
e   -1.556231
dtype: float64

1.5 选择数据

import pandas as pd
import numpy as np

s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index[1,2,3,'e', 'f', 'g'])

print(s[1:3])   # 选择第1到3个,包左不包右  zhehg beijing
print(s[[1,3]])  # 选择第1个和第3个,zheng 128
print(s[:-1]) #选择第1个到倒数第1个, 9 zheng beijing 128 usa

1.6 操作数据

import pandas as pd
import numpy as np

s = pd.Series([9, 'zheng', 'beijing', 128, 'usa', 990], index=[1,2,3,'e','f','g'])

sum = s[1:3] + s[1:3]
sum1 = s[1:4] + s[1:4]
sum2 = s[1:3] + s[1:4]
sum3 = s[:3] + s[1:]

print(sum)
print(sum1)
print(sum2)
print(sum3)
  • 打印
2        zhengzheng
3    beijingbeijing
dtype: object
2        zhengzheng
3    beijingbeijing
e               256
dtype: object
2        zhengzheng
3    beijingbeijing
e               NaN
dtype: object
1               NaN
2        zhengzheng
3    beijingbeijing
e               NaN
f               NaN
g               NaN
dtype: object

1.7 查找

  • 是否存在

USA in s # true

  • 范围查找
import pandas as pd
import numpy as np

s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}

sa = pd.Series(s, name="age")

print(sa[sa>19])

  • 中位数
import pandas as pd
import numpy as np

s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}

sa = pd.Series(s, name="age")

print(sa.median()) # 20
  • 判断是否大于中位数
import pandas as pd
import numpy as np

s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}

sa = pd.Series(s, name="age")

print(sa>sa.median())

找出大于中位数的数

import pandas as pd
import numpy as np

s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}

sa = pd.Series(s, name="age")

print(sa[sa > sa.median()])

  • 中位数
import pandas as pd
import numpy as np

s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}

sa = pd.Series(s, name="age")

more_than_midian = sa>sa.median()

print(more_than_midian)

print('---------------------')

print(sa[more_than_midian])

1.8 Series赋值

import pandas as pd
import numpy as np

s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}

sa = pd.Series(s, name="age")

print(s)

print('-----------------')

sa['ton'] = 99

print(sa)

1.9 满足条件的统一赋值

import pandas as pd
import numpy as np

s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}

sa = pd.Series(s, name="age")

print(s)  # 打印原字典

print('---------------') # 分割线

sa[sa>19] = 88 # 将所有大于19的统一改为88

print(sa)  #打印更改之后的数据

print('-------------------') # 分割线

print(sa / 2) # 将所有数据除以2

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值