pandas_Series基础学习

最新推荐文章于 2023-06-06 10:58:43 发布

MrRenLG

最新推荐文章于 2023-06-06 10:58:43 发布

阅读量243

点赞数

分类专栏： pandas series 基础

本文链接：https://blog.csdn.net/MrRenLG/article/details/90516221

版权

基础同时被 3 个专栏收录

4 篇文章 0 订阅

订阅专栏

pandas

3 篇文章 0 订阅

订阅专栏

series

1 篇文章 0 订阅

订阅专栏

'''
【课程2.2】  Pandas数据结构Series：基本概念及创建

"一维数组"Serise

'''

import numpy as np
import pandas as pd

# 基本概念 Series
# Series
s=pd.Series(np.random.rand(5))
print(s)
s=pd.Series(np.array([1,2,3,5,4]))
print(s)

0    0.597451
1    0.760871
2    0.992611
3    0.861297
4    0.525024
dtype: float64
0    1
1    2
2    3
3    5
4    4
dtype: int32

# Series 字典创建
s=pd.Series({
        'one':'a',
        'two':'b',
        'three':'c'
    })
print(s)
# 字典中key就是数组中的index，value是数组的值

one      a
three    c
two      b
dtype: object

s=pd.Series({
        'one':{'a','b'},
        2:'b',
        'three':{'c','d','f'}
    })
print(s)

2                b
one         {a, b}
three    {c, d, f}
dtype: object

# 一维数组创建
s=pd.Series(['a','b','c','d',])
print(s)
# np.Series(arr)

0    a
1    b
2    c
3    d
dtype: object

s=pd.Series(10,index=list('abcdef'))
print(s)
# 标量创建，指定index

a    10
b    10
c    10
d    10
e    10
f    10
dtype: int64

s=pd.Series({
        'one':'a',
        'two':'b',
        'three':'c'
    })
print(s)
# 字典创建

one      a
three    c
two      b
dtype: object

# Series的一些属性
# name 指定创建的名称
s=pd.Series([1,2,3,5],name='数字')
print(s)

0    1
1    2
2    3
3    5
Name: 数字, dtype: int64

# 对数组重命名
s=s.rename('重命名')
print(s)

0    1
1    2
2    3
3    5
Name: 重命名, dtype: int64

'''
【课程2.3】  Pandas数据结构Series：索引

位置下标 / 标签索引 / 切片索引 / 布尔型索引

'''

s=pd.Series(np.random.rand(5)*100)
print(s)

0    29.070191
1    22.470569
2    98.036061
3    70.841708
4    19.292308
dtype: float64

# 位置下表
# 索引为2的
print(s[2])

98.0360611027

# 直接对数据转格式--python的float
print(type(s[3]))
a=float(s[3])
print(type(a))

<class 'numpy.float64'>
<class 'float'>

# 倒序输出
print(s[::-1])

4    19.292308
3    70.841708
2    98.036061
1    22.470569
0    29.070191
dtype: float64

# 标签索引
s=pd.Series(np.random.rand(5),
           index=list('abcde'))
print(s)

a    0.362709
b    0.092404
c    0.965135
d    0.207001
e    0.819382
dtype: float64

# 直接索引a
print(s['a'])

0.3627090717

# 索引多标签 [[]]
print(s[['a','c','e']])

a    0.362709
c    0.965135
e    0.819382
dtype: float64

# 切片索引
# 索引为 1和4之间包括1的  左开右闭
print(s[1:4])

b    0.092404
c    0.965135
d    0.207001
dtype: float64

# 索引为 [a,d]的，左闭右闭
print(s['a':'d'])

a    0.362709
b    0.092404
c    0.965135
d    0.207001
dtype: float64

# 索引在2之后的，包括2
print(s[2:])

c    0.965135
d    0.207001
e    0.819382
dtype: float64

# 索引在3之前的，不包括3
print(s[:3])

a    0.362709
b    0.092404
c    0.965135
dtype: float64

# 从开始到结束，步长为2
print(s[::2])

a    0.362709
c    0.965135
e    0.819382
dtype: float64

# 布尔型索引
s=pd.Series(np.random.rand(5)*100)
print(s)

0    44.252999
1    44.853123
2    52.063017
3    88.140812
4     7.004036
dtype: float64

# 输出布尔类型
print(s>50)

0    False
1    False
2     True
3     True
4    False
dtype: bool

# 布尔型索引，输出为真值的
print(s[s>50])

2    52.063017
3    88.140812
dtype: float64

'''
【课程2.4】  Pandas数据结构Series：基本技巧

数据查看 / 重新索引 / 对齐 / 添加、修改、删除值

'''

s=pd.Series(np.random.rand(10))
print(s)

0    0.476257
1    0.663716
2    0.475116
3    0.962667
4    0.788937
5    0.766607
6    0.434719
7    0.952197
8    0.880334
9    0.204090
dtype: float64

# 显示前边的   默认为5个
print(s.head())

0    0.476257
1    0.663716
2    0.475116
3    0.962667
4    0.788937
dtype: float64

# 显示后5个 默认为5个 可以修改
print(s.tail(3))

7    0.952197
8    0.880334
9    0.204090
dtype: float64

# 重新索引，不存在的引入NaN，对存在的重新排序
s=pd.Series(np.random.rand(3),index=list('abd'))
print(s)

a    0.169277
b    0.371673
d    0.794740
dtype: float64

s1=s.reindex(['c','d','e','f','a'])
print(s1)

c         NaN
d    0.794740
e         NaN
f         NaN
a    0.169277
dtype: float64

# 可以对缺失值进行填充 fill_value
s2=s.reindex(['c','b','f','a','e'],fill_value=5)
print(s2)

c    5.000000
b    0.371673
f    5.000000
a    0.169277
e    5.000000
dtype: float64

# Series对齐
s1=pd.Series(np.random.rand(3),index=list('zbc'))
s2=pd.Series(np.random.rand(3),index=list('bca'))
print(s1)
print(s2)

z    0.775781
b    0.120130
c    0.027019
dtype: float64
b    0.853151
c    0.688326
a    0.708527
dtype: float64

print(s1+s2)
# Series 对齐会自动对index进行排序，相同的相加，没有的则引入NaN,
# 任何值与NaN相加都为NaN

a         NaN
b    0.973281
c    0.715345
z         NaN
dtype: float64

# 删除操作 .drop
s=pd.Series(np.random.rand(5),index=list('abcde'))
ss=pd.Series(np.random.rand(5))
print(s)
s1=s.drop('a')
print(s1)
s2=ss.drop(3)
print(s2)
# 当数组没有指定index时，可直接利用默认索引删除，
s3=s.drop(['a','c','e'])
print(s3)
# 也可以同时删除多个，

a    0.090122
b    0.941888
c    0.636887
d    0.960461
e    0.312361
dtype: float64
b    0.941888
c    0.636887
d    0.960461
e    0.312361
dtype: float64
0    0.644486
1    0.348466
2    0.810254
4    0.624436
dtype: float64
b    0.941888
d    0.960461
dtype: float64

# 添加
s1=pd.Series(np.random.rand(3))
s2=pd.Series(np.random.rand(3),index=list('abc'))
print(s1,s2)

0    0.009789
1    0.629592
2    0.147525
dtype: float64 a    0.018171
b    0.340888
c    0.998620
dtype: float64

s1[5]=100
print(s1)
# 通过下标索引添加

0      0.009789
1      0.629592
2      0.147525
5    100.000000
dtype: float64

s2['e']=100
print(s2)
# 通过index添加

a      0.018171
b      0.340888
c      0.998620
e    100.000000
dtype: float64

s3=s1.append(s2)
print(s3)
# 直接用append方法添加，生成一个新数组

0      0.009789
1      0.629592
2      0.147525
5    100.000000
a      0.018171
b      0.340888
c      0.998620
e    100.000000
dtype: float64

# 修改
s1=pd.Series(np.random.rand(3))
s2=pd.Series(np.random.rand(3),index=list('abc'))
print(s1,s2)

0    0.609883
1    0.970671
2    0.465691
dtype: float64 a    0.212456
b    0.123941
c    0.829763
dtype: float64

s1[1]=100
print(s1)
s1[[1,2]]=200
print(s1)

0      0.609883
1    100.000000
2      0.465691
dtype: float64
0      0.609883
1    200.000000
2    200.000000
dtype: float64

s2[['a','b']]=300
print(s2)
# 可以通过索引下标或者index直接修改，也可以同时修改多个 [[]]

a    300.000000
b    300.000000
c      0.829763
dtype: float64

MrRenLG

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
pandas_Series基础学习

'''【课程2.2】 Pandas数据结构Series：基本概念及创建"一维数组"Serise'''import numpy as npimport pandas as pd# 基本概念 Series# Seriess=pd.Series(np.random.rand(5))print(s)s=pd.Series(np.array([1,2,3,5,4]))prin...
复制链接

扫一扫