Pandas-数据结构-Series（三）：常用操作【数据查看（head）、排序（sort_values）、重新索引（reindex）、对齐（计算时根据标签自动对齐）、添加元素、修改元素、删除元素】

u013250861

已于 2022-04-11 21:06:28 修改

阅读量1.8k

点赞数

分类专栏： Pandas 文章标签： Pandas Series

于 2022-04-06 23:37:17 首次发布

本文链接：https://blog.csdn.net/u013250861/article/details/124003140

版权

Pandas 专栏收录该内容

45 篇文章 20 订阅

订阅专栏

一、数据查看

.head()查看头部数据
.tail()查看尾部数据
默认查看5条

import numpy as np
import pandas as pd

s = pd.Series(np.random.rand(50))

print("s.head() = \n", s.head())
print("-" * 100)
print("s.head(10) = \n", s.head(10))
print("-" * 100)
print("s.tail() = \n", s.tail())

打印结果：

s.head() = 
0    0.891778
1    0.575982
2    0.138742
3    0.101361
4    0.247216
dtype: float64
----------------------------------------------------------------------------------------------------
s.head(10) = 
0    0.891778
1    0.575982
2    0.138742
3    0.101361
4    0.247216
5    0.376180
6    0.117379
7    0.001082
8    0.769211
9    0.204997
dtype: float64
----------------------------------------------------------------------------------------------------
s.tail() = 
45    0.020636
46    0.062189
47    0.110146
48    0.958667
49    0.788788
dtype: float64

Process finished with exit code 0

二、排序

使用series.sort_values(ascending=True)进行排序

series排序时，只有一列，不需要参数

data['p_change'].sort_values(ascending=True).head()

2015-09-01   -10.03
2015-09-14   -10.02
2016-01-11   -10.02
2015-07-15   -10.02
2015-08-26   -10.01
Name: p_change, dtype: float64

使用series.sort_index()进行排序

与df一致

# 对索引进行排序
data['p_change'].sort_index().head()

2015-03-02    2.62
2015-03-03    1.44
2015-03-04    1.57
2015-03-05    2.02
2015-03-06    8.51
Name: p_change, dtype: float64

三、重新索引

.reindex将会根据索引重新排序，如果当前索引不存在，则引入缺失值

.reindex()中也是写列表
这里’d’索引不存在，所以值为NaN
fill_value参数：填充缺失值的值

import numpy as np
import pandas as pd

# 重新索引reindex
# .reindex将会根据索引重新排序，如果当前索引不存在，则引入缺失值

s = pd.Series(np.random.rand(3), index=['a', 'b', 'c'])
print("s = \n", s)
print("-" * 100)

# .reindex()中也是写列表
# 这里'd'索引不存在，所以值为NaN
s1 = s.reindex(['c', 'b', 'a', 'd'])
print("s1 = \n", s1)
print("-" * 100)

# fill_value参数：填充缺失值的值
s2 = s.reindex(['c', 'b', 'a', 'd'], fill_value=0)
print("s2 = \n", s2)

打印结果：

s = 
a    0.496666
b    0.828771
c    0.363888
dtype: float64
----------------------------------------------------------------------------------------------------
s1 = 
c    0.363888
b    0.828771
a    0.496666
d         NaN
dtype: float64
----------------------------------------------------------------------------------------------------
s2 = 
c    0.363888
b    0.828771
a    0.496666
d    0.000000
dtype: float64

Process finished with exit code 0

四、Series对齐（操作会根据标签自动对齐）

Series 和 ndarray 之间的主要区别是，Series 上的操作会根据标签自动对齐

index顺序不会影响数值计算，以标签来计算
空值和任何值计算结果扔为空值

import numpy as np
import pandas as pd

# Series对齐

s1 = pd.Series(np.random.rand(3), index = ['Jack','Marry','Tom'])
s2 = pd.Series(np.random.rand(3), index = ['Wang','Jack','Marry'])
print("s1 = \n", s1)
print("s2 = \n", s2)
print("-" * 100)
print("s1+s2 = \n", s1+s2)

打印结果：

s1 = 
Jack     0.965087
Marry    0.088279
Tom      0.369567
dtype: float64
s2 = 
Wang     0.398997
Jack     0.082579
Marry    0.856640
dtype: float64
----------------------------------------------------------------------------------------------------
s1+s2 = 
Jack     1.047665
Marry    0.944919
Tom           NaN
Wang          NaN
dtype: float64

Process finished with exit code 0

五、添加元素/数组

直接通过下标索引/标签index添加值

通过.append方法，直接添加一个数组
.append方法生成一个新的数组，不改变之前的数组

import numpy as np
import pandas as pd

# 添加

s1 = pd.Series(np.random.rand(5))
s2 = pd.Series(np.random.rand(5), index=list('ngjur'))
print("s1 = \n", s1)
print("s2 = \n", s2)
print("-" * 100)

# 直接通过下标索引/标签index添加值
s1[5] = 100
s2['a'] = 100
print("s1 = \n", s1)
print("s2 = \n", s2)
print("-" * 100)

s3 = s1.append(s2)
print("s1 = \n", s1)
print("s3 = \n", s3)

打印结果：

s1 = 
0    0.418343
1    0.611628
2    0.793579
3    0.643884
4    0.062399
dtype: float64
s2 = 
 n    0.178642
g    0.360007
j    0.287545
u    0.016724
r    0.126153
dtype: float64
----------------------------------------------------------------------------------------------------
s1 = 
0      0.418343
1      0.611628
2      0.793579
3      0.643884
4      0.062399
5    100.000000
dtype: float64
s2 = 
n      0.178642
g      0.360007
j      0.287545
u      0.016724
r      0.126153
a    100.000000
dtype: float64
----------------------------------------------------------------------------------------------------
s1 = 
0      0.418343
1      0.611628
2      0.793579
3      0.643884
4      0.062399
5    100.000000
dtype: float64
s3 = 
0      0.418343
1      0.611628
2      0.793579
3      0.643884
4      0.062399
5    100.000000
n      0.178642
g      0.360007
j      0.287545
u      0.016724
r      0.126153
a    100.000000
dtype: float64

Process finished with exit code 0

六、修改元素

通过索引直接修改，类似序列

import numpy as np
import pandas as pd

# 修改

s = pd.Series(np.random.rand(3), index=['a', 'b', 'c'])
print("s = \n", s)
s['a'] = 100
s[['b', 'c']] = 200
print("-" * 100)
print("s = \n", s)

打印结果：

s = 
a    0.383475
b    0.123369
c    0.911300
dtype: float64
----------------------------------------------------------------------------------------------------
s = 
a    100.0
b    200.0
c    200.0
dtype: float64

Process finished with exit code 0

七、删除值

drop 删除元素之后返回新对象

import numpy as np
import pandas as pd

# 删除：.drop

s = pd.Series(np.random.rand(5), index=list('ngjur'))
print("s = \n", s)
print("-" * 100)
s1 = s.drop('n')
s2 = s.drop(['g', 'j'])
print("s1 = \n", s1)
print("-" * 50)
print("s2 = \n", s2)
print("-" * 50)
print("s = \n", s)

打印结果

s = 
n    0.744795
g    0.345820
j    0.001573
u    0.275530
r    0.046669
dtype: float64
----------------------------------------------------------------------------------------------------
s1 = 
g    0.345820
j    0.001573
u    0.275530
r    0.046669
dtype: float64
--------------------------------------------------
s2 = 
n    0.744795
u    0.275530
r    0.046669
dtype: float64
--------------------------------------------------
s = 
n    0.744795
g    0.345820
j    0.001573
u    0.275530
r    0.046669
dtype: float64

Process finished with exit code 0

u013250861

关注

0
点赞
踩
3

收藏

觉得还不错? 一键收藏
0
评论
Pandas-数据结构-Series（三）：常用操作【数据查看（head）、排序（sort_values）、重新索引（reindex）、对齐（计算时根据标签自动对齐）、添加元素、修改元素、删除元素】

一、数据查看.head()查看头部数据.tail()查看尾部数据默认查看5条import numpy as npimport pandas as pds = pd.Series(np.random.rand(50))print("s.head() = \n", s.head())print("-" * 100)print("s.head(10) = \n", s.head(10))print("-" * 100)print("s.tail() = \n", s.tail())
复制链接

扫一扫