pandas 数据类型之 Series

梦幻精灵_cq

已于 2022-05-08 10:36:51 修改

阅读量2.8k

点赞数

分类专栏：笔记 Pandas 文章标签： python

于 2022-04-05 22:26:40 首次发布

本文链接：https://blog.csdn.net/m0_57158496/article/details/123978774

版权

笔记同时被 2 个专栏收录

246 篇文章 9 订阅

订阅专栏

Pandas

11 篇文章 0 订阅

订阅专栏

本文详细介绍了Python pandas库中的Series数据类型，包括如何创建Series、进行list式操作、利用索引获取数据、重置有意义的索引、数据运算和布尔运算。通过实例展示了Series如何像dict一样使用，并探讨了其与list和dict的特性。此外，还涵盖了Series的运算及与其他Series对象的合并操作。

摘要由CSDN通过智能技术生成

Python 官网： https://www.python.org/

这里，才 python 前沿。可惜是英文原版。所以，我要练习英文阅读。🧐🧐

自学并不是什么神秘的东西，一个人一辈子自学的时间总是比在学校学习的时间长，没有老师的时候总是比有老师的时候多。

—— 华罗庚

本篇学习笔记，需要有 list、dict 数据类型基础打底，如您还没了解请先行学习。如继续下看，可能有些 “吃力”。

笔记：pandas 数据类型之 Series

pandas 作为 python的数据处理模块，她是支持 python 的全部数据类型的，也同样可以定义自己的类。但 pandas 还有两个自己的数据类型 Series、DataFrame，让数据在 pandas 中处理变得更得心应手。想要用 python 来处理数据，一定要“吃透” 这两个数据类型。

新冠疫情严峻，绿码通行。我把程序运行结果用 Color green 打印，祈愿疫情松缓，早日完结。

用一组数据举例：

五个学生的某次考试成绩，
67、45、98、100、87，
分别是 Jhon、Tom、Grace、Anna、Rose 的分数。

#!/usr/bin/nve python 
# coding: utf-8
print('\n'*3)
print('正在加载依赖库……'.center(31))
import pickle
import mypythontools as pyt
import pandas as pd


score = 67, 45, 98, 100, 87
name = 'Jhon', 'Tom', 'Grace', 'Anna', 'Rose'

创建 Series 数据类型 score

score = pd.Series(score) # 创建 Series。
print(f'\n\nscore 类型：\n\n{type(score)}\n\nscore 样子：\n\n{score}')

在这里插入图片描述

如您所见，Series 就坚起来的 list 。她的索引也是从 0 开始，可以像列表一样对其操作。

回首页

list 式操作

代码

print(f'\n\n{pyt.color(1, "f_green")}类 list 操作：\n') # 开始输出颜色
print(f'\n切片复制：\n{score[:]}\
\n\n切片索引下标 3 后：\n{score[3:]}\
\n\n索引下标 4 的值：{score[4]}')
print('\n\n用 * 打破 list：', *score)
print('\n\nsum 求和', sum(score), '\n'*2)
# 修改数据
score[2] = 99
print(f'修改索引 2 数据：\n{score}{pyt.color(0)}') # 结束输出颜色控制

输出

在这里插入图片描述

经过试验，Series 数据类型，的确支持基本的 list 式操作，她“就是”竖起来的 list 。

回首页

Series 数据类型虽然像 list ，但她有更多操作，可以用索引直接访问数据。如直接用 score.get(2) 获取索引 2 的分数 99。

代码

print(f'\n\n{pyt.color(1,"f_green")}像 dict 样 用 get 获取索引 2 的数据：{score.get(2)}{pyt.color(0)}')

输出

在这里插入图片描述

由此可见，Series 数据结构具有 dict 的特性，她的索引相当于 dict 的 key。

回首页

虽然可以用索引直接取值，但数字索引意义不名，使用极为不便。Series 还有重置索引的招数，完美解决这个尴尬。比如，前面的例子可以用分数对应的姓名列表重置索引，就可以方便的用姓名获取对应分数，是不是 So easy 😋。

代码

# 重置索引
score.index = name
print(f'\n\n{pyt.color(1,"f_green")}用 name 重置索引：\n\n{score}{pyt.color(0)}')

输出

在这里插入图片描述

代码

# 值和索引
print(f'\n\n{pyt.color(1, "f_green")}数据值：\n{score.values}\
\n\n索引：\n{score.index}{pyt.color(0)}')

输出

在这里插入图片描述

通过 Series 的 values、infex 方法，可以获取整个数据列和整个索引，她们都是一个 list 。

回首页

创建 Series 数据类型时，可以用 index 关键字设置索引。

代码

# 创建 Series 数据类型时，可以用 index 关键字设置索引。
score = pd.Series(score, index=name)
print(f"\n\n{pyt.color(1,'f_green')}创建 Series 数据类型时，用关键字 index 设置索引：\n\n{score}{pyt.color(0)}")

输出

在这里插入图片描述

回首页

dict 数据创建 Series ，key 是默认索引，效果跟 index 关键字设置索引一样。

代码

# dict 数据创建 Series ，key 是默认索引，效果跟 index 关键字设置索引一样。
dict1 = dict(zip(name, score)) # 用两个 list 生成 dict 。
score = pd.Series(dict1)
print(f"\n\n{pyt.color(1,'f_green')}用 dict ，如：\n{dict1}\
\n\n创建 Series 数据类型时，key 是默认索引：\n{score}{pyt.color(0)}")

输出

在这里插入图片描述

回首页

这下子就可以用姓名索引直接获取分数成绩了。

代码

# 用 name 索引取分数成绩
print(pyt.color(1, 'f_green')) # 绿色打印输出起。
names = name[::-2]
print(f'\n\n查询 {"、".join(names)} 的分数成绩：\n')
for i in names: # 倒序间隔一个查 name 中名字的成绩。
    print(f'{i:>12} 的分数成绩：{score.get(i)}')
print(pyt.color(0)) # 绿色打印输出止。

输出

在这里插入图片描述

回首页

注意：重置的索引，要有意义，从“字面上”能有所“明悟”，如例子中的 name 索引。

如下所示设置，就与默认数字索引没啥差别了。

代码

# 注意：重置的索引，要有意义，从“字面上”能有所“明悟”，如例子中的 name 索引。
# 如下设置，就与默认数字索引没啥差别了。
score.index = 'S1', 'S2', 'S3', 'S4', 'S5'
print(f"\n\n{pyt.color(1,'f_green')}重置无意义索引：\n{score}{pyt.color(0)}")

输出

在这里插入图片描述

回首页

Series 数据类型，也可以进行“运算。”

数据运算(例子都是在原分数基础上操作)

代码

# 一、数据运算(例子都是在原分数基础上操作)。

print(f'\n\n按实际分数的六成计分：score * 0.6\n{score * 0.6}')
print(f'\n\n每人加 10 印象分：score + 10\n{score + 10}')
other_score = [3, 5, 8, 4, 9] # 附加题得分
print(f'\n\n基础得分 + 附加题得分：\
\n\n基础得分\n{score}\n附加题得分\n{other_score}\
\n\n{score + other_score}')
handwriting_score = [0.5, 0.9, 0.8, 0, 0.2] # 字迹潦草扣分。
print(f'\n\n扣除字迹潦草扣分\n{handwriting_score}\
\n后成绩得分：score - handwriting_score\n{score - handwriting_score}')

输出

在这里插入图片描述

回首页

二、布尔运算(过滤)

代码

# 二、布尔运算(过滤)
print(f'\n\n成绩 60 分以下的有：score[score < 60]\n{score[score < 60]}\
\n\n成绩 90 分以上的有：score[score > 90]\n{score[score > 90]}\
\n\n成绩 100 分的有：score[score == 100]\n{score[score == 100]}')

输出

在这里插入图片描述

回首页

Series 数据类型，也可以进行“运算”，合并后的 Series ，保留先前两个 Series 中的所有索引项，索引项分别在先前的 Series 中的，数据会叠加保存；只在先前其中一个 Series 的索引项数据空(NaN)，不管先前的数据是不是空，全部置空。

代码

# 二、Series 类型对象运算。

# 新建两个 Series 数据类型对象 s、s2。
s = pd.Series(range(2, 10), index=list('aycdxfgh'))
s2 = pd.Series(range(5, 13), index=list('dxfyhijl'))
print(f'\n\n新建两个 Series 数据类型对象 s、s2：\
\n\nSeries 对象 s：\n{s}\n\nSeries 对象s2：\n{s2}\
\n\n合并两个 Series 数据类型对象 s + s2：\n{s + s2}')

print(pyt.color(0)) # 绿色字符打印输出止。

pyt.wait()

输出

在这里插入图片描述

回首页

完整 Python 代码

我的解题思路，已融入代码注释，博文中就不再赘述。

(如果从语句注释不能清楚作用，请评论区留言指教和探讨。🤝)

#!/sur/bin/env python
# coding: utf-8

'''

filename: /sdcard/qpython/tem.py

梦幻精灵_cq的炼码场


'''

print('\n'*3)
print('正在加载依赖库……'.center(31))
import mypythontools as pyt # 自码工具模块加载
import pandas as pd


score = 67, 45, 98, 100, 87
name = 'Jhon', 'Tom', 'Grace', 'Anna', 'Rose'

score = pd.Series(score) # 创建 Series。
print(f'\n\n{pyt.color(1, "f_green")}score 类型：\n{type(score)}\
\n\nscore 样子：\n{score}{pyt.color(0)}')

print(f'\n\n{pyt.color(1, "f_green")}类 list 操作：\n')
print(f'\n切片复制：\n{score[:]}\
\n\n切片索引下标 3 后：\n{score[3:]}\
\n\n索引下标 4 的值：{score[4]}')
print('\n\n用 * 打破 list：', *score)
print('\n\nsum 求和', sum(score), '\n'*2)
# 修改数据
score[2] = 99
print(f'修改索引 2 数据：\n{score}{pyt.color(0)}')

print(f'\n\n{pyt.color(1,"f_green")}像 dict 样 用 get 获取索引 2 的数据：{score.get(2)}{pyt.color(0)}')

# 重置索引
score.index = name
print(f'\n\n{pyt.color(1,"f_green")}用 name 重置索引：\n\n{score}{pyt.color(0)}')

# 值和索引
print(f'\n\n{pyt.color(1, "f_green")}数据值：\n{score.values}\n\n索引：\n{score.index}{pyt.color(0)}')

# 创建 Series 数据类型时，可以用 index 关键字设置索引。
score = pd.Series(score, index=name)
print(f"\n\n{pyt.color(1,'f_green')}创建 Series 数据类型时，用关键字 index 设置索引：\n\n{score}{pyt.color(0)}")

# dict 数据创建 Series ，key 是默认索引，效果跟 index 关键字设置索引一样。
dict1 = dict(zip(name, score)) # 用两个 list 生成 dict 。
score = pd.Series(dict1)
print(f"\n\n{pyt.color(1,'f_green')}用 dict ，如：\n{dict1}\
\n\n创建 Series 数据类型时，key 是默认索引：\n{score}{pyt.color(0)}")

# 用 name 索引取分数成绩
print(pyt.color(1, 'f_green')) # 绿色打印输出起。
names = name[::-2]
print(f'\n\n查询 {"、".join(names)} 的分数成绩：\n')
for i in names: # 倒序间隔一个查 name 中名字的成绩。
    print(f'{i:>12} 的分数成绩：{score.get(i)}')
print(pyt.color(0)) # 绿色打印输出止。

# 注意：重置的索引，要有意义，从“字面上”能有所“明悟”，如例子中的 name 索引。
# 如下设置，就与默认数字索引没啥差别了。
tem = score[:]
tem.index = 'S1', 'S2', 'S3', 'S4', 'S5'
print(f"\n\n{pyt.color(1,'f_green')}重置无意义索引：\n{tem}{pyt.color(0)}")

# Series 的运算
print(pyt.color(1, 'f_green')) # 绿色字符打印输出起。

# 一、数据运算(例子都是在原分数基础上操作)。

print(f'\n\n按实际分数的六成计分：score * 0.6\n{score * 0.6}')
print(f'\n\n每人加 10 印象分：score + 10\n{score + 10}')
other_score = [3, 5, 8, 4, 9] # 附加题得分
print(f'\n\n基础得分 + 附加题得分：\
\n\n基础得分：\n{score}\n\n附加题得分：\n{other_score}\
\n\n{score + other_score}')
handwriting_score = [0.5, 0.9, 0.8, 0, 0.2] # 字迹潦草扣分。
print(f'\n\n扣除字迹潦草扣分\n{handwriting_score}\
\n后成绩得分：score - handwriting_score\n{score - handwriting_score}')

# 二、布尔运算(过滤)
print(f'\n\n成绩 60 分以下的有：score[score < 60]\n{score[score < 60]}\
\n\n成绩 90 分以上的有：score[score > 90]\n{score[score > 90]}\
\n\n成绩 100 分的有：score[score == 100]\n{score[score == 100]}')

# 三、Series 类型对象运算。

# 新建两个 Series 数据类型对象 s、s2。
s = pd.Series(range(2, 10), index=list('aycdxfgh'))
s2 = pd.Series(range(5, 13), index=list('dxfyhijl'))
print(f'\n\n新建两个 Series 数据类型对象 s、s2：\
\n\nSeries 对象 s：\n{s}\n\nSeries 对象s2：\n{s2}\
\n\n合并两个 Series 数据类型对象 s + s2：\n{s + s2}')

print(pyt.color(0)) # 绿色字符打印输出止。

pyt.wait()