Python 基础教程 —— Pandas 库常用方法实例说明

最新推荐文章于 2024-04-17 22:47:46 发布

风尘浪子

最新推荐文章于 2024-04-17 22:47:46 发布

阅读量906

点赞数 1

分类专栏： Python 基础教程 Pandas 库实用方法文章标签： python 后端

本文链接：https://blog.csdn.net/Leslies2/article/details/117075505

版权

1. pandas.Series（data=None, index=None, dtype=None, name=None, copy=False, fastpath=False ）

2. pandas.DataFrame ([data],[index]) 根据行建立数据

3. pandas.DataFrame ({dic}) 根据列建立数据

4. pandas.DataFrame（[list]）根据数据建立列数据

5. loc 、iloc数据筛选

6. 多级行索引

7. 使用 pandas.MultiIndex 显式创建多级行索引

8. 多级行索引的升维及降维

9. 在DataFrame 中添加列 insert

10. 排序 sort

11. 根据多级索引进行数据统计

12. 简易合并 pandas.concat

13. merge 合并与连接

14. 列统计函数 describe

15. groupby 分组运算

16. pivot_table 数据透视表

17. 高性能列间运算 eval 与 query

1. pandas.Series（data=None, index=None, dtype=None, name=None, copy=False, fastpath=False ）

data：支持多种数据类型

index：可选参数，数据索引，如为空则是由0开始的整数排序，索引确定后只能查看不能修改

dtype: 数据类型，可为空

name: 列名，可为空

 1 # index 为空时，默认由0开始顺序排列
 2 list=pd.Series(['a','b','c'])
 3 print(list)
 4 --------------------------------------------------------
 5 out:
 6 1 a
 7 2 b
 8 3 c
 9 =======================================================
10 
11 #使用 index 输入
12 list=pd.Series(['Leslie','Jack','Mike'],[2,1,3])
13 print(list)
14 --------------------------------------------------------
15 out:
16 2 Leslie
17 1 Jack
18 3 Mike
19 ========================================================
20 
21 # 以dic字典输入数据
22 list=pd.Series({2:'Leslie',1:'Jack',3:'Mike'})
23 print(list)
24 --------------------------------------------------------
25 out:
26 2 Leslie
27 1 Jack
28 3 Mike
29 ========================================================
30 
31 #显示筛选结果
32 list=pd.Series({2:'Leslie',1:'Jack',3:'Mike'},[2,3])
33 print(list)
34 --------------------------------------------------------
35 out:
36 2 Leslie
37 3 Mike
38 =========================================================
39 
40 #指定列名name
41 price=pd.Series(['68','90'],name='price',index=['JAVA IN ACTION','Python Data Science Handbook'])
42 print(price)
43 --------------------------------------------------------
44 out:
45 JAVA IN ACTION                  68
46 Python Data Science Handbook    90
47 Name: price, dtype: object

注意：列名默认以0开始的整数

2. pandas.DataFrame ([data],[index]) 根据行建立数据

　　 DataFrame可看作panads的行索引，最基础是通过单个已有的series对象创建DataFrame

　　 data: 被panads序列化的行数据集

index：行索引集合，为空时将由0开始按整数排列

1 java=pd.Series({'price':68,'count':1})
2 python=pd.Series({'price':90,'count':1})
3 frame=pd.DataFrame(data=[java,python],index=['JAVA IN ACTION','Python Data Science Handbook'])
4 print(frame)

输出

注意：data, index 参数必须是集合，否则会报错

3. pandas.DataFrame ({dic}) 根据列建立数据

可通过此方法利用字典建立列数据

1 #每本书的价格列
2 price=pd.Series({'JAVA IN ACTION':68,'Python Data Science Handbook':90})
3 #每本书的数据列
4 count=pd.Series({'JAVA IN ACTION':1,'Python Data Science Handbook':1})
5 #使用字典建立DataFrame
6 frame=pd.DataFrame({'price':price,'count':count})
7 print(frame)

结果与上面一样，系统会根据行索引绑定数据

4. pandas.DataFrame（[list]）根据数据建立列数据

注意：使用 list 与 dic 最大不同在 dic 在调用于生成列时先通过 index 指定行索引

1 price1=pd.Series(['68','90'],name='price1',index=['JAVA IN ACTION','Python Data Science Handbook'])
2 count1=pd.Series(['1','1'],name='count1',index=['JAVA IN ACTION','Python Data Science Handbook'])
3 frame1=pd.DataFrame([price1,count1])
4 print(frame1)

对比上面例子，当以数组建立 DataFrame 时，数组内的数据默认为行数据

5. loc 、iloc数据筛选

data=pandas.Series(['Leslie',‘Rose','Jack','Mike'])

显式索引即 data[ 'Leslie' : 'Jack'] 作切片时，结果包含最后一个索引即 Jack

隐式索引即 data[ 0 : 2 ]作切片时，结果不包含最后一个

为了避免混淆，建议使用 loc（显式）、iloc（隐式）

data[ 'Leslie' : 'Jack'] 等效于 data.loc[ 'Leslie' : 'Jack']

data[ 0 : 2 ]等效于data.iloc[ 0 : 2 ]

同时，loc 也可作为数据的筛选条件

1 age=pd.Series({'Leslie':28,'Jack':32,'Rose':18})
2 address=pd.Series({'Jack':'Beijing','Rose':'Shanghai','Leslie':'Guangzhou'})
3 person=pd.DataFrame({'address':address,'age':age})
4 print(person.loc[person['age']<30])

显示结果

多条件筛选

1 age=pd.Series({'Leslie':28,'Jack':32,'Rose':18})
2 address=pd.Series({'Jack':'Beijing','Rose':'Shanghai','Leslie':'Guangzhou'})
3 person=pd.DataFrame({'address':address,'age':age})
4 print(person.loc[(person['age']<30) & (person['age']>20)])

6. 多级行索引

将 index 行索引分成多维级别

1 test=pd.DataFrame(data=np.random.rand(4,2),
2                    index=[['index0','index0','index1','index1'],[0,1,0,1]],
3                    columns=['column0','column1'])
4 print(test)

结果

可为多级行索引建立名称，容易管理

1 test1=pd.DataFrame(data=np.random.rand(4,2),
2                    index=[['index0','index0','index1','index1'],[0,1,0,1]],
3                    columns=['column0','column1'])
4 test1.index.names=['indexName0','indexName1']
5 print(test1)

结果

7. 使用 pandas.MultiIndex 显式创建多级行索引

使用数组方法 MultiIndex.from_arrays （）

1 data=[['Python Learning from Scratch','1','68'],['Pro Apahe Hadoop','1','105'],['Python Crash Course','2','89']
2     ,['Beginning Python From Novice','1','76'],['Python Appclications','2','120'],['Deep Learning with TensorFlow','1','58']]
3 index=pd.MultiIndex.from_arrays([['Leslie','Leslie','Jack','Jack','Mike','Mike'],[2020,2021,2020,2021,2020,2021]])
4 column=['Book','Count','Price']
5 book=pd.DataFrame(data=data,index=index,columns=column)

使用索引值的元组方法 MultiIndex.from_tuples（）

1 data=[['Python Learning from Scratch','1','68'],['Pro Apahe Hadoop','1','105'],['Python Crash Course','2','89']
2     ,['Beginning Python From Novice','1','76'],['Python Appclications','2','120'],['Deep Learning with TensorFlow','1','58']]
3 index=pd.MultiIndex.from_tuples([('Leslie',2020),('Leslie',2021),('Jack',2020),('Jack',2021),('Mike',2020),('Mike',2021)])
4 column=['Book','Count','Price']
5 book=pd.DataFrame(data=data,index=index,columns=column)

使用笛卡乐积方法 MultiIndex.from_product （）

1 data=[['Python Learning from Scratch','1','68'],['Pro Apahe Hadoop','1','105'],['Python Crash Course','2','89']
2     ,['Beginning Python From Novice','1','76'],['Python Appclications','2','120'],['Deep Learning with TensorFlow','1','58']]
3 index=pd.MultiIndex.from_product([['Leslie','Jack','Mike'],[2020,2021]])
4 column=['Book','Count','Price']
5 book=pd.DataFrame(data=data,index=index,columns=column)

上面3种方法可获取相同结果，3种方法有不同的使用场景

8. 多级行索引的升维及降维

继续以上面例子为例，使用 stack（level）可以把 DataFrame 升维，使用 unstack（level&#

最低0.47元/天解锁文章

风尘浪子

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
2
评论
Python 基础教程 —— Pandas 库常用方法实例说明

目录1. pandas.Series（data=None, index=None, dtype=None, name=None, copy=False, fastpath=False ）2. pandas.DataFrame ([data],[index]) 根据行建立数据3. pandas.DataFrame ({dic}) 根据列建立数据4. pandas.DataFrame（[list]）根据数据建立列数据5. loc 、iloc数据筛选6. 多级行索引7. 使...
复制链接

扫一扫