pandas排序和排名

最新推荐文章于 2024-05-24 20:39:56 发布

weixin_43139613

最新推荐文章于 2024-05-24 20:39:56 发布

阅读量701

点赞数 1

分类专栏：笔记文章标签： python

本文链接：https://blog.csdn.net/weixin_43139613/article/details/82855856

版权

笔记专栏收录该内容

105 篇文章 1 订阅

订阅专栏

根据条件对数据进行排序（sorting）是一种重要的内置运算。要对行或者列进行排序（按字典顺序），可使用sort_index方法，它将返回一个已排序的新对象。

frame= DataFrame(np.arange(8).reshape(2,4),
				index=['one','two'],columns=[list('dabc')])
frame
Out[5]: 
     d  a  b  c
one  0  1  2  3
two  4  5  6  7

frame.sort_index()
Out[9]: 
     d  a  b  c
one  4  5  6  7
two  0  1  2  3

frame.sort_index(axis=1)
Out[11]: 
     a  b  c  d
two  1  2  3  0
one  5  6  7  4

frame.sort_index(axis=1,ascending=False)#反向排序
Out[12]: 
     d  c  b  a
two  0  3  2  1
one  4  7  6  5

给by赋值，根据一个或者多个行列的值来排序

f2 = DataFrame({'b':[4,7,-3,2],'a':[0,1,0,1]})
f2
Out[55]: 
   b  a
0  4  0
1  7  1
2 -3  0
3  2  1


f2.sort_index(by='b')
__main__:1: FutureWarning: by argument to sort_index is deprecated,
 please use .sort_values(by=...)
Out[59]: 
   b  a
2 -3  0
3  2  1
0  4  0
1  7  1

#系统建议使用.sort_values(by=...
f2.sort_values(by='b')
Out[60]: 
   b  a
2 -3  0
3  2  1
0  4  0
1  7  1

多列进行排序，由优先级

f2.sort_values(by=['a','b'])
Out[61]: 
   b  a
2 -3  0
0  4  0
3  2  1
1  7  1

f2.sort_values(by=['b','a'])
Out[62]: 
   b  a
2 -3  0
3  2  1
0  4  0
1  7  1

行进行排序

f2.sort_values(by=[1],axis=1)
Out[64]: 
   a  b
0  0  4
1  1  7
2  0 -3
3  1  2

f2.sort_values(by=[1,2],axis=1)
Out[65]: 
   a  b
0  0  4
1  1  7
2  0 -3
3  1  2

排名rank()

obj
Out[66]: 
0    7
1   -5
2    7
3    4
4    2
5    0
6    4
dtype: int64

method	说明
‘average’	默认：在想等分组中，为各个值分配平均排名
‘min’	使用整个分组的最小排名
‘max’	使用整个分组的最大排名
‘first’	按值在原始数据中出现的顺序排名

obj.rank()
Out[67]: 
0    6.5
1    1.0
2    6.5
3    4.5
4    3.0
5    2.0
6    4.5
dtype: float64

obj.rank(method='min')
Out[69]: 
0    6.0
1    1.0
2    6.0
3    4.0
4    3.0
5    2.0
6    4.0
dtype: float64

obj.rank(method='max')
Out[70]: 
0    7.0
1    1.0
2    7.0
3    5.0
4    3.0
5    2.0
6    5.0
dtype: float64

obj.rank(method='first',ascending=False)
Out[71]: 
0    1.0
1    7.0
2    2.0
3    3.0
4    5.0
5    6.0
6    4.0
dtype: float64

同理，DataFrame 也可以在行列上进行排名