参考链接: Python数据分析与展示
参考链接: Pandas官网
参考链接: User Guide
参考链接: Getting started tutorials
对索引排序:
.sort_index()方法在指定轴上根据索引进行排序,默认升序
.sort_index(axis=0, ascending=True)
实验演示1:
Microsoft Windows [版本 10.0.18363.1198]
(c) 2019 Microsoft Corporation。保留所有权利。
C:\Users\chenxuqi>python
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> b = pd.DataFrame(np.arange(20).reshape(4,5),index=["c","a","d","b"])
>>> b
0 1 2 3 4
c 0 1 2 3 4
a 5 6 7 8 9
d 10 11 12 13 14
b 15 16 17 18 19
>>> # 对行索引升序排序,而不关注数据本身
... b.sort_index()
0 1 2 3 4
a 5 6 7 8 9
b 15 16 17 18 19
c 0 1 2 3 4
d 10 11 12 13 14
>>> # 对行索引降序排序,而不关注数据本身
... b.sort_index(ascending=False)
0 1 2 3 4
d 10 11 12 13 14
c 0 1 2 3 4
b 15 16 17 18 19
a 5 6 7 8 9
>>> b
0 1 2 3 4
c 0 1 2 3 4
a 5 6 7 8 9
d 10 11 12 13 14
b 15 16 17 18 19
>>> c = b.sort_index(axis=1,ascending=False)
>>> c
4 3 2 1 0
c 4 3 2 1 0
a 9 8 7 6 5
d 14 13 12 11 10
b 19 18 17 16 15
>>> c = c.sort_index()
>>> c
4 3 2 1 0
a 9 8 7 6 5
b 19 18 17 16 15
c 4 3 2 1 0
d 14 13 12 11 10
>>>
>>>
对数据本身进行排序:
.sort_values()方法在指定轴上根据数值进行排序,默认升序
Series.sort_values(axis=0, ascending=True)
DataFrame.sort_values(by, axis=0, ascending=True)
by : axis轴上的某个索引或索引列表
实验演示2,对Series的排序:
>>> # 对数据本身的排序
... bSeries = pd.Series([99,44,203,34,56],index=['a','b','c','d','e'])
>>> bSeries
a 99
b 44
c 203
d 34
e 56
dtype: int64
>>> bSeries.sort_values(axis=0,ascending=True)
d 34
b 44
e 56
a 99
c 203
dtype: int64
>>> bSeries.sort_values(axis=0,ascending=False)
c 203
a 99
e 56
b 44
d 34
dtype: int64
>>>
>>>
实验演示3,对DataFrame的排序:
Microsoft Windows [版本 10.0.18363.1198]
(c) 2019 Microsoft Corporation。保留所有权利。
C:\Users\chenxuqi>python
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul 8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> b = pd.DataFrame(np.arange(20).reshape(4,5),index=['c','a','d','b'])
>>> b
0 1 2 3 4
c 0 1 2 3 4
a 5 6 7 8 9
d 10 11 12 13 14
b 15 16 17 18 19
>>> c = b.sort_values(2,ascending=False)
>>> c
0 1 2 3 4
b 15 16 17 18 19
d 10 11 12 13 14
a 5 6 7 8 9
c 0 1 2 3 4
>>> c = c.sort_values("a",axis=1,ascending=False)
>>> c
4 3 2 1 0
b 19 18 17 16 15
d 14 13 12 11 10
a 9 8 7 6 5
c 4 3 2 1 0
>>>
>>>
>>> np.random.seed(20200910)
>>> b = pd.DataFrame(np.random.randint(0,100,(4,5)),index=['c','a','d','b'])
>>> b
0 1 2 3 4
c 48 10 29 84 20
a 48 9 22 12 6
d 11 35 24 7 85
b 99 88 84 42 42
>>> b.sort_values(2,ascending=True)
0 1 2 3 4
a 48 9 22 12 6
d 11 35 24 7 85
c 48 10 29 84 20
b 99 88 84 42 42
>>> c = b.sort_values(2,ascending=False)
>>> c
0 1 2 3 4
b 99 88 84 42 42
c 48 10 29 84 20
d 11 35 24 7 85
a 48 9 22 12 6
>>> c = c.sort_values("a",axis=1,ascending=False)
>>> c
0 2 3 1 4
b 99 84 42 88 42
c 48 29 84 10 20
d 11 24 7 35 85
a 48 22 12 9 6
>>> c.sort_values("a",axis=1,ascending=True)
4 1 3 2 0
b 42 88 42 84 99
c 20 10 84 29 48
d 85 35 7 24 11
a 6 9 12 22 48
>>>
在排序中NaN被统一放到末尾:
>>>
>>> # 排序中NaN统一被放到末尾
... np.random.seed(20200910)
>>> a = pd.DataFrame(np.random.randint(0,100,(3,4)),index=['a','b','c'])
>>> a
0 1 2 3
a 48 10 29 84
b 20 48 9 22
c 12 6 11 35
>>> b = pd.DataFrame(np.random.randint(0,100,(4,5)),index=['c','a','d','b'])
>>> b
0 1 2 3 4
c 24 7 85 99 88
a 84 42 42 48 12
d 58 15 42 36 53
b 8 83 19 48 28
>>> c = a + b
>>> c
0 1 2 3 4
a 132.0 52.0 71.0 132.0 NaN
b 28.0 131.0 28.0 70.0 NaN
c 36.0 13.0 96.0 134.0 NaN
d NaN NaN NaN NaN NaN
>>> c.sort_values(2,ascending=False)
0 1 2 3 4
c 36.0 13.0 96.0 134.0 NaN
a 132.0 52.0 71.0 132.0 NaN
b 28.0 131.0 28.0 70.0 NaN
d NaN NaN NaN NaN NaN
>>> c.sort_values(2,ascending=True)
0 1 2 3 4
b 28.0 131.0 28.0 70.0 NaN
a 132.0 52.0 71.0 132.0 NaN
c 36.0 13.0 96.0 134.0 NaN
d NaN NaN NaN NaN NaN
>>>
>>>