5.2.6 函数应用和映射（Series & DataFrame）

最新推荐文章于 2024-06-16 18:20:37 发布

zimi_zzzz

最新推荐文章于 2024-06-16 18:20:37 发布

阅读量395

点赞数 1

分类专栏： Python学习文章标签：排序函数映射

Python学习专栏收录该内容

13 篇文章 1 订阅

订阅专栏

Numpy的通用函数对pandas对象仍然有效，需要使用DateFrame的apply方法或者applymap方法（逐元素python函数）

frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'),
                     index=['Utah', 'Ohio', 'Texas', 'Oregon'])
print(frame)
print(np.abs(frame))
#写一个函数，将函数apply到DataFrame中
f = lambda x: x.max() - x.min()
f2 = frame.apply(f)#每列调用一次
print(f2)
f3 = frame.apply(f, axis='columns') #每行调用一次
print(f3)
def f(x):
    return pd.Series([x.min(), x.max()], index=['min', 'max'])
f4 = frame.apply(f)
print(f4)
#逐元素计算的函数
format = lambda x: x**2
format1 = frame.applymap(format)
print(format1)
#也可以针对某一列,这里就只能用map方法
format3 = frame['e'].map(format)
print(format3)

结果：

              b         d         e
Utah    2.275905 -0.464549 -0.313687
Ohio    0.506466 -0.211827  2.278658
Texas   0.541429  0.181918  1.170932
Oregon  0.890301  0.834472  0.075096
               b         d         e
Utah    2.275905  0.464549  0.313687
Ohio    0.506466  0.211827  2.278658
Texas   0.541429  0.181918  1.170932
Oregon  0.890301  0.834472  0.075096
b    1.769439
d    1.299020
e    2.592345
dtype: float64
Utah      2.740453
Ohio      2.490485
Texas     0.989014
Oregon    0.815205
dtype: float64
            b         d         e
min  0.506466 -0.464549 -0.313687
max  2.275905  0.834472  2.278658
               b         d         e
Utah    5.179742  0.215806  0.098400
Ohio    0.256507  0.044871  5.192281
Texas   0.293145  0.033094  1.371081
Oregon  0.792636  0.696343  0.005639
Utah      0.098400
Ohio      5.192281
Texas     1.371081
Oregon    0.005639
Name: e, dtype: float64

按行、列索引大小进行排序用 sort_index()方法：frame.sort_index()

按值大小进行排序的话用过 sort_values(）方法:

obj = pd.Series(range(4), index=['d', 'a', 'c', 'b'])
print(obj)
print(obj.sort_index())
print(obj.sort_values())
#对DataFrame进行排序时，对某列进行排序，传递参数用 by=...
frame = pd.DataFrame({'b': [4, 7, -3, 2], 'a': [0, 1, 0, 1]})
print(frame)
print(frame.sort_values(by='b'))
print(frame.sort_values(by=['b', 'a']))
print(frame.sort_values(by=['a', 'b']))
#当多个列排序时，谁在前就按照前者的值来排大小，默认为升序。

结果：

d    0
a    1
c    2
b    3
dtype: int32
a    1
b    3
c    2
d    0
dtype: int32
d    0
a    1
c    2
b    3
dtype: int32
   a  b
0  0  4
1  1  7
2  0 -3
3  1  2
   a  b
2  0 -3
3  1  2
0  0  4
1  1  7
   a  b
2  0 -3
3  1  2
0  0  4
1  1  7
   a  b
2  0 -3
0  0  4
3  1  2
1  1  7