Python pandas使用map, apply和applymap实现对DataFrame进行单列/行，多列/行，以及所有元素的操作

每天都想躺平的大喵

已于 2022-08-08 11:52:31 修改

阅读量4k

点赞数 4

分类专栏： numpy&pandas 文章标签： python 机器学习

于 2021-12-11 23:57:08 首次发布

本文链接：https://blog.csdn.net/weixin_39925939/article/details/121881550

版权

numpy&pandas 专栏收录该内容

12 篇文章

订阅专栏

Python pandas使用map, apply和applymap实现对DataFrame进行单列/行，多列/行，以及所有元素的操作

map：只可以实现对DataFrame中单列或单行的操作
- 例1，字典替换DataFrame中的行或者列
- 例2，groupby得到的字典替换DataFrame中groupby的列
apply: 可以实现对DataFrame的多列或多行的操作
- result_type=expand 的神奇用法
applymap: 可以实现对DataFrame中所有元素的整体操作

最近在查看网上关于pandas DataFrame使用map, apply和applymap的说明时，发现许多博文未能写清楚关键点。这里整理一下每个函数的使用范围和适用情况。

map：只可以实现对DataFrame中单列或单行的操作

首先明确一点，DataFrame是没有map函数的，只有Series有。DataFrame的单列/行是一个Series，所以map可以实现单列/行的操作。

例1，字典替换DataFrame中的行或者列

以下是使用举例：用字典替换列a中的值，以及对行2进行判断填充。

df = pd.DataFrame(np.arange(12).reshape(3,4), columns=list('abcd'))
df
>>> a	b	c	d
0	0	1	2	3
1	4	5	6	7
2	8	9	10	11

# 用字典替换列`a`中的值
df['a'] = df['a'].map({0:0, 4:10, 8:4})
df
>>> a	b	c	d
0	0	1	2	3
1	10	5	6	7
2	4	9	10	11

# 对行`2`进行判断
df.loc[2] = df.loc[2].map(lambda x: x if x>9 else 9-x)
df
>>>	a	b	c	d
0	0	1	2	3
1	10	5	6	7
2	5	0	10	11

例2，groupby得到的字典替换DataFrame中groupby的列

df = pd.DataFrame({'c1':['a', 'a', 'a', 'b', 'b', 'b'],
                  'c2':['c', 'd', 'd', 'e', 'f', 'f'],
                  'c3':[1, 2, 3, 4, 5, 6],
                   'c4': np.random.randn(6)
                  })
                  
df.groupby('c1')['c3'].mean()
>>> c1
a    2
b    5
Name: c3, dtype: int64              

df['c1'].map(df.groupby('c1')['c3'].mean())
>>>0    2
1    2
2    2
3    5
4    5
5    5
Name: c1, dtype: int64

apply: 可以实现对DataFrame的多列或多行的操作

DataFrame的apply函数可以实现对多列，多行的操作。需要记住的是，参数axis设为1是对列进行操作，参数axis设为0是对行操作。默认是对行操作。 以下是多列和多行操作的举例：

现有如下一个DataFrame:

np.random.seed(1)
df = pd.DataFrame(np.random.randn(4,2), columns=['A', 'B'])
df
>>>       A	    B
0	1.624345	-0.611756
1	-0.528172	-1.072969
2	0.865408	-2.301539
3	1.744812	-0.761207

对多列进行操作
对A, B两列操作，生成C列, 其中C是字符串，由A ± B组成

df['C'] = df.apply(lambda x: '{:.2f}±{:.2f}'.format(x['A'], x['B']), axis=1)
df
>>> 	A	B	C
0	1.624345	-0.611756	1.62±-0.61
1	-0.528172	-1.072969	-0.53±-1.07
2	0.865408	-2.301539	0.87±-2.30
3	1.744812	-0.761207	1.74±-0.76

result_type=expand 的神奇用法

apply中有一个参数是result_type，如果将其设成expand，可以返回多列/多行结果。具体看以下实例：

import numpy as np
import pandas as pd
df = pd.DataFrame(np.arange(12).reshape(3,4), columns=list('abcd'))

# expand, 返回dataframe
df.apply(lambda x:[x['a']+1, x['c']+2], axis=1, result_type="expand") # 每行返回一个list
>>> 0	1
0	1	3
1	5	7
2	9	11

# reduce, 返回Series
df.apply(lambda x:[x['a']+1, x['c']+2], axis=1, result_type="reduce")
>>>
0     [1, 4]
1     [5, 8]
2    [9, 12]
dtype: object

# expand, 返回dataframe
df.apply(lambda x:(x[0]+1, x[1]+2), axis=0, result_type="expand")
>>> a	b	c	d
0	1	2	3	4
1	6	7	8	9

对多行进行操作

df.loc[10] = df.apply(lambda x: '{:.2f}±{:.2f}'.format(x[2], x[3]) )
df
>>>        A	B
0	1.624345	-0.611756
1	-0.528172	-1.072969
2	0.865408	-2.301539
3	1.744812	-0.761207
10	0.87±1.74	-2.30±-0.76

applymap: 可以实现对DataFrame中所有元素的整体操作

如果是对所有元素进行一个操作的话，可以使用applymap。

df = pd.DataFrame(np.arange(12).reshape(3,4), columns=list('abcd'))
df = df.applymap(lambda x: x if x>6 else 6-x)
df
>>> a	b	c	d
0	6	5	4	3
1	2	1	0	7
2	8	9	10	11