我有一个参考DataFrame,如下所示:
Variables Key Values
0 GRTYPE 40 Total exclusions 4-year schools
1 GRTYPE 2 4-year institutions, Adjusted cohort
2 GRTYPE 3 4-year institutions, Completers
41 CHRTSTAT 2 Revised cohort
42 CHRTSTAT 3 Exclusions
43 CHRTSTAT 4 Adjusted cohort
57 SECTION 12 Bachelors/ equiv .
58 SECTION 23 Bachelors or equiv 2009 .
我想使用参考数据框来替换下面主DataFrame中的值:
GRTYPE CHRTSTAT SECTION
0 40 2 12
1 2 3 12
2 2 4 23
3 3 2 12
4 3 3 23
最终结果将是:
GRTYPE CHRTSTAT SECTION
0 Total exclusions 4-year schools Revised cohort Bachelors/ equiv .
1 4-year institutions, Adjusted cohort Exclusions Bachelors/ equiv .
2 4-year institutions, Adjusted cohort Adjusted cohort Bachelors or equiv 2009 .
3 4-year institutions, Completers Revised cohort Bachelors/ equiv .
4 4-year institutions, Completers Exclusions Bachelors or equiv 2009 .
在pandas或python中执行此操作的最佳方法是什么?我尝试从第一个数据帧加入和提取变量,并在第二个数据帧上循环,但没有得到任何结果.
解决方法:
使用地图
您需要将Variables和Key设置为映射数据帧的索引,然后在列上使用map.
mapping_df = mapping_df.set_index(['Variables', 'Key'])
df = df.apply(lambda x: x.map(mapping_df.loc[x.name]['Values']))
与以下相同:
mapping_df = mapping_df.set_index(['Variables', 'Key'])
df['GRTYPE'] = df.GRTYPE.map(mapping_df.loc['GRTYPE']['Values'])
df['CHRTSTAT'] = df.CHRTSTAT.map(mapping_df.loc['CHRTSTAT']['Values'])
df['SECTION'] = df.SECTION.map(mapping_df.loc['SECTION']['Values'])
输出:
GRTYPE CHRTSTAT SECTION
0 Total exclusions 4-year schools Revised cohort Bachelors/ equiv .
1 4-year institutions, Adjusted cohort Exclusions Bachelors/ equiv .
2 4-year institutions, Adjusted cohort Adjusted cohort Bachelors or equiv 2009 .
3 4-year institutions, Completers Revised cohort Bachelors/ equiv .
4 4-year institutions, Completers Exclusions Bachelors or equiv 2009 .
标签:python,pandas
来源: https://codeday.me/bug/20190727/1549477.html