DataFrame.eval()
今天发现了pandas一个666的技巧 DataFrame.eval()
,必须写下来:
import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.random((6,4)), columns=list('abcd'))
df
a b c d
0 0.352100 0.660768 0.259112 0.190435
1 0.438345 0.147769 0.702476 0.503706
2 0.214064 0.440153 0.700988 0.029637
3 0.646761 0.539095 0.980113 0.921489
4 0.747330 0.260352 0.191178 0.002823
5 0.969599 0.163768 0.018234 0.458367
我想做的是增加一列新列df['e'] = df['a']*df['b']+df['c']*df['d']
;
eval
实现如下:
df.eval('e=a*c+b*d', inplace=True)
print(df)
[Out]:
a b c d e
0 0.352100 0.660768 0.259112 0.190435 0.217067
1 0.438345 0.147769 0.702476 0.503706 0.382359
2 0.214064 0.440153 0.700988 0.029637 0.163101
3 0.646761 0.539095 0.980113 0.921489 1.130668
4 0.747330 0.260352 0.191178 0.002823 0.143608
5 0.969599 0.163768 0.018234 0.458367 0.092746
666666666666666666666666666 !
对于我想处理的问题是从df.columns
中搜索出特定列colA=['a', 'b']
, colB=['c', 'd']
做上述操作,利用python
的字符串处理技巧,操作如下:
evalStr = zip(colA,'*'*len(colA), colB)
evalStrList = [''.join(x) for x in evalStr]
print(evalStrList) # 输出 ['a*c', 'b*d']
eval_expression = '+'.join(evalStrList) #其实就是 'a*c+b*d'
df.eval('e=' + eval_expression, inplace = True)