pandas 3种 2列数据比较方法

沙瑞山

于 2020-06-21 12:04:19 发布

阅读量2.1w

点赞数 1

本文链接：https://blog.csdn.net/ad576882002/article/details/106883779

版权

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'data1':np.random.randn(999999),'data2':np.random.randn(999999)},index=np.arange(0,999999))

print(df1[df1['data1']>df1['data2']].head())

print(df1.query('data1>data2').head())

print(df1.loc[lambda x:x['data1']>x['data2']].head())

运行结果如下：

方法1： []方法          data1     data2
3   1.458585 -1.598776
7   0.120763 -0.396222
8   0.955182  0.151948
10 -0.248750 -1.658321
11  0.030225 -0.383628
运行时间：  0:00:00.040964 秒


方法2： query()方法          data1     data2
3   1.458585 -1.598776
7   0.120763 -0.396222
8   0.955182  0.151948
10 -0.248750 -1.658321
11  0.030225 -0.383628
运行时间：  0:00:00.054603 秒


方法3： lambda方法          data1     data2
3   1.458585 -1.598776
7   0.120763 -0.396222
8   0.955182  0.151948
10 -0.248750 -1.658321
11  0.030225 -0.383628
运行时间：  0:00:00.023384 秒

数据结果一致；

运行效率Lambda最优，query()方法最差