关于pandas中groupby的参数as_index的True与False,也就是说是否变成索引了，索引和列的区别

最新推荐文章于 2021-11-13 16:15:11 发布

weixin_45271076

最新推荐文章于 2021-11-13 16:15:11 发布

阅读量788

点赞数 1

本文链接：https://blog.csdn.net/weixin_45271076/article/details/100180295

版权

grouby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)
import pandas as pd
 
df = pd.DataFrame(data={'books':['bk1','bk1','bk1','bk2','bk2','bk3'], 'price': [12,12,12,15,15,17]})
print df
print
print df.groupby('books', as_index=True).sum()
print
print df.groupby('books', as_index=False).sum()
  books  price
0   bk1     12
1   bk1     12
2   bk1     12
3   bk2     15
4   bk2     15
5   bk3     17
 
       price
books       
bk1       36
bk2       30
bk3       17
 
  books  price
0   bk1     36
1   bk2     30
2   bk3     17

When as_index=True the key(s) you use in groupby will become an index in the new dataframe.

The benefit of as_index=True is that you can yank out the rows you want by using key names. For eg. if you want ‘bk1’ you can get it like this: df.loc[‘bk1’] as opposed to when as_index=Falsethen you will have to get it like this: df.loc[df.books==‘bk1’]

Including the other main benefit of using as_index=True raised by @ayhan in comments: df.loc[‘bk1’] would be faster because it doesn’t have to traverse the entire books column to find ‘bk1’ when it’s indexed. It will just calculate the hash value of ‘bk1’ and find it in 1 go.

weixin_45271076

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
关于pandas中groupby的参数as_index的True与False,也就是说是否变成索引了，索引和列的区别

grouby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)import pandas as pd df = pd.DataFrame(data={'books':['bk1','bk1','bk1','bk2','bk2','bk3'], 'pr...
复制链接

扫一扫