Pandas数据操作学习笔记

最新推荐文章于 2024-03-04 20:41:00 发布

花花呼呼

最新推荐文章于 2024-03-04 20:41:00 发布

阅读量495

点赞数

分类专栏： Python

本文链接：https://blog.csdn.net/lglfa/article/details/80672622

版权

Python 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

1、df.country.unique()
表示dataframe中的country列中没有重复的字段，也就是共有哪些国家

2、 df.country.value_counts()
统计country列中不同国家出现的次数

3、表示dataframe中description列中每个字段是否含有tropical，注意map用法

tropical_wine = df.description.map(lambda r: "tropical" in r).value_counts()

4、df.loc[(df.country.notnull()) & (df.variety.notnull())]
选择df中country和variety列不是NaN的数据

5、df = df.dropna(subset=['country','variety'])
删掉country和variety列中NaN数据，注意要赋值，因为操作不会直接在原有的df上删减

6、下面这个两个效果一样，注意agg的用法

pd.concat([df.groupby('variety').price.min().rename('min'),df.groupby('variety').price.max().rename('max')],axis=1)

df.groupby('variety').price.agg([min, max])

7、 df.groupby('price').points.max().sort_index()
表示把price分组，然后列出每组的points的最大值，最后price从小到大排列

8、df.rename_axis("wines", axis="rows")
更改index名字为‘wines’

9、df.rename(columns={'region_1':'region','region_2':'locale'})
更改列名，region_1改为region

10、df.assign(n=0)
增加列名为n的一列，值全是0

11、df.reset_index()
重置index后，原来的index成为列，新的index为0.1.2.3…..

12、df.col.value_counts()
col列中的值出现频率从多到少一次排列

这些method经常连用，有些method的组合能产生意想不到的效果

花花呼呼

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录