Pandas的常用方法

1. 加载数据

import pandas as pd

marketing = pd.read_csv("DirectMarketing.csv")

2.遍历数据

marketing.head(8)/describe()/info()

3.selecting columns

按columns的标题进行索引
print(marketing[['Age', 'Married', 'AmountSpent']])

4.selecting rows

print(marketing.iloc[[3, 5, 8]])

5.处理missing data

isnull 和 dorpna

6.增加新的行数

import pandas as pd
marketing = pd.read_csv('/course/data/DirectMarketing.csv')

ratio = marketing['Salary'] / marketing['AmountSpent']

marketing['SalarySpendRatio'] = ratio
print(marketing.head())

7.数数

计算有多少个不一样的项

print(marketing['Age'].nunique())

计算每个项里有多少个

value_counts()

8.正序排列

.sort_values()

import pandas as pd

marketing = pd.read_csv('/course/data/DirectMarketing.csv')
marketing_vc = marketing['Age'].value_counts().sort_index()
print(marketing_vc)

9.查询

import pandas as pd
marketing = pd.read_csv('/course/data/DirectMarketing.csv')

youth = marketing.query('Age == "Young"')
print(youth.head())

设置筛选条件

import pandas as pd
marketing = pd.read_csv('/course/data/DirectMarketing.csv')

age_group = 'Young'
youth = marketing.query('Age == @age_group')
print(youth.head())

and和or来设置筛选条件

import pandas as pd

marketing = pd.read_csv('/course/data/DirectMarketing.csv')
print(marketing.query('AmountSpent > 1000 and Gender == "Female"'))

使用索引符号来查询

synax:

DataFrame[query]


query = marketing['Salary'] > 90000

import pandas as pd
marketing = pd.read_csv('/course/data/DirectMarketing.csv')

x = 1000
big_earners = marketing[marketing['AmountSpent'] > x]
print(big_earners.head())

合并成一行

两者条件联合

10.Groupby:

相当于pivot table

 

 

11.组合两个columns

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值