pandas练习

题目地址

https://nbviewer.jupyter.org/github/schmit/cme193-ipython-notebooks-lecture/blob/master/Exercises.ipynb

import random

import numpy as np
import scipy as sp
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# import statsmodels.api as sm
import statsmodels.formula.api as smf

sns.set_context("talk")

anascombe = pd.read_csv('Anscombe.csv')

x = anascombe.groupby('dataset')['x']
y = anascombe.groupby('dataset')['y']
print("x mean:", x.mean())
print("x variance:", x.var())
print("y mean:", y.mean())
print("y variance:", y.var())

print()
print(anascombe[anascombe['dataset'] == 'I'].corr())
print(anascombe[anascombe['dataset'] == 'II'].corr())
print(anascombe[anascombe['dataset'] == 'III'].corr())
print(anascombe[anascombe['dataset'] == 'IV'].corr())
print()

lin_model = smf.ols('y ~ x', anascombe[anascombe['dataset'] == 'I']).fit()
print(lin_model.summary())

lin_model = smf.ols('y ~ x', anascombe[anascombe['dataset'] == 'II']).fit()
print(lin_model.summary())

lin_model = smf.ols('y ~ x', anascombe[anascombe['dataset'] == 'III']).fit()
print(lin_model.summary())

lin_model = smf.ols('y ~ x', anascombe[anascombe['dataset'] == 'IV']).fit()
print(lin_model.summary())


g = sns.FacetGrid(anascombe, col="dataset")
g.map(plt.scatter, "x", "y")
plt.show()

因为是练习题,所以是有直接的参考资料的。我觉得这种教pandans的方式很好,方法太多了。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值