第十四周_Jupyter题目

最新推荐文章于 2023-03-09 22:29:37 发布

cchsblog

最新推荐文章于 2023-03-09 22:29:37 发布

阅读量286

点赞数

本文链接：https://blog.csdn.net/cchsblog/article/details/80638490

版权

Anscombe's quartet

Anscombe's quartet comprises of four datasets, and is rather famous. Why? You'll find out in this exercise.

%matplotlib inline
import random
import numpy as np
import scipy as sp
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import statsmodels.formula.api as smf
sns.set_context("talk")

anascombe = pd.read_csv('C:/Users/lenovo/Desktop/data/anscombe.csv')
anascombe.head()

Part 1

For each of the four datasets...

Compute the mean and variance of both x and y

Code

anascombe.groupby('dataset')['x', 'y'].mean()
anascombe.groupby('dataset')['x', 'y'].var()

Output

Compute the correlation coefficient between x and y

Code

X1 = anascombe.x[0:10].values  
X2 = anascombe.x[11:21].values  
X3 = anascombe.x[22:32].values  
X4 = anascombe.x[33:43].values 
Y1 = anascombe.y[0:10].values  
Y2 = anascombe.y[11:21].values  
Y3 = anascombe.y[22:32].values  
Y4 = anascombe.y[33:43].values  
cof = [0,0,0,0]
cof[0] = sp.stats.pearsonr(X1, Y1)[0]  #返回的第一个参数是相关系数
cof[1] = sp.stats.pearsonr(X2, Y2)[0]  
cof[2] = sp.stats.pearsonr(X3, Y3)[0]  
cof[3] = sp.stats.pearsonr(X4, Y4)[0] 
print("I    "+str(cof[0]))
print("II   "+str(cof[1]))
print("III  "+str(cof[2]))
print("IV   "+str(cof[3]))

Output

Compute the linear regression line: y=β0+β1x+ϵ (hint: use statsmodels and look at the Statsmodels notebook)

Code

lin_model = smf.ols('y ~ x', anascombe).fit() 
lin_model.summary()

Output

Part 2

Using Seaborn, visualize all four datasets.

hint: use sns.FacetGrid combined with plt.scatter

Code

m = sns.FacetGrid(anascombe, col="dataset")    
m.map(plt.scatter, "x","y")

Output

cchsblog

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
第十四周_Jupyter题目

Anscombe's quartetAnscombe's quartet comprises of four datasets, and is rather famous. Why? You'll find out in this exercise.%matplotlib inlineimport randomimport numpy as npimport scipy as spimpo...
复制链接

扫一扫