introduction to data science w4

最新推荐文章于 2024-04-01 09:56:12 发布

alisonxPandas

最新推荐文章于 2024-04-01 09:56:12 发布

阅读量988

点赞数

本文链接：https://blog.csdn.net/alisonxPandas/article/details/80469280

版权

numpy提供方法来模拟运行binomial distribution：

np.random.binomial（n,p）//n代表模拟的次数，p代表成功率
np.random.binomial(n,p,size)
//例如，np.random.binomial(20,0.5,10000)表示进行10000次抛20次硬币的模拟，输出结果为一个数组，每个数是进行试验得到的结果的加和

x = np.random.binomial(20, .5, 10000)

print((x>=15).mean())

显示结果

Q：求两天连续有龙卷风的概率

chance_of_tornado = 0.01
tornado_events = np.random.binomial(1, chance_of_tornado, 1000000)
two_days_in_a_row = 0
for j in range(1,len(tornado_events)-1):
    if tornado_events[j]==1 and tornado_events[j-1]==1:
        two_days_in_a_row+=1
print('{} tornadoes back to back in {} years'.format(two_days_in_a_row, 1000000/365))

np.std(distribution)

stats.skew(distribution)给出一个分布的skew值

chi_squared_df5 = np.random.chisquare(5, size=10000)

stats.skew(chi_squared_df5)

推荐书：think stats，o'reilly系列，pdf版本在greenteapress.com/thinkstats2/index.html

hypothesis test: a statement you can test

alternative hypothesis: there is a difference between groups

null hypothesis: there is no difference between A and B

critical value: a threshold as to how much chance you are willing to accept the alternative

要比较两个distribution有没有区别，用 T test，scipy有提供

from scipy import stats

stats.ttest_ind?