统计与R入门

主要是Coursera Basic Statistics课程的笔记。

第一周 Exploring Data

Descriptive Statistics

Different Levels of Measurement:
Nominal(定标), Ordinal(定序), Interval(定距), Ratio(定比): Interval和Ratio的差别是Inteval的零不是表示没有,比如温度为0并不代表没有温度。

Central Tendency and Dispersion:
Central Tendency指标有:Mode,Median,Mean(俗称3M)
Dispersion指标有:Range,Interquantile Range(IQR),Variance, Standard Deviation

另外一个Z-Scores:to specific a observation is common or exceptional,(变量值-均值)/标准差,

在R中对应的函数有:

MeasurementFunction
ModeN/A
Meanmean()
Medianmedian()
Rangerange()
IQRIQR()
Variancevar()
Standard Deviationsd()
Z-ScoresN/A

第二周 Correlation and Regression

Frequency table: One varible
Contingency table: Two varible
When the two varible are quantitative, we use scatterplot.

Correlation: Pearson r,取值范围[-1,1],正数表示正相关,负数表示负相关,数值表示强度:
0.8-1.0 极强相关
0.6-0.8 强相关
0.4-0.6 中等程度相关
0.2-0.4 弱相关
0.0-0.2 极弱相关或无相关
Regression: y^=a+bx , 其中: b=rzxzyn , r为皮尔森系数,z为z-score,n为样本数

Explained variance: The percentage of the variance in the dependent variable that can be explained using the formula of the regression line. You can measure this with r-squared.

R语言对应函数:

Namefunction
Frequency Table or Contingency Tabletable()
Pearson’s r/Correlationcor()
Linear Regressionlm()
Scatter Plotplot()
Regression Lineabline()

第三周 Probability

Experiment
Trial
Outcome
Event
Random Variable
Marginal Probability

Two methods to calculate probability:

  • Tree Diagram
  • Contingency Table

The complement of X is Xc.
Independent intersecting events are two events that do not influence each other and can occur similtaneously. An example might be the outcome of rolling two dices.
Disjoint exhaustive events are mutually exclusive, so only one of the events can happen at a time.

Intersection: P(AB)
Union: P(AB)=P(A)+P(B)P(AB)

Joint Probability: P(AB) , i.e. P(A and B)
Conditional Probability: P(AB)=P(AB)P(B) , i.e. P(A given B), reduced sample space

袋子里有6颗红球4颗绿球,从袋子里随机拿出两个球:
无放回:依赖事件
有放回:独立事件

If event A and event B are independent:
P(AB)=P(A)P(B)
P(AB)=P(AB)P(B)=P(A)P(B)P(B)=P(A)
(比如投两枚硬币,一枚硬币的结果不会影响另一枚硬币的结果)

And how to calculate P(AB) when events are dependent?
See this course
这里写图片描述

Bayes’ Law:
P(AB)=P(AB)P(B)=P(BA)P(A)
P(AB)=P(BA)P(A)P(B)
where P(A) is called prior probability, and P(BA) is called posterior probability.

Fallible:
Union
Independence
Bayesian Probability II:It is 0.5 because there are two sides, left and right.

Digression:
Do you know the result of 0.1+0.2 in Python?
Why don’t my numbers add up?
Basic Answers

第四周 Probability distributions

Probability distributions:

  • probability mass function(概率质量函数,离散随机变量,函数值等于概率)
  • probability density function (概率密度函数,连续随机变量,函数图像下方面积等于概率)
  • cumulative probability distribution(累计概率分布)

Mean and variance of a random variable:

The normal distribution:

The binomial distribution:

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

手撕机

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值