Khan Academy - Statistics and Probability - Unit 6 STUDY DESIGN

STUDY DESIGN

PART 1 Statistical questions

PART 2 Sampling and observational studies

PART 3 Types of studies (experimental vs. observational)

 


PART 1 Statistical questions

1. Statistics: involves collecting, presenting, and analyzing data.

(1) Variability: the degree to which the data points are different from each other / the degree to which the data points vary

(2) Statistical question: to answer statistical questions, we need to collect data with variability. Therefore, the question just needs one data point is not a statistical question.

 


PART 2 Sampling and observational studies

1. It is important to identify potential sources of bias when planning a sample survey. When we say there’s potential bias, we should also be able to argue if the results will probably be an overestimate or an underestimate.

2. Sources of bias in surveys

(1) Convenience sample / Underconverage bias: the researcher chooses a sample that is readily available in some non-random way.

  • [Eg] A researcher polls people as they walk by on the street.
  • [Eg] A senator wanted to know about how people in her state felt about internet privacy issues. She conducted a poll by calling 100 people whose names were randomly sampled from the phone book (note that mobile phones and unlisted numbers aren't in phone books). The senator's office called those numbers until they got a response from all 100 people chosen.

(2) Voluntary response sample: the researcher puts out a request for members of a population to join the sample, and people decide whether or not to be in the sample.

  • [Eg] A TV show host asks his viewers to visit his website and respond to an online poll.

(3) Response bias: is the tendency of a person to answer questions on a survey untruthfully or misleadingly.

  • [Eg] A high school wanted to know what percent of its students smoke cigarettes. During the week when students visited the counselors to schedule classes, they asked every student in person if they smoked cigarettes or not.

(4) Nonresponse bias: where the respondents differ in meaningful ways from nonrespondents.

  • [Eg] A survey asking about the best alcoholic drink brand targeted at older religious people will likely receive no response.

3. Sampling methods

(1) Simple random sample: Every member and set of members has an equal chance of being included in the sample. Technology, random number generators, or some other sort of chance process is needed to get a simple random sample.

  • [Eg] A teacher puts students’ names in a hat and chooses without looking to get a sample of students. 
  • Benefit: random samples are usually fairly representative since they don’t favor certain members.

(2) Stratified random sample: The population is first split into groups. The overall sample consists of some members from every group. The members from each group are chosen randomly.

  • [Eg] A student council surveys 100 students by getting samples of 25 freshmen, 25 sophomores, 25 juniors, and 25 seniors.
  • Benefit: it guarantees that members from each group will be represented in the sample, so this sampling method is good when we want some members from every group.
  • Stratified random sampling vs. Simple random sampling: stratified random sampling can produce a weighted mean that has less variability than the arithmetic mean of a simple random sample of the population.

(3) Cluster random sample: The population is first split into groups. The overall sample consists of every member from some of the groups. The groups are selected at random.

  • [Eg] An airline company wants to survey its customers one day, so they randomly select 5 flights that day and survey every passenger on those flights
  • Benefit: it gets every member from some of the groups, so it’s good when each group reflects the population as a whole.

(4) Systematic random sample: members of the population are put in some order. A starting point is selected at random, and every n^{th} member is selected to be in the sample. 

  • [Eg] Assume that in a population of 10,000 people, a statistician selects every 100th person for sampling. 

 

[EXERCISE] There are 90 students in a lunch period, and 5 of them will be selected at random for cleaning duty every week. Each student receives a number 01-90 and the school uses a random digit table to select a simple random sample to 5 students. Which 5 students should be assigned cleaning duty?

96565 05007 16605 81194 14873

A. 96, 56, 50, 50, 07

B. 96, 56, 50, 07, 16

C. 56, 50, 50, 07, 16

D. 56, 50, 07, 16, 60

 


PART 3 Types of studies (experimental vs. observational)

1. Why do we do studies?

We do studies to gather information and draw conclusions.

2. Types of statistical studies

(1) Sample study: trying to estimate the value of a parameter for a population from a random sample

(2) Observational study: trying to understand how two parameters in a population might move together or not / trying to find whether there is a correlation between two variables in a population.

  • we measure our survey members of a sample without trying to affect them.

(3) Experimental study: trying to establish causality. Where researchers introduce an intervention and study the effects, there is control and treatment group.

  • we assign people or things to groups and apply some treatment to one of the groups, while the other group does not receive the treatment.
  • 想要证明因果关系,必须要有实验-对照组

3. Explanatory variable & Response variable

(1) explanatory variable: also called independent variable / predictor variable. It explains the variations in the response variable; in an experimental study, it is manipulated by the researcher.

(2) response variable: also called dependent variable, whose variation depends on other variables. The response variable is the subject of change within an experiment, often as a result of differences in the explanatory variables.

4. Blind experiment & Double-blind experiment

(1) In a single-blind study, the participants in the clinical trial do not know if they are receiving the placebo or the real treatment. 

(2) In a double-blind study, both the participants and the experimenters do not know which group got the placebo and which got the experimental treatment.

5. Block design: With a randomized block design, the experimenter divides subjects into subgroups called blocks, such that the variability within blocks is less than the variability between blocks. Then, subjects within each block are randomly assigned to treatment conditions.

6. Replication: other people could replicate this experiment and hopefully get consistent results.

7. Matched pairs design: first randomly put people into either control or treatment group; and then we do another round where we switch people in the treatment go into the control group and people in the control group go into the treatment group. 

8. Placebo effect: People in an experiment will often respond to any treatment, even a treatment that has no real therapeutic value.

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值