Random Numbers and Probability
What are the chances?
Measuring chance
What's the probability of an event?
Example: a coin flip
Assigning salespeople
Sampling from a data frame
sales_counts
sales_count %>%
sample_n(1)
sales_count %>%
sample_n(1)
Setting a random seed
set.seed(5)
sales_counts %>%
sample_n(1)
set.seed(5)
sales_couonts %>%
sample_n(1)
Sampling with replacement in R
sales_counts %>%
sample_n(2, replace=TRUE)
sample(sales_team, 5, replace=TRUE)
Independent events
Two events are independent if the probability of the second event isn't affected by the outcome of the first event.
Sampling with replacement = each pick is independent
Dependent events
Two events are dependent if the probability of the sencond event is affected by the outcome of the first event.
Sampling without replacement = each pick is dependent
Discrete distributions
Rolling the dice
Choosing salespeople
Probability distribution
Describes the probability of each possible outcome in a scenario
Visualizing a probability distribution
Probability = area
Uneven die
Visualizing uneven probabilities
Adding areas
Discrete probability distributions
Discribe probabilities for discrete outcomes
Sampling from discrete distributions
die
mean(die$n)
rolls_10 <- die %>%
sample_n(10, replace = TRUE)
rolls_10
Visualizing a sample
ggplot(rolls_10, aes(n)) +
geom_histogram(bins = 6)
Sample distribution vs. theoretical distribution
A bigger sample
Law of large numbers
As the size of your sample increases, the sample mean will approach the expected value
Continuous distribution
Waiting for the bus
Continuous uniform distribution
Probability still = area
Uniform distribution in R
punif(7, min = 0, max = 12)
# 0.5833333
lower.tail
punif(7, min = 0, max = 12, lower.tail = FALSE)
# 0.4166667
punif(7, min = 0, max = 12) - punif(4, min = 0, max = 12)
Total area = 1
Other continuous distributions
Other special types of distributions
The binomial distribution
Coin fipping
Binary outcomes
A single flip
rbinom(# of trials, # of coins, # probability of heads/success)
1 = head, 0 = tails
rbinom(1, 1, 0.5)
# 1
rbinom(1, 1, 0.5)
# 0
One flip many times
rbinom(8, 1, 0.5)
# 1 0 0 1 0 0 1 0
Many flips one time
rbinom(1, 8, 0.5)
# 3
Many flips many times
rbinom(10, 3, 0.5)
# 2 0 1 0 1 1 3 3 3 1
Other probabilities
rbinom(10, 3, 0.25)
# 1 1 0 0 1 1 1 1 2 1
Binomial distribution
Probability distribution of the number of successes in a sequence of independent trials
E.g. Number of heads in a sequence of coin flips
Describe by n and p
-
n: total number of trials
-
p: probability of success
What's the probability of 7 heads?
P(heads = 7)
#dbinom(num heads, num trials, prob of heads)
dbinom(7, 10, 0.5)
# 0.1171875
What's the probability of 7 or fewer heads?
P(heads <= 7)
pbinom(7, 10, 0.5)
#0.9453125
What's the probability of more than 7 heads?
P(heads > 7)
pbinom(7, 10, 0.5, lower.tail = FALSE)
# 0.0546875
1 - pbinom(7, 10, 0.5)
# 0.0546875
Expected value
Expected value = n x p
Expected number of heads out of 10 flips = 10 x 0.5 = 5
Independence
The binomial distribution is a probability distribution of the number of successes in a sequence of independent trials
Probabilities of second trial are altered due to outcome of the first
If trials are not independent, the binomial distribution does not apply!