Probability Distribution
Week 11
INSTITUT TEKNOLOGI SAINS BANDUNG
Name : Dhefio Alim Muzakki
Student ID : 52250014
Major : Data Science
Lecturer : Mr. Bakti Siregar, M.Sc., CDS.
library(tidyverse)
library(readr)
library(ggplot2)
library(dplyr)
library(ggridges)
library(knitr)
library(DT)
Introduction
These additional topics extend the foundational ideas of probability into more applied tools that help describe and model real-world uncertainty. Together, they shift the perspective from basic event relationships toward structured random processes and the behavior of discrete probability distributions.
This chapter introduces the logic behind discrete probability models, visual representations of random variables, and the use of probability mass functions to describe how outcomes are distributed. By understanding how probability is allocated across possible values, learners begin to see how more complex statistical tools are built from simple counting principles, event structures, and conditional reasoning.
These videos collectively reinforce how probability can be expressed numerically, graphically, and conceptually — forming a bridge between elementary definitions and more advanced inferential techniques. The discussions and examples also demonstrate how probability models guide real-world interpretation, allowing learners to connect mathematical expressions with meaningful outcomes.
The following set of videos expands the understanding of probability by moving from foundational definitions toward more applied and visual concepts. Each video contributes a specific piece of intuition, helping you connect theory with real-world reasoning. Together, they create an unified learning path that strengthens both conceptual clarity and problem-solving skills.
- Learn how random variables work and why they are essential tools in probability.
- Explore variance and standard deviation to understand how outcomes spread around the mean.
- See how expected value predicts long-term behavior in uncertain situations.
- Get visual intuition through graphs and demonstrations.
- Reinforce the ideas through guided examples and real-life interpretations.
1 Continuous Random
Continuous Random Variables for continuous variables, which describe the likelihood of values over a continuous range. Video Reference
Probability Distribution of Continuous Variables
To understand continuous random variables, it is essential to know how probability is represented using a Probability Density Function (PDF). Unlike discrete random variables, a continuous random variable does not assign probability to individual points. Instead, probability is obtained from the area under the PDF curve.
1.0.1 Random Variable
A random variable is continuous if it can take any
value within an interval on the real number line.
Examples include: height, time, temperature, age, pressure, and
velocity.
Key characteristics:
The variable takes values in an interval such as \((a, b)\) or even \((-\infty, +\infty)\).
The probability of any single point is always zero:
\[P(X = x) = 0\]
Probabilities are meaningful only over intervals:
\[ P(a \le X \le b) = \int_{a}^{b} f(x)\, dx \]
1.1 Probability Density Function
A function \(f(x)\) is a valid Probability Density Function (PDF) if it satisfies:
Non-negativity
\[ f(x) \ge 0 \quad \forall x \]
Total Area Equals
\[ \int_{-\infty}^{\infty} f(x)\, dx = 1 \] ### Interpretation
- Larger values of \(f(x)\) indicate higher probability density around that value.
- However, \(f(x)\) is not a probability; probabilities come from the area under the curve.
1.1.1 Example PDF:
\(f(x) = 3x^2\) on \([0, 1]\)
Consider the probability density function:
\[ f(x) = 3x^2, \quad 0 \le x \le 1 \]
Validation:
\[ \int_{0}^{1} 3x^2 \, dx = 1 \]
1.1.2 Probability on an Interval
To compute probability within an interval:
\[ P(a \le X \le b) = \int_{a}^{b} 3x^2 \, dx \]
Example:
\[ P(0.5 \le X \le 1) \]
library(ggplot2)
library(dplyr)
set.seed(123)
# PARAMETERS
p <- 0.3
n <- 30
nsim <- 10000
# SIMULATION
successes <- rbinom(nsim, n, p)
prop <- successes / n
df <- data.frame(prop)
# THEORETICAL VALUES
mu <- p
se <- sqrt(p * (1 - p) / n)
# COOLER PROFESSIONAL PLOT
ggplot(df, aes(x = prop)) +
geom_histogram(aes(y = ..density..),
bins = 35,
fill = "#4F81BD",
color = "white",
alpha = 0.85) +
geom_density(color = "#C0504D", size = 1.2) +
geom_vline(xintercept = mu, color = "#9BBB59", size = 1.3, linetype = "solid") +
geom_vline(xintercept = mu + se, color = "#8064A2", size = 1, linetype = "dashed") +
geom_vline(xintercept = mu - se, color = "#8064A2", size = 1, linetype = "dashed") +
annotate("text",
x = mu, y = 12,
label = "Mean = p",
color = "#9BBB59", size = 4, fontface = "bold", vjust = -1) +
annotate("text",
x = mu + se, y = 10,
label = "+1 SE",
color = "#8064A2", size = 4) +
annotate("text",
x = mu - se, y = 10,
label = "-1 SE",
color = "#8064A2", size = 4) +
labs(
title = "Sampling Distribution of the Sample Proportion",
subtitle = paste0("True Proportion p = ", p, ", Sample Size n = ", n,
", Simulations = ", nsim),
x = "Sample Proportion (\\hat{p})",
y = "Density",
caption = "Visualization of the sampling distribution based on repeated sampling"
) +
theme_minimal(base_size = 14) +
theme(
plot.title = element_text(face = "bold", size = 18),
plot.subtitle = element_text(color = "gray40"),
panel.grid.minor = element_blank()
)Interpretation
The graph displays the empirical sampling distribution of \(\hat{p}\) produced by 10,000 repeated samples of size \(n = 30\) from a population with true proportion \(p = 0.3\).
Key Interpretation Points
Center at the True Proportion The solid green
vertical line shows the theoretical mean \(p =
0.3\).
The density curve peaks around this value, confirming:
\[ E(\hat{p}) = p \]
This indicates that the estimator \(\hat{p}\) is unbiased.
Spread Represented by Standard Error The two dashed purple lines show:
\[ p \pm SE = p \pm \sqrt{\frac{p(1-p)}{n}} \]
This region contains about 68% of sample proportions
under normality.
The histogram aligns well with these bounds, illustrating correct
variability.
Shape: Approaching Normality Even with moderate
sample size \(n=30\), the histogram
approximates a bell-shaped curve.
This confirms the Central Limit Theorem (CLT) for
proportions:
\[ \hat{p} \text{ is approximately normal for moderate } n \]
Simulation Validates Theory The density curve (red) aligns with theoretical expectations:
- Unbiased center
- Variability predicted by standard error
- Approximate normality
Thus, the simulation reinforces the theoretical sampling distribution.
1.1.3 Cumulative Distribution Function (CDF)
The Cumulative Distribution Function (CDF) of a continuous random variable \(X\) is defined as:
\[ F(x) = P(X \le x) = \int_{0}^{x} 3t^{2}\, dt = x^{3} \]
1.1.4 Relationship Between PDF and CDF
The Probability Density Function (PDF) is the derivative of the CDF:
\[ f(x) = F'(x) \]
2 Sampling Distributions
Sampling distributions form a core idea in inferential statistics. While populations contain all possible individuals and samples represent only a slice of them, the sampling distribution connects these two worlds by describing how a statistic behaves across many repeated samples. This chapter builds a clear and intuitive picture of how sample means vary, why larger samples produce more stable estimates, and how this leads to statistical inference.
Video Reference
The video introduces the goal of understanding what a sampling distribution is, how it differs from a sample distribution, and why the sample mean when repeated over many samples follows a predictable pattern. The fundamental objective is to explain how uncertainty in estimates arises and how it decreases as sample size increases.
Introduction
In statistics, we rarely observe an entire population. Instead, we
collect samples.
However, each sample is different — which means sample statistics (like
the sample mean) vary.
2.1 Learning Objectives
By the end of this chapter
- Understand what a sample is and how it differs from a population.
- Distinguish sample distributions from sampling distributions.
- Explain the sampling distribution of the sample mean.
- Identify differences between population and sampling distributions.
- Apply basic reasoning about sampling distributions to real data.
- Solve related practice problems.
Review of Samples
A sample is a subset of individuals selected from a population.
If the population is:
- All students at a university
- All manufactured items in a factory
- All possible rolls of a fair die
then a sample is a smaller collection drawn to represent the whole.
A sample statistic such as the sample mean is:
\[ \bar{X} = \frac{1}{n}\sum_{i=1}^{n} X_i \]
But different samples produce different values of \(\bar{X}\).
This variability leads to sampling distributions.
2.2 Sample Distribution vs Sampling Distribution
- Sample Distribution Describes the values within one sample.
Example:
If your sample of test scores is:
\[ \{70, 75, 80, 90, 85\} \]
That list itself forms the sample distribution.
- Sampling Distribution Describes the distribution of a statistic across many repeated samples of the same size.
If we repeatedly draw samples of size \(n = 5\) and compute each sample mean:
\[ \bar{X}_1, \bar{X}_2, \bar{X}_3, \dots \]
The distribution of those means is the sampling distribution of the sample mean.
Sampling Distribution of the Sample Mean
The sample mean has predictable behavior:
- Mean of sampling distribution:
\[ \mu_{\bar{X}} = \mu \]
- Standard Error (SE) of the mean:
\[ \text{SE}_{\bar{X}} = \frac{\sigma}{\sqrt{n}} \]
This formula shows an important concept:
- Larger sample sizes → smaller standard error
- Larger samples produce more stable, less variable sample means
Even if the population is skewed, the sampling distribution of the mean becomes more normal when \(n\) increases (Central Limit Theorem).
2.3 Population Distribution vs Sampling Distribution
| Concept | Population Distribution | Sampling Distribution |
|---|---|---|
| What it describes | Individuals | Sample statistics |
| Measured values | \(X\) | \(\bar{X}\) |
| Shape | Any shape | Approaches normal as \(n\) increases |
| Variability | Population SD \(\sigma\) | Standard Error \(\sigma / \sqrt{n}\) |
The key difference:
Sampling distributions describe how statistics vary across
repeated samples, not how individuals vary.
2.4 Uses of Sampling Distributions
Sampling distributions allow us to:
- Construct confidence intervals
- Perform hypothesis tests
- Evaluate probability statements involving sample statistics
- Understand sampling variability
- Predict how accurate a sample is for estimating a population
This is why sampling distributions are central to statistical inference.
Practice Question
Question
A population has mean \(\mu = 50\) and
standard deviation \(\sigma =
12\).
If samples of size \(n = 36\) are
repeatedly drawn, what is the standard error?
\[ \text{SE}_{\bar{X}} = \frac{\sigma}{\sqrt{n}} = \frac{12}{\sqrt{36}} = \frac{12}{6} = 2 \]
Answer:
The standard error of the sample mean is 2.
Practice Question
Question
If the population distribution is strongly skewed, what happens to the
sampling distribution of the sample mean as \(n\) increases?
Answer:
By the Central Limit Theorem:
- The sampling distribution becomes more normal
- The standard error becomes smaller
- Estimates become more stable
In this chapter, you learned the foundational ideas behind sampling distributions:
- Samples vary, so statistics vary.
- The sampling distribution describes that variation.
- The mean of the sampling distribution equals the population mean.
- The standard error shrinks with larger sample size.
- The Central Limit Theorem ensures normality for large \(n\).
Sampling distributions form the backbone of inferential statistics and prepare you for deeper topics such as confidence intervals and hypothesis testing.
2.5 Summary
- A sample is one collection from the population.
- A sample distribution is the distribution of values in that
one sample.
- A sampling distribution is the distribution of a statistic
across repeated samples.
- The mean of the sampling distribution equals the population
mean.
- The standard error quantifies variability of sample means.
\[ \text{SE}_{\bar{X}} = \frac{\sigma}{\sqrt{n}} \]
3 Confidence Intervals for a Population Mean (Unknown σ)
The Central Limit Theorem (CLT) is one of the most important ideas in all of statistics. It explains why sample means behave predictably, even when the population they come from is irregular, skewed, or unknown. When we take many samples of the same size and compute their means, those sample means form their own distribution — the sampling distribution.
What makes the CLT powerful is that this sampling distribution becomes approximately normal as the sample size increases. This allows us to make probability statements, build confidence intervals, and run hypothesis tests, even when we know very little about the underlying population. In short, the CLT connects real-world data to statistical inference. Video Reference3.1 Learning Objectives
- Understand when the t-distribution is used instead of the normal distribution.
- Construct a confidence interval for a population mean when population standard deviation is unknown.
- Interpret confidence intervals correctly in context.
- Identify how sample size affects uncertainty.
3.2 Review: When σ is Unknown
In many real situations, the population standard
deviation \(\sigma\) is not
known.
When this happens, we replace it with the sample standard
deviation \(s\).
This adds extra uncertainty, so instead of the normal z-distribution we use the t-distribution, which is wider and has heavier tails.
The t-Distribution
The t-distribution depends on the degrees of freedom:
\[ df = n - 1 \]
As the sample size increases, the t-distribution approaches the
normal distribution.
This is why large samples often allow us to use the z-based CI.
Confidence Interval Formula
When σ is unknown, the CI for the population mean \(\mu\) is:
\[ \bar{x} \pm t_{\alpha/2,\,n-1}\left( \frac{s}{\sqrt{n}} \right) \]
Where:
- \(\bar{x}\) = sample mean
- \(s\) = sample standard
deviation
- \(n\) = sample size
- \(t_{\alpha/2, n-1}\) = critical
value from t-distribution
- \(\frac{s}{\sqrt{n}}\) = standard error
Example
A sample of 16 students has:
- Mean sleep time: \(\bar{x} = 6.4\)
hours
- Standard deviation: \(s = 0.8\) hours
Construct a 95% CI for the population mean.
Step 1: Degrees of freedom \[ df = 16 - 1 = 15 \]
Step 2: Critical value For 95% CI and 15 df:
\[ t_{0.025,15} \approx 2.131 \]
Step 3: Standard error \[ SE = \frac{s}{\sqrt{n}} = \frac{0.8}{\sqrt{16}} = 0.2 \]
Step 4: CI \[ 6.4 \pm 2.131(0.2) \]
Margin of error:
\[ ME = 0.4262 \]
Final CI: \[ (5.97,\; 6.83) \]
Interpreting the Interval A 95% CI means:
If we repeatedly took samples of size 16 and constructed confidence intervals, about 95% of them would contain the true mean.
It does not mean there is a 95% probability the true mean lies in this specific interval.
3.3 Effect of Sample Size
- Larger \(n\) → smaller standard
error
- Smaller standard error → narrower CI
- Wider CIs occur with small \(n\) or noisy data (large \(s\))
The t-distribution becomes closer to the normal distribution as sample size grows.
Summary
- Use t-distribution when σ is unknown.
- Confidence interval formula uses sample standard deviation.
- Degrees of freedom = \(n -
1\).
- Larger sample sizes produce more precise estimates.
3.4 Practice Problem
A random sample of 10 adults has:
- \(\bar{x} = 72.3\) bpm
- \(s = 5.6\) bpm
Construct a 90% CI for the population mean heart rate.
Practice Problem
A researcher measures cortisol levels in 25 patients:
- \(\bar{x} = 14.5\) μg/dL
- \(s = 3.1\) μg/dL
Compute a 99% confidence interval for the population mean.
Chapter Summary
When repeatedly collect samples of size and compute their means, those means form a new distribution. The Central Limit Theorem states that as 𝑛 n becomes large, this sampling distribution approaches a normal shape, regardless of the population’s original distribution. The center of this distribution equals the population mean, and its spread — the standard error — equals \(σ/ \sqrt{n}.\) This result allows statisticians to use normal-based methods to estimate and test population parameters. Even when the population is skewed or irregular, the sampling distribution of the mean becomes predictable and well-behaved. The larger the sample, the more tightly the sample means cluster around the true population mean.
4 Sampling Distribution & Sample Proportions
Many statistical analyses require understanding not just the behavior of individual observations, but how sample-based statistics (like proportions or means) behave when we repeatedly draw samples from a population. This chapter explores the concept of sampling distributions, especially for proportions and the sample mean, and shows how the Central Limit Theorem helps us approximate their behavior under suitable conditions.
Video Reference
4.1 Learning Objective
By the end of this chapter you should be able to:
- Describe what a sampling distribution is.
- Distinguish between population distribution, a single sample
distribution, and a sampling distribution.
- Understand sampling distribution of a sample proportion and of the
sample mean.
- Use formulas for the mean and standard error of sample proportions
and sample means.
- Appreciate how larger sample size stabilizes estimates and allows normal approximation.
4.2 Sampling Distribution
When you take a random sample from a population, that sample has its own distribution (values of individuals). But if you imagine repeating the sampling process many times (with the same sample size), and computing a statistic (e.g. proportion or mean) for each sample, then the collection of those sample-statistics forms a sampling distribution.
This distribution models the variation of the statistic due to random sampling, not variation within one sample.
4.3 Proportions
Consider a population in which a certain characteristic occurs with probability \(p\). When we draw a random sample of size \(n\) and compute the sample proportion \(\hat p\) (the fraction of successes in the sample), \(\hat p\) becomes a random variable. Its distribution — across many hypothetical samples — is the sampling distribution of the proportion.
Sample Proportion vs Population Proportion
Key facts about \(\hat p\):
The expected value of \(\hat p\) equals the true population proportion \(p\):
\[ E[\hat p] = p \]The standard error (standard deviation) of \(\hat p\) is:
\[ \sigma_{\hat p} = \sqrt{ \frac{p(1-p)}{n} } \]
Thus \(\hat p\) is an unbiased estimator of \(p\), and larger \(n\) means smaller variability (more precise estimate).
4.4 Sampling Distribution of the Sample Proportion
Under conditions of sufficiently large sample size (commonly, \(n p \ge 5\) and \(n (1-p) \ge 5\)), the sampling distribution of \(\hat p\) can be approximated by a Normal distribution:
\[ \hat p \approx N\Bigl(p,\; \frac{p(1-p)}{n}\Bigr) \]
This allows us to apply the tools of normal-based probability to proportions (e.g. confidence intervals, hypothesis tests).
set.seed(123)
# Population proportion
p <- 0.6
n <- 50 # sample size
Nsim <- 5000 # number of simulations
# Simulate sampling distribution
phat <- replicate(Nsim, mean(rbinom(n, 1, p)))
# Plot histogram
hist(phat,
breaks = 30,
main = "Sampling Distribution of Sample Proportion (p-hat)",
xlab = "Sample Proportion (p-hat)",
probability = TRUE,
col = "skyblue",
border = "white",
cex.main = 2.5, # bigger title
cex.lab = 2, # bigger axis labels
cex.axis = 1.8) # bigger axis numbers
# Add theoretical Normal curve
x_vals <- seq(min(phat), max(phat), length = 200)
theoretical <- dnorm(x_vals, mean = p, sd = sqrt(p*(1-p)/n))
lines(x_vals, theoretical, lwd = 4, col = "red")4.5 Central Limit Theorem
The same reasoning extends to other statistics like the sample mean — under repeated sampling, the distribution of sample means tends toward normality as sample size increases, often regardless of the population’s shape.
In the case of proportions (which can be viewed as means of Bernoulli variables), CLT justifies the normal approximation for \(\hat p\) when \(n\) is large.
Chapter Summary
In this chapter we have seen that:
A sampling distribution describes the variability of a statistic (mean, proportion, etc.) across repeated samples.
For the sample proportion > \(\hat p\):
\[ E[\hat p] = p,\qquad \sigma_{\hat p} = \sqrt{\frac{p(1-p)}{n}} \]When \(n\) is large enough, \(\hat p\) is approximately normal.
The same logic — via the CLT — applies to the sample mean: larger \(n\) yields more stable estimates, and normal approximation becomes valid under mild conditions.
Sampling distributions (of proportions or means) are the theoretical foundation for inferential statistics — confidence intervals, hypothesis testing, margin of error — because they quantify the expected variability due to sampling.
5 Review: Sampling Distribution of the Sample Proportion, Binomial Distribution, Probability
Probability plays a foundational role in understanding uncertainty in real-world situations. Before exploring more advanced inferential tools, we must understand how probability rules connect with binomial outcomes and how sample proportions behave across repeated sampling.
This chapter provides a structured review of:
Simple probability and sample spaces
The binomial distribution and its formula
Sampling distribution of the sample proportion
Visual intuition supported by simulation
These concepts form the bridge toward hypothesis testing and confidence intervals for proportions.
5.1 Learning Objectives
By the end of this chapter, you will be able to:
- Define and interpret sample spaces.
- Apply simple probability rules.
- Use the binomial formula to compute event probabilities.
- Distinguish between population proportion \(p\) and sample proportion \(\hat{p}\).
- Understand the sampling distribution of a sample proportion.
- Apply the Central Limit Theorem for proportions when conditions are met.
5.2 Simple Probability and Sample Spaces
A sample space \(S\) is the set of all possible outcomes of
a random experiment.
If all outcomes are equally likely, the probability of an event \(A\) is:
\[ P(A) = \frac{\text{number of outcomes in } A}{\text{number of outcomes in } S} \]
Example:
A dice is rolled.
\[ S = \{1,2,3,4,5,6\}, \quad A = \{2,4,6\} \] \[ P(A) = \frac{3}{6} = 0.5 \]
Simple probability describes the likelihood of one event without considering combinations, intersections, or conditional structures.
5.3 Review: The Binomial Distribution Formula
A binomial experiment consists of: 1. A fixed number
of trials \(n\)
2. Only two outcomes (success/failure)
3. Constant probability of success \(p\)
4. Independent trials
The probability of getting exactly \(k\) successes:
\[ P(X=k) = {n \choose k} p^{k}(1-p)^{n-k} \]
where:
\[ {n \choose k} = \frac{n!}{k!(n-k)!} \]
The binomial distribution models count-based, discrete outcomes.
5.4 Review : Sampling Distribution of the Sample Proportion
For a population with true proportion \(p\):
\[ \hat{p} = \frac{X}{n} \] where \(X \sim \text{Binomial}(n,p)\).
Thus, \[ E(\hat{p}) = p \]
and the standard error: \[ SE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} \]
When \(np \ge 10\) and \(n(1-p) \ge 10\),
the sampling distribution of \(\hat{p}\) is approximately normal: \[
\hat{p} \sim N\left( p,\; \sqrt{\frac{p(1-p)}{n}} \right)
\]
Visualization: Sampling Distribution of the Sample Proportion
library(ggplot2)
set.seed(123)
# Parameters for simulation
p <- 0.3
n <- 40
nsim <- 10000
# Simulate binomial outcomes
successes <- rbinom(nsim, n, p)
prop <- successes / n
df <- data.frame(prop = prop)
ggplot(df, aes(x = prop)) +
geom_histogram(aes(y = ..density..), bins = 30,
color = "black", fill = "skyblue", alpha = 0.6) +
geom_density(color = "red", size = 1) +
labs(
title = "Sampling Distribution of Sample Proportion",
x = "Sample Proportion (\\hat{p})",
y = "Density"
) +
theme_minimal()Interpretation
The histogram represents the sampling distribution of the
sample proportion
\(\hat{p} = \frac{X}{n}\) for a
binomial process with:
- true population proportion \(p = 0.3\)
- sample size \(n = 40\)
- number of simulations \(10{,}000\)
Key points from the visualization:
- The histogram approximates the probability distribution of \(\hat{p}\).
- The red curve is a kernel density estimate, showing the smooth shape of the distribution.
- The distribution is approximately normal, which reflects the Central Limit Theorem (CLT) since \(np = 12 > 10\) and \(n(1-p) = 28 > 10\).
- The center of the distribution is close to the true value \(p = 0.3\), showing that \(\hat{p}\) is an unbiased estimator of \(p\).
- The spread of the distribution reflects the standard error of \(\hat{p}\):
\[ SE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.3 \cdot 0.7}{40}} \approx 0.072 \]
This means most sample proportions fall within:
\[ p \pm 2SE_{\hat{p}} \approx 0.3 \pm 0.14 \]
or roughly between 0.16 and 0.44, which matches the graph.
Combined Summary
All those videos explain the core ideas of probability and sampling in a way that builds from the basics to more advanced concepts. Everything starts with the idea that uncertainty can be measured, and probability gives us a way to quantify how likely an event is. We learn that all possible outcomes form a sample space, and events are just subsets of that space. This foundation helps us understand simple probability, the complement rule, and how multiple events can combine using unions and intersections.
The videos then explain the difference between mutually exclusive events (cannot happen together) and exhaustive events (cover all possible outcomes). Some events overlap, some don’t, and understanding this helps avoid double-counting when calculating probabilities. These ideas feed into the bigger picture of how uncertainty behaves.
We also explore independent vs dependent events. Independent events do not influence each other — like rolling two dice. Dependent events do affect each other — like drawing cards without replacement. This distinction is important when multiplying probabilities, because independent events allow the rule:\[ P(A \cap B) = P(A)P(B) \]
while dependent events require adjusting probabilities after the first event.
From there, the topics shift toward modeling repeated random processes, especially with the binomial distribution. The binomial model works when each trial has only two outcomes (success/failure), the probability of success is constant, and each trial is independent. The formula:\[ P(X = k) = \binom{n}{k} p^k (1-p)^{n-k} \]
tells us the probability of getting exactly \(k\) successes out of \(n\) trials.
Next, the videos move into the idea of sampling distributions, which is the backbone of statistical inference. A sample distribution describes data inside one sample, but a sampling distribution describes how a statistic — like a sample mean or sample proportion — behaves across many repeated samples.
For the sample mean, the sampling distribution becomes more normal as sample size increases because of the Central Limit Theorem (CLT). This means:
- the mean of the sampling distribution equals the population
mean
- the spread of the sampling distribution (standard error) shrinks as sample size increases
\[ SE_{\bar{X}} = \frac{\sigma}{\sqrt{n}} \]
\[ SE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} \]
The CLT explains why histograms of sample means and sample proportions look bell-shaped even if the population itself is not normal — as long as the sample size is reasonably large.
Altogether, the combined lessons show how probability helps describe uncertainty, how repeated random processes can be modeled mathematically, and how sampling distributions allow us to make conclusions about whole populations using just small samples. These ideas form the essential foundation for confidence intervals, hypothesis testing, and almost all of inferential statistics.
Reference
[1] YouTube. (Simple Learning Pro.). ZyUzRVa6hCM [Introduction to the probability of Continuos Variable]. YouTube. https://youtu.be/ZyUzRVa6hCM
[2] YouTube. (Simple Learning Pro.). 7S7j75d3GM4 Sampling Distribution. YouTube. https://youtu.be/7S7j75d3GM4
[3] YouTube. (Simple Learning Pro.). ivd8wEHnMCg Central Limit Theorem. YouTube. https://youtu.be/ivd8wEHnMCg
[4] YouTube. (Simple Learning Pro.). q2e4mK0FTbw [Sample Proportion]. YouTube. https://youtu.be/q2e4mK0FTbw
[5] YouTube. (Simple Learning Pro.). c0mFEL_SWzE [Review Sampling Distribution]. YouTube. https://youtu.be/c0mFEL_SWzE
[6] Diez, D. M., Barr, C. D., & Çetinkaya-Rundel, M. (2024). OpenIntro Statistics (5th ed.). OpenIntro. https://www.openintro.org/book/os/
[7] Blitzstein, J., & Hwang, J. (2024). Introduction to Probability (2nd ed.). CRC Press. Free draft: https://projects.iq.harvard.edu/stat110/home
[8] VanderPlas, J. (2022). A Whirlwind Tour of Data Science. O’Reilly Media. https://github.com/jakevdp/WhirlwindTourOfDataScience
[9] Severance, C. (n.d.). Python for everybody. https://www.py4e.com/book
[10] Downey, A. B. (2023). Think Stats: Exploratory Data Analysis in Python (2nd ed.). Green Tea Press. https://greenteapress.com/wp/think-stats-2e/
[11] Shafer, D. S., & Zhang, Z. (2012). Introductory statistics. Saylor Foundation.