Probability Distribution

Week 11

INSTITUT TEKNOLOGI SAINS BANDUNG

IDENTITY CARD

Name : Dhefio Alim Muzakki

Student ID : 52250014

Major : Data Science

Lecturer : Mr. Bakti Siregar, M.Sc., CDS.

library(tidyverse)
library(readr)
library(ggplot2)
library(dplyr)
library(ggridges)
library(knitr)
library(DT)

Introduction

These additional topics extend the foundational ideas of probability into more applied tools that help describe and model real-world uncertainty. Together, they shift the perspective from basic event relationships toward structured random processes and the behavior of discrete probability distributions.

This chapter introduces the logic behind discrete probability models, visual representations of random variables, and the use of probability mass functions to describe how outcomes are distributed. By understanding how probability is allocated across possible values, learners begin to see how more complex statistical tools are built from simple counting principles, event structures, and conditional reasoning.

These videos collectively reinforce how probability can be expressed numerically, graphically, and conceptually — forming a bridge between elementary definitions and more advanced inferential techniques. The discussions and examples also demonstrate how probability models guide real-world interpretation, allowing learners to connect mathematical expressions with meaningful outcomes.

The following set of videos expands the understanding of probability by moving from foundational definitions toward more applied and visual concepts. Each video contributes a specific piece of intuition, helping you connect theory with real-world reasoning. Together, they create an unified learning path that strengthens both conceptual clarity and problem-solving skills.

Learn how random variables work and why they are essential tools in probability.
Explore variance and standard deviation to understand how outcomes spread around the mean.
See how expected value predicts long-term behavior in uncertain situations.
Get visual intuition through graphs and demonstrations.
Reinforce the ideas through guided examples and real-life interpretations.

1 Continuous Random

Continuous Random Variables for continuous variables, which describe the likelihood of values over a continuous range. Video Reference

Probability Distribution of Continuous Variables

To understand continuous random variables, it is essential to know how probability is represented using a Probability Density Function (PDF). Unlike discrete random variables, a continuous random variable does not assign probability to individual points. Instead, probability is obtained from the area under the PDF curve.

1.0.1 Random Variable

A random variable is continuous if it can take any value within an interval on the real number line.
Examples include: height, time, temperature, age, pressure, and velocity.

Key characteristics:

The variable takes values in an interval such as \((a, b)\) or even \((-\infty, +\infty)\).
The probability of any single point is always zero:

\[P(X = x) = 0\]
Probabilities are meaningful only over intervals:

\[ P(a \le X \le b) = \int_{a}^{b} f(x)\, dx \]

1.1 Probability Density Function

A function \(f(x)\) is a valid Probability Density Function (PDF) if it satisfies:

Non-negativity

\[ f(x) \ge 0 \quad \forall x \]

Total Area Equals

\[ \int_{-\infty}^{\infty} f(x)\, dx = 1 \] ### Interpretation

Larger values of \(f(x)\) indicate higher probability density around that value.
However, \(f(x)\) is not a probability; probabilities come from the area under the curve.

1.1.1 Example PDF:

\(f(x) = 3x^2\) on \([0, 1]\)

Consider the probability density function:

\[ f(x) = 3x^2, \quad 0 \le x \le 1 \]

Validation:

\[ \int_{0}^{1} 3x^2 \, dx = 1 \]

1.1.2 Probability on an Interval

To compute probability within an interval:

\[ P(a \le X \le b) = \int_{a}^{b} 3x^2 \, dx \]

Example:

\[ P(0.5 \le X \le 1) \]

library(ggplot2)
library(dplyr)

set.seed(123)

# PARAMETERS
p <- 0.3
n <- 30
nsim <- 10000

# SIMULATION
successes <- rbinom(nsim, n, p)
prop <- successes / n

df <- data.frame(prop)

# THEORETICAL VALUES
mu <- p
se <- sqrt(p * (1 - p) / n)

# COOLER PROFESSIONAL PLOT
ggplot(df, aes(x = prop)) +
  geom_histogram(aes(y = ..density..),
                 bins = 35,
                 fill = "#4F81BD",
                 color = "white",
                 alpha = 0.85) +
  geom_density(color = "#C0504D", size = 1.2) +
  geom_vline(xintercept = mu, color = "#9BBB59", size = 1.3, linetype = "solid") +
  geom_vline(xintercept = mu + se, color = "#8064A2", size = 1, linetype = "dashed") +
  geom_vline(xintercept = mu - se, color = "#8064A2", size = 1, linetype = "dashed") +
  annotate("text",
           x = mu, y = 12,
           label = "Mean = p",
           color = "#9BBB59", size = 4, fontface = "bold", vjust = -1) +
  annotate("text",
           x = mu + se, y = 10,
           label = "+1 SE",
           color = "#8064A2", size = 4) +
  annotate("text",
           x = mu - se, y = 10,
           label = "-1 SE",
           color = "#8064A2", size = 4) +
  labs(
    title = "Sampling Distribution of the Sample Proportion",
    subtitle = paste0("True Proportion p = ", p, ", Sample Size n = ", n,
                      ", Simulations = ", nsim),
    x = "Sample Proportion  (\\hat{p})",
    y = "Density",
    caption = "Visualization of the sampling distribution based on repeated sampling"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", size = 18),
    plot.subtitle = element_text(color = "gray40"),
    panel.grid.minor = element_blank()
  )

Interpretation

The graph displays the empirical sampling distribution of \(\hat{p}\) produced by 10,000 repeated samples of size \(n = 30\) from a population with true proportion \(p = 0.3\).

Key Interpretation Points

Center at the True Proportion The solid green vertical line shows the theoretical mean \(p = 0.3\).
The density curve peaks around this value, confirming:

\[ E(\hat{p}) = p \]

This indicates that the estimator \(\hat{p}\) is unbiased.

Spread Represented by Standard Error The two dashed purple lines show:

\[ p \pm SE = p \pm \sqrt{\frac{p(1-p)}{n}} \]

This region contains about 68% of sample proportions under normality.
The histogram aligns well with these bounds, illustrating correct variability.

Shape: Approaching Normality Even with moderate sample size \(n=30\), the histogram approximates a bell-shaped curve.
This confirms the Central Limit Theorem (CLT) for proportions:

\[ \hat{p} \text{ is approximately normal for moderate } n \]

Simulation Validates Theory The density curve (red) aligns with theoretical expectations:

Unbiased center
Variability predicted by standard error
Approximate normality

Thus, the simulation reinforces the theoretical sampling distribution.

1.1.3 Cumulative Distribution Function (CDF)

The Cumulative Distribution Function (CDF) of a continuous random variable \(X\) is defined as:

\[ F(x) = P(X \le x) = \int_{0}^{x} 3t^{2}\, dt = x^{3} \]

1.1.4 Relationship Between PDF and CDF

The Probability Density Function (PDF) is the derivative of the CDF:

\[ f(x) = F'(x) \]

2 Sampling Distributions

Sampling distributions form a core idea in inferential statistics. While populations contain all possible individuals and samples represent only a slice of them, the sampling distribution connects these two worlds by describing how a statistic behaves across many repeated samples. This chapter builds a clear and intuitive picture of how sample means vary, why larger samples produce more stable estimates, and how this leads to statistical inference.

Video Reference

The video introduces the goal of understanding what a sampling distribution is, how it differs from a sample distribution, and why the sample mean when repeated over many samples follows a predictable pattern. The fundamental objective is to explain how uncertainty in estimates arises and how it decreases as sample size increases.

Introduction

In statistics, we rarely observe an entire population. Instead, we collect samples.
However, each sample is different — which means sample statistics (like the sample mean) vary.

This chapter explains how sample statistics behave across repeated samples, forming what we call a sampling distribution. Understanding this concept is essential for confidence intervals, hypothesis testing, and nearly all inferential methods.

2.1 Learning Objectives

By the end of this chapter

Understand what a sample is and how it differs from a population.
Distinguish sample distributions from sampling distributions.
Explain the sampling distribution of the sample mean.
Identify differences between population and sampling distributions.
Apply basic reasoning about sampling distributions to real data.
Solve related practice problems.

Review of Samples

A sample is a subset of individuals selected from a population.

If the population is:

All students at a university
All manufactured items in a factory
All possible rolls of a fair die

then a sample is a smaller collection drawn to represent the whole.

A sample statistic such as the sample mean is:

\[ \bar{X} = \frac{1}{n}\sum_{i=1}^{n} X_i \]

But different samples produce different values of \(\bar{X}\).
This variability leads to sampling distributions.

2.2 Sample Distribution vs Sampling Distribution

Sample Distribution Describes the values within one sample.

Example:
If your sample of test scores is:

\[ \{70, 75, 80, 90, 85\} \]

That list itself forms the sample distribution.

Sampling Distribution Describes the distribution of a statistic across many repeated samples of the same size.

If we repeatedly draw samples of size \(n = 5\) and compute each sample mean:

\[ \bar{X}_1, \bar{X}_2, \bar{X}_3, \dots \]

The distribution of those means is the sampling distribution of the sample mean.

Sampling Distribution of the Sample Mean

The sample mean has predictable behavior:

Mean of sampling distribution:

\[ \mu_{\bar{X}} = \mu \]

Standard Error (SE) of the mean:

\[ \text{SE}_{\bar{X}} = \frac{\sigma}{\sqrt{n}} \]

This formula shows an important concept:

Larger sample sizes → smaller standard error
Larger samples produce more stable, less variable sample means

Even if the population is skewed, the sampling distribution of the mean becomes more normal when \(n\) increases (Central Limit Theorem).

2.3 Population Distribution vs Sampling Distribution

Concept	Population Distribution	Sampling Distribution
What it describes	Individuals	Sample statistics
Measured values	\(X\)	\(\bar{X}\)
Shape	Any shape	Approaches normal as \(n\) increases
Variability	Population SD \(\sigma\)	Standard Error \(\sigma / \sqrt{n}\)

The key difference:
Sampling distributions describe how statistics vary across repeated samples, not how individuals vary.

2.4 Uses of Sampling Distributions

Sampling distributions allow us to:

Construct confidence intervals
Perform hypothesis tests
Evaluate probability statements involving sample statistics
Understand sampling variability
Predict how accurate a sample is for estimating a population

This is why sampling distributions are central to statistical inference.

Practice Question

Question
A population has mean \(\mu = 50\) and standard deviation \(\sigma = 12\).
If samples of size \(n = 36\) are repeatedly drawn, what is the standard error?

\[ \text{SE}_{\bar{X}} = \frac{\sigma}{\sqrt{n}} = \frac{12}{\sqrt{36}} = \frac{12}{6} = 2 \]

Answer:
The standard error of the sample mean is 2.

Practice Question

Question
If the population distribution is strongly skewed, what happens to the sampling distribution of the sample mean as \(n\) increases?

Answer:
By the Central Limit Theorem:

The sampling distribution becomes more normal
The standard error becomes smaller
Estimates become more stable

In this chapter, you learned the foundational ideas behind sampling distributions:

Samples vary, so statistics vary.
The sampling distribution describes that variation.
The mean of the sampling distribution equals the population mean.
The standard error shrinks with larger sample size.
The Central Limit Theorem ensures normality for large \(n\).

Sampling distributions form the backbone of inferential statistics and prepare you for deeper topics such as confidence intervals and hypothesis testing.

2.5 Summary

A sample is one collection from the population.
A sample distribution is the distribution of values in that one sample.
A sampling distribution is the distribution of a statistic across repeated samples.
The mean of the sampling distribution equals the population mean.
The standard error quantifies variability of sample means.

\[ \text{SE}_{\bar{X}} = \frac{\sigma}{\sqrt{n}} \]

3 Confidence Intervals for a Population Mean (Unknown σ)

The Central Limit Theorem (CLT) is one of the most important ideas in all of statistics. It explains why sample means behave predictably, even when the population they come from is irregular, skewed, or unknown. When we take many samples of the same size and compute their means, those sample means form their own distribution — the sampling distribution.

What makes the CLT powerful is that this sampling distribution becomes approximately normal as the sample size increases. This allows us to make probability statements, build confidence intervals, and run hypothesis tests, even when we know very little about the underlying population. In short, the CLT connects real-world data to statistical inference. Video Reference

3.1 Learning Objectives

Understand when the t-distribution is used instead of the normal distribution.
Construct a confidence interval for a population mean when population standard deviation is unknown.
Interpret confidence intervals correctly in context.
Identify how sample size affects uncertainty.

3.2 Review: When σ is Unknown

In many real situations, the population standard deviation \(\sigma\) is not known.
When this happens, we replace it with the sample standard deviation \(s\).

This adds extra uncertainty, so instead of the normal z-distribution we use the t-distribution, which is wider and has heavier tails.

The t-Distribution

The t-distribution depends on the degrees of freedom:

\[ df = n - 1 \]

As the sample size increases, the t-distribution approaches the normal distribution.
This is why large samples often allow us to use the z-based CI.

Confidence Interval Formula

When σ is unknown, the CI for the population mean \(\mu\) is:

\[ \bar{x} \pm t_{\alpha/2,\,n-1}\left( \frac{s}{\sqrt{n}} \right) \]

Where:

\(\bar{x}\) = sample mean
\(s\) = sample standard deviation
\(n\) = sample size
\(t_{\alpha/2, n-1}\) = critical value from t-distribution
\(\frac{s}{\sqrt{n}}\) = standard error

Example

A sample of 16 students has:

Mean sleep time: \(\bar{x} = 6.4\) hours
Standard deviation: \(s = 0.8\) hours

Construct a 95% CI for the population mean.

Step 1: Degrees of freedom \[ df = 16 - 1 = 15 \]

Step 2: Critical value For 95% CI and 15 df:

\[ t_{0.025,15} \approx 2.131 \]

Step 3: Standard error \[ SE = \frac{s}{\sqrt{n}} = \frac{0.8}{\sqrt{16}} = 0.2 \]

Step 4: CI \[ 6.4 \pm 2.131(0.2) \]

Margin of error:

\[ ME = 0.4262 \]

Final CI: \[ (5.97,\; 6.83) \]

Interpreting the Interval A 95% CI means:

If we repeatedly took samples of size 16 and constructed confidence intervals, about 95% of them would contain the true mean.

It does not mean there is a 95% probability the true mean lies in this specific interval.

3.3 Effect of Sample Size

Larger \(n\) → smaller standard error
Smaller standard error → narrower CI
Wider CIs occur with small \(n\) or noisy data (large \(s\))

The t-distribution becomes closer to the normal distribution as sample size grows.

Summary

Use t-distribution when σ is unknown.
Confidence interval formula uses sample standard deviation.
Degrees of freedom = \(n - 1\).
Larger sample sizes produce more precise estimates.

3.4 Practice Problem

A random sample of 10 adults has:

\(\bar{x} = 72.3\) bpm
\(s = 5.6\) bpm

Construct a 90% CI for the population mean heart rate.

Practice Problem

A researcher measures cortisol levels in 25 patients:

\(\bar{x} = 14.5\) μg/dL
\(s = 3.1\) μg/dL

Compute a 99% confidence interval for the population mean.

Chapter Summary

When repeatedly collect samples of size and compute their means, those means form a new distribution. The Central Limit Theorem states that as 𝑛 n becomes large, this sampling distribution approaches a normal shape, regardless of the population’s original distribution. The center of this distribution equals the population mean, and its spread — the standard error — equals \(σ/ \sqrt{n}.\) This result allows statisticians to use normal-based methods to estimate and test population parameters. Even when the population is skewed or irregular, the sampling distribution of the mean becomes predictable and well-behaved. The larger the sample, the more tightly the sample means cluster around the true population mean.

4 Sampling Distribution & Sample Proportions

Many statistical analyses require understanding not just the behavior of individual observations, but how sample-based statistics (like proportions or means) behave when we repeatedly draw samples from a population. This chapter explores the concept of sampling distributions, especially for proportions and the sample mean, and shows how the Central Limit Theorem helps us approximate their behavior under suitable conditions.

Video Reference

4.1 Learning Objective

By the end of this chapter you should be able to:

Describe what a sampling distribution is.
Distinguish between population distribution, a single sample distribution, and a sampling distribution.
Understand sampling distribution of a sample proportion and of the sample mean.
Use formulas for the mean and standard error of sample proportions and sample means.
Appreciate how larger sample size stabilizes estimates and allows normal approximation.

4.2 Sampling Distribution

When you take a random sample from a population, that sample has its own distribution (values of individuals). But if you imagine repeating the sampling process many times (with the same sample size), and computing a statistic (e.g. proportion or mean) for each sample, then the collection of those sample-statistics forms a sampling distribution.

This distribution models the variation of the statistic due to random sampling, not variation within one sample.

4.3 Proportions

Consider a population in which a certain characteristic occurs with probability \(p\). When we draw a random sample of size \(n\) and compute the sample proportion \(\hat p\) (the fraction of successes in the sample), \(\hat p\) becomes a random variable. Its distribution — across many hypothetical samples — is the sampling distribution of the proportion.

Sample Proportion vs Population Proportion

Key facts about \(\hat p\):

The expected value of \(\hat p\) equals the true population proportion \(p\):
\[ E[\hat p] = p \]
The standard error (standard deviation) of \(\hat p\) is:
\[ \sigma_{\hat p} = \sqrt{ \frac{p(1-p)}{n} } \]

Thus \(\hat p\) is an unbiased estimator of \(p\), and larger \(n\) means smaller variability (more precise estimate).

4.4 Sampling Distribution of the Sample Proportion

Under conditions of sufficiently large sample size (commonly, \(n p \ge 5\) and \(n (1-p) \ge 5\)), the sampling distribution of \(\hat p\) can be approximated by a Normal distribution:

\[ \hat p \approx N\Bigl(p,\; \frac{p(1-p)}{n}\Bigr) \]

This allows us to apply the tools of normal-based probability to proportions (e.g. confidence intervals, hypothesis tests).

set.seed(123)

# Population proportion
p <- 0.6
n <- 50             # sample size
Nsim <- 5000        # number of simulations

# Simulate sampling distribution
phat <- replicate(Nsim, mean(rbinom(n, 1, p)))

# Plot histogram
hist(phat,
     breaks = 30,
     main = "Sampling Distribution of Sample Proportion (p-hat)",
     xlab = "Sample Proportion (p-hat)",
     probability = TRUE,
     col = "skyblue",
     border = "white",
     cex.main = 2.5,   # bigger title
     cex.lab = 2,      # bigger axis labels
     cex.axis = 1.8)   # bigger axis numbers

# Add theoretical Normal curve
x_vals <- seq(min(phat), max(phat), length = 200)
theoretical <- dnorm(x_vals, mean = p, sd = sqrt(p*(1-p)/n))
lines(x_vals, theoretical, lwd = 4, col = "red")

4.5 Central Limit Theorem

The same reasoning extends to other statistics like the sample mean — under repeated sampling, the distribution of sample means tends toward normality as sample size increases, often regardless of the population’s shape.

In the case of proportions (which can be viewed as means of Bernoulli variables), CLT justifies the normal approximation for \(\hat p\) when \(n\) is large.

Chapter Summary

In this chapter we have seen that:

A sampling distribution describes the variability of a statistic (mean, proportion, etc.) across repeated samples.
For the sample proportion > \(\hat p\):
\[ E[\hat p] = p,\qquad \sigma_{\hat p} = \sqrt{\frac{p(1-p)}{n}} \]

When \(n\) is large enough, \(\hat p\) is approximately normal.
The same logic — via the CLT — applies to the sample mean: larger \(n\) yields more stable estimates, and normal approximation becomes valid under mild conditions.
Sampling distributions (of proportions or means) are the theoretical foundation for inferential statistics — confidence intervals, hypothesis testing, margin of error — because they quantify the expected variability due to sampling.

5 Review: Sampling Distribution of the Sample Proportion, Binomial Distribution, Probability

Probability plays a foundational role in understanding uncertainty in real-world situations. Before exploring more advanced inferential tools, we must understand how probability rules connect with binomial outcomes and how sample proportions behave across repeated sampling.

Video Reference

This chapter provides a structured review of:

Simple probability and sample spaces
The binomial distribution and its formula
Sampling distribution of the sample proportion
Visual intuition supported by simulation

These concepts form the bridge toward hypothesis testing and confidence intervals for proportions.

5.1 Learning Objectives

By the end of this chapter, you will be able to:

Define and interpret sample spaces.
Apply simple probability rules.
Use the binomial formula to compute event probabilities.
Distinguish between population proportion \(p\) and sample proportion \(\hat{p}\).
Understand the sampling distribution of a sample proportion.
Apply the Central Limit Theorem for proportions when conditions are met.

5.2 Simple Probability and Sample Spaces

A sample space \(S\) is the set of all possible outcomes of a random experiment.
If all outcomes are equally likely, the probability of an event \(A\) is:

\[ P(A) = \frac{\text{number of outcomes in } A}{\text{number of outcomes in } S} \]

Example:
A dice is rolled.

\[ S = \{1,2,3,4,5,6\}, \quad A = \{2,4,6\} \] \[ P(A) = \frac{3}{6} = 0.5 \]

Simple probability describes the likelihood of one event without considering combinations, intersections, or conditional structures.

5.3 Review: The Binomial Distribution Formula

A binomial experiment consists of: 1. A fixed number of trials \(n\)
2. Only two outcomes (success/failure)
3. Constant probability of success \(p\)
4. Independent trials

The probability of getting exactly \(k\) successes:

\[ P(X=k) = {n \choose k} p^{k}(1-p)^{n-k} \]

where:

\[ {n \choose k} = \frac{n!}{k!(n-k)!} \]

The binomial distribution models count-based, discrete outcomes.

5.4 Review : Sampling Distribution of the Sample Proportion

For a population with true proportion \(p\):

\[ \hat{p} = \frac{X}{n} \] where \(X \sim \text{Binomial}(n,p)\).

Thus, \[ E(\hat{p}) = p \]

and the standard error: \[ SE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} \]

When \(np \ge 10\) and \(n(1-p) \ge 10\),
the sampling distribution of \(\hat{p}\) is approximately normal: \[ \hat{p} \sim N\left( p,\; \sqrt{\frac{p(1-p)}{n}} \right) \]

Visualization: Sampling Distribution of the Sample Proportion

library(ggplot2)

set.seed(123)

# Parameters for simulation
p <- 0.3
n <- 40
nsim <- 10000

# Simulate binomial outcomes
successes <- rbinom(nsim, n, p)
prop <- successes / n

df <- data.frame(prop = prop)

ggplot(df, aes(x = prop)) +
  geom_histogram(aes(y = ..density..), bins = 30,
                 color = "black", fill = "skyblue", alpha = 0.6) +
  geom_density(color = "red", size = 1) +
  labs(
    title = "Sampling Distribution of Sample Proportion",
    x = "Sample Proportion (\\hat{p})",
    y = "Density"
  ) +
  theme_minimal()

Interpretation

The histogram represents the sampling distribution of the sample proportion
\(\hat{p} = \frac{X}{n}\) for a binomial process with:

true population proportion \(p = 0.3\)
sample size \(n = 40\)
number of simulations \(10{,}000\)

Key points from the visualization:

The histogram approximates the probability distribution of \(\hat{p}\).
The red curve is a kernel density estimate, showing the smooth shape of the distribution.
The distribution is approximately normal, which reflects the Central Limit Theorem (CLT) since \(np = 12 > 10\) and \(n(1-p) = 28 > 10\).
The center of the distribution is close to the true value \(p = 0.3\), showing that \(\hat{p}\) is an unbiased estimator of \(p\).
The spread of the distribution reflects the standard error of \(\hat{p}\):

\[ SE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.3 \cdot 0.7}{40}} \approx 0.072 \]

This means most sample proportions fall within:

\[ p \pm 2SE_{\hat{p}} \approx 0.3 \pm 0.14 \]

or roughly between 0.16 and 0.44, which matches the graph.

Combined Summary

All those videos explain the core ideas of probability and sampling in a way that builds from the basics to more advanced concepts. Everything starts with the idea that uncertainty can be measured, and probability gives us a way to quantify how likely an event is. We learn that all possible outcomes form a sample space, and events are just subsets of that space. This foundation helps us understand simple probability, the complement rule, and how multiple events can combine using unions and intersections.

The videos then explain the difference between mutually exclusive events (cannot happen together) and exhaustive events (cover all possible outcomes). Some events overlap, some don’t, and understanding this helps avoid double-counting when calculating probabilities. These ideas feed into the bigger picture of how uncertainty behaves.

We also explore independent vs dependent events. Independent events do not influence each other — like rolling two dice. Dependent events do affect each other — like drawing cards without replacement. This distinction is important when multiplying probabilities, because independent events allow the rule:

\[ P(A \cap B) = P(A)P(B) \]

while dependent events require adjusting probabilities after the first event.

From there, the topics shift toward modeling repeated random processes, especially with the binomial distribution. The binomial model works when each trial has only two outcomes (success/failure), the probability of success is constant, and each trial is independent. The formula:

\[ P(X = k) = \binom{n}{k} p^k (1-p)^{n-k} \]

tells us the probability of getting exactly \(k\) successes out of \(n\) trials.

Next, the videos move into the idea of sampling distributions, which is the backbone of statistical inference. A sample distribution describes data inside one sample, but a sampling distribution describes how a statistic — like a sample mean or sample proportion — behaves across many repeated samples.

For the sample mean, the sampling distribution becomes more normal as sample size increases because of the Central Limit Theorem (CLT). This means:

the mean of the sampling distribution equals the population mean
the spread of the sampling distribution (standard error) shrinks as sample size increases

The standard error of the sample mean is:

\[ SE_{\bar{X}} = \frac{\sigma}{\sqrt{n}} \]

For the sample proportion, the same logic applies. Proportions also have a sampling distribution with mean \(p\) and standard error:

\[ SE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} \]

The CLT explains why histograms of sample means and sample proportions look bell-shaped even if the population itself is not normal — as long as the sample size is reasonably large.

Altogether, the combined lessons show how probability helps describe uncertainty, how repeated random processes can be modeled mathematically, and how sampling distributions allow us to make conclusions about whole populations using just small samples. These ideas form the essential foundation for confidence intervals, hypothesis testing, and almost all of inferential statistics.

Reference

[1] YouTube. (Simple Learning Pro.). ZyUzRVa6hCM [Introduction to the probability of Continuos Variable]. YouTube. https://youtu.be/ZyUzRVa6hCM

[2] YouTube. (Simple Learning Pro.). 7S7j75d3GM4 Sampling Distribution. YouTube. https://youtu.be/7S7j75d3GM4

[3] YouTube. (Simple Learning Pro.). ivd8wEHnMCg Central Limit Theorem. YouTube. https://youtu.be/ivd8wEHnMCg

[4] YouTube. (Simple Learning Pro.). q2e4mK0FTbw [Sample Proportion]. YouTube. https://youtu.be/q2e4mK0FTbw

[5] YouTube. (Simple Learning Pro.). c0mFEL_SWzE [Review Sampling Distribution]. YouTube. https://youtu.be/c0mFEL_SWzE

[6] Diez, D. M., Barr, C. D., & Çetinkaya-Rundel, M. (2024). OpenIntro Statistics (5th ed.). OpenIntro. https://www.openintro.org/book/os/

[7] Blitzstein, J., & Hwang, J. (2024). Introduction to Probability (2nd ed.). CRC Press. Free draft: https://projects.iq.harvard.edu/stat110/home

[8] VanderPlas, J. (2022). A Whirlwind Tour of Data Science. O’Reilly Media. https://github.com/jakevdp/WhirlwindTourOfDataScience

[9] Severance, C. (n.d.). Python for everybody. https://www.py4e.com/book

[10] Downey, A. B. (2023). Think Stats: Exploratory Data Analysis in Python (2nd ed.). Green Tea Press. https://greenteapress.com/wp/think-stats-2e/

[11] Shafer, D. S., & Zhang, Z. (2012). Introductory statistics. Saylor Foundation.