Tugas Week 11 ~ Probability Distribution

1 . Introduction

This introduction serves as the definitive gateway to one of the most foundational and transformative concepts in statistical analysis: Probability Distributions.

Statistics is fundamentally the science of making confident decisions amid uncertainty. When we encounter variable outcomes such as the number of “Heads” we get after tossing a coin \(N\) times we cannot rely on a single, isolated value. This is where the Probability Distribution steps in as the primary analytical tool. A Probability Distribution is the “master map” that precisely describes the likelihood for every possible outcome. The Coin Toss Example: If you toss a coin 10 times, the possible outcomes (Random Variable \(X\) = Number of Heads) range from 0 to 10. The Probability Distribution tells you:

• What is the chance of getting exactly 5 Heads?

• What is the chance of getting at least 8 Heads? (\(P(X \ge 8)\))

Its function goes far beyond simple prediction. Crucially, it forms the core analytical basis for virtually all inferential statistical methods we use to:

  1. Quantify Risk and Uncertainty: It measures the expected variation or spread of results around the most likely value (e.g., the expected average of 5 Heads).

  2. Drive Inference: It allows us to draw valid, responsible conclusions from sample data such as determining if a deviation (like getting 9 Heads) is unusual enough to conclude the coin is biased. The central concept is the Random Variable (\(X\)), which numerically represents the experimental outcome. The Distribution then maps the probability onto each possible value.

The Critical Insight: Understanding Distribution Properties To make accurate and robust predictions, it is absolutely mandatory that we grasp the characteristics of the distribution we are dealing with:

• Shape: Is the distribution symmetric (like a fair coin’s outcome), skewed, or uniform?

• Spread (Variability): How far are the values likely to scatter from the mean?

• Center (Location): What is the most expected or average value? Without this deep understanding of a distribution’s properties, our probability calculations will be flawed, and our predictive models will be fundamentally unreliable. In essence, Probability Distributions are the universal language that translates raw data into quantifiable, actionable probabilities.

The Four Pillars of Statistical Inference Having established the indispensable role of Probability Distributions, this report will guide you through four essential topics that constitute the building blocks of inferential statistics the field that allows us to move from simply describing data to drawing powerful, general conclusions:

  1. Continuous Random Variables

We shift our focus to outcomes that can take any value within a given range (e.g., the time it takes to get a Head, weight, or temperature). This necessitates moving from calculating the probability of a specific point to calculating the probability as the area under a curve, with special emphasis on the paramount Normal Distribution.

  1. Sampling Distributions

This is the hinge concept, explaining how sample statistics behave (e.g., the average proportion of Heads \(\hat{p}\) across many different sets of 10 coin tosses). This distribution is the critical link that connects the small sample we observe to the large population we seek to understand.

  1. The Central Limit Theorem (CLT)

The CLT is often called the “Gold Standard” or the most crucial result in statistics. Its magical power states that, regardless of the original population distribution’s shape, the distribution of sample averages (the Sampling Distribution) will tend toward a Normal Distribution provided the sample size (\(n\)) is large enough. This allows us to leverage the power of Normal probability for almost any real-world analysis.

  1. Sample Proportion Distributions

This topic specifically addresses categorical or binary outcomes (Heads/Tails, Yes/No, Success/Failure). We will study the probability distribution of the proportion of successes (\(\hat{p}\)) found in a sample, a vital skill for survey data analysis and percentage-based hypothesis testing. The Introductory Takeaway

Mastering these four concepts from continuous variables to the revolutionary power of the CLT will equip you with the critical capability to analyze data deeply, construct sound statistical models, and ultimately draw robust, solid conclusions grounded in strong probabilistic principles. This foundation will transform you from a simple data consumer into a sophisticated, data-driven decision-maker.

2 . Countinuous Random

2.1 . Summary of the video:

2.1.1 . Discrete Variables: Data Obtained by Counting

A discrete variable represents data whose values are countable and limited to a finite number of possibilities. The key characteristic of this type of data is that it is generated through the process of counting. Consequently, discrete variables can only take on certain, isolated values, leaving no possibility for any value in between the defined steps.

For example, when considering the number of children in a family, the outcomes are restricted to integers like 0, 1, 2, or 3. The video emphasizes that asking for a fraction of a child is nonsensical in this context. Similarly, the score on a test is discrete because the possible outcomes are limited to a finite set of numbers. Even in cases involving currency, where decimals (cents) are present, the variable remains discrete because the precision is limited to a countable number of decimal places, making the set of values finite.

In terms of visualization, discrete data is represented by a Bar Chart, where the gaps between the bars clearly signify the separation between the countable values.

2.1.2 . Continuous Variables: Data Obtained by Measuring

In contrast, a continuous variable is data that can take on any numerical value within a given range. The fundamental source of continuous data is measurement (measuring), not counting.

This reliance on measurement means the number of possible outcomes within any range is infinite and uncountable. The defining trait of continuous data is that you can always achieve greater precision by adding more decimal places.

The video uses age as a prime example. While one might state an age of 23 years, precise measurement can extend this value to 23 years, 6 months, 2 days, 3 seconds, and down to milliseconds or nanoseconds, theoretically without end. The same concept applies to weight. Since measurement can be refined infinitely, all values within the range are theoretically possible.

Continuous variables are typically visualized using a Histogram or a Density Curve. The absence of gaps between the bars in a histogram illustrates the continuity of the data, reflecting the infinite possibilities of measurement within the observed range.

2.1.3 . Probability Formulas and Concepts

The difference between counting and measuring fundamentally changes how we calculate probability:

For Discrete Variables Focus: Finding the probability of a single, exact value (\(P(X = x)\)).

Calculation: Probability is found by Summation (adding up) the probabilities of all relevant outcomes.

For Continuous Variables Focus: Probability is calculated for a range of values (\(P(a < X < b)\)), not a single point.Calculation: Probability is equal to the Area Under the Density Curve (which involves integration, or calculus).

Key Fact: The probability of a single, exact value in a continuous distribution (e.g., \(P(X = 170.000...)\)) is always zero.

Major Example: The Normal Distribution is the most common example of a density curve used for continuous variables.

2.1.4 . Visualization

A. Discrete Data Visualization (Bar Chart with Gaps)

This code demonstrates a discrete variable (e.g., the probability of getting a certain number of heads from four coin flips), emphasizing the gaps between bars.

B. Continuous Data Visualization (Density Curve/Histogram)

This code demonstrates a continuous variable (like height, following a Normal Distribution), showing the smooth curve and connected histogram bars.

C. Conceptual Formula in R (Area Under the Curve)

The norm() function in R calculates the area under the Normal Distribution curve (cumulative probability), which is the method used for continuous variables.Scenario:

If height (\(X\)) is normally distributed with a mean (\(\mu\)) of \(170\) cm and a standard deviation (\(\sigma\)) of \(10\) cm, what is the probability that a person’s height is less than \(180\) cm? (\(P(X < 180)\))

The relationship between the Continuous Random Variable (RV), the Probability Density Function (PDF), and the Cumulative Distribution Function (CDF) is the fundamental bridge connecting the field of statistics with the powerful tools of integral and differential calculus.

  1. The Continuous Random VariableA random variable (\(X\)) is classified as continuous if it can assume any value within a given interval on the real number line, such as \([a, b]\) or \((-\infty, \infty)\). Examples include measured quantities like height, time, temperature, and velocity.

Zero Point Probability: Since there is an infinite number of possible values within any interval, the probability of the variable landing exactly on any single point is always zero:\[P(X = x) = 0\]

Interval-Based Probability: Probabilities are meaningful only when measured over an interval:\[P(a \le X \le b) = \int_{a}^{b} f(x) dx\]

  1. The Probability Density Function (PDF), \(f(x)\)The PDF, denoted \(f(x)\), is the descriptive function that illustrates the density of probability around each value \(x\). The PDF is the operational heart of any continuous distribution.

Interpretation Caveat: A crucial distinction is that the value of \(f(x)\) itself is not a probability.

Density Indicator: Larger values of \(f(x)\) merely indicate a higher probability density around that specific value.

Validation Requirements: A function \(f(x)\) is a valid PDF only if it satisfies two essential rules:

  1. Non-negativity: \(f(x) \ge 0\) for all \(x\).Total Area Equals 1:

  2. The total area under the PDF curve must integrate to 1:\[\int_{-\infty}^{\infty} f(x) dx = 1\]

Probability Calculation: The probability for an interval (\(P(a \le X \le b)\)) is computed as the area under the PDF curve between \(a\) and \(b\):\[P(a \le X \le b) = \int_{a}^{b} f(x) dx\]Example PDF: \(f(x) = 3x^2\) on \([0, 1]\).

  1. The Cumulative Distribution Function (CDF), \(F(x)\)The CDF, denoted \(F(x)\), is defined as the probability that the Random Variable \(X\) takes a value less than or equal to \(x\) (\(P(X \le x)\)). The CDF is an inherently cumulative measure.

Integral Relationship: The CDF is the integral of the PDF from the lower bound of the distribution up to the point \(x\):\[F(x) = P(X \le x) = \int_{low}^{x} f(t) dt\]Example CDF: For the PDF \(f(x) = 3x^2\) on \([0, 1]\), the CDF is calculated as \(F(x) = \int_{0}^{x} 3t^2 dt = x^3\).

The Reciprocal Relationship: PDF vs. CDF

The relationship between \(f(x)\) and \(F(x)\) is the backbone of the Fundamental Theorem of Calculus. They are an integral/derivative pair:\[\text{PDF } f(x) = \frac{d}{dx} F(x) = F'(x)\]This means the PDF is simply the rate of change of the CDF. You can obtain the PDF by differentiating the CDF, and you can obtain the CDF by integrating the PDF.

3 . Sampling Distribuition

This summary, based on the video “Sampling Distributions (7.2)”, unlocks one of the most vital concepts in inferential statistics: understanding not just single data points, but the behavior of averages drawn repeatedly from a population.

3.1 . Summary of the video:

The video starts by resolving a major source of confusion: differentiating between three types of statistical distributions:

  1. The Population Distribution: this is the complete data set from every individual in the entire population, characterized by its mean (\(\mu\)) and standard deviation (\(\sigma\)).

  2. The Sample Distribution: this is merely the data collected from one single random sample. It’s highly variable and often fails to perfectly mirror the population.

  3. The Sampling Distribution: this is our central focus. It is a distribution built not from individual scores, but from the statistic (like the mean, \(\bar{x}\)) calculated from hundreds or thousands of random samples taken from the same population. It’s essentially a “stack of sample means” that tells us the probability of observing a specific sample average.

The power of the Sampling Distribution lies in its predictability, which allows us to generalize results from a small group to a large population:

Stable Mean: the mean of the Sampling Distribution (\(\mu_{\bar{x}}\)) will always equal the population mean (\(\mu\)).Reduced Variability (The Standard Error): Crucially, sample means are less variable than individual observations. Therefore, the spread of the Sampling Distribution is much narrower and taller than the original population curve. This measure of spread is called the Standard Error (\(\sigma_{\bar{x}}\)).A Move Toward Normality: Even if the original Population Distribution is strangely shaped, the Sampling Distribution will tend toward a Normal (bell-shaped) distribution as the sample size increases. This principle is the cornerstone of the Central Limit Theorem.

The relationship between the population and the sampling distribution is governed by these two fundamental formulas:

Mean of the Sampling Distribution:\[\mu_{\bar{x}} = \mu\]Standard Error (Standard Deviation of the Sampling Distribution): This formula shows that as the sample size (\(n\)) increases, the Standard Error decreases, making the distribution tighter.\[\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}\]These are used to standardize a sample mean (\(\bar{x}\)) into a Z-score:\[Z = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}\]For example, using these formulas, the video calculates that the probability that the average height (\(\bar{x}\)) of a random sample of 10 Canadians is less than 157 cm (given \(\mu = 160 \text{ cm}, \sigma = 7 \text{ cm}\)) is only 8.69%.

4 . Central Limit Theorem

The video “The Central Limit Theorem (7.3)” dives into what is arguably the most fundamental and revolutionary principle in all of modern statistics. The CLT isn’t just a formula; it’s the mathematical magic that allows data scientists to trust their inferential conclusions.

4.1 . Summary of the video:

Before the CLT, statisticians struggled with population data that came in all shapes and sizes skewed, flat, or wildly irregular. The CLT offered a stunning solution, essentially providing a predictive guarantee:

  1. The CLT Statement: If you take simple random samples (\(n\)) that are large enough (typically \(n \ge 30\)) from any population (regardless of its original distribution shape)

  2. Guaranteed Normality: The Sampling Distribution of the sample means (\(\bar{x}\)) will tend toward a Normal Distribution (the bell curve).

  3. Centralized Mean: The mean of the Sampling Distribution (\(\mu_{\bar{x}}\)) will be equal to the true population mean (\(\mu\)).

  4. Standard Error: The spread of this new, normal distribution (the Standard Error, \(\sigma_{\bar{x}}\)) is perfectly predictable by the formula \(\sigma / \sqrt{n}\).

  5. The Simple Power: The CLT essentially normalizes the chaos of the real world. It guarantees that we can use the simple, powerful tools of the Normal Distribution (like Z-scores) to calculate probabilities and test hypotheses on our sample means—a feat impossible on the messy original population data.

If the Population is Normal: If the original population is already Normal, the Sampling Distribution is always Normal, even if the sample size (\(n\)) is small.If the Population is NOT Normal: We must rely on the large sample size condition. The general consensus is that \(n\) must be at least 30 (\(n \ge 30\)) to assume the Sampling Distribution is Normal.

Using the CLT guarantee of normality, we rely on the Z-score formula to find the probability of observing a specific sample mean (\(\bar{x}\)).

5 . Sample Proportion

These two videos—“The Central Limit Theorem (7.3)” and “Sampling Distribution of the Sample Proportion (7.4)”—represent the ultimate transition from descriptive analysis to actionable inferential statistics. Their essential message is that, under the right conditions, the chaotic results of sampling can be reliably transformed into a predictable Normal Distribution.

5.1 . Summary of the video:

5.1.1 . The Central Limit Theorem (CLT) for the Mean (\(\bar{x}\))

The CLT is often called the “magic of statistics” because it provides an incredible guarantee it ensures that the sampling distribution of the mean will be normal, regardless of the original population’s bizarre shape.

5.1.2 . The CLT Mechani

When we take multiple samples, the average of those sample means (\(\mu_{\bar{x}}\)) will precisely equal the true population mean (\(\mu\)). More importantly, the spread of this new distribution known as the Standard Error (\(\sigma_{\bar{x}}\)) is drastically reduced compared to the original population’s standard deviation (\(\sigma\)). This relationship is defined as \(\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}\).

5.1.3 . The Normal Condition (The Rule of 30)

For the CLT to kick in and ensure normality when the population is not normal, the sample size (\(n\)) must be large enough, with the common threshold being \(n \ge 30\).

5.1.4 . The Sampling Distribution of the Proportion (\(\hat{p}\))

When dealing with categorical data (like “yes/no,” “success/failure”), we use the proportion (\(\hat{p}\)) instead of the mean (\(\bar{x}\)). Just like the mean, if we repeatedly sample and plot the resulting proportions, they also form a Sampling Distribution.

5.1.5 . The Proportion Rules

The mean of the sample proportions (\(\mu_{\hat{p}}\)) will equal the true population proportion (\(p\)). The Standard Error for the proportion is calculated using the formula \(\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}\).

5.1.6 . The Normal Condition (The Rule of Successes and Failures)

For the sampling distribution of the proportion to be considered normal, the CLT applies differently. Instead of relying only on \(n \ge 30\), we must verify that there are enough expected successes and enough expected failures: both \(n \cdot p\) and \(n \cdot (1-p)\) must be greater than or equal to 10. Meeting these two simple conditions guarantees that the distribution is sufficiently symmetric to use the Z-score table.

6 . Review Sampling Distribuition

These three videos covering the Central Limit Theorem (CLT), the Sampling Distribution of the Proportion, and Binomial Approximation represent the ultimate transition from analyzing messy sample data to making confident predictions about a large population. Their essential teaching is how we leverage the laws of probability to turn chaotic sample results into a reliable Normal Distribution.

6.1 . Summary of the video:

6.1.1 . The Central Promise

The core idea shifts our focus.Instead of worrying about a single sample’s outcome, we consider the distribution of a statistic (like the mean \(\bar{x}\) or the proportion \(\hat{p}\)) if we were to draw thousands of random samples from the same population. This distribution is called the Sampling Distribution.This shift is crucial because while individual data points can be highly erratic, the averages and proportions of multiple samples follow highly predictable rules dictated by the CLT.

6.1.2 . The Central Limit Theorem (CLT) for the Sample Mean (\(\bar{x}\))

The CLT is statistics’ “magic spell,” guaranteeing that the sampling distribution of the mean will take on the familiar Normal (bell-shaped) Distribution, regardless of how weirdly shaped the original population data is.

Key Principles for \(\bar{x}\):

  1. Stable Center: The average of all sample means (\(\mu_{\bar{x}}\)) will equal the true population mean (\(\mu\)).

  2. Reduced Spread: The variability of this new distribution is drastically smaller than the population’s. This controlled spread is the Standard Error (\(\sigma_{\bar{x}}\)), calculated as \(\sigma / \sqrt{n}\). The bigger the sample size (\(n\)), the smaller and taller the resulting curve is.

  3. The Rule of 30: To ensure this normality takes effect when the population is non-normal, the sample size (\(n\)) must be at least 30 (\(n \ge 30\)).

6.1.3 . The Sampling Distribution of the Sample Proportion (\(\hat{p}\))

When analyzing categorical or binary data (e.g., success/failure, yes/no), we use the sample proportion (\(\hat{p}\)). Similar to the mean, if we repeatedly calculate \(\hat{p}\), its distribution will also approach normality.

Key Principles for \(\hat{p}\):

  1. Stable Center: The mean of the sample proportions (\(\mu_{\hat{p}}\)) equals the true population proportion (\(p\)).

  2. Standard Error: The variability is calculated as \(\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}\).

  3. The Rule of Successes and Failures: the condition for normality here is stricter: we must ensure that the number of expected successes (\(n \cdot p\)) and expected failures (\(n \cdot (1-p)\)) are both greater than or equal to 10. Meeting these two conditions guarantees the distribution is symmetric enough for normal approximation.

6.1.4 . Why CLT is Essential (The Binomial Bridge)

One of the most practical applications of the CLT for proportions is solving complicated problems involving the Binomial Distribution.Calculating the exact probability of, say, at least 35 successes out of 100 trials using the traditional Binomial formula is tedious, requiring the calculation and summing of many probabilities. Thanks to the CLT, once the \(n p \ge 10\) and \(n (1-p) \ge 10\) conditions are met, we can approximate that complex probability easily using a single Z-score calculation, saving immense time and effort.

7 . Referensi

  1. “The Basic Practice of Statistics” (Edisi Pertama: 1995)

  2. “Introductory Statistics with R” (Edisi Pertama: 2008)

  3. ”Foundations of Statistical Inference and Data Science”

  4. Elementary Statistics: Picturing the World by Ron Larson and Betsy Farber, 7th Edition, published in 2018.

  5. Statistics by James T. McClave, P. George Benson, and Terry Sincich, 13th Edition, published in 2018.

---
title: "Tugas Week 11 ~ Probability Distribution"
author: "Chricyesia Winnerlady Frexisovara Uvas"
date: "2025-12-03"
output:
  rmdformats::readthedown:
    self_contained: true
    thumbnails: true
    lightbox: true
    gallery: true
    number_sections: true
    lib_dir: libs
    df_print: "paged"
    code_folding: "show"
    code_download: yes
    css: "style.css"     
---

```{r, echo=FALSE, warning=FALSE, message=FALSE, out.extra='style="display:block; margin-left:auto; margin-right:auto;"'}
library(magick)
gambar <- image_read("~/Tugas estatistika winer/tugas week 11 ~ probability distribuition/foto_1_jpg.jpg")
gambar
```


# . Introduction
This introduction serves as the definitive gateway to one of the most foundational and transformative concepts in statistical analysis: Probability Distributions.

Statistics is fundamentally the science of making confident decisions amid uncertainty. When we encounter variable outcomes such as the number of "Heads" we get after tossing a coin $N$ times we cannot rely on a single, isolated value. This is where the Probability Distribution steps in as the primary analytical tool.
A Probability Distribution is the "master map" that precisely describes the likelihood for every possible outcome.
The Coin Toss Example:
If you toss a coin 10 times, the possible outcomes (Random Variable $X$ = Number of Heads) range from 0 to 10. The Probability Distribution tells you:

•	What is the chance of getting exactly 5 Heads? 
 

•	What is the chance of getting at least 8 Heads? 
($P(X \ge 8)$)

Its function goes far beyond simple prediction. Crucially, it forms the core analytical basis for virtually all inferential statistical methods we use to:

1.	Quantify Risk and Uncertainty: It measures the expected variation or spread of results around the most likely value (e.g., the expected average of 5 Heads).

2.	Drive Inference: It allows us to draw valid, responsible conclusions from sample data such as determining if a deviation (like getting 9 Heads) is unusual enough to conclude the coin is biased.
The central concept is the Random Variable ($X$), which numerically represents the experimental outcome. The Distribution then maps the probability onto each possible value. 

The Critical Insight: Understanding Distribution Properties
To make accurate and robust predictions, it is absolutely mandatory that we grasp the characteristics of the distribution we are dealing with:

•	Shape: Is the distribution symmetric (like a fair coin's outcome), skewed, or uniform?

•	Spread (Variability): How far are the values likely to scatter from the mean?

•	Center (Location): What is the most expected or average value?
Without this deep understanding of a distribution’s properties, our probability calculations will be flawed, and our predictive models will be fundamentally unreliable. In essence, Probability Distributions are the universal language that translates raw data into quantifiable, actionable probabilities.

The Four Pillars of Statistical Inference
Having established the indispensable role of Probability Distributions, this report will guide you through four essential topics that constitute the building blocks of inferential statistics the field that allows us to move from simply describing data to drawing powerful, general conclusions:

1. Continuous Random Variables

We shift our focus to outcomes that can take any value within a given range (e.g., the time it takes to get a Head, weight, or temperature). This necessitates moving from calculating the probability of a specific point to calculating the probability as the area under a curve, with special emphasis on the paramount Normal Distribution.

2. Sampling Distributions

This is the hinge concept, explaining how sample statistics behave (e.g., the average proportion of Heads $\hat{p}$ across many different sets of 10 coin tosses). This distribution is the critical link that connects the small sample we observe to the large population we seek to understand.

3. The Central Limit Theorem (CLT)

The CLT is often called the "Gold Standard" or the most crucial result in statistics. Its magical power states that, regardless of the original population distribution's shape, the distribution of sample averages (the Sampling Distribution) will tend toward a Normal Distribution provided the sample size ($n$) is large enough. This allows us to leverage the power of Normal probability for almost any real-world analysis.

4. Sample Proportion Distributions

This topic specifically addresses categorical or binary outcomes (Heads/Tails, Yes/No, Success/Failure). We will study the probability distribution of the proportion of successes ($\hat{p}$) found in a sample, a vital skill for survey data analysis and percentage-based hypothesis testing.
The Introductory Takeaway

Mastering these four concepts from continuous variables to the revolutionary power of the CLT will equip you with the critical capability to analyze data deeply, construct sound statistical models, and ultimately draw robust, solid conclusions grounded in strong probabilistic principles. This foundation will transform you from a simple data consumer into a sophisticated, data-driven decision-maker.


# . Countinuous Random

<center>
<iframe src="https://www.youtube.com/embed/ZyUzRVa6hCM" width="680" height="400" data-external="1" frameborder="0" allowfullscreen> </iframe>
</center>

## . Summary of the video:
### . Discrete Variables: Data Obtained by Counting 
A discrete variable represents data whose values are countable and limited to a finite number of possibilities. The key characteristic of this type of data is that it is generated through the process of counting. Consequently, discrete variables can only take on certain, isolated values, leaving no possibility for any value in between the defined steps.

For example, when considering the number of children in a family, the outcomes are restricted to integers like 0, 1, 2, or 3. The video emphasizes that asking for a fraction of a child  is nonsensical in this context. Similarly, the score on a test is discrete because the possible outcomes are limited to a finite set of numbers. Even in cases involving currency, where decimals (cents) are present, the variable remains discrete because the precision is limited to a countable number of decimal places, making the set of values finite.

In terms of visualization, discrete data is represented by a Bar Chart, where the gaps between the bars clearly signify the separation between the countable values.

### . Continuous Variables: Data Obtained by Measuring 
In contrast, a continuous variable is data that can take on any numerical value within a given range. The fundamental source of continuous data is measurement (measuring), not counting.

This reliance on measurement means the number of possible outcomes within any range is infinite and uncountable. The defining trait of continuous data is that you can always achieve greater precision by adding more decimal places.

The video uses age as a prime example. While one might state an age of 23 years, precise measurement can extend this value to 23 years, 6 months, 2 days, 3 seconds, and down to milliseconds or nanoseconds, theoretically without end. The same concept applies to weight. Since measurement can be refined infinitely, all values within the range are theoretically possible.

Continuous variables are typically visualized using a Histogram or a Density Curve. The absence of gaps between the bars in a histogram illustrates the continuity of the data, reflecting the infinite possibilities of measurement within the observed range.

### . Probability Formulas and Concepts 
The difference between counting and measuring fundamentally changes how we calculate probability:

For Discrete Variables Focus: Finding the probability of a single, exact value ($P(X = x)$).

Calculation: Probability is found by Summation (adding up) the probabilities of all relevant outcomes.


For Continuous Variables Focus: Probability is calculated for a range of values ($P(a < X < b)$), not a single point.Calculation: Probability is equal to the Area Under the Density Curve (which involves integration, or calculus).

Key Fact: The probability of a single, exact value in a continuous distribution (e.g., $P(X = 170.000...)$) is always zero.

Major Example: The Normal Distribution is the most common example of a density curve used for continuous variables.

### .  Visualization  
A. Discrete Data Visualization (Bar Chart with Gaps)

This code demonstrates a discrete variable (e.g., the probability of getting a certain number of heads from four coin flips), emphasizing the gaps between bars.
```{r,,echo=FALSE, warning=FALSE, message=FALSE, big =TRUE, out.extra='style="display:block; margin-left:auto; margin-right:auto;"'}

library(ggplot2)

# --- Dataframe untuk Data Diskrit ---
data_discrete_heads <- data.frame(
  Heads = factor(0:4), # Ubah menjadi 'factor' agar diperlakukan sebagai kategori diskrit
  Probability = c(1/16, 4/16, 6/16, 4/16, 1/16)
)

# --- Plotting Diagram Batang ---
ggplot(data_discrete_heads, aes(x=Heads, y=Probability)) +
  # Diagram Batang: stat="identity" menggunakan nilai Y secara langsung
  geom_bar(stat="identity", 
           fill="#1f78b4", # Warna biru yang menarik
           color="black", 
           width=0.8) + # Lebar batang diatur untuk menciptakan celah (gaps)
  
  # Tambahkan label probabilitas di atas setiap batang
  geom_text(aes(label = round(Probability, 3)), 
            vjust = -0.1, 
            size = 4,
            fontface = "bold") +
  
  # Judul dan Label Sumbu
  labs(
    title = "Probabilitas Variabel Diskrit: Jumlah Heads dari 4 Lemparan Koin",
    subtitle = "Menunjukkan Celah Antar Batang (Data Hanya Ada Pada Nilai Spesifik)",
    x = "Jumlah Heads (x)",
    y = "Probabilitas P(X=x)"
  ) +
  
  # Tema dan Pengaturan Tampilan (untuk tampilan maksimal/besar)
  theme_minimal() + 
  theme(
    plot.title = element_text(size = 10, face = "bold", hjust = 0.4),
    plot.subtitle = element_text(size = 10, hjust = 0.4),
    axis.title = element_text(size = 14),
    axis.text = element_text(size = 10)
  )
```


B. Continuous Data Visualization (Density Curve/Histogram)

This code demonstrates a continuous variable (like height, following a Normal Distribution), showing the smooth curve and connected histogram bars.

```{r,echo=FALSE, warning=FALSE, message=FALSE, big =TRUE, out.extra='style="display:block; margin-left:auto; margin-right:auto;"'}

library(ggplot2)

# Generate 1000 random data points following a Normal Distribution (Continuous Variable)
# Example: Height with a mean of 170 and standard deviation of 10
set.seed(42)
data_continuous <- data.frame(
  Height = rnorm(1000, mean = 170, sd = 10)
)

# Plotting the Density Curve and Histogram
ggplot(data_continuous, aes(x=Height)) +
  # 1. Histogram (bars are connected/no gaps)
  geom_histogram(aes(y=after_stat(density)), binwidth = 3, fill="lightcoral", color="black", alpha=0.7) +
  # 2. Density Curve (smooth line)
  geom_density(color="blue", linewidth=1.5) +
  labs(
    title = "Visualization of Continuous Variable (Density Curve)",
    x = "Height (cm)",
    y = "Density"
  ) +
  theme_minimal()
```


C. Conceptual Formula in R (Area Under the Curve)

The norm() function in R calculates the area under the Normal Distribution curve (cumulative probability), which is the method used for continuous variables.Scenario: 

If height ($X$) is normally distributed with a mean ($\mu$) of $170$ cm and a standard deviation ($\sigma$) of $10$ cm, what is the probability that a person's height is less than $180$ cm? ($P(X < 180)$)

```{r,echo=FALSE, warning=FALSE, message=FALSE, big =TRUE, out.extra='style="display:block; margin-left:auto; margin-right:auto;"'}

# Parameter Distribusi Normal
mean_val <- 170  # Rata-rata (µ)
stdev <- 10      # Standar Deviasi (σ)
x_value <- 180   # Batas nilai yang dicari

# 1. Tentukan rentang sumbu X (4 kali stdev dari mean)
x_range <- seq(mean_val - 4 * stdev, mean_val + 4 * stdev, length.out = 1000)

# 2. Hitung nilai kerapatan (tinggi kurva) untuk setiap X
y_range <- dnorm(x_range, mean = mean_val, sd = stdev)

# 3. Buat plot dasar kurva lonceng
plot(x_range, y_range, 
     type = "l", 
     main = "Visualisasi Probabilitas P(X < 180)",
     xlab = "Nilai X", 
     ylab = "Kerapatan Probabilitas",
     col = "darkblue", # Warna garis kurva: Biru gelap
     lwd = 3) # Ketebalan garis kurva: 3

# 4. Tentukan batas untuk pengarsiran
x_shade <- seq(min(x_range), x_value, length.out = 1000)
y_shade <- dnorm(x_shade, mean = mean_val, sd = stdev)

# 5. Arsir area P(X < 180) dengan warna Biru Langit
polygon(c(x_shade, rev(x_shade)), c(y_shade, rep(0, length(y_shade))), 
        col = "skyblue", # Ganti warna arsiran menjadi Biru Langit
        border = NA) 

# 6. Tambahkan garis vertikal pada Mean (µ) dan X
abline(v = mean_val, col = "red", lty = 2) # Garis putus-putus untuk Rata-rata
abline(v = x_value, col = "darkgreen", lty = 1, lwd = 2) # Garis tebal untuk X=180

# 7. Tambahkan teks penjelasan
probability <- pnorm(q = x_value, mean = mean_val, sd = stdev)
prob_text <- paste("P(X < 180) =", round(probability * 100, 2), "%")
text(mean_val - stdev, max(y_range) * 0.5, 
     prob_text, 
     col = "black", 
     font = 2, 
     cex = 1.2) # Perbesar ukuran teks

```

The relationship between the Continuous Random Variable (RV), the Probability Density Function (PDF), and the Cumulative Distribution Function (CDF) is the fundamental bridge connecting the field of statistics with the powerful tools of integral and differential calculus.

1. The Continuous Random VariableA random variable ($X$) is classified as continuous if it can assume any value within a given interval on the real number line, such as $[a, b]$ or $(-\infty, \infty)$. Examples include measured quantities like height, time, temperature, and velocity.

Zero Point Probability: Since there is an infinite number of possible values within any interval, the probability of the variable landing exactly on any single point is always zero:$$P(X = x) = 0$$

Interval-Based Probability: Probabilities are meaningful only when measured over an interval:$$P(a \le X \le b) = \int_{a}^{b} f(x) dx$$

2. The Probability Density Function (PDF), $f(x)$The PDF, denoted $f(x)$, is the descriptive function that illustrates the density of probability around each value $x$. The PDF is the operational heart of any continuous distribution.

Interpretation Caveat: A crucial distinction is that the value of $f(x)$ itself is not a probability.

Density Indicator: Larger values of $f(x)$ merely indicate a higher probability density around that specific value.

Validation Requirements: A function $f(x)$ is a valid PDF only if it satisfies two essential rules:

1. Non-negativity: $f(x) \ge 0$ for all $x$.Total Area Equals 1: 

2. The total area under the PDF curve must integrate to 1:$$\int_{-\infty}^{\infty} f(x) dx = 1$$

Probability Calculation: The probability for an interval ($P(a \le X \le b)$) is computed as the area under the PDF curve between $a$ and $b$:$$P(a \le X \le b) = \int_{a}^{b} f(x) dx$$Example PDF: $f(x) = 3x^2$ on $[0, 1]$.

3. The Cumulative Distribution Function (CDF), $F(x)$The CDF, denoted $F(x)$, is defined as the probability that the Random Variable $X$ takes a value less than or equal to $x$ ($P(X \le x)$). The CDF is an inherently cumulative measure.

Integral Relationship: The CDF is the integral of the PDF from the lower bound of the distribution up to the point $x$:$$F(x) = P(X \le x) = \int_{low}^{x} f(t) dt$$Example CDF: For the PDF $f(x) = 3x^2$ on $[0, 1]$, the CDF is calculated as $F(x) = \int_{0}^{x} 3t^2 dt = x^3$.

The Reciprocal Relationship: PDF vs. CDF

The relationship between $f(x)$ and $F(x)$ is the backbone of the Fundamental Theorem of Calculus. They are an integral/derivative pair:$$\text{PDF } f(x) = \frac{d}{dx} F(x) = F'(x)$$This means the PDF is simply the rate of change of the CDF. You can obtain the PDF by differentiating the CDF, and you can obtain the CDF by integrating the PDF.

```{r,echo=FALSE, warning=FALSE, message=FALSE, big =TRUE, out.extra='style="display:block; margin-left:auto; margin-right:auto;"'}

# 1. Define the PDF and CDF Functions
# PDF: f(x) = 3x^2, for 0 <= x <= 1
f_pdf <- function(x) {
  # The ifelse statement correctly handles the range [0, 1]
  ifelse(x >= 0 & x <= 1, 3 * x^2, 0)
}

# CDF: F(x) = x^3, for 0 <= x <= 1
f_cdf <- function(x) {
  # This sets the CDF to 0 before 0, x^3 on [0, 1], and 1 after 1.
  ifelse(x < 0, 0, ifelse(x > 1, 1, x^3))
}

# 2. Validation Check (Total Area must equal 1)
# Use the integrate function to numerically check the total area
validation_check <- integrate(f_pdf, lower = 0, upper = 1)


# 3. Probability Calculation: P(0.5 <= X <= 1)
# Using the Integral (Area under the PDF)
prob_integral <- integrate(f_pdf, lower = 0.5, upper = 1)


# Using the CDF: F(1) - F(0.5)
prob_cdf <- f_cdf(1) - f_cdf(0.5)


# 4. Compelling Visualizations

# Plot the PDF
curve(f_pdf, from = -0.5, to = 1.5, n = 200, col = "#FF5733", lwd = 2,
      ylab = "f(x) - Probability Density Function", xlab = "X",
      main = "The Shape of Probability Density: f(x) = 3x^2")

# Plot the CDF
plot(f_cdf, from = -0.5, to = 1.5, n = 200, type = "l", col = "#3371FF", lwd = 2,
     ylab = "F(x) - Cumulative Distribution Function", xlab = "X",
     main = "The Accumulation of Probability: F(x) = x^3")
```

# . Sampling Distribuition

This summary, based on the video "Sampling Distributions (7.2)", unlocks one of the most vital concepts in inferential statistics: understanding not just single data points, but the behavior of averages drawn repeatedly from a population.

<center>
<iframe src="https://www.youtube.com/embed/7S7j75d3GM4" width="680" height="400" data-external="1" frameborder="0" allowfullscreen> </iframe>
</center>

## . Summary of the video:

The video starts by resolving a major source of confusion: differentiating between three types of statistical distributions:

1. The Population Distribution: this is the complete data set from every individual in the entire population, characterized by its mean ($\mu$) and standard deviation ($\sigma$).

2. The Sample Distribution: this is merely the data collected from one single random sample. It’s highly variable and often fails to perfectly mirror the population.

3. The Sampling Distribution: this is our central focus. It is a distribution built not from individual scores, but from the statistic (like the mean, $\bar{x}$) calculated from hundreds or thousands of random samples taken from the same population. It’s essentially a "stack of sample means" that tells us the probability of observing a specific sample average.

The power of the Sampling Distribution lies in its predictability, which allows us to generalize results from a small group to a large population:

Stable Mean: the mean of the Sampling Distribution ($\mu_{\bar{x}}$) will always equal the population mean ($\mu$).Reduced Variability (The Standard Error): Crucially, sample means are less variable than individual observations. Therefore, the spread of the Sampling Distribution is much narrower and taller than the original population curve. This measure of spread is called the Standard Error ($\sigma_{\bar{x}}$).A Move Toward Normality: Even if the original Population Distribution is strangely shaped, the Sampling Distribution will tend toward a Normal (bell-shaped) distribution as the sample size increases. This principle is the cornerstone of the Central Limit Theorem.

The relationship between the population and the sampling distribution is governed by these two fundamental formulas:

Mean of the Sampling Distribution:$$\mu_{\bar{x}} = \mu$$Standard Error (Standard Deviation of the Sampling Distribution): This formula shows that as the sample size ($n$) increases, the Standard Error decreases, making the distribution tighter.$$\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$$These are used to standardize a sample mean ($\bar{x}$) into a Z-score:$$Z = \frac{\bar{x} - \mu}{\sigma/\sqrt{n}}$$For example, using these formulas, the video calculates that the probability that the average height ($\bar{x}$) of a random sample of 10 Canadians is less than 157 cm (given $\mu = 160 \text{ cm}, \sigma = 7 \text{ cm}$) is only 8.69%.

```{r,echo=FALSE, warning=FALSE, message=FALSE, big =TRUE, out.extra='style="display:block; margin-left:auto; margin-right:auto;"'}

# Required Libraries
library(ggplot2)
library(dplyr)

# Data from the Video Example: Canadian Heights
mu <- 160     # Population Mean (mu)
sigma <- 7    # Population Standard Deviation (sigma)
n <- 10       # Sample Size (n)

# Calculate the Standard Error (Standard Deviation of the Sampling Distribution)
sigma_xbar <- sigma / sqrt(n)

# Create data points for plotting the Normal curves
data_points <- data.frame(x = seq(130, 190, length.out = 500))

# Calculate Density for Population and Sampling
data_points <- data_points %>%
  mutate(
    # Population Distribution: X ~ N(160, 7)
    density_pop = dnorm(x, mean = mu, sd = sigma),
    # Sampling Distribution: Xbar ~ N(160, 2.21)
    density_sampling = dnorm(x, mean = mu, sd = sigma_xbar)
  )

# Reshape data to "long" format for ggplot
data_long <- data_points %>%
  tidyr::pivot_longer(
    cols = starts_with("density"),
    names_to = "Distribution_Type",
    values_to = "Density"
  )

# Plot using ggplot2
ggplot(data_long, aes(x = x, y = Density, color = Distribution_Type)) +
  geom_line(lwd = 1.2) +
  # Add vertical line at the tested sample mean (xbar = 157)
  geom_vline(xintercept = 157, linetype = "dashed", color = "gray50") +
  geom_text(aes(x = 157, y = 0.15, label = "Sample Mean (157cm)"), 
            color = "gray30", angle = 90, size = 3, hjust = 0) +
  
  # Customization
  scale_color_manual(
    values = c("density_pop" = "#0072B2", "density_sampling" = "#D55E00"),
    labels = c("Population Distribution (σ=7)", "Sampling Distribution (σ/√n ≈ 2.21)")
  ) +
  labs(
    title = "Distribution Comparison: Population vs. Sample Mean",
    subtitle = paste0("Canadian Heights: μ=", mu, "cm, n=", n),
    x = "Height (cm)",
    y = "Probability Density",
    color = "Distribution Type"
  ) +
  theme_minimal(base_size = 14) +
  theme(legend.position = "bottom", 
        plot.title = element_text(face = "bold"))
```


# . Central Limit Theorem
The video "The Central Limit Theorem (7.3)" dives into what is arguably the most fundamental and revolutionary principle in all of modern statistics. The CLT isn't just a formula; it's the mathematical magic that allows data scientists to trust their inferential conclusions.

<center>
<iframe src="https://www.youtube.com/embed/ivd8wEHnMCg" width="680" height="400" data-external="1" frameborder="0" allowfullscreen> </iframe>
</center>

## . Summary of the video:
Before the CLT, statisticians struggled with population data that came in all shapes and sizes skewed, flat, or wildly irregular. The CLT offered a stunning solution, essentially providing a predictive guarantee:

1. The CLT Statement: If you take simple random samples ($n$) that are large enough (typically $n \ge 30$) from any population (regardless of its original distribution shape)

2. Guaranteed Normality: The Sampling Distribution of the sample means ($\bar{x}$) will tend toward a Normal Distribution (the bell curve).

3. Centralized Mean: The mean of the Sampling Distribution ($\mu_{\bar{x}}$) will be equal to the true population mean ($\mu$).

4. Standard Error: The spread of this new, normal distribution (the Standard Error, $\sigma_{\bar{x}}$) is perfectly predictable by the formula $\sigma / \sqrt{n}$.

5. The Simple Power: The CLT essentially normalizes the chaos of the real world. It guarantees that we can use the simple, powerful tools of the Normal Distribution (like Z-scores) to calculate probabilities and test hypotheses on our sample means—a feat impossible on the messy original population data.

If the Population is Normal: If the original population is already Normal, the Sampling Distribution is always Normal, even if the sample size ($n$) is small.If the Population is NOT Normal: We must rely on the large sample size condition. The general consensus is that $n$ must be at least 30 ($n \ge 30$) to assume the Sampling Distribution is Normal.

Using the CLT guarantee of normality, we rely on the Z-score formula to find the probability of observing a specific sample mean ($\bar{x}$).

```{r,echo=FALSE, warning=FALSE, message=FALSE, big =TRUE, out.extra='style="display:block; margin-left:auto; margin-right:auto;"'}

# Required Libraries
library(ggplot2)
library(dplyr)
library(patchwork) # For combining plots into an attractive layout

# Simulation Parameters
num_samples <- 10000 # Number of samples taken for each case
lambda <- 0.5       # Parameter for the Exponential Distribution (Our Skewed Population)

# --- 1. Sample Simulations ---

# n=1: The Original Population Distribution (Highly Skewed)
set.seed(42) 
sample_n1 <- data.frame(
  mean = rexp(num_samples, rate = lambda),
  group = "Sample Size n = 1 (Original Population)"
)

# n=5: The Sampling Distribution (Starting to Look Normal)
sample_n5 <- data.frame(
  mean = replicate(num_samples, mean(rexp(5, rate = lambda))),
  group = "Sample Size n = 5 (Slightly Normal)"
)

# n=30: The Sampling Distribution (Perfectly Normal - CLT Applies)
sample_n30 <- data.frame(
  mean = replicate(num_samples, mean(rexp(30, rate = lambda))),
  group = "Sample Size n = 30 (CLT Applies)"
)

# Combine data for plotting
clt_data <- bind_rows(sample_n1, sample_n5, sample_n30)
clt_data$group <- factor(clt_data$group, 
                         levels = c("Sample Size n = 1 (Original Population)", 
                                    "Sample Size n = 5 (Slightly Normal)", 
                                    "Sample Size n = 30 (CLT Applies)"))

# --- 2. Plotting Function ---

# Function to create consistent histogram/density plots
create_plot <- function(data, title, fill_color) {
  ggplot(data, aes(x = mean)) +
    # Histogram showing frequency
    geom_histogram(aes(y = after_stat(density)), bins = 50, fill = fill_color, color = "white", alpha = 0.7) +
    # Density line showing the curve shape
    geom_density(color = "black", lwd = 1) +
    labs(title = title, x = "Sample Mean Value", y = "Probability Density") +
    theme_minimal(base_size = 12) +
    theme(plot.title = element_text(face = "bold", color = "#36454F"))
}

# --- 3. Create Individual Plots with Contrasting Colors ---

p1 <- create_plot(filter(clt_data, group == "Sample Size n = 1 (Original Population)"), 
                  "1. Exponential Distribution (Highly Skewed)", "#D55E00")
p2 <- create_plot(filter(clt_data, group == "Sample Size n = 5 (Slightly Normal)"), 
                  "2. Mean of n=5: Starting the Bell Curve", "#0072B2")
p3 <- create_plot(filter(clt_data, group == "Sample Size n = 30 (CLT Applies)"), 
                  "3. Mean of n=30: Perfectly Normal", "#009E73")

# --- 4. Combine Plots into an Attractive Layout ---

# Using patchwork for a 3x1 vertical layout
p1 / p2 / p3 + 
  plot_annotation(
    title = "Central Limit Theorem (CLT) Simulation: The Evolution of Normality",
    subtitle = "Starting Population: Highly Skewed Exponential Distribution",
    caption = "Source: Simulation based on CLT principles (n=30 is the common threshold).",
    theme = theme_minimal(base_size = 16)
  )

```

# . Sample Proportion
These two videos—"The Central Limit Theorem (7.3)" and "Sampling Distribution of the Sample Proportion (7.4)"—represent the ultimate transition from descriptive analysis to actionable inferential statistics. Their essential message is that, under the right conditions, the chaotic results of sampling can be reliably transformed into a predictable Normal Distribution.

<center>
<iframe src="https://www.youtube.com/embed/q2e4mK0FTbw" width="680" height="400" data-external="1" frameborder="0" allowfullscreen> </iframe>
</center>

## . Summary of the video:

### . The Central Limit Theorem (CLT) for the Mean ($\bar{x}$)
The CLT is often called the "magic of statistics" because it provides an incredible guarantee it ensures that the sampling distribution of the mean will be normal, regardless of the original population's bizarre shape.

### . The CLT Mechani
When we take multiple samples, the average of those sample means ($\mu_{\bar{x}}$) will precisely equal the true population mean ($\mu$). More importantly, the spread of this new distribution known as the Standard Error ($\sigma_{\bar{x}}$) is drastically reduced compared to the original population's standard deviation ($\sigma$). This relationship is defined as $\sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}}$.

### . The Normal Condition (The Rule of 30)
For the CLT to kick in and ensure normality when the population is not normal, the sample size ($n$) must be large enough, with the common threshold being $n \ge 30$.

### . The Sampling Distribution of the Proportion ($\hat{p}$)
When dealing with categorical data (like "yes/no," "success/failure"), we use the proportion ($\hat{p}$) instead of the mean ($\bar{x}$). Just like the mean, if we repeatedly sample and plot the resulting proportions, they also form a Sampling Distribution.

### . The Proportion Rules
The mean of the sample proportions ($\mu_{\hat{p}}$) will equal the true population proportion ($p$). The Standard Error for the proportion is calculated using the formula $\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}$.

### . The Normal Condition (The Rule of Successes and Failures)
For the sampling distribution of the proportion to be considered normal, the CLT applies differently. Instead of relying only on $n \ge 30$, we must verify that there are enough expected successes and enough expected failures: both $n \cdot p$ and $n \cdot (1-p)$ must be greater than or equal to 10. Meeting these two simple conditions guarantees that the distribution is sufficiently symmetric to use the Z-score table.

```{r,echo=FALSE, warning=FALSE, message=FALSE, big =TRUE, out.extra='style="display:block; margin-left:auto; margin-right:auto;"'}
library(ggplot2)
library(dplyr)
library(tidyr)
library(ggridges)
library(patchwork)

# --- Parameter Simulasi ---
num_samples <- 10000 # Jumlah rata-rata sampel untuk simulasi
lambda <- 0.5       # Parameter untuk Distribusi Eksponensial (Populasi Miring)
mu_pop <- 1/lambda  # Rata-rata Populasi (μ = 2)

# --- 2. Simulasi Data untuk Berbagai Ukuran Sampel (n) ---

# Buat fungsi untuk mensimulasikan rata-rata sampel
simulate_means <- function(n_size, num_runs, lambda_rate) {
  # Mengambil rata-rata dari n_size observasi, diulang num_runs kali
  replicate(num_runs, mean(rexp(n_size, rate = lambda_rate)))
}

# Ukuran sampel yang akan diuji
n_sizes <- c(1, 2, 5, 10, 15, 20, 30) 

# Jalankan simulasi dan gabungkan hasilnya
clt_data_ridges <- data.frame(
  mean = c(
    simulate_means(n_sizes[1], num_samples, lambda),
    simulate_means(n_sizes[2], num_samples, lambda),
    simulate_means(n_sizes[3], num_samples, lambda),
    simulate_means(n_sizes[4], num_samples, lambda),
    simulate_means(n_sizes[5], num_samples, lambda),
    simulate_means(n_sizes[6], num_samples, lambda),
    simulate_means(n_sizes[7], num_samples, lambda)
  ),
  # Buat kolom grup berdasarkan ukuran sampel (n)
  n_group = factor(rep(n_sizes, each = num_samples), 
                   levels = rev(n_sizes)) # Balik urutan agar n terbesar di atas
)

# --- 3. Visualisasi Density Ridges (Plot Utama) ---

p_ridges <- ggplot(clt_data_ridges, aes(x = mean, y = n_group, fill = factor(n_group))) +
  # Gunakan geom_density_ridges untuk efek visual yang tumpang tindih
  geom_density_ridges(scale = 3, alpha = 0.8, rel_min_height = 0.01) +
  
  # Tambahkan garis vertikal pada Rata-rata Populasi
  geom_vline(xintercept = mu_pop, linetype = "dashed", color = "black", lwd = 1.2) +
  
  # Kustomisasi Tema dan Label
  scale_fill_viridis_d(option = "plasma", direction = -1) + # Palet warna yang menarik
  labs(
    title = "Evolusi Distribusi Sampling: Visualisasi Density Ridges",
    subtitle = paste0("Populasi Awal Miring (Eksponensial). Garis hitam = μ =", mu_pop),
    x = "Rata-rata Sampel (x̄)",
    y = "Ukuran Sampel (n)"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", color = "#36454F", size = 18),
    legend.position = "none", # Hilangkan legenda karena n_group sudah menjadi Y-axis
    panel.grid.major.y = element_blank(), # Bersihkan garis grid Y
    panel.grid.minor.y = element_blank()
  ) +
  # Pastikan limit X-axis mencakup semua data secara wajar
  coord_cartesian(xlim = c(0, 10)) 

# Tampilkan Plot
p_ridges


```

# . Review Sampling Distribuition
These three videos covering the Central Limit Theorem (CLT), the Sampling Distribution of the Proportion, and Binomial Approximation represent the ultimate transition from analyzing messy sample data to making confident predictions about a large population. Their essential teaching is how we leverage the laws of probability to turn chaotic sample results into a reliable Normal Distribution.

<center>
<iframe src="https://www.youtube.com/embed/c0mFEL_SWzE" width="680" height="400" data-external="1" frameborder="0" allowfullscreen> </iframe>
</center>

## . Summary of the video:
### . The Central Promise
The core idea shifts our focus.Instead of worrying about a single sample's outcome, we consider the distribution of a statistic (like the mean $\bar{x}$ or the proportion $\hat{p}$) if we were to draw thousands of random samples from the same population. This distribution is called the Sampling Distribution.This shift is crucial because while individual data points can be highly erratic, the averages and proportions of multiple samples follow highly predictable rules dictated by the CLT.

### . The Central Limit Theorem (CLT) for the Sample Mean ($\bar{x}$)
The CLT is statistics' "magic spell," guaranteeing that the sampling distribution of the mean will take on the familiar Normal (bell-shaped) Distribution, regardless of how weirdly shaped the original population data is.

Key Principles for $\bar{x}$:

1. Stable Center: The average of all sample means ($\mu_{\bar{x}}$) will equal the true population mean ($\mu$).

2. Reduced Spread: The variability of this new distribution is drastically smaller than the population's. This controlled spread is the Standard Error ($\sigma_{\bar{x}}$), calculated as $\sigma / \sqrt{n}$. The bigger the sample size ($n$), the smaller and taller the resulting curve is.

3. The Rule of 30: To ensure this normality takes effect when the population is non-normal, the sample size ($n$) must be at least 30 ($n \ge 30$).

### . The Sampling Distribution of the Sample Proportion ($\hat{p}$)
When analyzing categorical or binary data (e.g., success/failure, yes/no), we use the sample proportion ($\hat{p}$). Similar to the mean, if we repeatedly calculate $\hat{p}$, its distribution will also approach normality.

Key Principles for $\hat{p}$:

1. Stable Center: The mean of the sample proportions ($\mu_{\hat{p}}$) equals the true population proportion ($p$).

2. Standard Error: The variability is calculated as $\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}$.

3. The Rule of Successes and Failures: the condition for normality here is stricter: we must ensure that the number of expected successes ($n \cdot p$) and expected failures ($n \cdot (1-p)$) are both greater than or equal to 10. Meeting these two conditions guarantees the distribution is symmetric enough for normal approximation.

### . Why CLT is Essential (The Binomial Bridge)
One of the most practical applications of the CLT for proportions is solving complicated problems involving the Binomial Distribution.Calculating the exact probability of, say, at least 35 successes out of 100 trials using the traditional Binomial formula is tedious, requiring the calculation and summing of many probabilities. Thanks to the CLT, once the $n p \ge 10$ and $n (1-p) \ge 10$ conditions are met, we can approximate that complex probability easily using a single Z-score calculation, saving immense time and effort.

```{r,echo=FALSE, warning=FALSE, message=FALSE, big =TRUE, out.extra='style="display:block; margin-left:auto; margin-right:auto;"'}
library(ggplot2)
library(dplyr)
library(tidyr)
library(ggridges)
library(patchwork)

# --- Simulation Parameters ---
num_samples <- 10000 
lambda <- 0.5       # Parameter for the Exponential Distribution (Our Skewed Population)
mu_pop <- 1/lambda  # Population Mean (μ = 2)

# --- Sample Simulation Function ---
simulate_means <- function(n_size, num_runs, lambda_rate) {
  replicate(num_runs, mean(rexp(n_size, rate = lambda_rate)))
}

# Sample sizes to test (demonstrating the transformation from 1 to 30)
n_sizes <- c(1, 2, 5, 10, 15, 20, 30) 

# Run simulation and combine results
clt_data_ridges <- data.frame(
  mean = c(
    simulate_means(n_sizes[1], num_samples, lambda),
    simulate_means(n_sizes[2], num_samples, lambda),
    simulate_means(n_sizes[3], num_samples, lambda),
    simulate_means(n_sizes[4], num_samples, lambda),
    simulate_means(n_sizes[5], num_samples, lambda),
    simulate_means(n_sizes[6], num_samples, lambda),
    simulate_means(n_sizes[7], num_samples, lambda)
  ),
  # Grouping factor based on sample size (n)
  n_group = factor(rep(n_sizes, each = num_samples), 
                   levels = rev(n_sizes)) # Reverse order for n=30 at the top
)

# --- Density Ridges Visualization ---
p_ridges <- ggplot(clt_data_ridges, aes(x = mean, y = n_group, fill = factor(n_group))) +
  # Use geom_density_ridges for the beautiful overlapping effect
  geom_density_ridges(scale = 3, alpha = 0.8, rel_min_height = 0.01) +
  
  # Add vertical line for the true population mean (the center point)
  geom_vline(xintercept = mu_pop, linetype = "dashed", color = "black", lwd = 1.2) +
  
  # Customization
  scale_fill_viridis_d(option = "plasma", direction = -1) + 
  labs(
    title = "CLT Simulation: The Evolution of Normality using Density Ridges",
    subtitle = paste0("Starting with a Skewed (Exponential) Population. Black line = μ =", mu_pop),
    x = "Sample Mean (x̄)",
    y = "Sample Size (n)"
  ) +
  theme_minimal(base_size = 14) +
  theme(
    plot.title = element_text(face = "bold", color = "#36454F", size = 18),
    legend.position = "none", 
    panel.grid.major.y = element_blank(),
    panel.grid.minor.y = element_blank()
  ) +
  # Restrict X-axis for a focused view on the center
  coord_cartesian(xlim = c(0, 8)) 

# Display Plot
p_ridges
```

# . Referensi


1. "The Basic Practice of Statistics" (Edisi Pertama: 1995)

2. "Introductory Statistics with R" (Edisi Pertama: 2008)

3. ”Foundations of Statistical Inference and Data Science”

4. Elementary Statistics: Picturing the World by Ron Larson and Betsy Farber, 7th Edition, published in 2018.

5. Statistics by James T. McClave, P. George Benson, and Terry Sincich, 13th Edition, published in 2018.
