Tugas Week 10 ~ Essential of Probability

/* Override body background dengan gradient */
body {
  background: linear-gradient(135deg, #667eea 0%, #764ba2 100%) !important;
  background-attachment: fixed;
  color: #333;
}

/* Style headers dengan efek shadow */
h1, h2, h3 {
  color: #fff !important;
  text-shadow: 2px 2px 4px rgba(0, 0, 0, 0.5);
  background: rgba(255, 255, 255, 0.1);
  padding: 10px;
  border-radius: 8px;
}

/* Text alignment classes */
.left-align {
  text-align: left;
}

.right-align {
  text-align: right;
}

.justify {
  text-align: justify;
  line-height: 1.6;
}

/* Custom text boxes */
.text-box {
  border: 2px solid #fff;
  background: rgba(255, 255, 255, 0.9);
  padding: 15px;
  margin: 10px 0;
  border-radius: 10px;
  box-shadow: 0 4px 8px rgba(0, 0, 0, 0.2);
  font-size: 14px;
  color: #333;
}

/* Model output styling */
.model-output {
  background: rgba(255, 255, 255, 0.95);
  border: 1px solid #ddd;
  padding: 20px;
  margin: 15px 0;
  border-radius: 8px;
  box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
  overflow-x: auto;
}

.model-output table {
  width: 100%;
  border-collapse: collapse;
}

.model-output th, .model-output td {
  border: 1px solid #ddd;
  padding: 8px;
  text-align: left;
}

.model-output th {
  background-color: #667eea;
  color: white;
}

/* Plot container */
.plot-container {
  background: rgba(255, 255, 255, 0.9);
  padding: 10px;
  border-radius: 8px;
  box-shadow: 0 4px 8px rgba(0, 0, 0, 0.2);
  margin: 10px 0;
}

/* Footer custom */
footer {
  text-align: center;
  color: #fff;
  padding: 10px;
  background: rgba(0, 0, 0, 0.3);
  margin-top: 20px;
  border-radius: 8px;
}

Logo

Bab 6 ~ Essential of Probability

Introduction

Probability is not merely numerical calculation—it is a fundamental framework for confronting uncertainty systematically. In a world filled with incomplete data, random variables, and risky decisions, probability provides a logical structure for measuring likelihood, predicting outcomes, and making accountable decisions.

Why Probability is Essential

Most real-world phenomena—from financial market fluctuations to scientific experimental results—cannot be predicted with absolute certainty. Probability fills the gap between total ignorance and complete certainty, enabling us to:

Quantify uncertainty with mathematical precision
Make inferences from limited data to broader populations
Optimize decisions under conditions of risk and imperfect information
Build predictive models that can be validated and refined

6 Fundamental Pillars

This material builds foundations through six interconnected core concepts:

Fundamental Concepts: Architecture of Probability Sample spaces, events, and complement rules form the basic structure. Without this understanding, you are merely guessing—not calculating. Every probabilistic analysis begins with clear definition of what can possibly occur and how to measure it.
Independent and Dependent Events: Dependency Structure This distinguishes scenarios where one event affects another versus those that do not. Errors in identifying independence are among the most common sources of error in data analysis—from incorrect model assumptions to flawed causal conclusions.
Union of Events: Logic of Combination How to calculate probability when multiple scenarios can occur simultaneously or alternatively. This is the foundation for understanding complexity in systems with multiple variables and interactions.
Exclusive and Exhaustive Events: Partition Structure Clarifying how events divide the sample space without overlap. This concept is crucial for constructing valid probability distributions and ensuring mathematical consistency in calculations.
Binomial Trials and Binomial Distribution: Discrete Outcome Models The most practical tool for analyzing repetitive situations with two possible outcomes. From production quality testing to medical clinical trials, the binomial distribution is the workhorse of applied probability analysis.
Integration and Application: From Theory to Practice Probability without application is empty abstraction. This section demonstrates how the five preceding concepts work together in real scenarios—risk analysis, predictive modeling, hypothesis testing, and data-driven decision making.

Video and Summary of Chapter 6 Material

6.1 Fundamental Consept

library(htmltools)

# input YouTube URL
url <- "https://www.youtube.com/watch?v=ynjHKBCiGXY"

# automatically extract video ID
get_video_id <- function(url) {
  pattern <- "(?<=v=)[A-Za-z0-9_-]{11}|(?<=youtu.be/)[A-Za-z0-9_-]{11}"
  id <- regmatches(url, regexpr(pattern, url, perl = TRUE))
  if (length(id) == 0) return(NULL)
  id
}

video_id <- get_video_id(url)

thumbnail <- paste0("https://img.youtube.com/vi/", video_id, "/0.jpg")
video_link <- paste0("https://youtu.be/", video_id)

# create thumbnail
browsable(
  tags$a(
    href = video_link,
    target = "_blank",
    style = "
      position: relative;
      display: block;
      max-width: 525px;
      width: 100%;
      margin: auto;",
    
    # video background
    tags$img(
      src = thumbnail,
      style = "
        width: 100%;
        border-radius: 10px;
        border: 1.5px solid #444;
        filter: brightness(0.87);"),
    
    # play button
    tags$div(
      style = "
        position:absolute;
        top:50%; left:50%;
        transform: translate(-50%,-50%);
        width:70px; height:70px;
        background: rgba(255,255,255,0.8);
        border-radius:50%;
        display:flex;
        justify-content:center;
        align-items:center;",
      tags$div(
        style = "
          width: 0; height: 0;
          border-left: 24px solid #e53935;
          border-top: 14px solid transparent;
          border-bottom: 14px solid transparent;
          margin-left: 6px;"))))

The core concept of the video is the foundation of classical probability, where probability is defined as a measure of the outcome of a random experiment. Key interpretations:

Probability as a Function : The probability $P(A)$ for event $A$ is the ratio of favorable outcomes to the total number of possible outcomes, assuming the outcomes are equiprobable. This assumes a simple discrete model, which does not always apply to continuous distributions—risking over-simplification if the limitations are not stated.
Sample Space ($S$) : The complete set of all possible outcomes of an experiment. Interpretation: $S$ limits the space of probability analysis; failure to define $S$ precisely often leads to estimation bias (e.g., ignoring rare outcomes). The video emphasizes $S$ as a basic “map,” strategically facilitating the decomposition of complex events into subsets.
Complement Rule: The complement event of $A^c$ is “not $A$”, with $P(A^c) = 1 - P(A)$. Analytical interpretation: This rule is efficient for “easy” calculations via subtraction, but is error-prone if $P(A)$ is difficult to compute—the video highlights this as a shortcut, not a universal solution. Overall, this material builds a deterministic framework for probability, but underexplores the Bayesian (subjective) implications, which could enrich the interpretation beyond the frequentist context.

Formula

Basic Probability Definition

\[P(A) = \frac{\text{Number of favorable outcomes for } A}{\text{Total outcomes in } S}\] Design: Explicit numerator/denominator, assuming $|S|$ is finite. Usage: Directly applied to sample space diagrams for visualization.

Sample Space

\[S = \{ \text{all possible outcomes} \}, \quad |S| = n \quad (\text{cardinal size})\] Clean design: The video diagram displays $S$ as a box with branches (a tree diagram), making enumeration easy. Usage: Splitting event $A \subseteq S$.

Complement Rule

\[P(A^c) = 1 - P(A), \quad \text{where } A^c = S \setminus A\] Design: Simple, with implicit proof via the probability axiom ($P(S) = 1$). Strategic use: Compute the complement if $P(A)$ is complex, reducing computation by up to 50% in the symmetric case.

Example

Sample Space Example: Toss a fair coin twice.

$S = \{ HH, HT, TH, TT \}$, $|S| = 4$.

Event $A$: “At least one head” → $A = \{ HH, HT, TH \}$, $P(A) = 3/4$.

Explanation: Complete enumeration of $S$ avoids undercounting; relevant because it resembles a basic lab experiment.

Complement Rule Example: From the example above,

$A^c$: “No heads (two tails)” → $P(A^c) = 1 - 3/4 = 1/4$,

Verification: $|A^c| = 1$, $P(A^c) = 1/4$.

Explanation: Complement (1 outcome) is easier to calculate than $A$ (3 outcomes), saving 67% effort. Relevant for quick decision-making, e.g., the risk of “failing” in quality control.

6.2 Independent and Dependent

library(htmltools)

# input URL youtube
url <- "https://www.youtube.com/watch?v=LS-_ihDKr2M"

# memanggil video id secara otomatis
get_video_id <- function(url) {
  pattern <- "(?<=v=)[A-Za-z0-9_-]{11}|(?<=youtu.be/)[A-Za-z0-9_-]{11}"
  id <- regmatches(url, regexpr(pattern, url, perl = TRUE))
  if (length(id) == 0) return(NULL)
  id
}

video_id <- get_video_id(url)

thumbnail <- paste0("https://img.youtube.com/vi/", video_id, "/0.jpg")
video_link <- paste0("https://youtu.be/", video_id)

# membuat thumbnail
browsable(
  tags$a(
    href = video_link,
    target = "_blank",
    style = "
      position: relative;
      display: block;
      max-width: 525px;
      width: 100%;
      margin: auto;",
    
    # latar belakang video
    tags$img(
      src = thumbnail,
      style = "
        width: 100%;
        border-radius: 10px;
        border: 1.5px solid #444;
        filter: brightness(0.87);"),
    
    # tombol play
    tags$div(
      style = "
        position:absolute;
        top:50%; left:50%;
        transform: translate(-50%,-50%);
        width:70px; height:70px;
        background: rgba(255,255,255,0.8);
        border-radius:50%;
        display:flex;
        justify-content:center;
        align-items:center;",
      tags$div(
        style = "
          width: 0; height: 0;
          border-left: 24px solid #e53935;
          border-top: 14px solid transparent;
          border-bottom: 14px solid transparent;
          margin-left: 6px;"))))

The video interprets probability as a measure of the likelihood of an event, based on a simple calculation: probability = number of favorable outcomes to the total number of possible outcomes. Independent events are explained as events where the outcome of one event has no effect on the outcome of the other, such as rolling a dice or a coin—the probability remains constant. In contrast, dependent events involve interrelated influences, especially without replacement, where the probability changes after each event.

The explanation is structured logically: starting with the independent principle, then transitioning to dependent events, highlighting common mistakes such as applying the independent formula to the dependent case. This interpretation is accurate and insightful, connecting the concept to real-world scenarios.

Formula

The formulas are presented verbally and mathematically in the video, with a simple yet effective design for clarity. Here are the main formulas used, simplified using standard mathematical notation:

Basic Probability:

$P(\text{event}) = \frac{\text{number of favorable outcomes}}{\text{total possible outcomes}}$

Probability of Independent Events (A and B):

$P(A \cap B) = P(A) \times P(B)$ (Because outcome A does not affect B, the probabilities are multiplied directly.)

Probability of Dependent Events (A and B):

$P(A \cap B) = P(A) \times P(B \mid A)$ (The probability of B is adjusted after A occurs, reflecting the changed conditions.)

Example

Independent: Roll a 6-sided die to get 5, then flip a coin to get heads.

$P(\text{dice 5}) = \frac{1}{6}$, $P(\text{coin heads}) = \frac{1}{2}$, so

$P(\text{both}) = \frac{1}{6} \times \frac{1}{2} = \frac{1}{12} \approx 0.0833$.

Relevance: Indicates a fixed probability, in contrast to dependent.

Dependent (Example 1) : A box with 10 marbles (7 green, 3 blue). Draw the first green, then the second blue without replacement.

$P(\text{first green}) = \frac{7}{10}$, $P(\text{second blue} \mid \text{first green}) = \frac{3}{9} = \frac{1}{3}$, so

$P(\text{both}) = \frac{7}{10} \times \frac{3}{9} = \frac{7}{30} \approx 0.233$.

Relevance: Illustrates the change in probability, using the independent formula ($\frac{7}{10} \times \frac{3}{10} = 0.21$, which is incorrect).

Dependent (Example 2) : From the same box, draw two greens without replacement.

$P(\text{first green}) = \frac{7}{10}$, $P(\text{second green} \mid \text{first green}) = \frac{6}{9} = \frac{2}{3}$, so that

$P(\text{both}) = \frac{7}{10} \times \frac{6}{9} = \frac{42}{90} = \frac{7}{15} \approx 0.467$.

6.3 Union of Events

library(htmltools)

# input URL youtube
url <- "https://www.youtube.com/watch?v=vqKAbhCqSTc"

# memanggil video id secara otomatis
get_video_id <- function(url) {
  pattern <- "(?<=v=)[A-Za-z0-9_-]{11}|(?<=youtu.be/)[A-Za-z0-9_-]{11}"
  id <- regmatches(url, regexpr(pattern, url, perl = TRUE))
  if (length(id) == 0) return(NULL)
  id
}

video_id <- get_video_id(url)

thumbnail <- paste0("https://img.youtube.com/vi/", video_id, "/0.jpg")
video_link <- paste0("https://youtu.be/", video_id)

# membuat thumbnail
browsable(
  tags$a(
    href = video_link,
    target = "_blank",
    style = "
      position: relative;
      display: block;
      max-width: 525px;
      width: 100%;
      margin: auto;",
    
    # latar belakang video
    tags$img(
      src = thumbnail,
      style = "
        width: 100%;
        border-radius: 10px;
        border: 1.5px solid #444;
        filter: brightness(0.87);"),
    
    # tombol play
    tags$div(
      style = "
        position:absolute;
        top:50%; left:50%;
        transform: translate(-50%,-50%);
        width:70px; height:70px;
        background: rgba(255,255,255,0.8);
        border-radius:50%;
        display:flex;
        justify-content:center;
        align-items:center;",
      tags$div(
        style = "
          width: 0; height: 0;
          border-left: 24px solid #e53935;
          border-top: 14px solid transparent;
          border-bottom: 14px solid transparent;
          margin-left: 6px;"))))

The concept of sample space is defined as the entire range of possible outcomes of a statistical experiment, which forms the basis for calculating probability. Simple probability is interpreted as the ratio of favorable outcomes to the total number of possible outcomes, assuming each outcome is equally likely—this assumption is valid for fair dice, but can be flawed if the distribution is not uniform. The union of events (A ∪ B) is defined as the probability of at least one event occurring, denoted by the word “or.” Intersection (A ∩ B) is the overlap, which must be subtracted to avoid overcounting.

Formula

The formula is presented neatly and sequentially, with emphasis on the logic behind each component. Here are the main formulas:

Simple Probability:

$P(E) = \frac{\text{Number of favorable outcomes}}{\text{Total outcomes in sample space}}$

Probability of Independent Events (for “and”):

$P(A \cap B) = P(A) \times P(B)$

Probability of Union of Events (for “or”):

$P(A \cup B) = P(A) + P(B) - P(A \cup B)$

Example

All examples use two 6-sided dice (sample space: 36 outcomes).

Example 1: Probability of two even numbers.

Favorable: 9 outcomes (e.g., (2,2), (2,4), …, (6,6)).

$P = \frac{9}{36} = 0.25$

Example 2: Probability of at least one 2.

Favorable: 11 outcomes (e.g., (2,1) to (2,6), (1,2) to (6,2), minus the double-count (2,2)).

$P = \frac{11}{36} \approx 0.3056$

Example 3: Probability of two even numbers AND at least one 2 (intersection).

Overlap: 5 outcomes.

$P = \frac{5}{36} \approx 0.1389$

Example 4: Probability of two even numbers OR at least one 2 (union).

$P = \frac{9}{36} + \frac{11}{36} - \frac{5}{36} = \frac{15}{36} \approx 0.4167$

Visualization

A Venn diagram is used for visualization: Outer box = sample space (1 or 100%). Blue circle = two even numbers (25%), green = at least one 2 (31%), overlap = 14%. Union = combined area without duplication (42%).

Berikut kode r:

library(VennDiagram)

## Warning: package 'VennDiagram' was built under R version 4.5.2

## Loading required package: grid

## Loading required package: futile.logger

## Warning: package 'futile.logger' was built under R version 4.5.2

# Define the area (in arbitrary units, but label the probability)
venn.diagram(
  x = list(
    "Two Even" = 1:9,  # Representasi dummy untuk 9 outcomes
    "At least One 2" = 5:15  # Dummy untuk 11 outcomes, overlap 5
  ),
  category.names = c("Two Even (25%)", "At least One 2 (31%)"),
  filename = NULL,  # Display in console/plot window
  output = TRUE,
  fill = c("blue", "green"),
  alpha = 0.5,
  label.col = "black",
  cex = 1.5,
  cat.cex = 1.2,
  cat.pos = c(-20, 20),
  cat.dist = c(0.05, 0.05),
  margin = 0.1,
  main = "Venn Diagram: Union of Events (Dadu Example)",
  sub = "Overlap: 14%, Union: 42%"
)

## (polygon[GRID.polygon.1], polygon[GRID.polygon.2], polygon[GRID.polygon.3], polygon[GRID.polygon.4], text[GRID.text.5], text[GRID.text.6], text[GRID.text.7], text[GRID.text.8], text[GRID.text.9], text[GRID.text.10], text[GRID.text.11])

# Proportional version with eulerr (more accurate for probability)
library(eulerr)

## Warning: package 'eulerr' was built under R version 4.5.2

# Define proportional combination (based on probability: A=25%-14%=11%, B=31%-14%=17%, A&B=14%)
fit <- euler(c(A = 0.11, B = 0.17, "A&B" = 0.14))  # The total sample space is assumed to be 1

plot(fit,
     fills = list(fill = c("blue", "green"), alpha = 0.5),
     labels = list(labels = c("Two Even\n(25%)", "At least\n one 2 (31%)"), cex = 1.2),
     quantities = TRUE,  # Display area values
     main = "Proporsional Venn Diagram: Union ~42%"
)

6.4 Exclusive and Exhaustive

library(htmltools)

# input URL youtube
url <- "https://www.youtube.com/watch?v=f7agTv9nA5k"

# memanggil video id secara otomatis
get_video_id <- function(url) {
  pattern <- "(?<=v=)[A-Za-z0-9_-]{11}|(?<=youtu.be/)[A-Za-z0-9_-]{11}"
  id <- regmatches(url, regexpr(pattern, url, perl = TRUE))
  if (length(id) == 0) return(NULL)
  id
}

video_id <- get_video_id(url)

thumbnail <- paste0("https://img.youtube.com/vi/", video_id, "/0.jpg")
video_link <- paste0("https://youtu.be/", video_id)

# membuat thumbnail
browsable(
  tags$a(
    href = video_link,
    target = "_blank",
    style = "
      position: relative;
      display: block;
      max-width: 525px;
      width: 100%;
      margin: auto;",
    
    # latar belakang video
    tags$img(
      src = thumbnail,
      style = "
        width: 100%;
        border-radius: 10px;
        border: 1.5px solid #444;
        filter: brightness(0.87);"),
    
    # tombol play
    tags$div(
      style = "
        position:absolute;
        top:50%; left:50%;
        transform: translate(-50%,-50%);
        width:70px; height:70px;
        background: rgba(255,255,255,0.8);
        border-radius:50%;
        display:flex;
        justify-content:center;
        align-items:center;",
      tags$div(
        style = "
          width: 0; height: 0;
          border-left: 24px solid #e53935;
          border-top: 14px solid transparent;
          border-bottom: 14px solid transparent;
          margin-left: 6px;"))))

The video begins with a brief overview of basic probability, emphasizing the importance of understanding the relationships between events in a sample space. Main objectives:

Define and distinguish mutually exclusive events from exhaustive events.
Prove both visually (Venn diagram) and mathematically (probability formula).
Demonstrate applications in statistical analysis to avoid probability calculation errors.

Mutually Exclusive Events

Definition: Two or more events are said to be mutually exclusive if there is no probability that they will occur simultaneously. Visually, a Venn diagram shows non-overlapping circles.

Formula:

Mathematical: For events A and B, $P(A \cap B) = 0$.
Joint probability: $P(A \cap B) = P(A) + P(B)$ (because there is no overlap).
Visual proof: The video shows a Venn diagram where the intersection of A and B is empty, proving there are no shared elements.
Mathematical proof: If A and B are disjoint, then the sum of the probabilities is straightforward without subtracting the intersection (the general addition rule: $P(A \cap B) = P(A) + P(B) - P(A \cap B)$, with $P(A \cap B) = 0$).

# Instal paket jika diperlukan
# install.packages("ggVennDiagram")
# install.packages("ggplot2")  # Jika belum ada

library(ggVennDiagram)

## Warning: package 'ggVennDiagram' was built under R version 4.5.2

library(ggplot2)

# Venn Diagram for Mutually Exclusive Events (non-overlapping)
# Data sets without overlap
sets_mutual <- list(
  "Event A" = 1:3,
  "Event B" = 4:6
)

# Create and display diagrams as ggplot
ggVennDiagram(sets_mutual) +
  scale_fill_gradient(low = "blue", high = "red") +  # Color customization
  labs(title = "Mutually Exclusive Events") +
  theme_minimal()  # Simple theme

Example:

Rolling a dice: The events “rolling an even number” (2,4,6) and “rolling an odd number” (1,3,5) are mutually exclusive because they do not overlap.

$P(even \ odd cup) = 0.5 + 0.5 = 1$.

Playing cards: Drawing a “heart” and a “spade” from a single deck is impossible simultaneously.

Exhaustive Events

Definition: A set of events is exhaustive if their combination covers the entire sample space, meaning at least one event is certain to occur.

Formula:

Mathematically: For events A1, A2, …, An, $\bigcup_{i=1}^n A_i = S$ (S = sample space),

so $P(\bigcup_{i=1}^n A_i) = 1$.

Visual proof: The Venn diagram shows circles that together fill the entire sample space, with no empty space outside.
Mathematical proof: The total probability must be 1; otherwise, an outcome is missed, violating the axioms of probability.

# Venn Diagram for Exhaustive Events (full coverage, can overlap)
# Data sets with overlap
universe <- 1:10
A <- 1:6
B <- 5:10  # Overlap

sets_exhaustive <- list(
  "Event A" = A,
  "Event B" = B
)

# Create and display diagrams as ggplot
ggVennDiagram(sets_exhaustive) +
  scale_fill_gradient(low = "green", high = "orange") +  # Color customization
  labs(title = "Exhaustive Events (with Overlap)") +
  theme_minimal()

Example

Coin toss: “Heads” or “Sides” – exhaustive because it covers all possibilities and is mutually exclusive.
Weather: “Rainy”, “Cloudy”, “Sunny” – exhaustive because it covers all conditions, but can overlap (e.g., rain, cloudy).

The Union of Exhaustive Events

Explanation: The union of exhaustive events always yields probability 1. The video discusses how to calculate union when there is overlap.

Formula

For exhaustive non-mutually exclusive: $P(\bigcup A_i) = 1$, but the union calculation uses inclusion-exclusion if overlap occurs.

Overlapping and Non-overlapping Exhaustive Events

Explanation of Exhaustive Events can be:

Non-overlapping (mutually exclusive): Like partition, simple formula.
Overlapping: Still exhaustive, but the calculation is more complex.

Visual evidence: Venn diagram with overlap and non-overlapping

# Install if necessary: install.packages("VennDiagram")
library(VennDiagram)
library(grid) # For grid.draw() if you want to display directly

# Define an explicit universe for exhaustive (e.g., 1:10)
universe <- 1:10

# Non-overlapping Exhaustive Events (partition: no overlap, union = universe)
A_non <- 1:5
B_non <- 6:10  # No overlap, union = 1:10
vd_non <- venn.diagram(
  x = list(A = A_non, B = B_non),
  category.names = c("Event A", "Event B"),
  filename = NULL,  # For direct display; or "non_overlapping_exhaustive.png" for save
  lwd = 2,
  col = c("purple", "cyan"),
  fill = c(adjustcolor("purple", alpha.f = 0.3), adjustcolor("cyan", alpha.f = 0.3)),
  cex = 0.6,
  fontfamily = "sans",
  cat.cex = 0.6,
  cat.default.pos = "outer",
  cat.pos = c(-27, 27),
  cat.dist = c(0.055, 0.055),
  cat.fontfamily = "sans",
  main = "Non-overlapping Exhaustive Events"
)
grid.newpage()
grid.draw(vd_non)  # Display directly in the plot window

# Overlapping Exhaustive Events (overlap exists, union = universe)
A_over <- 1:6
B_over <- 5:10  # Overlap di 5-6, union = 1:10
vd_over <- venn.diagram(
  x = list(A = A_over, B = B_over),
  category.names = c("Event A", "Event B"),
  filename = NULL,  # For direct display
  lwd = 2,
  col = c("purple", "cyan"),
  fill = c(adjustcolor("purple", alpha.f = 0.3), adjustcolor("cyan", alpha.f = 0.3)),
  cex = 0.6,
  fontfamily = "sans",
  cat.cex = 0.6,
  cat.default.pos = "outer",
  cat.pos = c(-27, 27),
  cat.dist = c(0.055, 0.055),
  cat.fontfamily = "sans",
  main = "Overlapping Exhaustive Events"
)
grid.newpage()
grid.draw(vd_over)  # Live display

6.5 Binomial Experiment and the Binomial Formula

library(htmltools)

# input URL youtube
url <- "https://www.youtube.com/watch?v=nRuQAtajJYk"

# memanggil video id secara otomatis
get_video_id <- function(url) {
  pattern <- "(?<=v=)[A-Za-z0-9_-]{11}|(?<=youtu.be/)[A-Za-z0-9_-]{11}"
  id <- regmatches(url, regexpr(pattern, url, perl = TRUE))
  if (length(id) == 0) return(NULL)
  id
}

video_id <- get_video_id(url)

thumbnail <- paste0("https://img.youtube.com/vi/", video_id, "/0.jpg")
video_link <- paste0("https://youtu.be/", video_id)

# membuat thumbnail
browsable(
  tags$a(
    href = video_link,
    target = "_blank",
    style = "
      position: relative;
      display: block;
      max-width: 525px;
      width: 100%;
      margin: auto;",
    
    # latar belakang video
    tags$img(
      src = thumbnail,
      style = "
        width: 100%;
        border-radius: 10px;
        border: 1.5px solid #444;
        filter: brightness(0.87);"),
    
    # tombol play
    tags$div(
      style = "
        position:absolute;
        top:50%; left:50%;
        transform: translate(-50%,-50%);
        width:70px; height:70px;
        background: rgba(255,255,255,0.8);
        border-radius:50%;
        display:flex;
        justify-content:center;
        align-items:center;",
      tags$div(
        style = "
          width: 0; height: 0;
          border-left: 24px solid #e53935;
          border-top: 14px solid transparent;
          border-bottom: 14px solid transparent;
          margin-left: 6px;"))))

A binomial experiment is interpreted as a statistical process with fixed trials that produce a binary outcome (success/failure), useful for modeling random events such as the probability of repeated successes. Key explanation: The video defines a binomial experiment as an experiment that meets four strict conditions—a fixed number of trials (n), only two outcomes (success with probability p, failure with 1-p), a constant probability per trial, and independent trials (one outcome does not affect the other). Critical interpretation: It is deductive, starting from probability theory to test hypotheses, but is weak if the assumption of independence is violated (e.g., carryover effects in consecutive trials).

Formula

Binomial experimental design involves planning fixed trials, defining outcomes, estimating p, and verifying independence. The video neatly presents the binomial formula as the probability of exactly k successes in n trials:

$ P(X = k) = p^k (1-p)^{n-k} $ where $\binom{n}{k} = \frac{n!}{k!(n-k)!}$ is the binomial coefficient (the number of ways to choose k successes from n).

Usage: Calculate manually for small n, or use tables/software for large n.

Design example: Determine n (e.g., 10), p (0.5 for a coin fair), then apply the formula for the probability distribution.

# Install packages if needed: install.packages("ggplot2")
library(ggplot2)

# Parameters: n = number of trials, p = probability of success
n <- 10 # Number of trials
p <- 0.5 # Probability of success (e.g., coin flip)

# Calculate PMF for k=0 to n
x <- 0:n # Vector from 0 to n
pmf <- dbinom(x, size = n, prob = p) # Calculate PMF using the binomial distribution

# Create a data frame for plotting
data <- data.frame(x = x, pmf = pmf)

# Plot use ggplot2
ggplot(data, aes(x = factor(x), y = pmf)) +
  geom_bar(stat = "identity", fill = "skyblue", alpha = 0.7) +
  labs(
    title = paste("Distribution Binomial (n =", n, ", p =", p, ")"),
    x = "Number of Successes (k)",
    y = "Probabilitas"
  ) +
  scale_x_discrete(breaks = x) +  # Make sure all x's appear on the x-axis.
  theme_minimal() +
  theme(
    panel.grid.major.y = element_line(linetype = "dashed", alpha = 0.5),  # Grid on the y-axis
    axis.text.x = element_text(angle = 0)  # Rotasi teks x jika perlu
  )

## Warning in element_line(linetype = "dashed", alpha = 0.5): `...` must be empty.
## ✖ Problematic argument:
## • alpha = 0.5

Example

Main example: Flip a coin 10 times to count the number of heads (successes). Conditions: n=10 fixed, outcome heads/tails, p=0.5 constant, independent.

Application of the formula: P(exactly 5 heads) = $\binom{10}{5} (0.5)^5 (0.5)^5 = 252 \times 0.0009765625 = 0.246$.

Relevance: Illustrate the symmetric distribution for p=0.5.

Another example: Roll a die 20 times to count the number of times a 6 appears (success, p=1/6).

P(exactly 3 successes) = $\binom{20}{3} (1/6)^3 (5/6)^{17}$.

Relevance: Show skewness for p≠0.5.

Non-binomial example: Rolling a die until you get a 6 (not fixed n). Explanation: This example confirms the independence and fixed trials conditions, with limitations such as the fair die assumption.

6.6 Visualizing the Binomial Distribution

library(htmltools)

# input URL youtube
url <- "https://www.youtube.com/watch?v=Y2-vSWFmgyI"

# memanggil video id secara otomatis
get_video_id <- function(url) {
  pattern <- "(?<=v=)[A-Za-z0-9_-]{11}|(?<=youtu.be/)[A-Za-z0-9_-]{11}"
  id <- regmatches(url, regexpr(pattern, url, perl = TRUE))
  if (length(id) == 0) return(NULL)
  id
}

video_id <- get_video_id(url)

thumbnail <- paste0("https://img.youtube.com/vi/", video_id, "/0.jpg")
video_link <- paste0("https://youtu.be/", video_id)

# membuat thumbnail
browsable(
  tags$a(
    href = video_link,
    target = "_blank",
    style = "
      position: relative;
      display: block;
      max-width: 525px;
      width: 100%;
      margin: auto;",
    
    # latar belakang video
    tags$img(
      src = thumbnail,
      style = "
        width: 100%;
        border-radius: 10px;
        border: 1.5px solid #444;
        filter: brightness(0.87);"),
    
    # tombol play
    tags$div(
      style = "
        position:absolute;
        top:50%; left:50%;
        transform: translate(-50%,-50%);
        width:70px; height:70px;
        background: rgba(255,255,255,0.8);
        border-radius:50%;
        display:flex;
        justify-content:center;
        align-items:center;",
      tags$div(
        style = "
          width: 0; height: 0;
          border-left: 24px solid #e53935;
          border-top: 14px solid transparent;
          border-bottom: 14px solid transparent;
          margin-left: 6px;"))))

The binomial distribution is interpreted as a discrete model for the number of successes in n independent trials with a constant p, which is visualized to reveal patterns such as symmetry (p=0.5) or skewness (p approaching 0 or 1). Explanation: The video explains the PMF as a function that maps probabilities to each k (0 to n), with visualizations helping identify the mode, variance, and normal approximation for large n.

Formula

Directly: The main formula for the binomial distribution is the PMF, which calculates the probability of exactly k successes in n independent trials with a constant probability of success p:

$ P(X = k) = p^k (1-p)^{n-k} $

Where:

$\binom{n}{k} = \frac{n!}{k!(n-k)!}$ is the binomial coefficient (the number of ways to choose k successes from n trials).

Assumptions: Independent trials, binary outcome (success/failure), fixed n, constant p.

Example

Example: The distribution for n=10, p=0.5 (coin flips), is visualized as a symmetrical bar plot with a peak at k=5. Probability: P(k=5) ≈ 0.246.

Relevance: Illustrate probability balance.

library(ggplot2)

# Parameter contoh dari video
n <- 10
p <- 0.5

# Count PMF
k <- 0:n
pmf <- dbinom(k, size = n, prob = p)

# Data frame
df <- data.frame(k = k, pmf = pmf)

# Plot
ggplot(df, aes(x = k, y = pmf)) +
  geom_bar(stat = "identity", fill = "skyblue", alpha = 0.7) +
  labs(title = paste("Distribution Binomial (n =", n, ", p =", p, ")"),
       x = "Number of Succesess (k)",
       y = "Probabilitas") +
  theme_minimal() +
  scale_x_continuous(breaks = k)

Another example: n=20, p=0.3 (probability of rain), skewed right with mode at k≈6.

Relevance: Shows how p affects shape, useful for predictions such as business models.

library(ggplot2)

# Functions for binomial plots
plot_binom <- function(n, p) {
  k <- 0:n
  pmf <- dbinom(k, size = n, prob = p)
  df <- data.frame(k = k, pmf = pmf)
  ggplot(df, aes(x = k, y = pmf)) +
    geom_bar(stat = "identity", fill = "skyblue", alpha = 0.7) +
    labs(title = paste("Distribution Binomial (n =", n, ", p =", p, ")"),
         x = "Number of Succesess (k)", y = "Probabilitas") +
    theme_minimal() +
    scale_x_continuous(breaks = k)
}

# Multiple visuals: one example
plot_binom(20, 0.2)  # Skewed

References

[1] Ross, S. M. (2014). A First Course in Probability (9th ed.). Pearson. (Bab 2: Sample Spaces & Events)

[2] Professor Leonard. (2019). Probability, Sample Spaces, and the Complement Rule

[3] LibreTexts Statistics. (2023). Complement Rule

[4] Grinstead, C. M., & Snell, J. L. (n.d.). Introduction to Probability. Dartmouth College.

[5] Evans, M. J., & Rosenthal, J. S. (n.d.). Probability and Statistics: The Science of Uncertainty.

[6] Jaynes, E. T. (2003). Probability Theory: The Logic of Science. Cambridge University Press.

[7] Grimmett & Welsh — Probability: An Introduction Fokus: Ch. 1-3 (basic probability space, union axioms)

[8] Bertsekas & Tsitsiklis — Introduction to Probability Fokus: Ch. 1 (union formula), Ch. 2 (conditional probability)

[9] Ross, S.M. (2019). Introduction to Probability Models (12th ed.).

[10] Blitzstein, J. K., & Hwang, J. (2019). Introduction to probability (2nd ed.). CRC Press.

[11] Gnedenko, B. V. (1962). The theory of probability (translated)

[12] Mendenhall, W., Beaver, R. J., & Beaver, B. M. (2020). Introduction to probability and statistics (15th ed.)

[13] Wickham, H., Çetinkaya-Rundel, M., & Grolemund, G. (2023). R for data science (2nd ed.). O’Reilly Media.