Probability is a field of mathematics that studies the likelihood of
an event occurring. Understanding this concept is crucial as it forms
the basis for data analysis, statistics, and decision-making under
uncertainty.
This report is based on six videos that systematically discuss the
flow of probability concepts. The first video introduces the
sample space and events as the
foundation for probability calculation. The second video explains the
difference between independent and
dependent events. The third video discusses the rules
for the union and intersection of
events. The fourth video reaffirms the basic probability rules. The
fifth video introduces the binomial experiment and its
formula, while the sixth video visualizes the binomial
distribution for easier comprehension.
Overall, these six videos complement each other and provide a concise,
comprehensive overview of fundamental concepts up to the initial
applications of probability.
6.1 Fundamental Concept
Explanatory Video
If the video does not appear, please click the YouTube link below:
This video explains the basic definitions in probability theory:
sample space, event, and how to
determine the probability of an event. It then introduces the basic
rules of probability: every probability value is between 0 and 1; the
total probability of all outcomes in the sample space equals 1.
✨ 1. Probability and Value
Probability (\(P(A)\)) is a measure
of how likely an event is to occur. Probability Value: The value of
\(P(A)\) must always be in the range of
0 to 1.
\[0 \le P(A) \le 1\]
\(0\): The event is
impossible.
\(1\): The event is certain to
occur.
✨ 2. Sample Space (\(S\))
The sample space (\(S\)) is the set of all possible outcomes of
an experiment. - Event (\(A\)): An event is a single outcome or a
collection of outcomes (a subset) of the sample space. - Basic
Formula: If every outcome is equally likely (fair), then:
\[ P(A) = \frac{\text{Number of outcomes
in } A \text{ (favorable outcomes)}}{\text{Number of outcomes in sample
space} \text{ (total possible outcomes)}}\]
All probability problems must satisfy these two essential
conditions: a. Probability Range: The probability of an event must
always be between 0 and 1: \[0 \le P(A) \le
1\] b. Total Probability: The total probability of all outcomes
in the sample space must always equal 1: \[P(S) = 1\] - Example: \(P(\text{Heads}) + P(\text{Tails}) = 0.5 + 0.5 =
1.0\).
✨ 4. Complement Rule
The Complement Rule is used to calculate the
probability that an event does not occur. -
Complement (\(A^c\)):
The complement of event \(A\) is the
event that \(A\) does not occur. -
Rule: \[ P(A^c) = 1 - P(A)\] - This
rule holds because \(A\) and \(A^c\) together cover the entire sample
space (\(P(A) + P(A^c) = 1\)).
Usefulness: This rule is very useful when it is easier to calculate
the probability of the opposite (complement) than to calculate the
probability of the original event directly.
✨ 5. Example Application of the Complement Rule
Situation: Tossing two coins. Determine the probability of
NOT getting two tails (\(A^c\)). 1. Calculate the probability of
event \(A\) (getting two tails): \(P(TT) = 0.25\). 2. Use the Complement
Rule: \[P(\text{Not } TT) = 1 - P(TT) = 1 -
0.25 = 0.75\] The probability of not getting two tails is \(0.75\) or \(75\%\).
6.2 Independent and Dependent
Explanatory Video
If the video does not appear, please click the YouTube link below:
This video explains the difference between independent
events and dependent events in probability. It
also explains how to calculate the probability of two events occurring
simultaneously (\(A\) and \(B\)), called the
Intersection, the formula for which depends on whether
the events are Independent or Dependent.
☀️ 1. Basic Definitions
# Load librarylibrary(knitr)# Membuat data frametabel_konsep <-data.frame(Concept =c("Event", "Intersection (A∩B)"),Brief_Explanation =c("An outcome or a collection of outcomes that can occur in a random experiment.", "The probability that event A and event B occur together."))# Menampilkan tabelkable(tabel_konsep, caption ="Basic Probability Concepts")
Basic Probability Concepts
Concept
Brief_Explanation
Event
An outcome or a collection of outcomes that can occur
in a random experiment.
Intersection (A∩B)
The probability that event A and event B occur
together.
☀️ 2. Independent Events
Two events \(A\) and \(B\) are called independent if the
occurrence of \(A\) does not affect the
probability of \(B\) occurring.
Main Formula: The probability of the intersection is calculated
by multiplying the probabilities of the individual events. \[P(A \cap B) = P(A) \times P(B)\]
Characteristics & Examples: - Constant Probability:The probability
of the next event remains the same, as nothing changes in the sample
space. - Sampling Type: Usually occurs with sampling with replacement. -
Example: Tossing a coin twice, or drawing a marble and then replacing it
before the second draw.
☀️ 3. Dependent Events
Two events \(A\) and \(B\) are called dependent if the occurrence
of \(A\) changes the probability of
\(B\) occurring.
Key Concept: Conditional Probability For dependent events, we use
the concept of Conditional Probability, symbolized as \(P(B \mid A)\). - \(P(B \mid A)\) is the probability that event
\(B\) occurs, given that event \(A\) has already occurred.
Main Formula: The probability of the intersection is calculated
by multiplying the probability of the first event by the probability of
the second event after the first one has occurred. \[P(A \cap B) = P(A) \times P(B \mid A)\]
Characteristics & Examples: - Changing Probability: The
probability of \(B\) changes because
the composition of the sample space has changed after \(A\) occurred. - Sampling Type: Usually
occurs with sampling without replacement. - Example: Drawing a card
from a deck without replacing it, or drawing a marble and not replacing
it.4. Example ApplicationSituation: A jar contains 4 Red and 6 Blue
marbles (Total 10). You want to draw 2 Red marbles consecutively.
☀️ 4. Example Application
Situation: A jar contains 4 Red and 6 Blue marbles (Total 10). You
want to draw 2 Red marbles consecutively.
library(knitr)# Membuat data frametabel_kejadian <-data.frame(Event_Type =c("Independent", "Dependent"),Condition =c("Sampling with replacement", "Sampling without replacement"),Calculation =c("$$P(R1 \\cap R2) = P(R1) \\times P(R2) = \\frac{4}{10} \\times \\frac{4}{10} = \\frac{16}{100} = 0.16$$","$$P(R1 \\cap R2) = P(R1) \\times P(R2 \\mid R1) = \\frac{4}{10} \\times \\frac{3}{9} = \\frac{12}{90} \\approx 0.133$$" ))# Menampilkan tabelkable(tabel_kejadian, caption ="Probability Calculation for Independent and Dependent Events", escape =FALSE)
Probability Calculation for Independent and Dependent
Events
This video explains how to calculate the probability of “A or B” or “A
and B” when dealing with two (or more) events. Key terms:
Union — means “A or B (or both)”. Written as: (A B)
Intersection — means “A and B occur together”. Written
as: (A B)
Because there can be overlap (the area where two events occur
simultaneously / intersection) between A and B, we must avoid
double-counting the intersection when calculating the union probability.
🚀 1. Basic Definitions in Probability (Review)
Sample Space (\(S\)):
The Sample Space is the set of all possible outcomes of a random
experiment.
Example from the Video: Rolling two 6-sided dice; the total sample
space is \(6 \times 6 = 36\) possible
outcomes.
Event:
An event is a subset of the sample space, which is a collection of
outcomes that satisfy a specific condition.
Intersection (\(A \cap
B\)):
\(A \cap B\) is the set of outcomes
included in both \(A\) and \(B\) — meaning both events occur
simultaneously.
This is called the overlap area or duplicate outcomes.
Union (\(A \cup B\)):
\(A \cup B\) is the probability
that at least one of the events occurs (\(A\) or \(B\) occurs, or both).
🚀 2. Formula for Union Probability — General Addition Rule
We use this formula when the question contains the keyword “or”.
General Addition Rule: \[P(A \cup B) = P(A) +
P(B) - P(A \cap B)\] - \(P(A \cup
B)\): The probability that \(A\)
or \(B\) occurs. - \(P(A \cap B)\): The probability of the
intersection (\(A\) and \(B\) occur simultaneously).
For Disjoint (Mutually Exclusive) Events: If \(A\) and \(B\) are disjoint, their intersection
probability is zero: \[P(A \cap B) =
0\] Thus, the formula simplifies to: \[P(A \cup B) = P(A) + P(B)\]
🚀 3. Why Must We Subtract the Intersection (\(P(A \cap B)\))?
Objective: The union formula requires the term \(- P(A \cap B)\) because we want to
eliminate duplicate outcomes.
Visual Explanation: When you directly add \(P(A) + P(B)\), the outcomes in the
intersection area (\(A \cap B\)) are
counted twice.
To ensure every outcome in the sample space is counted exactly once,
the intersection probability must be subtracted once.
🚀 4. Example Application
Using the example of rolling two dice: - A: Event “Getting two
even numbers” \(\rightarrow P(A) =
\frac{9}{36}\) - B: Event “Getting at least one 2” \(\rightarrow P(B) = \frac{11}{36}\)
Then: - Intersection (\(A \cap
B\)): Event “Getting two even numbers AND at least one 2”. \[ P(A \cap B) = \frac{5}{36} \quad \text{(taken
from the sample space overlap)}\] - Union (\(A \cup B\)): Probability of getting \(A\) OR \(B\). \[\begin{aligned}P(A \cup B) &= P(A) + P(B) -
P(A \cap B) \&= \frac{9}{36} + \frac{11}{36} - \frac{5}{36} \&=
\frac{15}{36} \approx 0.4167\end{aligned}\] Meaning: the
probability of getting “two even or at least one 2” is approximately
41.67%.
6.4 Exclusive and Exhaustive
Explanatory Video
If the video does not appear, please click the YouTube link below:
This video explains the difference and definitions of two types of
events in probability theory:
Mutually Exclusive Events — two events that cannot
occur simultaneously.
Exhaustive Events — a collection of events that covers
all possible outcomes of an experiment.
⭐ 1. Exclusive Events (Mutually Exclusive)
Definition Two events are said to be mutually exclusive if they
cannot occur at the same time. This means no outcome appears in both
events simultaneously.
Mathematically: \[A \cap B =
\emptyset\] There is no “overlap”.
Example: On a single die roll:
A = rolling an even number = {2,4,6}
B = rolling an odd number = {1,3,5}
A and B have no intersection → mutually exclusive.
Formula ConsequenceIf A and B are mutually exclusive: \[P(A \cup B) = P(A) + P(B)\] No need to
subtract the overlap because there is no intersection.
efinition Events are called exhaustive if together they cover the
entire sample space. \[P(A \cup B) =
S\] All possible outcomes are included in A or B or both.
Example Die roll:
A = even number = {2,4,6}
B = odd number = {1,3,5}
Union of A and B = S = {1,2,3,4,5,6}
→ A & B are exhaustive. Note: They are both exhaustive and
mutually exclusive.
Another example that is not mutually exclusive but is still
exhaustive:
A = number > 3 → {4,5,6}
B = number \(\le\) 3 → {1,2,3}
The union = the entire sample space. There is no requirement to be
mutually exclusive.
⭐3. Key Differences
Differences Between Mutually Exclusive and Exhaustive
Events
Concept
Mutually_Exclusive
Exhaustive
Can A and B occur together?
No
Yes (overlap is allowed or not allowed)
Intersection A ∩ B
Always empty
Can be empty / can be non-empty
Union A ∪ B
Not necessarily = S
Must equal the entire sample space
Example
Even vs Odd
Even & Odd, or >3 ≤ 3
⭐ 4. Relationship Between the Two
Two events can be mutually exclusive + exhaustive (example: even
vs. odd).
They can be exhaustive but not exclusive (example: A = \(\ge 3\), B = \(\le 4\)).
They can be exclusive but not exhaustive (example: A = 1, B = 6 on a
die → they don’t cover the whole 1–6).
This video explains the basic concept of a Binomial Experiment and how
to calculate probabilities using the Binomial Formula. The core concept
of the binomial probability distribution is that it is a probability
distribution for an experiment that is repeated multiple times and
yields only two possible outcomes (as indicated by the prefix “bi-”),
which are: Success or Failure.
🪐 1. Binomial Experiment
A Binomial Experiment is a type of statistical trial that satisfies
four key characteristics: - There is a fixed number of trials (\(n\)). - Each trial has only two
outcomes: success or failure.The probability of success (\(p\)) is constant for every trial. - The
trials are independent (they do not influence each other).
Case examples:Coin Toss: - Calculating the probability of
getting exactly one head (success) in three coin tosses. Since all 4
conditions are met (fixed number of trials \(n=3\), only two outcomes, constant
probability \(p=0.5\), and independent
trials), this is a binomial experiment. - Marble Draw: Calculating
the probability of getting exactly two green marbles (success) in five
draws with replacement. Replacement ensures that the probability of
success remains constant in every trial, thus satisfying the binomial
condition.
🪐 2. Conditions for a Binomial Experiment
A trial is called binomial if it meets four conditions:
✔ Fixed number of trials The number of trials \(n\) is predetermined. ✔ Two possible
outcomes Each trial yields only success or failure. ✔ Constant
probability The probability of success \(p\) is the same for every trial. ✔
Independent trials One trial does not affect the others.
🪐 3. Important Notation in Binomial Distribution
\(n\) = number of trials
\(x\) = number of successes
\(p\) = probability of success
\(q = 1 - p\) = probability of
failure
\(P(X = x)\) = probability of
getting exactly \(x\) successes
🪐 4. Binomial Distribution Formula
The binomial probability formula: \[P(k) =
{n \choose k} \cdot p^k \cdot (1-p)^{\,n-k}\] Where: - \(P(k)\) : the probability of getting exactly
\(k\) successes. - \({n \choose k}\) : the combination, which is
the number of ways to get \(k\)
successes fromsukses dari \(n\)
trials. - \(n\) : total number of
trials. - \(k\) : desired number of
successes. - \(p\) : probability of
success on a single trial. - \((1-p)\) : probability of failure on a
single trial.
🪐 5. Binomial Example Problem
Using the marble draw case to demonstrate the formula: Case: If 5
marbles are drawn with replacement, what is the probability of getting
exactly 2 green marbles?
\(n = 5\) (Trials) \(k = 2\) (Successes) \(p = 0.2\) (Probability of green, from \(\frac{2}{10}\)) \(1-p = 0.8\) (Probability of non-green)
Thus: \[P(k=2) = {5 \choose
2}(0.2)^2(1-0.2)^{5-2}\]\[P(k=2) = 10
\times (0.04) \times (0.8)^3\]\[P(k=2) = 10 \times 0.04 \times 0.512\]\[P(k=2) = 0.2048\] So, the
probability is 0.2048.
##🪐 6. Why is it called “Binomial”?
The name Binomial comes from the fact that this distribution always
has two possible outcomes in every trial, indicated by the prefix bi
(meaning two).
6.6 Binomial Distribution
Explanatory Video
If the video does not appear, please click the YouTube link below:
This video continues the discussion on the Binomial Distribution,
focusing on how to visualize the data and understand its distributional
properties, and visually explaining how the shape of the Binomial
Distribution graph is influenced by changes in the number of trials
((n)) and the probability of success ((p)).
🌙 1. Formula and Basic Visualization
Objective: The Binomial Distribution is used to calculate the
probability of getting a specific number of successes (\(k\)) from a fixed number of trials (\(n\)).
Example: Tossing a coin 2 times (\(n=2\)) with a probability of Head (\(p=0.5\)). The probabilities for getting 0,
1, or 2 Heads are calculated using the Binomial Formula.
Graph: These probabilities are visualized using a bar chart
(histogram), where the X-axis is the number of successes (\(k\)) and the Y-axis is the probability
(\(P(k)\)).
🌙 2. Important Parameters
Formulas for parameters to measure the Binomial Distribution:
# Load library for kablelibrary(knitr)# Membuat data frame tabeltabel_binomial <-data.frame(Parameter =c("Mean (μ)", "Variance (σ²)", "Standard Deviation (σ)"),Formula =c("μ = n · p","σ² = n · p · (1 - p)","σ = √(n · p · (1 - p))"),Description =c("The expected average number of successes.","A measure of the spread of the data from the mean.","The square root of the variance."))# Menampilkan tabelkable(tabel_binomial, caption ="Table of Binomial Distribution Parameters")
Table of Binomial Distribution Parameters
Parameter
Formula
Description
Mean (μ)
μ = n · p
The expected average number of successes.
Variance (σ²)
σ² = n · p · (1 - p)
A measure of the spread of the data from the mean.
Standard Deviation (σ)
σ = √(n · p · (1 - p))
The square root of the variance.
🌙 3. Influence of Probability of Success (\(p\)) on Distribution Shape
The probability of success (\(p\))
controls the shape of the distribution curve. It determines whether the
graph will be symmetrical or skewed:
✨ Shape of the Binomial Distribution ✨
Value_p
Distribution_Shape
Description
p = 0.5 (50%)
Symmetrical
Perfectly symmetrical distribution, its peak is in the middle,
resembling the Normal Distribution.
p < 0.5
Skewed Right
Low probability of success, so most outcomes pile up at small number of
successes (near 0).
p > 0.5
Skewed Left
High probability of success, so most outcomes pile up at large number of
successes (near n).
Observation: The histogram peak is near the value 0 (small number of
successes). The tail of the curve extends to the right towards larger
number of successes. This happens because the probability of failure
(\(1-p\)) is greater than the
probability of success (\(p\)), making
the most likely outcome a large number of failures.
Case \(p = 0.5\)
(Symmetrical)Visual: The graph in the center.
Observation: The histogram is perfectly symmetrical, resembling a
bell shape. The peak is exactly in the middle (at \(\mu = n \cdot p\)). This happens because
the probability of success and failure are equal (\(50\%\)), balancing the probability
distribution.
Case \(p > 0.5\)
(Skewed Left)Visual: The graph on the right.
Observation: The histogram peak shifts to the right, piling up near
the value \(n\) (large number of
successes). The tail of the curve extends to the left towards the value
0. This happens because the probability of success (\(p\)) is greater than the probability of
failure, making the most likely outcome a high number of successes.
🌙 4. Normal Approximation to Binomial
As the value of the number of trials (\(n\)) increases, the shape of the Binomial
Distribution becomes increasingly similar to the bell curve of the
Normal Distribution. - Approximation Condition: Normal Approximation
to Binomial can be used (to simplify calculations) if both of the
following conditions are met: 1. \[n
\cdot p \ge 10\] 2. \[n \cdot (1-p)
\ge 10\]
Observation: The histogram (blue bars) looks discrete and the
difference in height between bars is significant. The Normal curve (red
line) looks very wide and is not suitable for modeling this Binomial
histogram.
Validity: \(n \cdot p = 5 \cdot 0.5 =
2.5\). Since \(2.5 < 10\),
the approximation is invalid.
Medium \(n\) (e.g., \(n=20, p=0.5\))
Representation: The middle graph.
Observation: The histogram is starting to become symmetrical and
shows a bell shape. The Normal curve is starting to encompass the
histogram bars better.
Validity: \(n \cdot p = 20 \cdot 0.5 =
10\). Since \(10 \ge 10\), the
approximation condition is beginning to be met. This approximation can
start to be used, although accuracy may need improvement (usually
requiring continuity correction).
Large \(n\) (e.g., \(n=50, p=0.5\))
Representation: The bottom graph.
Observation: The histogram becomes very dense and smooth. The
midpoint of each histogram bar almost falls perfectly on top of the
Normal Distribution bell curve.
Validity: \(n \cdot p = 50 \cdot 0.5 =
25\). Since \(25 \ge 10\), the
approximation is highly valid and yields very accurate results. This is
the perfect visualization of the Normal Approximation to Binomial.
Conclusion
The material in these six videos is the backbone that strengthens our
understanding of uncertainty. In Data Science, everything is based on
probability. By mastering Probability and the Binomial Distribution, we
have the mathematical logic foundation to build, test, and trust all
prediction models and conclusions we make.
📋 1. Summary of 6 Key Topics
Fundamental Concepts Introduction to the basics of probability
(sample space, events, chance/likelihood).
Independent & Dependent Events Understanding whether the
outcome of one event affects the probability of another.
Union of Events Rules for calculating the probability of one or
more events occurring (\(P(A \text{ or }
B)\)).
Exclusive & Exhaustive Events Classification of
relationships between events. For example, mutually exclusive events
mean both cannot occur simultaneously.
Binomial Experiment Defining the conditions for a trial to have
its probability calculated using the Binomial formula (e.g., only two
outcomes: success/failure, and trials are repeated
independently).
Binomial Distribution Application tool for calculating the
probability of obtaining a specific number of successes in a fixed
number of trials that satisfy the Binomial conditions.
📋 2. Interconnection (Logical Progression)
The interrelation among these six videos is a progression from the
most basic rules to the use of a specific model tool:
Logical Foundation (V1-V4) Videos 1 to 4 form the philosophical
and mathematical basis of probability. We must understand what
probability is, how events influence each other (V2 - Independent and
Dependent), and how to calculate the probability of a union (V3 - Union)
before we can go further. The concepts in V4 (Exclusive and Exhaustive)
help us classify the relationship between these events.
Moving Towards Application (V5) The concepts from V1-V4 are used
to establish the conditions of a trial. Video 5 (Binomial Experiment)
teaches us how to identify real-world situations that satisfy all those
basic probability rules (e.g., the outcome of each coin toss is
independent, consistent with V2). This is the stage of determining
whether the Binomial model can be used.
Using the Tool (V6) Once we are confident that the situation
meets the conditions (V5), we then use the Binomial Distribution (V6) to
calculate specific probabilities (e.g., what is the probability of
getting 7 successes out of 10 trials). This calculation depends on the
validity of the conditions learned in V1-V5.(Basic Probability Rules
(V1-V4) validate the Trial Conditions (V5), which then allows us to use
the Model (V6) for prediction.)
📋3. Impact of This Material on Statistics Learning in Data
Science
A deep understanding of Probability Theory, especially the Binomial
Distribution, is crucial because it is the foundation for Statistical
Inference and Binary Data Modeling (success/failure outcomes) in Data
Science.
✨ Positive Impact of the Binomial Distribution ✨
Positive_Impact
Easy_Explanation
Understanding A/B Testing
The Binomial Distribution is the primary tool in A/B Testing…
Foundation for Classification Models
Many Data Science problems are binary classification…
Validation and P-Value
This concept is closely related to hypothesis testing and p-value…
Basis for Sampling
Independent and Dependent concepts are crucial for valid samples…
Spiegel, Murray R., and Larry J. Stephens. Schaum’s Outline of
Theory and Problems of Statistics. [https://anyflip.com/ljjmh/cgnr/basic] (Added as a
supplementary reference for problems and theory.)