Undergraduate Student in Data Science at Institut Teknologi Sains Bandung
A digital learning platform claims that the average daily study time of its users is 120 minutes. Based on historical records, the population standard deviation is known to be 15 minutes.
A random sample of 64 users shows an average study time of 116 minutes.
\[ \begin{eqnarray*} \mu_0 &=& 120 \\ \sigma &=& 15 \\ n &=& 64 \\ \bar{x} &=& 116 \end{eqnarray*} \]
1.Formulate the Null Hypothesis (H₀) and Alternative Hypothesis (H₁).
Null Hypothesis (H₀): \[H_0:\mu=120\] This hypothesis represents the platform’s claim that the average daily study time of users is exactly 120 minutes.
Alternative Hypothesis (H₁): \[H_1:\mu<120\] This hypothesis states that the true average daily study time of users is less than 120 minutes.
2.Identify the appropriate statistical test and justify your choice.
The appropriate statistical test for this problem is a One-Sample Z-Test because:
3.Compute the statistic and p-value using \(\alpha = 0.05\).
\[Z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}}\]
Where:
Standard Error (SE):
\[SE = \frac{\sigma}{\sqrt{n}} = \frac{15}{\sqrt{64}} = \frac{15}{8} = 1.875\]
Z-statistic:
\[Z = \frac{116 - 120}{1.875} = \frac{-4}{1.875} = -2.13\]
Since this is a two-tailed test, we need to find the probability in both tails.
\[P(Z \leq -2.13) \approx 0.0166\]
\[\text{P-value} = 2 \times 0.0166 = 0.0332\]
4.State the Statistical Decision
Decision Rule:
Decision:
Given:
P-value = 0.0332
Significance level \(\alpha = 0.05\)
Since p-value (0.0332) \(< \alpha\) (0.05), REJECT the null hypothesis (\(H_0\)).
5.Interpret the Result in a Business Analytics Context
At the 5% significance level, there is sufficient statistical evidence to reject the platform’s claim that the average daily study time is 120 minutes. The sample data suggests that the true average study time is significantly different from 120 minutes. The actual average study time (116 minutes) is statistically significantly lower than the claimed 120 minutes.
Business Implications:
1.Platform Performance: The platform is not meeting its claimed engagement target. Users are studying approximately 4 minutes less per day than advertised.
2.Marketing Concerns: If the 120-minute claim is used in marketing materials, it may be misleading to potential customers and could lead to reputation issues.
3.User Engagement: The lower study time could indicate:
The platform should address the gap between claimed and actual study time by improving user engagement and adjusting its communication strategy to maintain credibility with users and stakeholders.
A UX Research Team investigates whether the average task completion time of a new application differs from 10 minutes.
The following data are collected from 10 users:
\[ 9.2,\; 10.5,\; 9.8,\; 10.1,\; 9.6,\; 10.3,\; 9.9,\; 9.7,\; 10.0,\; 9.5 \]
Soal 1
1.Define H₀ and H₁ (two-tailed).
Null Hypothesis (H₀):
\[ \begin{array}{l} H_0: \mu = 10 \text{ minutes} \end{array} \] The null hypothesis states that the average task completion time is exactly 10 minutes
Alternative Hypothesis (H₁):
\[ \begin{array}{l} H_1: \mu \neq 10 \text{ minutes} \end{array} \] The alternative hypothesis states that the true mean is different from 10 minutes.
Soal 2
2.Determine the Appropriate Hypothesis Test
The One-Sample T-Test is the most appropriate method for this analysis because:
3.Calculate the t-statistic and p-value at \(\alpha\) = 0.05
Sample data:
\[9.2,\; 10.5,\; 9.8,\; 10.1,\; 9.6,\; 10.3,\; 9.9,\; 9.7,\; 10.0,\; 9.5\]
Sample Mean (\(\bar{x}\)):
\[ \begin{array}{rcl} \bar{x} &=& \frac{\sum x_i}{n} = \frac{9.2 + 10.5 + 9.8 + 10.1 + 9.6 + 10.3 + 9.9 + 9.7 + 10.0 + 9.5}{10} \\ \bar{x} &=& \frac{98.6}{10} = 9.86 \text{ minutes} \end{array} \]
Standard Deviation (\(s\)):
\[ \begin{array}{rcl} \sum (x_i - \bar{x})^2 &=& (9.2-9.86)^2 + (10.5-9.86)^2 + \ldots + (9.5-9.86)^2 \\ &=& (-0.66)^2 + (0.64)^2 + (-0.06)^2 + (0.24)^2 + (-0.26)^2 \\ & & + (0.44)^2 + (0.04)^2 + (-0.16)^2 + (0.14)^2 + (-0.36)^2 \\ &=& 0.4356 + 0.4096 + 0.0036 + 0.0576 + 0.0676 \\ & & + 0.1936 + 0.0016 + 0.0256 + 0.0196 + 0.1296 \\ \sum (x_i - \bar{x})^2 &=& 1.344 \end{array} \]
Sample Variance:
\[ \begin{array}{rcl} s^2 = \frac{\sum (x_i - \bar{x})^2}{n-1} = \frac{1.344}{9} = 0.1493 \end{array} \]
Sample Standard Deviation:
\[ \begin{array}{rcl} s = \sqrt{0.1493} = 0.3864 \text{ minutes} \end{array} \]
Standard Error (SE):
\[ \begin{array}{rcl} SE = \frac{s}{\sqrt{n}} = \frac{0.3864}{\sqrt{10}} = \frac{0.3864}{3.162} = 0.1222 \end{array} \]
T-statistic:
\[ \begin{array}{rcl} t = \frac{9.86 - 10}{0.1222} = \frac{-0.14}{0.1222} = -1.146 \end{array} \]
Degrees of Freedom: \[df = n - 1 = 10 - 1 = 9\]
4.Make a Statistical Decision
Decision Rule:
Decision:
Given:
P-value = 0.281
Significance level \(\alpha = 0.05\)
Since p-value (0.281) \(> \alpha\) (0.05), FAIL TO REJECT the null hypothesis (\(H_0\)).
Soal 5
5.Explain How Sample Size Affects Inferential Reliability
Sample size plays a crucial role in the reliability of statistical inference.
When the sample size is small:
As sample size increases:
In this case, the small sample size (\(n = 10\)) limits the strength of the conclusion. Although the sample mean is slightly below 10 minutes, the data do not provide strong enough evidence to conclude a real difference.
A product analytics team conducts an A/B test to compare the average session duration (minutes) between two versions of a landing page.
| Version | Sample Size (n) | Mean | Standard Deviation |
|---|---|---|---|
| A | 25 | 4.8 | 1.2 |
| B | 25 | 5.4 | 1.4 |
Soal 1
1.Formulate the null and alternative hypotheses.
Null Hypothesis (H₀): \[H_0:\mu_A=\mu_B\]
There is no difference in the average session duration between Version A and Version B.
Alternative Hypothesis (H₁): \[H_1:\mu_A\neq\mu_B\]
There is a difference in the average session duration between the two landing page versions.
Soal 2
2.Identify the type of t-test required.
The appropriate statistical test for this scenario is an Independent Two-Sample t-Test because:
Soal 3
3.Compute the test statistic and p-value.
The test statistic for a two-sample t-test (Welch’s version) is given by: \[t = \frac{\bar{x}_A - \bar{x}_B} {\sqrt{\frac{s_A^2}{n_A} + \frac{s_B^2}{n_B}}}\]
Where:
Standard Error (SE):
\[\begin{array}{rcl} SE &=& \sqrt{\dfrac{s_A^2}{n_A} + \dfrac{s_B^2}{n_B}} \\[6pt] &=& \sqrt{\dfrac{1.2^2}{25} + \dfrac{1.4^2}{25}} \\[6pt] &=& \sqrt{\dfrac{1.44}{25} + \dfrac{1.96}{25}} \\[6pt] &=& \sqrt{0.0576 + 0.0784} \\[6pt] &=& \sqrt{0.136} \\[6pt] &\approx& 0.369 \end{array}\]
t-statistic: \[\begin{array}{rcl} t &=& \dfrac{4.8 - 5.4}{0.369} \\[6pt] &=& \dfrac{-0.6}{0.369} \\[6pt] &\approx& -1.63 \end{array}\]
Degrees of freedom (df) using Welch’s approximation:
\[ \begin{array}{rcl} df = \frac{\left(\frac{s_A^2}{n_A} + \frac{s_B^2}{n_B}\right)^2}{\frac{\left(\frac{s_A^2}{n_A}\right)^2}{n_A - 1} + \frac{\left(\frac{s_B^2}{n_B}\right)^2}{n_B - 1}} \end{array} \]
Calculate numerator:
\[ \begin{array}{rcl} \left(\frac{s_A^2}{n_A} + \frac{s_B^2}{n_B}\right)^2 = (0.0576 + 0.0784)^2 = (0.136)^2 = 0.0185 \end{array} \]
Calculate denominator:
\[ \begin{array}{rcl} \frac{(0.0576)^2}{24} + \frac{(0.0784)^2}{24} &=& \frac{0.00332}{24} + \frac{0.00615}{24} \\ &=& 0.000138 + 0.000256 &= 0.000394 \end{array} \]
Calculate df:
\[ \begin{array}{rcl} df = \frac{0.0185}{0.000394} = 46.95 \approx 47 \end{array} \]
Calculate P-Value
For a two-tailed test with \(t = 1.627\) and \(df = 47\):
Soal 4
4.Draw a statistical conclusion at \(\alpha = 0.05\).
Decision Rule:
If p-value \(\leq \alpha\), reject \(H_0\)
If p-value \(>\alpha\), fail to reject \(H_0\)
Decision:
Given:
P-value = 0.111
Significance level \(\alpha = 0.05\)
Since p-value (0.111) \(>\alpha\) (0.05), FAIL TO REJECT the null hypothesis (\(H_0\)).
Soal 5
5.Interpret the result for product decision-making.
From a product analytics perspective, the results indicate that although Version B shows a higher average session duration (5.4 minutes) compared to Version A (4.8 minutes), this observed difference is not statistically significant.
This implies that:
Overall, while Version B appears promising descriptively, the statistical evidence does not yet justify a definitive product change.
An e-commerce company examines whether device type is associated with payment method preference.
| Device / Payment | E-Wallet | Credit Card | Cash on Delivery |
|---|---|---|---|
| Mobile | 120 | 80 | 50 |
| Desktop | 60 | 90 | 40 |
Soal 1
1.State the Null Hypothesis (H₀) and Alternative Hypothesis (H₁).
Null Hypothesis (H₀):
\[ \begin{array}{l} H_0: \text{Device type and payment method are independent} \end{array} \]
In other words, the choice of payment method does not depend on the device type used.
Alternative Hypothesis (H₁):
\[ \begin{array}{l} H_1: \text{Device type and payment method are not independent} \end{array} \]
This means there is a relationship between device type and payment method choice.
Soal 2
2.Identify the Appropriate Statistical Test
The Chi-Square Test of Independence is the most appropriate method for this analysis because:
Soal 3
3.Compute the Chi-Square statistic (χ²).
The expected frequency for each cell is calculated using:
\[ \begin{array}{rcl} E_{ij} = \frac{(\text{Row Total}_i) \times (\text{Column Total}_j)}{\text{Grand Total}} \end{array} \]
Expected Frequencies Table:
\[ \begin{array}{rcl} E_{\text{Mobile, E-Wallet}} &=& \frac{250 \times 180}{440} = \frac{45000}{440} = 102.27 \\ E_{\text{Mobile, Credit Card}} &=& \frac{250 \times 170}{440} = \frac{42500}{440} = 96.59 \\ E_{\text{Mobile, COD}} &=& \frac{250 \times 90}{440} = \frac{22500}{440} = 51.14 \end{array} \]
\[ \begin{array}{rcl} E_{\text{Desktop, E-Wallet}} &=& \frac{190 \times 180}{440} = \frac{34200}{440} = 77.73 \\ E_{\text{Desktop, Credit Card}} &=& \frac{190 \times 170}{440} = \frac{32300}{440} = 73.41 \\ E_{\text{Desktop, COD}} &=& \frac{190 \times 90}{440} = \frac{17100}{440} = 38.86 \end{array} \]
Expected Frequencies Table:
| Device | E-Wallet | Credit Card | Cash on Delivery |
|---|---|---|---|
| Mobile | 102.27 | 96.59 | 51.14 |
| Desktop | 77.73 | 73.41 | 38.86 |
Verification: All expected frequencies are ≥ 5, so the Chi-Square test is appropriate.
The Chi-Square test statistic is calculated using:
\[ \begin{array}{rcl} \chi^2 = \sum \frac{(O_{ij} - E_{ij})^2}{E_{ij}} \end{array} \]
Where:
\[ \begin{array}{rcl} \text{Mobile, E-Wallet:} && \frac{(120 - 102.27)^2}{102.27} = \frac{(17.73)^2}{102.27} = \frac{314.35}{102.27} = 3.074 \\ \text{Mobile, Credit Card:} && \frac{(80 - 96.59)^2}{96.59} = \frac{(-16.59)^2}{96.59} = \frac{275.23}{96.59} = 2.850 \\ \text{Mobile, COD:} && \frac{(50 - 51.14)^2}{51.14} = \frac{(-1.14)^2}{51.14} = \frac{1.30}{51.14} = 0.025 \end{array} \]
\[ \begin{array}{rcl} \text{Desktop, E-Wallet:} && \frac{(60 - 77.73)^2}{77.73} = \frac{(-17.73)^2}{77.73} = \frac{314.35}{77.73} = 4.044 \\ \text{Desktop, Credit Card:} && \frac{(90 - 73.41)^2}{73.41} = \frac{(16.59)^2}{73.41} = \frac{275.23}{73.41} = 3.750 \\ \text{Desktop, COD:} && \frac{(40 - 38.86)^2}{38.86} = \frac{(1.14)^2}{38.86} = \frac{1.30}{38.86} = 0.033 \end{array} \]
\[ \begin{array}{rcl} \chi^2 &=& 3.074 + 2.850 + 0.025 + 4.044 + 3.750 + 0.033 \\ \chi^2 &=& 13.776 \end{array} \]
\[ \begin{array}{rcl} df = (r - 1) \times (c - 1) \end{array} \]
Where:
\[ \begin{array}{rcl} df = (2 - 1) \times (3 - 1) = 1 \times 2 = 2 \end{array} \]
Soal 4
4.Determine the p-value at \(\alpha = 0.05\).
Using Chi-Square distribution table or calculator with \(\chi^2 = 13.776\) and \(df = 2\):
\[ \begin{array}{rcl} P(\chi^2 \geq 13.776) \approx 0.001 \end{array} \]
At \(\alpha = 0.05\) and \(df = 2\), the critical value from Chi-Square table:
\[ \begin{array}{rcl} \chi^2_{\text{critical}} = 5.991 \end{array} \]
Since \(\chi^2 = 13.776 > 5.991\), we reject \(H_0\).
Decision Rule:
Decision:
Given:
Since p-value (0.001) \(< \alpha\) (0.05), REJECT the null hypothesis (\(H_0\)).
Soal 5
5.Interpret the results in terms of digital payment strategy.
From a digital payment and business strategy perspective:
Strategic Implications:
Overall, this analysis provides actionable insights that can help the company tailor its payment infrastructure and user experience to better match customer behavior.
A fintech startup tests whether a new fraud detection algorithm reduces fraudulent transactions.
Soal 1
1.Explain a Type I Error (α) in this context.
A Type I Error occurs when the null hypothesis is rejected even though it is true.
In this case, a Type I Error means:
“The fintech startup concludes that the new fraud detection algorithm reduces fraud, when in reality it does not.”
Practical Implications:
Thus, a Type I Error represents a false positive, where the algorithm is believed to work when it actually does not.
Soal 2
2.Explain a Type II Error (β) in this context.
A Type II Error occurs when the null hypothesis is not rejected even though the alternative hypothesis is true.
In this context, a Type II Error means:
“The fintech startup concludes that the new fraud detection algorithm does not reduce fraud, when in reality it does reduce fraud.”
Practical Implications:
A Type II Error represents a false negative, where a useful improvement is overlooked.
Soal 3
3.Identify which error is more costly from a business perspective.
From a business standpoint, Type I Error is generally more costly in this scenario.
Reasoning:
While Type II Errors also have costs (missed improvements), the direct and immediate financial risks associated with Type I Errors in fraud detection are typically more severe.
Soal 4
4.Discuss how sample size affects Type II Error.
Sample size plays a crucial role in determining the probability of a Type II Error.
Therefore, larger sample sizes improve the ability to detect true fraud reduction.
Soal 5
5.Explain the relationship between α, β, and statistical power.
The key relationships are:
Mathematically:
\[ \text{Power} = 1 - \beta \]
A churn prediction model evaluation yields the following results:
Soal 1
1.Explain the meaning of the p-value.
The p-value represents the probability of observing a test statistic at least as extreme as the one obtained, assuming that the null hypothesis (H₀) is true.
In this context, a p-value of 0.021 means:
“If the churn model actually provides no real improvement, there is a 2.1% chance of observing a test statistic as large as 2.31 purely due to random variation.”
A smaller p-value indicates that the observed result is less likely to be caused by random chance, providing stronger evidence against the null hypothesis.
Soal 2
2.Make a statistical decision.
The standard decision rule for hypothesis testing:
\[ \begin{array}{ll} \text{If } p\text{-value} \leq \alpha: & \text{Reject } H_0 \\ \text{If } p\text{-value} > \alpha: & \text{Fail to reject } H_0 \end{array} \]
Given:
Comparison:
\[ \begin{array}{rcl} \text{p-value} &=& 0.021 \\ \alpha &=& 0.05 \\ 0.021 &<& 0.05 \end{array} \]
Statistical Decision:
Since p-value (0.021) < α (0.05), we REJECT the null hypothesis (\(H_0\)).
This indicates that the churn prediction model’s performance is statistically significant at the 5% level.
Soal 3
3.Translate the decision into non-technical language for management.
In plain, non-technical language:
“The results suggest that the improvement we see in the churn prediction model is unlikely to be due to random chance. We can be reasonably confident that the model is genuinely performing better than a baseline approach.”
This supports moving forward with further validation, controlled deployment, or business integration of the model.
Soal 4
4.Discuss the risk if the sample is not representative.
Statistical conclusions rely heavily on the assumption that the sample data accurately represent the broader customer population.
If the sample is not representative:
In short, a statistically significant result does not guarantee real-world success if the underlying data are biased.
Soal 5
5.Explain why the p-value does not measure effect size.
The p-value indicates whether an effect exists, not how large or meaningful that effect is.
Key points:
To assess effect size, additional metrics are needed, such as:
Siregar, B. (n.d.). Introduction to statistics: Chapter 9: Statistical Inference. dsciencelabs. https://bookdown.org/dsciencelabs/intro_statistics/09-Statistical_Inference.html?authuser=0
Mansyur, S. (2025). Statistik dasar. UP45 Press – Universitas Proklamasi 45. https://press.up45.ac.id/wp-content/uploads/sites/42/2025/03/STATISTIK-DASAR-BOOK-CHAPTER_ok_KIRIM.pdf
Levine, D. M., & Stephan, D. F. (2022). Hypothesis testing: Z and t tests. In Even you can learn statistics and analytics: An easy to understand guide. Addison-Wesley Professional. https://www.oreilly.com/library/view/even-you-can/9780137654789/