Confidence Interval for Mean, \(\sigma\) Known: An e-commerce platform wants to estimate the average number of daily transactions per user after launching a new feature. Based on large-scale historical data, the population standard deviation is known.
\[ \begin{eqnarray*} \sigma &=& 3.2 \quad \text{(population standard deviation)} \\ n &=& 100 \quad \text{(sample size)} \\ \bar{x} &=& 12.6 \quad \text{(sample mean)} \end{eqnarray*} \]
Tasks
Confidence Interval Formula for Mean (σ Known): \[ CI = \bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \] where:
Solution:
Given: \(\sigma = 3.2,\quad n = 100,\quad \bar{x} = 12.6\)
Calculate Standard Error (SE): \[ SE = \frac{\sigma}{\sqrt{n}} = \frac{3.2}{\sqrt{100}} = \frac{3.2}{10} = 0.32 \]
Determine critical value \(z_{\alpha/2}\):
Calculate Margin of Error (ME) for each confidence level:
# Visualization for Case 1
library(ggplot2)
ci_data1 <- data.frame(
Level = factor(c("90%", "95%", "99%"), levels = c("90%", "95%", "99%")),
Mean = 12.6,
Lower = c(12.074, 11.973, 11.776),
Upper = c(13.126, 13.227, 13.424)
)
ggplot(ci_data1, aes(x = Level, y = Mean)) +
geom_point(size = 3, color = "blue") +
geom_errorbar(aes(ymin = Lower, ymax = Upper), width = 0.1, size = 1) +
geom_hline(yintercept = 12.6, linetype = "dashed", alpha = 0.3) +
labs(title = "Comparison of Confidence Intervals (σ Known)",
subtitle = "Higher confidence levels result in wider intervals.",
x = "Confidence Level", y = "Average Daily Transactions per User") +
theme_minimal()
Business Interpretation:
With 95% confidence, the team can conclude that the true average daily transactions for the entire user population lie between 11.97 and 13.23. This range provides a solid foundation for business decision-making regarding the success of the new feature. If a business target is 12 transactions/day, this target falls within the range, indicating the feature’s performance meets expectations.
Confidence Interval for Mean, \(\sigma\) Unknown: A UX Research team analyzes task completion time (in minutes) for a new mobile application. The data are collected from 12 users:
\[ 8.4,\; 7.9,\; 9.1,\; 8.7,\; 8.2,\; 9.0,\; 7.8,\; 8.5,\; 8.9,\; 8.1,\; 8.6,\; 8.3 \]
Tasks:
Confidence Interval Formula for Mean (σ Unknown): \[ CI = \bar{x} \pm t_{\alpha/2, df} \cdot \frac{s}{\sqrt{n}} \]
where:
Solution:
Data: 8.4, 7.9, 9.1, 8.7, 8.2, 9.0, 7.8, 8.5, 8.9, 8.1, 8.6, 8.3
1. Calculate Descriptive Statistics.
Sample size: \(n = 12\)
Sample mean (\(\bar{x}\)):
\[ \bar{x} = \frac{8.4 + 7.9 + \ldots + 8.3}{12} = \frac{101.9}{12} = 8.4917 \quad (8.49) \]
Sample standard deviation (\(s\)):
\[ \begin{aligned} s &= \sqrt{\frac{\sum_{i=1}^{n}(x_i - \bar{x})^2}{n-1}} \\ &= \sqrt{\frac{(8.4-8.4917)^2 + (7.9-8.4917)^2 + \ldots + (8.3-8.4917)^2}{11}} \\ &= \sqrt{\frac{1.8292}{11}} = \sqrt{0.1663} = 0.4078 \quad (0.41) \end{aligned} \]
Degrees of freedom: \(df = n - 1 = 12 - 1 = 11\)
2. Calculate Standard Error (SE). \[ SE = \frac{s}{\sqrt{n}} = \frac{0.4078}{\sqrt{12}} = \frac{0.4078}{3.4641} = 0.1177 \]
3. Determine critical value \(t_{\alpha/2, df}\).
4. Calculate Margin of Error (ME).
5. Calculate Upper and Lower CI bounds.
# Visualization for Case 2
ci_data2 <- data.frame(
Level = factor(c("90%", "95%", "99%"), levels = c("90%", "95%", "99%")),
Mean = 8.4917,
Lower = c(8.2803, 8.2326, 8.1261),
Upper = c(8.7031, 8.7508, 8.8573)
)
ggplot(ci_data2, aes(x = Level, y = Mean)) +
geom_point(size = 3, color = "darkgreen") +
geom_errorbar(aes(ymin = Lower, ymax = Upper), width = 0.1, size = 1, color = "darkgreen") +
labs(title = "Case 2: CI for Task Completion Time (σ Unknown)",
subtitle = "n=12: Small sample size and higher confidence widen the interval.",
x = "Confidence Level", y = "Time (minutes)") +
theme_minimal() +
ylim(8.0, 9.0)
Influence of Sample Size and Confidence Level:
Confidence Level: To increase certainty that our interval contains the true population mean, we must widen the interval (e.g., 99% CI is wider than 95% CI).
Sample Size: With a small sample (n=12),
our estimate of population variability (s) is less
precise. The t-distribution, used when σ is unknown, has
“heavier tails” than the normal distribution, resulting in larger
critical values and a larger margin of error compared to knowing σ
(z-test).
Larger samples reduce the SE (\(s/\sqrt{n}\)), narrowing the
interval.
Confidence Interval for a Proportion, A/B Testing: A data science team runs an A/B test on a new Call-To-Action (CTA) button design. The experiment yields:
\[ \begin{eqnarray*} n &=& 400 \quad \text{(total users)} \\ x &=& 156 \quad \text{(users who clicked the CTA)} \end{eqnarray*} \]
Tasks:
Confidence Interval Formula for a Proportion: \[ CI = \hat{p} \pm z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \] where:
Solution:
Given: \(n = 400\), \(x = 156\)
Calculate Sample Proportion (\(\hat{p}\)): \[
\hat{p} = \frac{x}{n} = \frac{156}{400} = 0.39
\]
Calculate Standard Error (SE): \[
SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} = \sqrt{\frac{0.39 \times (1 -
0.39)}{400}} = \sqrt{\frac{0.39 \times 0.61}{400}} =
\sqrt{\frac{0.2379}{400}} = \sqrt{0.00059475} = 0.02439
\]
Determine critical value \(z_{\alpha/2}\):
Calculate Margin of Error (ME):
Calculate Upper and Lower CI bounds:
# Visualization for Case 3
ci_data3 <- data.frame(
Level = factor(c("90%", "95%", "99%"), levels = c("90%", "95%", "99%")),
Proportion = 0.39,
Lower = c(0.350, 0.342, 0.327),
Upper = c(0.430, 0.438, 0.453)
)
ggplot(ci_data3, aes(x = Level, y = Proportion)) +
geom_point(size = 3) +
geom_errorbar(aes(ymin = Lower, ymax = Upper), width = 0.1, size = 1) +
scale_y_continuous(labels = scales::percent, limits = c(0.30, 0.47)) +
labs(title = "Case 3: CI for Click Proportion (A/B Test)",
x = "Confidence Level", y = "Click Proportion (CTR)") +
theme_minimal()
Impact of Confidence Level on Product Decision-Making:
The confidence level reflects the team’s risk tolerance for drawing incorrect conclusions. In A/B testing, a 95% CI (industry standard) means we accept a 5% risk of concluding a difference exists (e.g., new design is better) when it actually doesn’t (Type I Error). If the 95% CI for the difference in proportions between two variants does not include zero, we declare a statistically significant winner. Choosing a 99% CI reduces the chance of false positives but makes detecting real differences harder (increases false negatives). The decision should consider the cost of each error type.
Precision Comparison (Z-Test vs t-Test): Two data teams measure API latency (in milliseconds) under different conditions.
\[\begin{eqnarray*} \text{Team A:} \\ n &=& 36 \quad \text{(sample size)} \\ \bar{x} &=& 210 \quad \text{(sample mean)} \\ \sigma &=& 24 \quad \text{(known population standard deviation)} \\[6pt] \text{Team B:} \\ n &=& 36 \quad \text{(sample size)} \\ \bar{x} &=& 210 \quad \text{(sample mean)} \\ s &=& 24 \quad \text{(sample standard deviation)} \end{eqnarray*}\]
Tasks
Team A (σ known) uses z-test: \[ CI = \bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \] Team B (σ unknown) uses t-test: \[ CI = \bar{x} \pm t_{\alpha/2, df} \cdot \frac{s}{\sqrt{n}}, \quad \text{with } df = n-1 \]
Solution:
Given for both teams: \(n = 36\), \(\bar{x} = 210\)
1. Calculate Standard Error (SE) for both teams (same). \[ SE = \frac{24}{\sqrt{36}} = \frac{24}{6} = 4 \]
2. Determine critical values.
3. Calculate Margin of Error (ME).
4. Calculate Confidence Interval.
# Visualization for Case 4
ci_data4 <- data.frame(
Team = rep(c("Team A (z, σ known)", "Team B (t, σ unknown)"), each=3),
Level = rep(c("90%", "95%", "99%"), 2),
Mean = 210,
Lower = c(203.42, 202.16, 199.70, 203.24, 201.88, 199.10),
Upper = c(216.58, 217.84, 220.30, 216.76, 218.12, 220.90)
)
ci_data4$Level <- factor(ci_data4$Level, levels = c("90%", "95%", "99%"))
ggplot(ci_data4, aes(x = Level, y = Mean, color = Team)) +
geom_point(position = position_dodge(width=0.5)) +
geom_errorbar(aes(ymin = Lower, ymax = Upper), width=0.2,
position = position_dodge(width=0.5)) +
labs(title = "Case 4: CI Precision Comparison - Z-Test vs. T-Test",
subtitle = "Intervals for t-test are wider due to uncertainty from estimating σ.",
x = "Confidence Level", y = "API Latency (ms)", color = "Method / Team") +
theme_minimal()
Why Do Interval Widths Differ? Although the mean, sample size, and variance estimate (24) are the same, the CI for Team B (t-test) is always wider than for Team A (z-test) at the same confidence level. This difference occurs because:
Source of Uncertainty: Team B must estimate the population standard deviation (σ) using the sample statistic (s). This estimation introduces additional uncertainty.
Sampling Distribution: To accommodate this extra uncertainty, the t-distribution used by Team B has heavier tails than the standard normal distribution used by Team A. This results in larger critical values (e.g., 2.030 vs 1.960 for 95% CI) for the same degrees of freedom, thus creating a larger margin of error.
One-Sided Confidence Interval: A Software as a Service (SaaS) company wants to ensure that at least 70% of weekly active users utilize a premium feature.
From the experiment:
\[ \begin{eqnarray*} n &=& 250 \quad \text{(total users)} \\ x &=& 185 \quad \text{(active premium users)} \end{eqnarray*} \]
Management is only interested in the lower bound of the estimate.
Tasks:
One-Sided Lower Bound Formula for a Proportion: \[ LB = \hat{p} - z_{\alpha} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \] where:
Solution: Given: \(n = 250\), \(x = 185\)
Calculate Sample Proportion (\(\hat{p}\)): \[ \hat{p} = \frac{x}{n} = \frac{185}{250} = 0.74 \]
Calculate Standard Error (SE): \[ SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} = \sqrt{\frac{0.74 \times 0.26}{250}} = \sqrt{\frac{0.1924}{250}} = \sqrt{0.0007696} = 0.02774 \]
Determine critical value \(z_{\alpha}\) for one-sided CI:
# Visualization for Case 5
ci_data5 <- data.frame(
Level = factor(c("90%", "95%", "99%"), levels = c("90%", "95%", "99%")),
Proportion = 0.74,
Lower_Bound = c(0.704, 0.694, 0.675)
)
ggplot(ci_data5, aes(x = Level, y = Proportion)) +
geom_point(size=3, color="purple") +
geom_segment(aes(xend=Level, y=Lower_Bound, yend=Proportion),
arrow = arrow(length = unit(0.2, "cm")), size=1, color="purple") +
geom_hline(yintercept=0.70, linetype="dashed", color="red") +
geom_text(aes(label=paste0(round(Lower_Bound*100,1),"%")),
y=ci_data5$Lower_Bound - 0.01, size=3.5) +
scale_y_continuous(labels=scales::percent, limits=c(0.66, 0.75)) +
labs(title="Case 5: One-Sided Lower Bound for Premium User Proportion",
subtitle="Arrow points from lower bound to sample proportion (0.74).\nRed line: business target 70%.",
x="Confidence Level", y="Proportion of Active Premium Users") +
theme_minimal()
Is the 70% Target Statistically Met? The conclusion depends on the confidence level (or risk tolerance) set by management:
At 90% confidence, the lower bound is 70.4%, which is higher than the 70% target. This means we can be 90% confident that the true proportion of users utilizing the premium feature is at least 70.4%, this meeting the target.
At 95% confidence, the lower bound is 69.4%, which is lower than the 70% target. Therefore, we cannot state with 95% confidence that the target has been met.
At 99% confidence, our confidence is even lower (lower bound 67.5%).
The company can claim the target is met with 90% confidence. However, if a more stringent standard (95% or 99%) is required, the current data does not provide sufficient evidence to support that claim. The final decision should consider the risk of an erroneous claim (e.g., if the true proportion is actually below 70%).