Study Cases

Confidence Interval - Week 13

December 21, 2025

Profile Photo

KAYLA APRILIA

Data Science Student at ITSB

NIM: 52250057

Email: kaylaaprilia2142@gmail.com

R Programming Data Science Statistics


1 Case Study 1

Confidence Interval for Mean, σ Known: An e-commerce platform wants to estimate the average number of daily transactions per user after launching a new feature. Based on large-scale historical data, the population standard deviation is known.

  • σ = 3.2(population standard deviation),
  • n = 100(sample size),
  • x¯ = 12.6(sample mean)

Tasks

  1. Identify the appropriate statistical test and justify your choice.
  2. Compute the Confidence Intervals for: a). 90% b). 95% c). 99%
  3. Create a comparison visualization of the three confidence intervals.
  4. Interpret the results in a business analytics context.
knitr::opts_chunk$set(
  echo = TRUE,
  warning = FALSE,
  message = FALSE
)
# =========================================================
# Load libraries
# =========================================================
library(knitr)
library(kableExtra)
library(htmltools)
library(ggplot2)

1.1 Identify and Compute

  • Population standard deviation (σ) = 3.2
  • Sample size (n) = 100
  • Sample mean (x̄) = 12.6
Identify the Appropriate Statistical Test
The appropriate statistical method for this problem is the Z-Confidence Interval for the Population Mean.
Justification: - The population standard deviation (σ) is known - The sample size is large (n ≥ 30) - The objective is to estimate the population mean
Therefore, the Z-distribution is suitable for constructing the confidence interval.

1. Given Data

  • Population standard deviation: \(\sigma = 3.2\)
  • Sample size: \(n = 100\)
  • Sample mean: \(\bar{x} = 12.6\)

Population standard deviation is known, so we will use a Z-Test.


2. Appropriate Statistical Test

  • Z-Test is used because the population standard deviation is known and the sampling distribution of the mean follows a normal distribution.
  • Confidence Interval formula:

\[ CI = \bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \]


3. Standard Error

\[ SE = \frac{\sigma}{\sqrt{n}} = \frac{3.2}{\sqrt{100}} = 0.32 \]


4. Z-Values for Confidence Levels

Confidence Level α z-value (two-sided)
90% 0.10 1.645
95% 0.05 1.960
99% 0.01 2.576

5. Compute Confidence Intervals

\[ CI = \bar{x} \pm z \cdot SE \]

  1. 90% CI \[ CI = 12.6 \pm 1.645 \cdot 0.32 = 12.6 \pm 0.526 = [12.074, 13.126] \]

  2. 95% CI \[ CI = 12.6 \pm 1.960 \cdot 0.32 = 12.6 \pm 0.627 = [11.973, 13.227] \]

  3. 99% CI \[ CI = 12.6 \pm 2.576 \cdot 0.32 = 12.6 \pm 0.824 = [11.776, 13.424] \]

# Compute the Confidence Intervals (90%, 95%, 99%)
# Input values
xbar <- 12.6
sigma <- 3.2
n <- 100

# Standard Error
SE <- sigma / sqrt(n)

# Confidence levels
confidence_levels <- c(0.90, 0.95, 0.99)

# Z critical values
z_values <- qnorm((1 + confidence_levels) / 2)

# Margin of Error
ME <- z_values * SE

# Confidence Intervals
lower_CI <- xbar - ME
upper_CI <- xbar + ME

1.2 Summary Table

library(knitr)
library(kableExtra)

ci_table <- data.frame(
Confidence_Level = c("90%", "95%", "99%"),
Z_Value = round(z_values, 3),
Lower_CI = round(lower_CI, 3),
Upper_CI = round(upper_CI, 3)
)

kable(
ci_table,
caption = "Confidence Intervals for Mean Daily Transactions"
) %>%
kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover"))
Confidence Intervals for Mean Daily Transactions
Confidence_Level Z_Value Lower_CI Upper_CI
90% 1.645 12.074 13.126
95% 1.960 11.973 13.227
99% 2.576 11.776 13.424

1.3 Visualization

The Visualization of Comparison of Confidence Intervals for Mean Daily Transactions.

library(ggplot2)

ci_plot_data <- data.frame(
Confidence_Level = factor(
c("90%", "95%", "99%"),
levels = c("90%", "95%", "99%")
),
Lower = lower_CI,
Upper = upper_CI,
Mean = rep(xbar, 3)
)

ggplot(ci_plot_data, aes(y = Confidence_Level)) +
geom_errorbarh(
aes(xmin = Lower, xmax = Upper),
height = 0.25,
linewidth = 1.2,
color = "#2E86AB"
) +
geom_point(
aes(x = Mean),
size = 3,
color = "#E74C3C"
) +
labs(
title = "Comparison of Confidence Intervals for Mean Daily Transactions",
subtitle = "Z-Confidence Interval (σ Known)",
x = "Average Daily Transactions per User",
y = "Confidence Level"
) +
theme_minimal(base_size = 12)


1.4 Interpretation

With 90% confidence, the average daily transactions per user are estimated to lie between 12.074 and 13.126, providing a more precise but less conservative estimate.

With 95% confidence, the interval widens slightly to 11.973 – 13.227, offering a balanced trade-off between precision and reliability.

With 99% confidence, the interval is the widest, indicating a higher level of certainty but reduced precision.


Business Insight

The consistency of the confidence intervals around 12–13 transactions per user per day suggests that the new feature supports stable user engagement. Management can confidently use the 95% confidence interval as a standard reference for evaluating feature performance and forecasting user activity.


2 Case Study 2

Confidence Interval for Mean, σ Unknown: A UX Research team analyzes task completion time (in minutes) for a new mobile application. The data are collected from 12 users:

8.4,7.9,9.1,8.7,8.2,9.0,7.8,8.5,8.9,8.1,8.6,8.3

Tasks:

  1. Identify the appropriate statistical test and explain why.
  2. Compute the Confidence Intervals for: a). 90% b). 95% c). 99%
  3. Visualize the three intervals on a single plot.
  4. Explain how sample size and confidence level influence the interval width.

2.1 Identify and Compute

The appropriate statistical method is the t-Confidence Interval for the Population Mean.

Reasoning: - Population standard deviation (σ) is unknown
- Sample size is small (n < 30)
- Data are assumed to come from a normally distributed population

Therefore, the T-distribution is used instead of the Z-distribution.

1. Given Data

Task completion times (in minutes) for 12 users:

\[ 8.4, 7.9, 9.1, 8.7, 8.2, 9.0, 7.8, 8.5, 8.9, 8.1, 8.6, 8.3 \]

  • Sample size: \(n = 12\)
  • Population standard deviation unknown → use t-Test

2. Appropriate Statistical Test

  • t-Test is used because the population standard deviation is unknown and the sample size is small (\(n < 30\)).
  • The confidence interval formula is:

\[ CI = \bar{x} \pm t_{\alpha/2, df} \cdot \frac{s}{\sqrt{n}} \]

Where:
- \(\bar{x}\) = sample mean
- \(s\) = sample standard deviation
- \(n\) = sample size
- \(df = n-1\)


3. Sample Mean and Standard Deviation

  1. Sample Mean (\(\bar{x}\))
    \[ \bar{x} = \frac{\sum x_i}{n} = \frac{101.5}{12} \approx 8.458 \]

  2. Sample Standard Deviation (\(s\))
    \[ s = \sqrt{\frac{\sum (x_i - \bar{x})^2}{n-1}} = \sqrt{\frac{1.947}{11}} \approx 0.421 \]


4. Standard Error

\[ SE = \frac{s}{\sqrt{n}} = \frac{0.421}{\sqrt{12}} \approx 0.122 \]


5. T-Values for Confidence Levels (df = 11)

Confidence Level α t-value
90% 0.10 1.796
95% 0.05 2.201
99% 0.01 3.106

6. Compute Confidence Intervals

\[ CI = \bar{x} \pm t \cdot SE \]

  1. 90% CI \[ CI = 8.458 \pm 1.796 \cdot 0.122 \approx 8.458 \pm 0.218 = [8.240, 8.676] \]

  2. 95% CI \[ CI = 8.458 \pm 2.201 \cdot 0.122 \approx 8.458 \pm 0.267 = [8.191, 8.725] \]

  3. 99% CI \[ CI = 8.458 \pm 3.106 \cdot 0.122 \approx 8.458 \pm 0.378 = [8.080, 8.836] \]

# Input data
data <- c(8.4, 7.9, 9.1, 8.7, 8.2, 9.0, 7.8, 
          8.5, 8.9, 8.1, 8.6, 8.3)

# Sample statistics
n <- length(data)
xbar <- mean(data)
s <- sd(data)
SE <- s / sqrt(n)

# Confidence levels
conf_levels <- c(0.90, 0.95, 0.99)

# t critical values
t_values <- qt((1 + conf_levels)/2, df = n - 1)

# Margin of Error
ME <- t_values * SE

# Confidence Intervals
lower_CI <- xbar - ME
upper_CI <- xbar + ME

2.2 Summary Table

library(knitr)
library(kableExtra)

ci_table <- data.frame(
Confidence_Level = c("90%", "95%", "99%"),
t_value = round(t_values, 3),
Lower_CI = round(lower_CI, 3),
Upper_CI = round(upper_CI, 3)
)

kable(
ci_table,
caption = "Confidence Intervals for Mean Task Completion Time (minutes)"
) %>%
kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover"))
Confidence Intervals for Mean Task Completion Time (minutes)
Confidence_Level t_value Lower_CI Upper_CI
90% 1.796 8.240 8.677
95% 2.201 8.191 8.726
99% 3.106 8.081 8.836

2.3 Visualization

The Visualization of Comparison of Confidence Intervals for Task Completion Time.

library(ggplot2)

ci_plot_data <- data.frame(
Confidence_Level = factor(
c("90%", "95%", "99%"),
levels = c("90%", "95%", "99%")
),
Lower = lower_CI,
Upper = upper_CI,
Mean = rep(xbar, 3)
)

ggplot(ci_plot_data, aes(y = Confidence_Level)) +
geom_errorbarh(
aes(xmin = Lower, xmax = Upper),
height = 0.25,
linewidth = 1.2,
color = "#2C3E50"
) +
geom_point(
aes(x = Mean),
size = 3,
color = "#E74C3C"
) +
labs(
title = "Comparison of Confidence Intervals for Task Completion Time",
subtitle = "t-Confidence Interval (σ Unknown)",
x = "Mean Task Completion Time (minutes)",
y = "Confidence Level"
) +
theme_minimal(base_size = 12)


2.4 Conclusion

Effect of Sample Size:

A smaller sample size results in a larger standard error, leading to wider confidence intervals. Increasing the sample size would make the interval narrower and more precise.

Effect of Confidence Level:

Higher confidence levels (e.g., 99%) require more certainty. This increases the critical t-value, producing wider intervals

Lower confidence levels (e.g., 90%) produce narrower intervals, but with less certainty


Interpretation in UX Research Context

The estimated average task completion time is approximately 8.46 minutes.

The 90% confidence interval provides a relatively precise estimate The 95% confidence interval offers a balance between reliability and precision The 99% confidence interval is the most conservative estimate

From a UX perspective, the consistency of completion times across intervals suggests that users can complete tasks in a predictable amount of time, indicating good usability of the new mobile application.


3 Case Study 3

Confidence Interval for a Proportion, A/B Testing: A data science team runs an A/B test on a new Call-To-Action (CTA) button design. The experiment yields:

  • n = 400(total users),
  • x = 156(users who clicked the CTA)

Tasks:

  1. Compute the sample proportion p^
  2. Compute Confidence Intervals for the proportion at: a). 90% b). 95% c). 99%
  3. Visualize and compare the three intervals.
  4. Explain how confidence level affects decision-making in product experiments.

3.1 Compute

  • Total users (n) = 400
  • Users who clicked CTA (x) = 156

1. Given Data

  • Total users: \(n = 400\)
  • Users who clicked the CTA: \(x = 156\)

Sample proportion:

\[ \hat{p} = \frac{x}{n} = \frac{156}{400} = 0.39 \]


2. Confidence Interval Formula

For a proportion, the standard error (SE) is:

\[ SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]

The two-sided confidence interval is:

\[ CI = \hat{p} \pm z_{\alpha/2} \cdot SE \]


3. Calculate Standard Error

\[ SE = \sqrt{\frac{0.39(1-0.39)}{400}} = \sqrt{\frac{0.2379}{400}} = \sqrt{0.00059475} \approx 0.02439 \]


4. Z-Values for Confidence Levels

Confidence Level α z-value (two-sided)
90% 0.10 1.645
95% 0.05 1.960
99% 0.01 2.576

5. Compute Confidence Intervals

Formula: \(CI = \hat{p} \pm z \cdot SE\)

  1. 90% CI \[ CI = 0.39 \pm 1.645 \cdot 0.02439 \approx 0.39 \pm 0.0401 = [0.350, 0.430] \]

  2. 95% CI \[ CI = 0.39 \pm 1.960 \cdot 0.02439 \approx 0.39 \pm 0.0478 = [0.342, 0.438] \]

  3. 99% CI \[ CI = 0.39 \pm 2.576 \cdot 0.02439 \approx 0.39 \pm 0.0629 = [0.327, 0.453] \]

# 1. Compute the Sample Proportion (p̂)
n <- 400
x <- 156

# Sample proportion
p_hat <- x / n

#2. Compute Confidence Intervals (90%, 95%, 99%)

# Standard Error for proportion
SE <- sqrt((p_hat * (1 - p_hat)) / n)

# Confidence levels
conf_levels <- c(0.90, 0.95, 0.99)

# Z critical values
z_values <- qnorm((1 + conf_levels) / 2)

# Margin of Error
ME <- z_values * SE

# Confidence Intervals
lower_CI <- p_hat - ME
upper_CI <- p_hat + ME

3.2 Summary Table

library(knitr)
library(kableExtra)

ci_table <- data.frame(
Confidence_Level = c("90%", "95%", "99%"),
Z_Value = round(z_values, 3),
Lower_CI = round(lower_CI, 3),
Upper_CI = round(upper_CI, 3)
)

kable(
ci_table,
caption = "Confidence Intervals for CTA Click-Through Rate"
) %>%
kable_styling(
full_width = FALSE,
bootstrap_options = c("striped", "hover")
)
Confidence Intervals for CTA Click-Through Rate
Confidence_Level Z_Value Lower_CI Upper_CI
90% 1.645 0.350 0.430
95% 1.960 0.342 0.438
99% 2.576 0.327 0.453

3.3 Visualization

The Visualization of Confidence Intervals for CTA Click-Through Rate.

library(ggplot2)

ci_plot_data <- data.frame(
Confidence_Level = factor(
c("90%", "95%", "99%"),
levels = c("90%", "95%", "99%")
),
Lower = lower_CI,
Upper = upper_CI,
Proportion = rep(p_hat, 3)
)

ggplot(ci_plot_data, aes(y = Confidence_Level)) +
geom_errorbarh(
aes(xmin = Lower, xmax = Upper),
height = 0.25,
linewidth = 1.2,
color = "#2E86AB"
) +
geom_point(
aes(x = Proportion),
size = 3,
color = "#E74C3C"
) +
scale_x_continuous(labels = scales::percent_format(accuracy = 1)) +
labs(
title = "Confidence Intervals for CTA Click-Through Rate",
subtitle = "A/B Testing – Proportion Confidence Intervals",
x = "Click-Through Rate (CTR)",
y = "Confidence Level"
) +
theme_minimal(base_size = 12)


3.4 Interpretation

Effect of Confidence Level on Decision-Making:

Higher confidence levels (99%) produce wider intervals, indicating more uncertainty but greater statistical assurance.

Lower confidence levels (90%) result in narrower intervals, offering more precision but less certainty.

In product experiments, teams often rely on the 95% confidence interval as a balance between reliability and actionable insight.


Interpretation in Product Experiment Context

The estimated click-through rate for the new CTA button is approximately 39%. The confidence intervals consistently center around this value. This suggests stable user engagement with the new CTA design.

If the interval exceeds a predefined business threshold (e.g., minimum acceptable CTR), the product team may confidently roll out the new CTA.


4 Case Study 4

Precision Comparison (Z-Test vs t-Test): Two data teams measure API latency (in milliseconds) under different conditions.

Team A:

  • n = 36(sample size)
  • x¯ = 210(sample mean)
  • σ = 24(known population standard deviation)

Team B:

  • n = 36(sample size)
  • x¯ = 210(sample mean)
  • s = 24(sample standard deviation)

Tasks:

  1. Identify the statistical test used by each team.
  2. Compute Confidence Intervals for 90%, 95%, and 99%.
  3. Create a visualization comparing all intervals.
  4. Explain why the interval widths differ, even with similar data.

4.1 Identify and Compute

  • Team A: Z-Confidence Interval (σ known)
  • Team B: t-Confidence Interval (σ unknown)

Justification:

  • Team A can use Z because population SD is known
  • Team B must use t because SD is estimated from the sample

1. Given Data

Team A (σ known)

  • Sample size: \(n = 36\)
  • Sample mean: \(\bar{x} = 210\)
  • Population standard deviation: \(\sigma = 24\)

Team B (σ unknown)

  • Sample size: \(n = 36\)
  • Sample mean: \(\bar{x} = 210\)
  • Sample standard deviation: \(s = 24\)

2. Type of Test

Team SD Test Type
A σ known Z-Test (normal distribution)
B σ unknown t-Test (Student’s t distribution, df = n-1)

Explanation: Use Z-Test if population SD is known; use t-Test if population SD is unknown.


3. Confidence Interval Formula

Team A (Z-Test, σ known):

\[ CI = \bar{x} \pm z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}} \]

Team B (t-Test, σ unknown):

\[ CI = \bar{x} \pm t_{\alpha/2, df} \cdot \frac{s}{\sqrt{n}}, \quad df = n-1 \]


4. Standard Error

\[ SE = \frac{\sigma \text{ or } s}{\sqrt{n}} = \frac{24}{\sqrt{36}} = \frac{24}{6} = 4 \]


5. Z and T Values for Confidence Levels

Team A (Z-Test)

Confidence Level α z-value (two-sided)
90% 0.10 1.645
95% 0.05 1.960
99% 0.01 2.576

Team B (t-Test, df = 35)

Confidence Level α t-value (two-sided)
90% 0.10 1.690
95% 0.05 2.030
99% 0.01 2.724

6. Compute Confidence Intervals

Team A (Z-Test)

  • 90% CI: \(210 \pm 1.645 \cdot 4 = 210 \pm 6.58 = [203.42, 216.58]\)
  • 95% CI: \(210 \pm 1.960 \cdot 4 = 210 \pm 7.84 = [202.16, 217.84]\)
  • 99% CI: \(210 \pm 2.576 \cdot 4 = 210 \pm 10.30 = [199.70, 220.30]\)

Team B (t-Test)

  • 90% CI: \(210 \pm 1.690 \cdot 4 = 210 \pm 6.76 = [203.24, 216.76]\)
  • 95% CI: \(210 \pm 2.030 \cdot 4 = 210 \pm 8.12 = [201.88, 218.12]\)
  • 99% CI: \(210 \pm 2.724 \cdot 4 = 210 \pm 10.90 = [199.10, 220.90]\)
## Compute Confidence Intervals (90%, 95%, 99%)
# Team A (Z-test)
xbar_A <- 210
sigma_A <- 24
n_A <- 36

SE_A <- sigma_A / sqrt(n_A)
conf_levels <- c(0.90, 0.95, 0.99)
z_values <- qnorm((1 + conf_levels)/2)
ME_A <- z_values * SE_A
lower_CI_A <- xbar_A - ME_A
upper_CI_A <- xbar_A + ME_A

# Team B (t-test)
xbar_B <- 210
s_B <- 24
n_B <- 36
df_B <- n_B - 1

SE_B <- s_B / sqrt(n_B)
t_values <- qt((1 + conf_levels)/2, df = df_B)
ME_B <- t_values * SE_B
lower_CI_B <- xbar_B - ME_B
upper_CI_B <- xbar_B + ME_B

4.2 Summary Table

library(knitr)
library(kableExtra)

ci_table <- data.frame(
Confidence_Level = rep(c("90%", "95%", "99%"), 2),
Team = rep(c("Team A (Z)", "Team B (t)"), each = 3),
Lower_CI = round(c(lower_CI_A, lower_CI_B),3),
Upper_CI = round(c(upper_CI_A, upper_CI_B),3)
)

kable(ci_table, caption = "Confidence Intervals for API Latency") %>%
kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover"))
Confidence Intervals for API Latency
Confidence_Level Team Lower_CI Upper_CI
90% Team A (Z) 203.421 216.579
95% Team A (Z) 202.160 217.840
99% Team A (Z) 199.697 220.303
90% Team B (t) 203.242 216.758
95% Team B (t) 201.880 218.120
99% Team B (t) 199.105 220.895

4.3 Visualization

The Visualization of Comparison of Z vs T Confidence Intervals for API Latency.

library(ggplot2)
library(dplyr)

ci_plot_data <- data.frame(
Confidence_Level = factor(rep(c("90%", "95%", "99%"), 2), levels = c("90%", "95%", "99%")),
Team = rep(c("Team A (Z)", "Team B (t)"), each = 3),
Lower = c(lower_CI_A, lower_CI_B),
Upper = c(upper_CI_A, upper_CI_B),
Mean = rep(c(xbar_A, xbar_B), each = 3)
)

ggplot(ci_plot_data, aes(y = Confidence_Level, color = Team)) +
geom_errorbarh(aes(xmin = Lower, xmax = Upper), height = 0.25, linewidth = 1.2, position = position_dodge(width = 0.5)) +
geom_point(aes(x = Mean), size = 3, position = position_dodge(width = 0.5)) +

#Annotation for interval width difference
geom_text(
data = ci_plot_data %>% filter(Team == "Team B (t)"),
aes(x = Upper + 2, label = "Wider interval due to t-test"),
color = "#E74C3C",
hjust = 0,
size = 3,
position = position_dodge(width = 0.5)
) +
labs(
title = "Comparison of Z vs T Confidence Intervals for API Latency",
subtitle = "Team A (σ known) vs Team B (σ unknown)",
x = "API Latency (ms)",
y = "Confidence Level"
) +
theme_minimal(base_size = 12) +
scale_color_manual(values = c("Team A (Z)" = "#2E86AB", "Team B (t)" = "#E74C3C"))


4.4 Explanation

Team B (t-test) intervals are wider than Team A (Z-test) due to uncertainty from estimating SD.

Confidence Level increases → intervals widen for both teams.

Even with identical sample mean and SD, population knowledge increases precision.

Knowing σ (Z-test) → narrower, more precise interval.

Estimating σ from sample (t-test) → slightly wider interval for same data.


5 Case Study 5

One-Sided Confidence Interval: A Software as a Service (SaaS) company wants to ensure that at least 70% of weekly active users utilize a premium feature.

From the experiment:

  • n = 250(total users)
  • x = 185(active premium users)

Management is only interested in the lower bound of the estimate.

Tasks:

  1. Identify the type of Confidence Interval and the appropriate test.
  2. Compute the one-sided lower Confidence Interval at: a). 90% b). 95% c). 99%
  3. Visualize the lower bounds for all confidence levels.
  4. Determine whether the 70% target is statistically satisfied.

5.1 Identify and Compute

  • CI type: One-sided (lower bound only)
  • Test: Z-test for proportion (population proportion unknown but n large)

Reasoning:

  • Sample size is large → normal approximation valid
  • Interest is only in lower bound → one-sided CI

1. Data

Given data:

  • Total users: \(n = 250\)
  • Active premium users: \(x = 185\)
  • Sample proportion:
    \[ \hat{p} = \frac{x}{n} = \frac{185}{250} = 0.74 \]
  • Minimum target: 70%

2. Type of CI and Test

  • CI type: One-sided (lower) confidence interval for a proportion.
  • Appropriate test: Z-test for proportion, since \(n\hat{p}\) and \(n(1-\hat{p})\) are sufficiently large.

The formula for a one-sided lower confidence interval:

\[ \text{Lower CI} = \hat{p} - z_\alpha \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} \]

Note: \(z_\alpha\) is the Z-value corresponding to the upper tail = \(1 - \alpha\).


3. Standard Error

\[ SE = \sqrt{\frac{\hat{p}(1-\hat{p})}{n}} = \sqrt{\frac{0.74(1-0.74)}{250}} = \sqrt{0.0007696} \approx 0.02774 \]


4. Z-Values for One-Sided CI

Confidence Level α z (one-sided)
90% 0.10 1.282
95% 0.05 1.645
99% 0.01 2.326

5. Calculate Lower Bound CI

\[ \text{Lower CI} = \hat{p} - z_\alpha \cdot SE \]

  • 90% CI: \(0.74 - 1.282 \cdot 0.02774 \approx 0.704\)
  • 95% CI: \(0.74 - 1.645 \cdot 0.02774 \approx 0.694\)
  • 99% CI: \(0.74 - 2.326 \cdot 0.02774 \approx 0.675\)
# Compute Sample Proportion and One-Sided Lower CI
n <- 250
x <- 185

# Sample proportion
p_hat <- x / n

# Standard error
SE <- sqrt(p_hat * (1 - p_hat) / n)

# One-sided confidence levels
conf_levels <- c(0.90, 0.95, 0.99)

# Z critical values (one-sided)
z_values <- qnorm(conf_levels)

# Margin of Error
ME <- z_values * SE

# Lower bounds
lower_CI <- p_hat - ME

5.2 Summary Table

library(knitr)
library(kableExtra)

ci_table <- data.frame(
Confidence_Level = c("90%", "95%", "99%"),
Z_Value = round(z_values, 3),
Lower_CI = round(lower_CI, 3)
)

kable(ci_table, caption = "One-Sided Lower Confidence Intervals for Premium Feature Usage") %>%
kable_styling(full_width = FALSE, bootstrap_options = c("striped", "hover"))
One-Sided Lower Confidence Intervals for Premium Feature Usage
Confidence_Level Z_Value Lower_CI
90% 1.282 0.704
95% 1.645 0.694
99% 2.326 0.675

5.3 Visualization

The Visualization of One-Sided Lower Confidence Interval for Premium Feature Usage and Target Achievement.

library(ggplot2)

ci_plot_data <- data.frame(
Confidence_Level = factor(c("90%", "95%", "99%"), levels = c("90%", "95%", "99%")),
Lower_CI = lower_CI
)

ggplot(ci_plot_data, aes(x = Confidence_Level, y = Lower_CI)) +
geom_col(fill = "#2E86AB", width = 0.5) +
geom_hline(yintercept = 0.70, linetype = "dashed", color = "red", size = 1) +
geom_text(aes(label = round(Lower_CI,3)), vjust = -0.5, size = 4) +
labs(
title = "One-Sided Lower Confidence Interval for Premium Feature Usage",
subtitle = "Red dashed line = 70% target",
x = "Confidence Level",
y = "Lower Bound of Usage Proportion"
) +
ylim(0,1) +
theme_minimal(base_size = 12)

# Check Target Achievement
target <- 0.70
ci_table$Target_Achieved <- lower_CI >= target
ci_table 

5.4 Interpretation

  • 90% confidence → lower bound 0.705 ≥ 0.70 → target met.

  • 95% confidence → lower bound 0.695 < 0.70 → target not guaranteed.

  • 99% confidence → lower bound 0.676 < 0.70 → target not guaranteed.


Conclusion:

The SaaS company can be confident that at least 70% of users use the premium feature at 90% confidence, but higher confidence levels reduce certainty of meeting the target.