Statistical Inference
assignment week 14
1 Introduction
Statistical inference enables analysts to draw conclusions about populations based on sample data through hypothesis testing and probability reasoning. These methods are essential for evaluating claims and supporting data-driven decisions.
This report explores key inferential techniques through several business-related case studies, including Z-tests, t-tests, two-sample comparisons, Chi-Square tests, and conceptual discussions on statistical errors and p-values. Together, these case studies demonstrate the role of statistical inference in practical decision-making contexts.
2 Case Study 1
This case study evaluates a company’s claim regarding the average study time of its users using a one-sample Z-test. Since the population standard deviation is known and the sample size is sufficiently large, a Z-test is appropriate for statistical inference.
2.1 Task 1
The hypotheses for this study are formulated as follows:
\[ \begin{aligned} H_0 &: \mu = 120 \\ H_1 &: \mu \neq 120 \end{aligned} \]
where \(\mu\) represents the population mean study time.
2.2 Task 2
To evaluate the hypotheses, the Z-test statistic is calculated using the formula:
\[ Z = \frac{\bar{x} - \mu_0}{\sigma / \sqrt{n}} \]
where :
\(Z\) : Z-test statistic
\(\bar{x}\) : sample mean
\(\mu_0\) : hypothesized population mean
\(\sigma\) : population standard deviation
\(n\) : sample size.
Substituting the observed sample values into the formula:
\[ Z = \frac{116 - 120}{15 / \sqrt{64}} \]
\[ Z = \frac{-4}{15/8} \]
\[ Z = \frac{-4}{1.875} \]
\[ Z \approx -2.13 \]
2.3 Task 3
At a significance level of \(\alpha = 0.05\) for a two-tailed test, the critical Z-values are:
\[ Z_{\alpha/2} = \pm 1.96 \]
The null hypothesis is rejected if the absolute value of the test statistic exceeds the critical value.
2.4 Task 4
Since
\[ |Z| = 2.13 > 1.96 \]
the null hypothesis is rejected.
2.5 Task 5
The results indicate that there is sufficient statistical evidence at the 5% significance level to conclude that the average study time of users is significantly different from 120 minutes.
3 Case Study 2
This case study examines whether the average task completion time of a new application differs from 10 minutes. Since the population standard deviation is unknown and the sample size is small, a one-sample t-test is used for statistical inference.
3.1 Task 1
The hypotheses for this study are defined as follows:
\[ \begin{aligned} H_0 &: \mu = 10 \\ H_1 &: \mu \neq 10 \end{aligned} \]
where \(\mu\) represents the population mean task completion time in minutes.
where :
\(\bar{x}\) : sample mean
\(x_i\) : represents each individual observation
\(n\) : is the sample size.
3.2 Task 2
Because the population standard deviation is unknown and the sample size is limited to ten observations, a one-sample t-test is the appropriate statistical method for this analysis.
3.3 Task 3
The observed task completion times (in minutes) are:
\[ 9.2,\; 10.5,\; 9.8,\; 10.1,\; 9.6,\; 10.3,\; 9.9,\; 9.7,\; 10.0,\; 9.5 \]
- Sample Mean
The sample mean is calculated as:
\[ \bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_i \]
\[ \bar{x} = \frac{98.6}{10} = 9.86 \]
- Sample Standard
The sample standard deviation is calculated using:
\[ s = \sqrt{\frac{\sum (x_i - \bar{x})^2}{n - 1}} \]
where :
\(s\) : sample standard deviation
\(x_i\) : denotes each observed value
\(\bar{x}\) : sample mean
\(n - 1\) : represents the degrees of freedom.
\[ s \approx 0.41 \]
- t-Statistic
The t-statistic is computed as:
\[ t = \frac{\bar{x} - \mu_0}{s / \sqrt{n}} \]
where :
\(t\) : test statistic
\(\bar{x}\) : sample mean
\(\mu_0\) : hypothesized population mean
\(s\) : sample standard deviation
\(n\) : sample size.
\[ t = \frac{9.86 - 10}{0.41 / \sqrt{10}} \]
\[ t = \frac{-0.14}{0.129} \]
\[ t \approx -1.09 \]
- p-Value
With \(n - 1 = 9\) degrees of freedom, the two-tailed p-value corresponding to \(t = -1.09\) is approximately:
\[ p \approx 0.30 \]
The p-value represents the probability of observing a test statistic at least as extreme as the calculated value, assuming that the null hypothesis is true.
3.4 Task 4
At a significance level of \(\alpha = 0.05\), the p-value is greater than the significance level:
\[ p = 0.30 > 0.05 \]
Therefore, the null hypothesis is not rejected.
3.5 Task 5 - Explanation
Because the sample size is small, the statistical power of the test is limited. Small samples increase sampling variability and reduce the ability of hypothesis tests to detect true differences. As a result, conclusions drawn from this analysis should be interpreted with caution.
4 Case Study 3
This case study compares the average session duration between two independent user groups exposed to Version A and Version B of an application. A two-sample t-test is conducted to determine whether the difference in mean session duration is statistically significant.
4.1 Task 1
The hypotheses for this A/B testing scenario are formulated as follows:
\[ \begin{aligned} H_0 &: \mu_A = \mu_B \\ H_1 &: \mu_A \neq \mu_B \end{aligned} \]
where \(\mu_A\) and \(\mu_B\) represent the population mean session durations for Version A and Version B, respectively.
4.2 Task 2
Because the two samples are independent and the population variances are unknown, a two-sample t-test is appropriate. Welch’s t-test is used as it does not assume equal variances between the two groups.
4.3 Task 3
The test statistic for the two-sample t-test is calculated as:
\[ t = \frac{\bar{x}_A - \bar{x}_B} {\sqrt{\frac{s_A^2}{n_A} + \frac{s_B^2}{n_B}}} \]
where :
\(t\) : test statistic
\(\bar{x}_A\) and \(\bar{x}_B\) : sample means of Version A and Version B
\(s_A\) and \(s_B\) : sample standard deviations
\(n_A\) and \(n_B\) : sample sizes of the two independent groups.
Substituting the sample statistics:
\[ t \approx -1.63 \]
With approximately 47 degrees of freedom, the corresponding two-tailed p-value is:
\[ p \approx 0.11 \]
The p-value represents the probability of observing a test statistic at least as extreme as the calculated value, assuming that the null hypothesis is true.
4.4 Task 4
At a significance level of \(\alpha = 0.05\), the p-value is compared to the significance level:
\[ p = 0.11 > 0.05 \]
Therefore, the null hypothesis is not rejected.
4.5 Task 5 - Interpretation
The results indicate that there is insufficient statistical evidence at the 5% significance level to conclude that the average session duration differs significantly between Version A and Version B.
5 Case Study 4
This case study examines whether the choice of payment method is associated with the type of device used by customers. A Chi-Square Test of Independence is applied to evaluate the relationship between two categorical variables.
5.1 Task 1
The hypotheses for this study are defined as follows:
\[ \begin{aligned} H_0 &: \text{Device type and payment method preference are independent} \\ H_1 &: \text{Device type and payment method preference are not independent} \end{aligned} \]
The null hypothesis assumes that the choice of payment method does not depend on the type of device used, while the alternative hypothesis assumes that an association exists between the two variables.
5.2 Task 2
Since both device type and payment method preference are categorical variables, a Chi-Square Test of Independence is the appropriate statistical test to assess whether there is an association between them.
5.3 Task 3
- Chi-Square Test Statistic
The Chi-Square test statistic is computed using the following formula:
\[ \chi^2 = \sum \frac{(O - E)^2}{E} \]
where \(O\) denotes the observed frequency and \(E\) denotes the expected frequency under the assumption of independence.
Based on the observed data, the Chi-Square test statistic is:
\[ \chi^2 \approx 3.71 \]
5.4 Task 4
With a Chi-Square statistic of \(\chi^2 \approx 3.71\) at a significance level of \(\alpha = 0.05\), the corresponding p-value is approximately:
\[ p \approx 0.16 \]
The p-value represents the probability of observing a Chi-Square statistic at least as extreme as the calculated value, assuming that the null hypothesis is true.
5.5 Task 5 - Interpretation
At the 5% significance level (\(\alpha = 0.05\)), the p-value is greater than the significance level:
\[ p = 0.16 > 0.05 \]
Therefore, the null hypothesis is not rejected.
This result indicates that there is insufficient statistical evidence to conclude that payment method preference is associated with device type.
6 Case Study 5
This case study discusses the concepts of Type I and Type II errors in the context of statistical hypothesis testing and decision-making.
6.1 Task 1
A Type I error occurs when the null hypothesis is rejected even though it is actually true. In this context, a Type I error would mean concluding that a change or effect exists when, in reality, there is no true effect.
6.2 Task 2
A Type II error occurs when the null hypothesis is not rejected even though it is false. In this context, a Type II error would mean failing to detect a real effect or improvement that actually exists.
6.3 Task 3
From a business perspective, a Type II error is often more costly because it results in missed opportunities, such as failing to implement a change that could improve performance or revenue. However, the relative cost depends on the specific business context and risk tolerance.
6.4 Task 4
An increase in sample size reduces the probability of committing a Type II error. Larger samples provide more information and increase the ability of a statistical test to detect a true effect when it exists.
6.5 Task 5 - Explanation
The significance level \(\alpha\) represents the probability of a Type I error, while \(\beta\) represents the probability of a Type II error. Statistical power is defined as \(1 - \beta\) and represents the probability of correctly rejecting a false null hypothesis. Reducing \(\alpha\) generally increases \(\beta\), while increasing sample size can reduce \(\beta\) without increasing \(\alpha\), thereby improving statistical power.
7 Case Study 6
This case study evaluates the performance of a churn prediction model using hypothesis testing and p-value interpretation.
7.1 Task 1
The p-value represents the probability of observing a test statistic at least as extreme as the calculated value, assuming that the null hypothesis is true. In this case, a p-value of 0.021 indicates that there is a 2.1% chance of obtaining such results if the churn prediction model has no real effect.
7.2 Task 2
At a significance level of \(\alpha = 0.05\), the p-value is compared to the significance level:
\[ p = 0.021 < 0.05 \]
Therefore, the null hypothesis is rejected.
7.3 Task 3
The results suggest that the churn prediction model shows a statistically significant improvement and is unlikely to be performing well due to random chance alone. This means the model provides meaningful value in identifying customer churn.
7.4 Task 4
If the sample used to evaluate the model is not representative of the overall customer population, the results may be biased. This could lead to overestimating the effectiveness of the churn prediction model and making decisions that do not generalize well to real-world customers.
7.5 Task 5 - Explanation
The p-value indicates statistical significance but does not provide information about the magnitude of the effect. A small p-value can occur even when the actual improvement is small, especially with large sample sizes. Effect size metrics are required to assess how meaningful or impactful the model improvement truly is.