library(BSDA)
library(tidyverse)
library(ggpubr)
library(rstatix)
library(pander)
\(~\)
Problem 1: Seven manufacturing companies agreed to implement a time management program in hopes of improving productivity. The average times, in minutes, it took the companies to produce the same quantity and kinds of part are listed on the table below. Does this information indicate that the program decreased the production time? Use a 0.05 level of significance and assume normal population distributions. (10 pts.)
\(~\)
Solution:
Statement of the Problem: Is the average time it took the companies to produce the same quantity and kinds of part before the time management program greater than the average time after?
Let \(\mu_1\) be the average time it took the companies to produce the same quantity and kind of part before the program.
Let \(\mu_2\) be the average time it took the companies to produce the same quantity and kind of part after the program.
Ho: The average time it took the companies to produce the same quantity and kinds of part before the time management program is less than or equal to the average time after the time management program.
Ha: The average time it took the companies to produce the same quantity and kinds of part before the time management program is greater than the average time after the time management program.
Level of Significance: \(0.05\)
Test Statistic: t statistic for paired samples
Computation/Analysis using RStudio:
before <- c(75, 112, 89, 95, 80, 105, 110)
after <- c(70, 110, 88, 100, 80, 100, 99)
pander(t.test(before, after, paired = TRUE, mu=0, alternative = "greater"))
Test statistic | df | P value | Alternative hypothesis | mean of the differences |
---|---|---|---|---|
1.439 | 6 | 0.1001 | greater | 2.714 |
\(~\)
Problem2: The monthly returns in percentage of pesos of two investment portfolios were recorded for one year. Perform a hypothesis test at the 0.05 significance level to determine if there is sufficient evidence showing that there is no significant difference in the mean monthly percentage returns between the two investment portfolios. (10 pts.)
\(~\)
Solution:
Statement of the Problem: Is there a significant difference in the mean monthly returns between Portfolio I and Portfolio II?
Let \(\mu_1\) be the mean monthly returns for Portfolio I.
Let \(\mu_2\) be the mean monthly returns for Portfolio II.
Ho: There is no significant difference in the mean monthly returns between Portfolio I and Portfolio II.
Ha: There is a significant difference in the mean monthly returns between Portfolio I and Portfolio II.
Level of Significance: \(0.05\)
Test Statistic: t-statistic for independent samples
Computation/Analysis:
port1 <- c(2.1, 1.2, -1.5, 1.9, 0.7, 2.5, 3.0, -2.2, 1.8, 0.5, 2.0, 1.5)
port2 <- c(2.9, 3.5, -2.8, 1.0, -3.0, 2.6, -3.5, 4.5, 1.5, 2.3, -1.0, 0.8)
var.test(port1, port2)
F test to compare two variances
data: port1 and port2
F = 0.3335, num df = 11, denom df = 11, p-value = 0.082
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.09600787 1.15848712
sample estimates:
ratio of variances
0.3335024
The population variances are equal \((p>0.05)\).
pander(t.test(port1, port2, mu = 0, alternative="two.sided", var.equal = TRUE))
Test statistic | df | P value | Alternative hypothesis | mean of x | mean of y |
---|---|---|---|---|---|
0.4344 | 22 | 0.6683 | two.sided | 1.125 | 0.7333 |
Decision: Fail to reject Ho since \(p > 0.05\).
Conclusion: There is sufficient data supporting Ho. Hence, there is no significant difference in the mean monthly returns between Portfolio I and Portfolio II.
\(~\)
Problem 3: The following data represents the semi-monthly salary of the faculty (in thousands) of four state universities. Faculties were randomly selected from each school. At the 5% level of significance, is there a significant difference among the salary of the faculties of the four state universities? (10 pts.)
\(~\)
Solution:
Statement of the Problem: Is there a significant difference among the mean salary of the faculties of the four state universities?
Ho: There is no significant difference among the mean salary of the faculties of the four state universities.
Ha: At least two stat universities significantly differ in terms of the mean salary fo the faculty
Level of Significance: \(0.05\)
Test Statistic: F-statistic
Computations/Analysis:
salary <- c(15, 20, 16, 13, 17, 12, 19, 18, 10, 20, 23, 18, 16, 30, 15, 17, 16, 12)
univ <- c("A", "A", "A", "A", "A", "B", "B", "B", "B", "C", "C", "C", "C", "C", "D", "D", "D", "D")
data <- data.frame(univ, salary)
data
univ salary
1 A 15
2 A 20
3 A 16
4 A 13
5 A 17
6 B 12
7 B 19
8 B 18
9 B 10
10 C 20
11 C 23
12 C 18
13 C 16
14 C 30
15 D 15
16 D 17
17 D 16
18 D 12
class(data$univ)
[1] "factor"
class(data$salary)
[1] "numeric"
ggboxplot(data, x="univ", y="salary")
data%>%group_by(univ)%>%identify_outliers(salary)
[1] univ salary is.outlier is.extreme
<0 rows> (or 0-length row.names)
# Homogeneity of variances:
levene_test(data, salary~univ)
# A tibble: 1 x 4
df1 df2 statistic p
<int> <int> <dbl> <dbl>
1 3 14 1.24 0.333
# Normality of residuals
anovamodel<-lm(salary~univ, data)
ggqqplot(residuals(anovamodel))
shapiro.test(residuals((anovamodel)))
Shapiro-Wilk normality test
data: residuals((anovamodel))
W = 0.96518, p-value = 0.7037
\(~\)
There are no outliers; the popn variances are homogeneous \((p>0.05)\); the residuals are approximately normally distributed \((p > 0.05)\).
\(~\)
pander(anova(anovamodel))
Df | Sum Sq | Mean Sq | F value | Pr(>F) | |
---|---|---|---|---|---|
univ | 3 | 136.2 | 45.4 | 2.905 | 0.07184 |
Residuals | 14 | 218.8 | 15.63 | NA | NA |