#PROBLEM 1:
A School District Supervisor is interested to verify the claim that female school heads/principals are better at managerial skills than their male counterparts. He obtained data on management skills of random samples of 13 male and 12 female school heads from four school districts using an adapted management skills inventory.

  1. Suppose these samples were drawn randomly from independent populations of female and male school heads and that these populations are N(\(\mu_{f},\sigma^2_{f}\)) and N(\(\mu_{m},\sigma^2_{m}\)), where \(\sigma^2_{f} \neq \sigma^2_{m}\) and both are unknown. Use the following as prior distributions of \(\mu_{f}\) and \(\mu_{m}\),
    respectively N(70,49) and and N(60,64). Find the posterior distributions of \(\mu_{f}\) and $_{m}.

Since the variance is unequal and unknown, we have these following solution: SOLUTION:

F <- c(75,65,55,80,67,65,67,71,79,59,63,69)
M <- c(66,61,60,56,78,63,59,68,77,62,50,49,57)

print(cbind(m1=mean(F),v1=var(F),m2=mean(M), v2=var(M)))
##            m1       v1 m2       v2
## [1,] 67.91667 56.26515 62 76.83333

Let us use the following priors N(70,49) and N(60,64) for \(\mu_{f}\) and \(\mu_{m}\) respectively. The posterior distribution of \(\mu_{f}\) is N(\(\mu'_{f}\),\(\sigma^2_{f}\)) = N(70,49) where

\[\mu'_{f} = \frac{\S^2_{f}\mu_{f}+ n_{1} \sigma^2_{f}ȳ_{1}}{\S^2_{f}+ n_{1} \sigma^2_{f}}= \frac{56.26515(70)+12(49)(67.91667)}{56.26515+12(49)} \approx 68.09861198\]

\[\sigma2'_{f} = \frac{\S^2_{f}\sigma^2_{f}}{\S^2_{f}+n_{1}\sigma^2_{f}} = \frac{56.26515(49)}{56.26515 + 12(49)} \approx 4.27928214\]

and for the posterior distribution of \(\mu_{m}\) = N(60,64) is

\[\mu'_{m} = \frac{\S^2_{m}\mu_{m}+ n_{2} \sigma^2_{m}ȳ_{2}}{\S^2_{m}+ n_{2} \sigma^2_{m}} = \frac{76.83333(60)+13(64)(62)}{76.83333+13(64)}= 61.83091877\approx 61.831\]

\[\sigma2'_{m} = \frac{\S^2_{m}\sigma^2_{m}}{\S^2_{m}+n_{2}\sigma^2_{m}} = \frac{76.83333(64)}{76.83333 + 13(64)} = 5.410599455\approx 5.4106\]

  1. Find the posterior distribution of \(μ_{d}=μ_{f}−μ_{m}\).

SOLUTION:

\(\mu'_{d} = \mu'_{f}-\mu'_{m} = 68.09861198 - 61.83091877 = 6.267693\)

mu_d = 68.09861198 - 61.83091877

print(cbind(mu_d))
##          mu_d
## [1,] 6.267693
  1. Construct a 95% credible interval for \(μ_{d}\). SOLUTION:
F <- c(75,65,55,80,67,65,67,71,79,59,63,69)
M <- c(66,61,60,56,78,63,59,68,77,62,50,49,57)

v1=var(F)
n1 = length(F)
v2=var(M)
n2 = length(M)

print(cbind(v1,n1,v2,n2))
##            v1 n1       v2 n2
## [1,] 56.26515 12 76.83333 13
# Using Satterthwaite’s approximation of degrees of freedom.
df = (((v1/n1) + (v2/n2))^2)/ (((v1/n1)^2/(n1-1))+((v2/n2)^2/(n2-1)))
df
## [1] 22.88192

We note that: \[\sigma2'_{d} = \sigma2'_{f} + \sigma2'_{m} = 4.27928214 + 5.410599455 = 9.689882\]

sigma_d = 4.27928214 + 5.410599455
Tvalue = qt(0.025,22.882,lower.tail = FALSE)
Tvalue 
## [1] 2.069248
Lower_CI = mu_d - (Tvalue)*sqrt(sigma_d)
Upper_CI = mu_d + (Tvalue)*sqrt(sigma_d)

print(cbind(Lower_CI, Upper_CI))
##        Lower_CI Upper_CI
## [1,] -0.1735811 12.70897

Therefore, the 95% credible interval is given by

\[\mu'_{d} ± t_{a/2},df \sqrt(\sigma2'_{d}) = 6.267693 ± 2.069248 \sqrt(9.689882) = (-0.1735811,12.70897)\]

  1. Test the hypothesis that females have superior managerial skills than males. Use a 5% level of significance.

\(H_{0}:\mu_{f} \leq \mu_{m}\) vs.  \(H_{1}:\mu_{f} > \mu_{m}\)

\(a = 0.05\)

qt(0.025,22.88192 , lower.tail = FALSE)
## [1] 2.069248
pt( 2.069248, df=22.88192, lower.tail = FALSE)
## [1] 0.02500002

Since this probability is lesser than the 5% level of significance, we reject \(H_{0}\).

Therefore, the data is sufficient to conclude that the female school heads/principals are better at managerial skills than their male counterparts.

#PROBLEM 2:

With the COVID-19 pandemic, public school teachers in the country were busy preparing and printing course modules for distribution to their pupils. One of the important parts of the module is a pretest and a posttest. The pretest and posttest scores of a class of 15 pupils in Mathematics are obtained from the class record of a teacher.

#Calculating for difference of Post and Pre-test, sigma of difference and mean of difference.

Pre_Test <- c(18,21,16,22,19,24,17,21,23,18,14,16,16,19,18)
Post_Test <- c(22,25,17,24,16,29,20,23,19,20,15,15,18,26,18)

d <- Post_Test-Pre_Test

Sigma_d <- var(d)
ybar_d <- mean(d)

Sigma_d
## [1] 8.380952
ybar_d
## [1] 1.666667
  1. Assume that the pairwise differences are distributed as N(\(\mu_{d},\sigma^2_{d}\)), with \(\sigma^2_{d}\) unknown. Using a N(-3,4) prior for \(\mu_{d}\), find the posterior distribution for \(\mu_{d}\).
mu_prime <- ((Sigma_d)*(-3) + 15*(4)*(ybar_d ))/ ((Sigma_d) + (15*4))

sigma_prime <-  ((Sigma_d)*4)/((Sigma_d)+ (15*4))

#The posterior distribution is:
print(cbind(mu_prime, sigma_prime))
##      mu_prime sigma_prime
## [1,] 1.094708   0.4902507

\[\mu'_{f} = \frac{\S^2_{d}\mu_{p}+ n_{1} \sigma^2_{p}ȳ_{d}}{\S^2_{f}+ n_{1} \sigma^2_{f}}= \frac{8.31(-3)+15(4)(1.667)}{8.381+15(4)} = 1.094997\] \[\sigma2' = \frac{\S^2_{d}\sigma^2_{p}}{\S^2_{d}+n_{2}\sigma^2_{p}} = \frac{8.31(4)}{8.31 + 15(4)} = 0.490531\]

Thus, the posterior distribution is given by N(1.09,0.49)

  1. At the 5% level of significance, test the hypothesis that the module is effective, that is, the mean post test score is greater than the mean pre test score.

HYPOTHESIS TESTING:

\(H_{0}: \mu_{d} = 0\) \(H_{1}: \mu_{d} > 0\)

\(a = 0.05\)

qt(0.025, 14, lower.tail = FALSE)
## [1] 2.144787
pt(2.144787, df = 14,lower.tail = FALSE )  #df = n-1
## [1] 0.02499999

Since this probability is lesser than the 5% level of significance, we will reject \(H_{0}\).

Therefore, the data is sufficient to conclude that the mean post test is greater than mean pre-test.