Population parameters: \(\mu = 400\) and \(\sigma = 24\). Sample size is relatively large. Therefore the sample mean \(\bar{X}\) is normally distributed with a mean of 400 and a standard deviation of \(\frac{\sigma}{\sqrt{n}} = \frac{24}{\sqrt{144}} = \frac{24}{12} = 2\)
\[P \left(395.5 \leq \bar{X} \leq 404.5 \right)= P \left(\frac{395.4-400}{2} \leq \frac{\bar{X} - \mu}{\sigma} \leq \frac{404.5 - 400}{2} \right)\]
We need to find the probability that something will be between -2.25 and +2.25.
\[P(-2.25 \leq z \leq 2.25)\] If 1 gives the probability that a standard normal will be below the 2.25 quantile (2.25 standard deviations), to get the probability that something is between -2.25 and plus 2.25 we would need to calculate the weight that is less than 2.25 and the mean multiplied by two to get the symetry.
round(2 * (pnorm(2.25) - pnorm(0)), 2)
## [1] 0.98
This is the light blue area for the following t-distribution.
Below you are given ages that were obtained by taking a random sample of 6 MBA students. Assume the population has a normal distribution. It represents 98 percent.
40 42 43 39 37 39popdata <- c(40, 42, 43, 39, 37, 39)
popmean <- sum(popdata)/6
popvar <- sum((40-popmean)^2 + (42-popmean)^2 + (43 - popmean)^2 +
(39 - popmean)^2 + (37 - popmean)^ 2 + (39 - popmean)^2) / (length(popdata) -1)
popsd <- sqrt(popvar)
The population mean is 40 and the population standard deviation is 2.19.
This is a small sample with a normally distribuited, unknown variance. For the \(90\%\) confidence intervals we use the t-distribution and find the 5% cutoff (half of 1 - 90%). There are n-1 (6) degrees of freedom.
t_stat_90 <- qt(0.05, 5)
myconfint_90 <- t_stat_90 * popsd/sqrt(length(popdata))
popmean + myconfint_90
## [1] 38.19769
popmean - myconfint_90
## [1] 41.80231
The \(90\%\) quantile on a t-distribution with 5 degrees of freedom would be -2.015 and the confident interval would be 38.1977 to 41.8023.
=======================
For the \(99\%\) confidence intervals, you would use half of one percent for each tail.
t_stat_99 <- qt(0.005, 5)
myconfint_99 <- t_stat_99 * popsd/sqrt(length(popdata))
popmean + myconfint_99
## [1] 36.39354
popmean - myconfint_99
## [1] 43.60646
Therefore the confidence intervals are 36.3935 and 43.6065.
Find the confidence intervals for the difference in means. Where \(n_m = 100\) and \(n_w = 200\), \(\bar{M} = 8000\) and \(\bar{W} = 6000\), \(S_m = 1500\) and \(s_w = 2500\).
\[\text{Standard Error of difference} = \sqrt{\frac{s_m^2}{n_m} + \frac{s_w^2}{n_w}} = \sqrt{\frac{1500^2}{100} + \frac{2500^2}{200}} = 231.8\] Given the large sample size, the critical values for 95% confidence intervals are z = 1.96.
Therefore, the critical values will be given by
\[\bar{W_m} - \bar{W_w} \pm z_{\alpha/s} * \sqrt{\frac{s_m^2}{n_m} + \frac{s_w^2}{n_w}}\]
WageDiff <- 8000 - 6000
Z_score_95 <- qnorm(0.975)
StandErr <- sqrt(1500^2/100 + 2500^2/200)
UCI <- WageDiff + Z_score_95 * StandErr
LCI <- WageDiff - Z_score_95 * StandErr
Therefore the 95% confidence interval for the wage difference between men and women is £1546 to £2454.
Details
The wage difference is \(\bar{W_m} -\bar{W_w} = 7600 - 7200 = 400\)
The standard error of the difference is \(\sqrt{\frac{s_m^2}{n_m} + \frac{s_w^2}{n_w}} = \sqrt{\frac{720^2}{50} + \frac{1225^2}{75}} = 176.8\)
Given the large sample, z-score for critical values
# 2.5 percent in each tail
qnorm(0.975)
## [1] 1.959964
Critical values are
\[\bar{W_m} - \bar{W_w} \pm z_{\alpha/2} * \sqrt{\frac{s_m^2}{n_m} + \frac{s_w^2}{n_w}} = 400 \pm 1.96 * 176.8\]
# Infomration
Nmen <- 50
Nwomen <- 75
Mmean <- 7600
Wmean <- 7200
s_m <- 750
s_w <- 1225
#===================
Mmean - Wmean - qnorm(0.975) * sqrt(s_m^2/Nmen + s_w^2/Nwomen)
## [1] 53.47785
Mmean - Wmean + qnorm(0.975) * sqrt(s_m^2/Nmen + s_w^2/Nwomen)
## [1] 746.5221
The information that we have is
\(mu = 75; \bar{x} = 71.4; s = 31.9; n = 60\)
The null and the alternative hypothesis are:
As the sample size is more than 30 we can assume a normal distribution for the mean and use a two-tailed z-test.
\[z = \frac{\bar{x} - \mu}{s/\sqrt{n}} = \frac{71.4 - 75}{31.9 / \sqrt{60}} = -0.87\]
The critical for z is
qnorm(p = c(0.025, 0.975))
## [1] -1.959964 1.959964
That is, -1.96 and +1.96. This is 95% for a two-tailed test of the standard normal distribution.
Therefore, there is insufficient information to reject the null that the mean hits are 75.
The information is $= £28,000; {x} = \(32,200; s = £8,000; n = 50\)
The sample size is greater than 30 so we use a one-tailed z-test.
\[z = \frac{\bar{x} = \mu}{s/\sqrt{n}} = \frac{32200 - 28000}{8000/\sqrt{50}} = 3.71\]
The critical value for a onesided z-test is
qnorm(0.95)
## [1] 1.644854
That is 1.64. Therefore, we reject the null and assume that there is sufficient evidence to conclude that the mean salery for economists has increased.