First homework for DACSS 603.
##Question 1
| Surgical Procedure | Sample Size | Mean Wait Time | Standard Deviation |
|---|---|---|---|
| Bypass | 539 | 19 | 10 |
| Angiography | 847 | 18 | 9 |
Construct the 90% confidence interval to estimate the actual mean wait time for each of the two procedures. Is the confidence interval narrower for angiography or bypass surgery?
bypass_l <- 19 - (1.65 * (10/(539^.5)))
bypass_h <- 19 + (1.65 * (10/(539^.5)))
bypass_diff <- bypass_h - bypass_l
angio_l <- 18 - (1.65 * (9/(847^.5)))
angio_h <- 18 + (1.65 * (9/(847^.5)))
angio_diff <- angio_h - angio_l
CI for Bypass \(19 \pm 1.65 ( 10/sqrt(539) )\) | Difference = 1.4214106
CI for Angiography \(18 \pm 1.65 ( 9/sqrt(847) )\) | Difference = 1.0205041
The confidence interval is narrower for angiography surgery.
##Question 2
p2 = 567 / 1031
n2 = 1031
z2 = 1.96
p2_l <- p2 - (z2*(p2*(1-p2))^.5) / n2
p2_h <- p2 + (z2*(p2*(1-p2))^.5) / n2
CI = \(.55 \pm 1.96 * sqrt( .55 / 1 -.55 )\)
95% of confidence intervals calculated would contain If this survery is repeated as many times, it is expected that 95% of those confidence intervals will contain the proportion that almost 55% of adult Americans believe that college education is essential for success.
##Question 3
z = 1.96
SD = $170 * .25 = $42.5 (The quarter of the range is .25*(200-30) )
Mean = Assuming most of the book values of the mean is between $30 and $200, the mean can be derived from ( $200 + $30 / 2 ) = $115
A sample size of 277 textbooks should be needed to estimate the mean cost of textbooks per quarter. Using this sample size, given the standard deviation is $42.5, and the mean price of textbooks is $115, our confidence interval will have a length of $10.01
##Question 4
Test whether the mean income of female employees differs from $500 per week. Include assumptions, hypotheses, test statistic, and P-value. Interpret the result. Report the P-value for Ha : μ < 500. Interpret. Report and interpret the P-value for H a: μ > 500. (Hint: The P-values for the two possible one-sided tests must sum to 1.)
bar = 410
s = 90
n = 9
mu = 500
tscore <- (bar - mu) / (s / 9^.5)
p_value_l <- pt(tscore, df = n - 1, lower.tail = TRUE)
cat("P-value is:", p_value_l)
P-value is: 0.008535841
P-value is: 0.9914642
Hypothesis 1:
Ho: μ = 500
Ha: μ < 500
Test Statistic: -3
0.0085358
We reject the null hypothesis and conclude that the mean salaries of female senior employees are not statistically significantly less than the $500 / week of senior employees.
Hypothesis 2:
Ho: μ = 500
Ha: μ > 500
Test Statistic: -3
0.9914642
We fail to reject the null hypothesis and conclude that the mean salaries of female senior employees are not statistically significantly higher than the $500 / week of senior employees.
##Question 5
T Statistic for Jones = (519.5 - 500) / 10 = 1.95, p-value = 0.051
T Statistic for Smith = (519.7 - 500) / 10 = 1.97, p-value = 0.049
Using an α = 0.05
The Jones study is not considered to be statistically significant, given the p-value for the test statistic is .051, which is barely above the .05 threshold.
The Smith study considered to be statistically significant, given the p-value for the test statistic is .049, which is barely above the .05 threshold.
For this example, if a result was listed as “P ≤ 0.05”, the range of p-values for Smith’s study can range from .05 to almost 0, with the actual p-value being .049.. If a result was listed as “P > 0.05”, the range of p-values for Jones’ study can range from .05 to 1, while the actual p-value was .051. These ranges diminish the reality that these studies were barely statistically significant or not, which would lead to possibly diminishing the motivation to look further into these studies and the nuances of the study which led to producing said result.
##Question 6
gas_taxes <- c(51.27, 47.43, 38.89, 41.95, 28.61, 41.29, 52.19, 49.48, 35.02, 48.13, 39.28, 54.41, 41.66, 30.28, 18.49, 38.72, 33.41, 45.02)
Is there enough evidence to conclude at a 95% confidence level that the average tax per gallon of gas in the US in 2005 was less than 45 cents? Explain.
gas_taxes <- c(51.27, 47.43, 38.89, 41.95, 28.61, 41.29, 52.19, 49.48, 35.02, 48.13, 39.28, 54.41, 41.66, 30.28, 18.49, 38.72, 33.41, 45.02)
gas_m <- mean(gas_taxes)
gas_sd <- (var(gas_taxes))^.5
gas_l <- gas_m - (1.96 * (gas_sd/length(gas_taxes)^.5))
gas_g <- gas_m + (1.96 * (gas_sd/length(gas_taxes)^.5))
Mean = 40.8627778 Standard Deviation = 9.3083168
\(40.86 \pm 1.96 (9.31/sqrt(18))\) = (36.5625548, 45.1630008)
At the 95% confidence level, there is not enough evidence to conclude that the average tax per gallon of gas was less than $.45. The upper end of the confidence interval is just over $.45.