1. This is a randomized controlled trial. It would be better to do a Wilcoxon-rank-sum test, because the distributions of survival time are not symmetrical (the density plot shows each distribution is not symmetrical). After doing the test, the difference between survival times in each group is not statistically significant. If we did use a t-test, the result would then be falsely significant.

##
## Wilcoxon rank sum test with continuity correction
##
## data: Survival by Group
## W = 337, p-value = 0.265
## alternative hypothesis: true location shift is not equal to 0
##
## Welch Two Sample t-test
##
## data: Survival by Group
## t = -2.7851, df = 41.031, p-value = 0.008062
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -33.589088 -5.352089
## sample estimates:
## mean in group Control mean in group Therapy
## 20.00000 39.47059
2.
a.) Look at boxplot below
b.) Based on the boxplot, the guinea pigs in the treatment group do not live as long. The median lifetime in the treatment group is 214.5 compared to 316.5 in the control group. The treatment group has a smaller spread with an IQR of 145 compared to 429 in the control group.
c.) My conclusion based on the box and whisker plots is that the treatment decreases the guinea pigs’ lifetime.
d.) I do not think a t-test would be appropriate. If I use the favstats command, I can see that the mean and median differ from each other by a large amount. Since the mean and median are different, this tells me that the distributions are not symmetrical and a t-test would not be appropriate.
e.) I think the Wilcox rank sum test would be a better test. If I use that test, the results are not statistically significant using a p-value of 0.05. Thus there is no significant difference between lifetime in each group.

## Group min Q1 median Q3 max mean sd n missing
## 1 bacilli 76 161.00 214.5 306.00 598 242.5345 117.9309 58 0
## 2 control 18 141.75 316.5 570.75 735 345.2344 222.1974 64 0
##
## Wilcoxon rank sum test with continuity correction
##
## data: Lifetime by Group
## W = 1478.5, p-value = 0.05326
## alternative hypothesis: true location shift is not equal to 0
3.
a.) Boxplot is below. The measure of central tendency for the box and whiskers plot is the median. Looking at the boxplot, the states that did increase their speed limit have a higher median percent change in traffic fatalities. I don’t think it is a valid conclusion, because there could be other factors that may have led to increased traffic fatalities in those states.
b.) I think a t-test can be justified, because both distributions look symmetrical. Doing a t-test will give a p-value less than 0.05 meaning the difference between both groups is statistically significant.
c.) My inference is that speed limit increases may lead to increased traffic fatalities.
d.) After doing a chi-squared analysis, states that increased their speed limit did have an increase in traffic fatalities. The result was statistically significant with p-value less than 0.05.

## Increase min Q1 median Q3 max mean sd n missing
## 1 No -80.0 -22.5 -9.7 7.100 44.1 -8.563158 31.00085 19 0
## 2 Yes -31.5 1.2 12.1 30.525 62.5 13.753125 21.33285 32 0
##
## Welch Two Sample t-test
##
## data: FatalitiesChange by Increase
## t = -2.7722, df = 28.248, p-value = 0.009747
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -38.799538 -5.833028
## sample estimates:
## mean in group No mean in group Yes
## -8.563158 13.753125
## as.numeric(ex0223$Increase)
## Increase 1 2
## No 19 0
## Yes 0 32
## Increase.Speed
## Increase TRUE FALSE
## No 0 19
## Yes 32 0
## Fatalities.Change
## FatalitiesChange TRUE FALSE
## -80 0 1
## -50 0 1
## -41.2 0 1
## -31.5 0 1
## -29 0 1
## -25 0 1
## -20 0 1
## -19.1 0 1
## -17.9 0 1
## -16.4 0 1
## -14.3 0 1
## -13.3 0 1
## -13.2 0 1
## -9.7 0 1
## -9.1 0 1
## -7.9 0 1
## -7 0 1
## -5.4 0 1
## -4.4 0 1
## -1.8 0 1
## 0 0 1
## 1.6 1 0
## 3.4 2 0
## 4 1 0
## 4.4 1 0
## 5.4 1 0
## 8.2 1 0
## 9.4 2 0
## 10.8 1 0
## 14.8 1 0
## 17.6 1 0
## 17.9 2 0
## 18.2 1 0
## 22.2 1 0
## 23.2 1 0
## 24.5 1 0
## 30 1 0
## 32.1 2 0
## 33.3 1 0
## 34.1 1 0
## 34.3 1 0
## 41.3 1 0
## 41.4 2 0
## 44.1 1 0
## 50.7 1 0
## 62.5 1 0
## Fatalities.Change
## Increase.Speed TRUE FALSE
## TRUE 80.00000 38.09524
## FALSE 20.00000 61.90476
##
## Pearson's Chi-squared test with Yates' continuity correction
##
## data: tally(Increase.Speed ~ Fatalities.Change, data = ex0223, format = "count")
## X-squared = 7.5736, df = 1, p-value = 0.005923
4. 2x2 tables are below
a.) Sensitivity = P(T+|D+) = 35/50 = 70%
Specificity = P(T-|D-) = 60/100 = 60%
b.) With prevalance of 5% and assuming a population of 10,000 people
P(D+|T+) = 350/4150 = .0843 -> 8.43% of those who tested positive for CTS actually have CTS
P(D-|T-) = 5700/5850 = .9744 -> 97.44% of those who tested negative for CTS do not have CTS.
c.)
RR = P(D+|T+)/P(D+|T-) = 0.0843/.026 = 3.24
Someone who tested positive for CTS is more likely to have CTS than someone who tested negative.
## CTS No CTS Total
## + 350 40 75
## - 15 60 75
## Total 50 100 150
## CTS No CTS Total
## + 350 3800 4150
## - 150 5700 5850
## Total 500 9500 10000