Exam 2

Please sign here to acknowledge that you are following the College honor code with respect to instructions that you may use any notes from the course including pre-loaded data, but you may not get assistance from another person.

-Carolina Fuentes

Number 1

A)

mean <-114
sd <- 24
n <-100
qt(.025,99)
## [1] -1.984217
qt(.975,99)
## [1] 1.984217
lower_vec <- mean - 1.98* sd/sqrt(n)
upper_vec <- mean + 1.98* sd/sqrt(n)
c(lower_vec[1],upper_vec[1])
## [1] 109.248 118.752

As we are 95% confident that the mean birthweight from this hospital is between 109.45 and 118.75oz.

B)

As we are 95% confident that the mean birthweight from this hospital is between 109.45 and 118.75oz, we can say that if the mean birthweight in the US is 120oz, the mean birthweight in this hospital is lower than the national average. 120oz is above our confidence interval.

Number 2

3/126
## [1] 0.02380952

This test does infact provide evidence that the birthweights differ between infants of smokers and non smokers. Only 2.38% of cases have difference in means at least as large as the difference in the two observed sample means. 2% is a very low probability and therefore the test does provide evidence that the birthweights differ.

Number 3

sd <- 40
se <-3
(sd/se)^2
## [1] 177.7778

The reasercher needs to take a sample of 178 middle aged men.

Number 4

A)

library(BSDA)
## Warning: package 'BSDA' was built under R version 3.6.3
## Loading required package: lattice
## 
## Attaching package: 'BSDA'
## The following object is masked from 'package:datasets':
## 
##     Orange
tsum.test(107.7,9.5,195,115.3,14.9,96)
## 
##  Welch Modified Two-Sample t-Test
## 
## data:  Summarized x and y
## t = -4.5619, df = 134.2, p-value = 1.131e-05
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -10.894936  -4.305064
## sample estimates:
## mean of x mean of y 
##     107.7     115.3

We are 95% confident that the mean difference in systolic blood pressure between the two groups of women is between 4.305 mmHg and 10.895 mmHg.

B)

Null hypothesis: There is no difference in the mean systolic blood pressure between Caucasian and African American women.

Alternative hypothesis: There is a difference in the mean systolic blood pressure between Caucasian and African American women.

C)

t-test statistic = -4.5619

p- value = 1.13e-05 = .0000113

If α= .05 then α > p-value, therefore we have very strong evidence to reject the null. Meaning there is a difference in the mean systolic blood pressure between Caucasian and African American women.

D)

The 3 main assumptions are as follows:

1.The observations are independent from one another.

  1. The subjects were randomly sampled.

  2. There is normal distribution in the populations.

E)

A Type I error would mean that we would wrongly reject the null. Meaning that the Null was correct but we may have statistcally significant findings when there is no real difference. In the context of the study, this would mean that we would reject the null and accept that there is a difference in the mean systolic blood pressure between the two groups.

Consequences of a Type I error could potentially be giving wrong dossage of medications to patients(African American) who we would think have a difference in SBP but truly do not.

F)

abs(107.7-115.3)/14.9
## [1] 0.5100671

Effect size = .51

Since effect size is less than 1 it is not considered an imporant effect. This means that distributions are within .52 standard deviations. Since the effect is not significant, the null hypothesis is accepted. This is contrary to our conclusion, there may be an error(Type I).

Number 5

A)

We would want to use a Wilcoxon-Mann-Whitney test rather than a t-test because of the distributions not being normal.

B)

α=0.05

p-value = .003113

.05>.003

α>p-value

There is very strong evidence to reject the null suggesting that there is a difference in lengths of stay for patients with the same diagnosis in 2 different hospitals.

C)

We are 95% sure that the true difference in average days spent in the hospital between the two hospitals is between 15 and 63.9 days.

Number 6

A)

11/18
## [1] 0.6111111
(11+2)/(18+4)
## [1] 0.5909091

p^(p-hat)= .611

p~= .591

B)

se <- sqrt((.241719)/22)
.591 - (1.96)*se
## [1] 0.3855528
.591 + (1.96)*se
## [1] 0.7964472

The 95% confidence interval is (.386,.796).

C)

We are 95% confident that the true proportion of smokers who would smoke again within a year of quiting is between .385 and .796.

Number 7

A)

Null Hypothesis: There is no difference in efficacy of the treaments. Alternative Hypothesis: There is a difference in the efficacy of the treatments.

B)

As the p-value is .036 we strong evidence to suggest that there is a difference in the efficacy of the treatments.High dose spectinomycin would likely be the most effective tratment for gonorrhea.

Number 8

A)

Null Hypothesis: There is no difference in the proportion of mive with tumors vs. mice without tumor in having E.coli treatments.

Alternative Hypothesis: There is a difference in the proportion of mive with tumors vs. mice without tumor in having E.coli treatments.

tum <- c(20,8)
notum <- c(11,32)
test <- data.frame(tum,notum)
chisq.test(test)
## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  test
## X-squared = 12.687, df = 1, p-value = 0.0003683

X^2=12.687

p−value= .0003683

B)

Since the p-value is less than .1 and around 0 we reject the null hypothesis, we have very strong evidence suggesting mice treated with E.coli are more likely to delevop tumors.

C)

(20*32)/(8*11)
## [1] 7.272727

D)

The odds of a rat treated with E.coli developing a liver tumor is 7.272727 time higher than those of a rat in a germ free environment.