#7.1.2.a

lcdiet<-c(42.3, 51.5, 53.7)
ndiet<-c(53.1,50.7)
mean_dif<-mean(lcdiet)-mean(ndiet);mean_dif
## [1] -2.733333
lcdiet<-c(42.3, 53.1, 53.7)
ndiet<-c(51.5,50.7)
mean_dif<-mean(lcdiet)-mean(ndiet);mean_dif
## [1] -1.4
lcdiet<-c(42.3, 51.5, 53.1)
ndiet<-c(53.7,50.7)
mean_dif<-mean(lcdiet)-mean(ndiet);mean_dif
## [1] -3.233333
lcdiet<-c(53.1, 51.5, 53.7)
ndiet<-c(42.3,50.7)
mean_dif<-mean(lcdiet)-mean(ndiet);mean_dif
## [1] 6.266667
lcdiet<-c(50.7, 51.5, 53.7)
ndiet<-c(53.1,42.3)
mean_dif<-mean(lcdiet)-mean(ndiet);mean_dif
## [1] 4.266667
lcdiet<-c(42.3, 50.7, 53.7)
ndiet<-c(53.1,51.5)
mean_dif<-mean(lcdiet)-mean(ndiet);mean_dif
## [1] -3.4
lcdiet<-c(42.3, 51.5, 50.7)
ndiet<-c(53.1,53.7)
mean_dif<-mean(lcdiet)-mean(ndiet);mean_dif
## [1] -5.233333
lcdiet<-c(53.1,50.7, 53.7)
ndiet<-c(42.3, 51.5)
mean_dif<-mean(lcdiet)-mean(ndiet);mean_dif
## [1] 5.6
lcdiet<-c(42.3, 53.1,50.7)
ndiet<-c(51.5, 53.7)
mean_dif<-mean(lcdiet)-mean(ndiet);mean_dif
## [1] -3.9
lcdiet<-c(53.1, 51.5, 50.7)
ndiet<-c(42.3,53.7)
mean_dif<-mean(lcdiet)-mean(ndiet);mean_dif
## [1] 3.766667

##7.1.2.b 8 are greater; 9/10 including original mean difference. ##7.1.2.c This fails to reject the null, it is highly likely that there is no difference between a low chromium diet and normal diet of liver enzyme GITH activity in rat.


Does randomization test look at all numbers greater than or less than -2.7333 or is it two tailed. ***

##7.2.5.a

P-value

##7.2.5.b

Ho: mu(HH)=mu(SH) Ha: mu(HH)!=mu(SH)

7.2.5

If the P-value for the t test is 0.29 and the alpha = 0.10 then p>0.10 thus we fail to reject the null meaning, we are 90% confident the two samples are not statistically different.

##7.2.9.a

Ho: mu(Hypoxia)= mu(Normoxia)

##7.2.9.b

The dot plot would suggest that we reject the null hypothesis.

##7.2.9.c

A p-value of 0.000007307 is a statistically significant value, we reject the null hypothesis meaning the two groups have a statistically significant difference in MBF (myocardial blood flow) based on if they are given normal oxygen or reduced levels of of oxygen to stimulate hypoxia.

##7.2.9.d

A p-value of 0.000007307 < alpha=0.05 which supports is a statistically significant difference in groups. We reject the null hypothesis meaning the two groups have a statistically significant difference in MBF (myocardial blood flow) based on if they are given normal oxygen or reduced levels of of oxygen to stimulate hypoxia.

##7.3.6 We will reject the null for a 95% confidence interval for mu(1)-mu(2) of (1.4, 6.7) with an alpha level of 0.05 because this CI does not contain zero and therefore we accept the alternative hypothesis that these two population(s) and populations means are statistically significantly different from each other.

##7.3.8.a Ho: mu(culture time old method) = mu(culture time new method) Ha: mu(culture time old method) != mu(culture time new method)

##7.3.8.b A type 1 error (Reject Ho when actually true) would result in a large revenue loss in the long run (financial mistake) because you would spend a lot of money to retrofit new machinery, just to find out that cheese culturing wasn’t actually accelerated over time thus not increasing profits to make up for investment.

##7.3.8.c A type II error (Fail to reject Ho) would result in continued operation done the same way it had been done. This would eat into the company’s potential profits if it had implemented the retrofitting, but isn’t as big of a mistake unless all of company’s competition moves to retrofitting or if the company was unprofitable when thinking about implementing new tech.

##7.3.8.d Like stated above, a Type I error would be more serious due to the large investment loss (millions of dollars) without the ability to recoup that loss. A type II error would mean loss of potential profit over time, but could allow for tessting of newer equipment due to no financial burden of new machinery which company would be indebted to.

##7.4.2 This is an example of correlation not necessarily being causation, spurious variables, and confounding variables. Just because a woman that get implants appears to get sick more often, doesn’t mean it is due to implants. This study shows that women that get implants are likely a different group than the population already in that they appear to be living a “risky”/unhealthy lifestyle in the first place. The breast implants effect on getting sicks is confounded with these risky behaviors effects on getting sick. The relationship could also be spurious in that unhealthy or risky behaviors effects likeliness of getting implants and likeliness of getting ill separately. If controlling for a spurious variable than no relationship would be evident between iplants and illness.

##7.4.3.a X= got breast implants/ no implants

##7.4.3.b Y= Illness

##7.4.3.c Observational units = How illness is measure i.e. (doctor visits, fever, self reported illness, etc. )

#7.5.5 (a) 95% CI or alpha =0.05; (-69.40126, 326.80126); Fail to reject the null-> Ho: mu(infected) = mu(noninfected) (b)(a) 90% CI or alpha =0.10; (-35.51732 292.91732); Fail to reject the null-> Ho: mu(infected) = mu(noninfected)


How do I get P-value ***

'95% CI'
## [1] "95% CI"
df<-24
alpha<-.05
CV<-qt(((1-alpha)+.5*alpha), df)
y1<-972.1
y2<-843.4
s1<-245.1
s2<-251.2
n1<-12
n2<-15

y_dif<-y1-y2
SE_Y1Y2<- sqrt(((s1^2)/n1)+((s2^2)/n2))

upper_vector<-y_dif+CV* SE_Y1Y2
lower_vector<-y_dif-CV* SE_Y1Y2

CI<-c(lower_vector, upper_vector); CI
## [1] -69.40126 326.80126
'90% CI'
## [1] "90% CI"
df<-24
alpha<-.10
CV<-qt(((1-alpha)+.5*alpha), df)
y1<-972.1
y2<-843.4
s1<-245.1
s2<-251.2
n1<-12
n2<-15

y_dif<-y1-y2
SE_Y1Y2<- sqrt(((s1^2)/n1)+((s2^2)/n2))

upper_vector<-y_dif+CV* SE_Y1Y2
lower_vector<-y_dif-CV* SE_Y1Y2

CI<-c(lower_vector, upper_vector); CI
## [1] -35.51732 292.91732
'Double check 95% CI and p-value'
## [1] "Double check 95% CI and p-value"
library(BSDA)
## Loading required package: lattice
## 
## Attaching package: 'BSDA'
## The following object is masked from 'package:datasets':
## 
##     Orange
tsum.test(972.1,245.1,12,843.4,251.2,15)
## 
##  Welch Modified Two-Sample t-Test
## 
## data:  Summarized x and y
## t = 1.3408, df = 23.961, p-value = 0.1925
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -69.41849 326.81849
## sample estimates:
## mean of x mean of y 
##     972.1     843.4

##7.5.10.a A unidirectional test with alpha=0.05 finds significance and rejects the null hypothesis that the two populations are different. CI of difference between population means does not contain 0.

df<-47.2
alpha<-.05
CV<-qt(1-alpha, df)
y1<-31.96
y2<-25.32
s1<-12.05
s2<-13.78
n1<-25
n2<-25

y_dif<-y1-y2
SE_Y1Y2<- sqrt(((s1^2)/n1)+((s2^2)/n2))

upper_vector<-y_dif+ CV*SE_Y1Y2


CI<-c(y_dif, upper_vector); CI
## [1]  6.64000 12.78253

##7.5.10.b p-value A two-tailed t-test with alpha=0.05 finds no significant difference between populations and fails to reject the null hypothesis that the two groups are different. CI of difference between population means contains 0.

df<-47.2
alpha<-.05
CV<-qt(((1-alpha)+.5*alpha), df)
y1<-31.96
y2<-25.32
s1<-12.05
s2<-13.78
n1<-25
n2<-25

y_dif<-y1-y2
SE_Y1Y2<- sqrt(((s1^2)/n1)+((s2^2)/n2))

upper_vector<-y_dif+CV* SE_Y1Y2
lower_vector<-y_dif-CV* SE_Y1Y2


CI<-c(lower_vector, upper_vector); CI
## [1] -0.7243547 14.0043547

##7.6.7 We are 95% confident that the true population mean difference between serum uric acid (mmol/l) of men and women is between 0.08405 and 0.09795.We reject the null (Ho: mu(men) = mu(women)), and say that there is strong evidence of a statistically significant difference in the population means of uric acid concentration between men and women with a p< 2.2 e-16.

The CI also does not contain 0.08 mmol/l and exceeds this number and thus the difference is clinically important.


Book reports 934 df versus test below at 937.77.

y1<-.354
y2<-.263
s1<-.058
s2<-.051
n1<-530
n2<-420

tsum.test(y1,s1,n1,y2,s2,n2)
## 
##  Welch Modified Two-Sample t-Test
## 
## data:  Summarized x and y
## t = 25.698, df = 937.77, p-value < 2.2e-16
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  0.08405043 0.09794957
## sample estimates:
## mean of x mean of y 
##     0.354     0.263
'CI using books calc. degree of freedom'
## [1] "CI using books calc. degree of freedom"
df<- 934
alpha<-.05
CV<-qt(((1-alpha)+.5*alpha), df)
y1<-.354
y2<-.263
s1<-.058
s2<-.051
n1<-530
n2<-420
y_dif<-y1-y2
SE_Y1Y2<- sqrt(((s1^2)/n1)+((s2^2)/n2))

upper_vector<-y_dif+CV* SE_Y1Y2
lower_vector<-y_dif-CV* SE_Y1Y2


CI<-c(lower_vector, upper_vector); CI
## [1] 0.08405039 0.09794961

##7.8.3 After plotting the data I realize that parametric statistics will not meaningfully be able to evaluate this data. This data is has outliers/long scraggly tails. A test that is still valid even if the population distributions are not normal would be best like a nonparametric test that is not centered around mean or median.

baseline<-c(8.3,5.7,3.3,4.6,5.6,2.3,11.7,33.7,3.3,1.3,5.3,32.3,2,0.8,2.7,5.7,3,0,3.7,4.7)
weeks_later_6<-c(19.3,10.7,8.3,9,13.6,9.3,16.6,47.3,9,18,12,43,10.3,7,9,10,7.7,23.7,10.3,15)

hist(baseline)

hist(weeks_later_6)

wilcox.test(baseline, weeks_later_6)
## Warning in wilcox.test.default(baseline, weeks_later_6): cannot compute exact p-
## value with ties
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  baseline and weeks_later_6
## W = 49.5, p-value = 4.928e-05
## alternative hypothesis: true location shift is not equal to 0

##7.10.3.a Ho: The population from which the samples are drawn have the same dopamine concentrations in the striatum region of the brain. Ha: The concentration of dopamine in the striatum region of the brain tends to be higher in one of the populations.

This p-value 0.02598 suggests that at the alpha = 0.05 level, we would reject the null and accept the alternative that there is a difference between the dopamine concentrations the population that takes toluene versus the control population.

toluene<-c(3420,2314,1911, 2464, 2781, 2803)
control<-c(1820,1843,1397, 1803,2539, 1990)

wilcox.test(toluene, control, alternative = "two.sided", conf.int = TRUE, conf.level = 0.95)
## 
##  Wilcoxon rank sum test
## 
## data:  toluene and control
## W = 32, p-value = 0.02597
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##    91 1406
## sample estimates:
## difference in location 
##                    726

##7.10.3.b Ho: The population from which the samples are drawn have the same dopamine concentrations in the striatum region of the brain. Ha: The concentration of dopamine in the striatum region of the brain tends to be higher in population exposed to toluene versus the control population.

This p-value 0.01299 suggests that at the alpha = 0.05 level, we would reject the null and accept the alternative that the dopamine concentrations of the population exposed to toluene is significantly greater than the control population.

wilcox.test(toluene, control, alternative = "greater", conf.int = TRUE, conf.level = 0.95)
## 
##  Wilcoxon rank sum test
## 
## data:  toluene and control
## W = 32, p-value = 0.01299
## alternative hypothesis: true location shift is greater than 0
## 95 percent confidence interval:
##  242 Inf
## sample estimates:
## difference in location 
##                    726

##7.10.7.a k2=25 p-value = 0.01141

singly_housed<-c(3.3,2.4,2.5,3.3,2.4)
group_housed<-c(3.9,4.1,4.8,3.9,3.4)
wilcox.test(singly_housed,group_housed)
## Warning in wilcox.test.default(singly_housed, group_housed): cannot compute
## exact p-value with ties
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  singly_housed and group_housed
## W = 0, p-value = 0.01141
## alternative hypothesis: true location shift is not equal to 0

7.10.7.b Ho: Benzo(a)pyrene concentrations are no different beween group-housed mice and singly housed mice. Ha: Benzo(a)pyrene concentrations tend to be high in group-housed mice than in singly housed mice.

p-value is aproximately 0.004-0.005 both which are less than alpha= 0.01 so we say that we reject the null and that we accept the alternative that benzo(a)pyrene concentrations are significantly greater in group housed mice than singly housed mice.

wilcox.test(singly_housed,group_housed, alternative = "less")
## Warning in wilcox.test.default(singly_housed, group_housed, alternative =
## "less"): cannot compute exact p-value with ties
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  singly_housed and group_housed
## W = 0, p-value = 0.005706
## alternative hypothesis: true location shift is less than 0

7.10.7.c Directional alternative is valid because it is aparent every value is greater in the group housed, the p-value is less than a 0.01 alpha level, and this is likely what would occur if mice were licking and biting their cage mates which they evidently do for these populations to be signifcantly different.