Problem 1

Create 100 independent binomial numbers with parameters n = 10, p = 0.4, and then graph it

rbinom(100 , 10, 0.4)
##   [1] 4 4 1 4 6 4 3 6 6 4 3 3 3 5 4 3 1 4 3 3 2 4 4 2 6 3 3 4 6 5 3 2 5 3 4
##  [36] 2 4 3 1 3 3 2 1 6 6 6 3 4 4 4 7 5 7 7 4 3 4 3 5 6 5 2 6 4 3 7 4 5 5 5
##  [71] 5 6 5 4 4 4 4 5 3 5 4 2 6 2 3 4 7 5 5 6 3 5 4 3 3 3 7 4 6 1
v <- rbinom(100,10 ,0.4)
barplot(table(v))

Problem 2

A fish survey is done to see if the proportion of fish types is consistent with previous years. Suppose, the 3 types of fish recorded: parrotfish, grouper, tang are historically in a 5:3:4 proportion and in a survey the following counts are found Do a test of hypothesis to see if this survey of fish has the same proprortions as historically

H0: fish proportions are not consistent with history Ha: fish proportions are consistent with history

history <- c(5,3,4)
now <- c(53,22,49)
chisq.test(data.frame(history, now))
## Warning in chisq.test(data.frame(history, now)): Chi-squared approximation
## may be incorrect
## 
##  Pearson's Chi-squared test
## 
## data:  data.frame(history, now)
## X-squared = 0.42384, df = 2, p-value = 0.809

Sine the P-value greater than 0.05. Therefore, we will accept the null hypothesis, which means the fishes are Not consistent with previous years.

Problem 3

It is well known that the more beer you drink, the more your blood alcohol level rises. Suppose we have the following data on student beer consumption. Make a scatterplot and fit the data with a regression line. Test the hypothesis that another beer raises your BAL by 0.02 percent against the alternative that it is less.

plot.new()
beer <- c(5,2,9,8,3,7,3,5,3,5)
bal <- c(0.1 , 0.03, 0.19, 0.12, 0.04, 0.095, 0.07, 0.06, 0.02, 0.05)
data <- data.frame (beer, bal)
p1<- plot(beer, bal, main="beer vs blood alcohol level" , xlab="Beers intake" , ylab="blood pressure level")
p1
## NULL
# add a linefit
abline(lm(bal~beer), col="red")

Test the hpythoesis that another beer raises the BAL by 0.02 percent against the alternative that is less.

H0 : drink one more beer the blood pressure level will increase >=0.02 Ha : drink one more beer the blood pressure level will not increase >=0.02 Alpha= 0.05

m1<- lm(bal~ beer, data=data)
summary(m1)
## 
## Call:
## lm(formula = bal ~ beer, data = data)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.0275 -0.0187 -0.0071  0.0194  0.0357 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.018500   0.019230  -0.962 0.364200    
## beer         0.019200   0.003511   5.469 0.000595 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.02483 on 8 degrees of freedom
## Multiple R-squared:  0.789,  Adjusted R-squared:  0.7626 
## F-statistic: 29.91 on 1 and 8 DF,  p-value: 0.0005953
m1$coefficients
## (Intercept)        beer 
##     -0.0185      0.0192

since the Residual standard error= 0.02483, p-value=0.0005953 on the beers analysis, we decided to Accept the null hpythoese.

Problem 4

What is the max and min of F value (F statistics) to accept the null hypothesis for 7 df for numerator, and 12 df for denominator? (alpha = 0.05)

qf(0.95, df1=7,df2=12)
## [1] 2.913358

in order to accept the null, we need F < 2.9133, since we can’t have negative Favlue. so 0<= F< 2.9133 in order to accept the null hypothesis.

Problem 5

Please perform step-by-step ANOVA analysis process for below dataset, and discuss the results at each step. Finally answer the question of do all the three drugs has the same impact on patients and if yes, how they are different? (The 3 steps include “graphical comparison”, “fitting ANOVA model” and “Why and how the means are different”).

Drug A 3,5,6,1,2,4,5,7,8,9,0,10

Drug B 6,2,3,2,1,6,8,1,5,5,3,9

Drug C 4,7,3,7,3,8,5,4,6,5,1,8

impact <-c (3,5,6,1,2,4,5,7,8,9,0,10,6,2,3,2,1,6,8,1,5,5,3,9,4,7,3,7,3,8,5,4,6,5,1,8)
drug <- c(rep("a",12), rep("b",12), rep("c",12))
result <- data.frame(impact, drug)
plot(impact~drug, data=result)

H0: Mean(a)= Mean(b)=Mean(c) Ha: not all the mean of a, b, c are equal Since the differences are small, so we cant tell from the graph clearly, we will perform anova model to anlysis it

outcome <- aov(impact~drug, data=result)
summary(outcome)
##             Df Sum Sq Mean Sq F value Pr(>F)
## drug         2   5.06   2.528   0.346   0.71
## Residuals   33 241.17   7.308

Since the p value =0.71 from ANOVA result , therefore, we should accept the Null Hpythosis, which means Mean(a)= Mean(b)=Mean(c)

Now we check the mean between each drugs.

pairwise.t.test(impact,drug,p.adjust="bonferroni")
## 
##  Pairwise comparisons using t tests with pooled SD 
## 
## data:  impact and drug 
## 
##   a b
## b 1 -
## c 1 1
## 
## P value adjustment method: bonferroni

the pairwise test results shows that there is no signifcant difference in mean among drug a,b,c. therefore we conclude the drugs have the same impact