PHST 501: Introduction to Biostatistics Health Sciences II Midterm Project R Results Document

Scatterplot

##code for scatterplot using ggvis package
library(ggvis)
library(dplyr)
risk<-read.csv('RiskBehavior.csv')
ggvis(~riskscore,~phq,fill=~factor(year),data=risk)

Table of Regression Coefficients

[Fill in the table of coefficients below. The table has the same elements as the table of coefficients in R, and is structured in the same way. Round the Estimates, Standard Errors (SE), and t-statistics to 1 decimal place. Round p-values to 2 places.]

risk$year <-as.factor(risk$year)
lin<-lm(riskscore ~ phq * year,data = risk)
summary(lin)
## 
## Call:
## lm(formula = riskscore ~ phq * year, data = risk)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -18.7166  -4.6007  -0.4105   3.9805  21.0292 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  23.9063     2.9255   8.172 1.52e-13 ***
## phq           2.1150     0.3104   6.814 2.50e-10 ***
## year2         4.9431     4.3205   1.144 0.254500    
## year3         5.2785     4.1389   1.275 0.204269    
## year4       -11.2210     4.8956  -2.292 0.023372 *  
## phq:year2    -1.8907     0.4892  -3.865 0.000168 ***
## phq:year3    -0.1736     0.4715  -0.368 0.713361    
## phq:year4     1.9286     0.5237   3.683 0.000327 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.425 on 142 degrees of freedom
## Multiple R-squared:  0.6729, Adjusted R-squared:  0.6568 
## F-statistic: 41.74 on 7 and 142 DF,  p-value: < 2.2e-16

QQ plot of residuals

plot(lin)

ANOVA Table from One-Way Comparison of PHQ Scores among Academic Years

## calculating anova table values and p value
total <- 6241 + 7829
msy <- 6241/3
msy <- signif(msy,digits=6)
msr <- 7829/145
msr <- signif(msr,digits=4)
f <- msy/msr
f <- signif(f,digits=4)
c<-pf(38.53,3,145,2.67,F)
SS df MS FA pvalue
Year 6241 3 2080.33 38.53 1.891542210^{-10}
Residual 7829 145 53.99
Total 14070 149