PHST 501: Introduction to Biostatistics Health Sciences II Midterm Project R Results Document
##code for scatterplot using ggvis package
library(ggvis)
library(dplyr)
risk<-read.csv('RiskBehavior.csv')
ggvis(~riskscore,~phq,fill=~factor(year),data=risk)
[Fill in the table of coefficients below. The table has the same elements as the table of coefficients in R, and is structured in the same way. Round the Estimates, Standard Errors (SE), and t-statistics to 1 decimal place. Round p-values to 2 places.]
risk$year <-as.factor(risk$year)
lin<-lm(riskscore ~ phq * year,data = risk)
summary(lin)
##
## Call:
## lm(formula = riskscore ~ phq * year, data = risk)
##
## Residuals:
## Min 1Q Median 3Q Max
## -18.7166 -4.6007 -0.4105 3.9805 21.0292
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 23.9063 2.9255 8.172 1.52e-13 ***
## phq 2.1150 0.3104 6.814 2.50e-10 ***
## year2 4.9431 4.3205 1.144 0.254500
## year3 5.2785 4.1389 1.275 0.204269
## year4 -11.2210 4.8956 -2.292 0.023372 *
## phq:year2 -1.8907 0.4892 -3.865 0.000168 ***
## phq:year3 -0.1736 0.4715 -0.368 0.713361
## phq:year4 1.9286 0.5237 3.683 0.000327 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.425 on 142 degrees of freedom
## Multiple R-squared: 0.6729, Adjusted R-squared: 0.6568
## F-statistic: 41.74 on 7 and 142 DF, p-value: < 2.2e-16
plot(lin)
## calculating anova table values and p value
total <- 6241 + 7829
msy <- 6241/3
msy <- signif(msy,digits=6)
msr <- 7829/145
msr <- signif(msr,digits=4)
f <- msy/msr
f <- signif(f,digits=4)
c<-pf(38.53,3,145,2.67,F)
| SS | df | MS | FA | pvalue | |
|---|---|---|---|---|---|
| Year | 6241 | 3 | 2080.33 | 38.53 | 1.891542210^{-10} |
| Residual | 7829 | 145 | 53.99 | ||
| Total | 14070 | 149 |