Title HW 2 2/18/2014

Question 1) A: There is an interaction between the units because the 3 lines are not parallel, indicating that the relationship between “species rank in abudance” and “log abudance” was influenced by whether there had been a fire or not. B: Species rank and abundance C: According to the graph, their is a more relative abundance of species after the burn then before, thus fire burns (when controlled) can indicate low species eveneness indicated by the steep slope–thus the higher ranking species has more evenness than the lower ranking species.

Question 2) A) This graph shows that private colleges have a higher graduation rate with lower admissions rate and that public colleges have a lower graduation with higher admissions rate. Private colleges graduation rate goes down as admissions rate increases, and public institutions graduation rate increases as admissions decreases. B) GradRate=.85-.31AdmisRate-.22TypePublic-.12AdmisRate*TypePublic+scatter C) Public GradRate=.85-.31AdmisRate-.22(1)-.12AdmisRate(1) Public GradRate=.63-.43AdmisRate Private GradRate=.85-.31AdmisRate-.22(0)-.12AdmisRate(0) Private GradRate=.85-.31AdmisRate D) The graduation rate for public colleges decreases at a faster rate with admissions that private colleges(-.43 versus -.31 slopes), and the intercept for private colleges is higher (.85 versus .63) meaning that graduation rate is going to start at a higher value than public colleges.

Question 3) A) Taking the intercept out of the equation is a bad idea because in the previous example, the slop for AdmisRate is negative, and when taking out the intercept, the slope of the line becomes positive and does not fit the data.

Question 4) A) The Metropolitan variable is an interaction term because the percentage of states' population that resides in metropolitan areas impacts the relationship between the percentage of the state that lives below the poverty line (poverty) and the number of serious crimes per 100,000 people (crime). This makes intitutive sense because a state will have a higher percentage of people living in poverty if they have more metropolitan areas. B) CrimeRate=-33.03+7.74poverty+1.85metropolitan +.31poverty*metropolitan CrimeRate(30)=-33.03 + 7.74poverty + 1.85 (30) + .31(30) poverty CrimeRate(30)=22.47+17.04poverty CrimeRate(70)=-33.03 + 7.74poverty + 1.85(70) + .31(70)poverty CrimeRate(70)=96.47+29.44poverty CrimeRate(100)=-33.03+7.74poverty + 1.85(100) + .31(100)poverty CrimeRate(100)=151.97+38.74poverty C) The states with a metropolitan rate of 100 have the largest inctercept (151.97) and see crime rate increase at the highest rate with poverty (38.74) which makes sense because we see that crime and poverty are concentrated in metropolitan areas and would expect to see crime rates and poverty increase with an increase in a state's metropolitan rate. D) The relationship between the poverty rate and the crime rate is influenced by the metorpolitan rate, proven by the positive sign on the interactive coefficient.

Question 5) A) As years of experience increases, salary also increases however at a certain point (around 30) it increases in slower increments and then tapers off or decreases>
B) mod6=lm(salary~exper, dat1) fitted(mod6) xyplot(salary + fitted(mod6) ~ exper, dat1) summary(mod6) C) Salary=46.14+(1.14*30)

Question 6) A) BDI=25.79 +6.97 + (-1.75psiat) B) BDI=25.78+.70+3.9-1.75(5)=21.63 (Single personal supported by their parents) C) There is no interaction term, thus the intercept indicates which groups has the highest or lowest depression levels. The government assistance employment group and single will have the highest depression level because their intercept is the largest (25.79+10.71_.7). D) The Employment other and marital status of other have the lowest depression because they ahve the smallest intercept (25.79+2.76-3.52)

#1 and 2

hw2 = read.csv("http://www.macalester.edu/~ajohns24/data/College.csv")
library(mosaic)
## Loading required package: grid Loading required package: lattice
## 
## Attaching package: 'mosaic'
## 
## The following objects are masked from 'package:stats':
## 
## D, IQR, binom.test, cor, cov, fivenum, median, prop.test, sd, t.test, var
## 
## The following objects are masked from 'package:base':
## 
## max, mean, min, print, prod, range, sample, sum
mod2 = lm(GradRate ~ AdmisRate, hw2)
summary(mod2)
## 
## Call:
## lm(formula = GradRate ~ AdmisRate, data = hw2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.7028 -0.1043  0.0271  0.1306  0.3650 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.9343     0.0362    25.8   <2e-16 ***
## AdmisRate    -0.7233     0.0629   -11.5   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.183 on 193 degrees of freedom
## Multiple R-squared:  0.407,  Adjusted R-squared:  0.404 
## F-statistic:  132 on 1 and 193 DF,  p-value: <2e-16
xyplot(GradRate ~ AdmisRate, groups = Type, data = hw2, auto.key = T)

plot of chunk unnamed-chunk-1

mod3 = lm(GradRate ~ AdmisRate + Type, hw2)
summary(mod3)
## 
## Call:
## lm(formula = GradRate ~ AdmisRate + Type, data = hw2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.7593 -0.0614  0.0238  0.0812  0.3572 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.8714     0.0282   30.91  < 2e-16 ***
## AdmisRate    -0.3504     0.0576   -6.09  6.2e-09 ***
## TypePublic   -0.2820     0.0240  -11.75  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.14 on 192 degrees of freedom
## Multiple R-squared:  0.655,  Adjusted R-squared:  0.651 
## F-statistic:  182 on 2 and 192 DF,  p-value: <2e-16
mod4 = lm(GradRate ~ AdmisRate - 1, hw2)
summary(mod4)
## 
## Call:
## lm(formula = GradRate ~ AdmisRate - 1, data = hw2)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -0.626 -0.217  0.105  0.458  0.823 
## 
## Coefficients:
##           Estimate Std. Error t value Pr(>|t|)    
## AdmisRate   0.7890     0.0478    16.5   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.385 on 194 degrees of freedom
## Multiple R-squared:  0.584,  Adjusted R-squared:  0.582 
## F-statistic:  272 on 1 and 194 DF,  p-value: <2e-16
plot(GradRate ~ AdmisRate, hw2)
abline(mod4)

plot of chunk unnamed-chunk-1

#Question 4

crime = read.csv("http://www.macalester.edu/~ajohns24/data/USCrime.csv")
library(mosaic)
crimesub = subset(crime, State != "DC")
crimemod = lm(CrimeRate ~ poverty * Metropolitan, crimesub)
summary(crimemod)
## 
## Call:
## lm(formula = CrimeRate ~ poverty * Metropolitan, data = crimesub)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -239.8  -88.6  -43.8   43.7  379.4 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)
## (Intercept)           -33.035    311.325   -0.11     0.92
## poverty                 7.736     25.889    0.30     0.77
## Metropolitan            1.852      4.423    0.42     0.68
## poverty:Metropolitan    0.314      0.379    0.83     0.41
## 
## Residual standard error: 146 on 46 degrees of freedom
## Multiple R-squared:  0.439,  Adjusted R-squared:  0.403 
## F-statistic:   12 on 3 and 46 DF,  p-value: 6.19e-06

#Question 5

dat1 = read.csv("http://dl.dropbox.com/u/7315092/Data/SalarySim.csv")
library(mosaic)
xyplot(salary ~ exper, dat1)

plot of chunk unnamed-chunk-3

xyplot(salary ~ exper, dat1)

plot of chunk unnamed-chunk-3

xyplot(salary + fitted(dat1) ~ exper, dat1)

plot of chunk unnamed-chunk-3

abline(dat1)
## Error: plot.new has not been called yet
mod6 = lm(salary ~ exper, dat1)
fitted(mod6)
##     1     2     3     4     5     6     7     8     9    10    11    12 
## 46.14 46.71 47.28 47.85 48.42 49.00 49.57 50.14 50.71 51.28 51.86 52.43 
##    13    14    15    16    17    18    19    20    21    22    23    24 
## 53.00 53.57 54.14 54.71 55.29 55.86 56.43 57.00 57.57 58.15 58.72 59.29 
##    25    26    27    28    29    30    31    32    33    34    35    36 
## 59.86 60.43 61.00 61.58 62.15 62.72 63.29 63.86 64.44 65.01 65.58 66.15 
##    37    38    39    40    41    42    43    44    45    46    47    48 
## 66.72 67.30 67.87 68.44 69.01 69.58 70.15 70.73 71.30 71.87 72.44 73.01 
##    49    50    51    52    53    54    55    56    57    58    59    60 
## 73.59 74.16 74.73 75.30 75.87 76.44 77.02 77.59 78.16 78.73 79.30 79.88 
##    61    62    63    64    65    66    67    68    69    70    71    72 
## 80.45 81.02 81.59 82.16 82.74 83.31 83.88 84.45 85.02 85.59 86.17 86.74 
##    73    74    75 
## 87.31 87.88 88.45
xyplot(salary + fitted(mod6) ~ exper, dat1)
summary(mod6)
## 
## Call:
## lm(formula = salary ~ exper, data = dat1)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -15.85  -5.43   1.12   6.48  11.11 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  46.1367     1.6832    27.4   <2e-16 ***
## exper         1.1437     0.0785    14.6   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7.36 on 73 degrees of freedom
## Multiple R-squared:  0.744,  Adjusted R-squared:  0.74 
## F-statistic:  212 on 1 and 73 DF,  p-value: <2e-16

#Question 6

dat6 = read.csv("http://dl.dropbox.com/u/7315092/Data/socsupport.csv")
library(mosaic)
mod = lm(BDI ~ employment + marital + psisat, dat6)
summary(mod)
## 
## Call:
## lm(formula = BDI ~ employment + marital + psisat, data = dat6)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -17.92  -5.15  -0.68   3.91  33.58 
## 
## Coefficients:
##                              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                    25.785      7.690    3.35   0.0012 ** 
## employmentemployed part-time    6.979      6.259    1.12   0.2679    
## employmentgovt assistance      10.714      6.570    1.63   0.1066    
## employmentother                 2.763      6.436    0.43   0.6688    
## employmentparental support      3.923      6.580    0.60   0.5526    
## maritalother                   -3.522      3.834   -0.92   0.3608    
## maritalsingle                   0.696      3.097    0.22   0.8227    
## psisat                         -1.752      0.372   -4.71  9.1e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.28 on 87 degrees of freedom
## Multiple R-squared:  0.304,  Adjusted R-squared:  0.248 
## F-statistic: 5.44 on 7 and 87 DF,  p-value: 3.36e-05