1. An experiment was conducted as part of an investigation to combat the effects of certain toxic agents. The survival time of rats depended on the type of poison used and the treatment applied. The data is found in rats.
##   time poison treat
## 1 0.31      I     A
## 2 0.82      I     B
## 3 0.43      I     C
## 4 0.45      I     D
## 5 0.45      I     A
## 6 1.10      I     B
## [1] 48  3
## [1] "time"   "poison" "treat"
## [1] "I"   "II"  "III"
## [1] "A" "B" "C" "D"

Variable Description

Time: Survival time of rats exposed to different poison and treatments Poison: One of three poisons used (I, II, III) Treat: One of four treatments administered (A, B, C, D)

Questions

  1. Construct a linear model for the data bearing in mind that some transformation of the response may be necessary and that the possibility of interactions needs to be considered. Interpret the effects of the poisons.

Simple Model

At first glance, overlap exists for all categories of poison and treatment. In general, poison III appears to have the most detrimental impact on survival, while treatments A and C appear the least effective in mitigating the effects of poison. The colored graph also helps visualize that treatments B and D tend to be the interventions with higher survival rates, while A is the lowest and C having marginal improvement.

## 
## Call:
## lm(formula = time ~ ., data = rats)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.25167 -0.09625 -0.01490  0.06177  0.49833 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.45229    0.05592   8.088 4.22e-10 ***
## poisonII    -0.07313    0.05592  -1.308  0.19813    
## poisonIII   -0.34125    0.05592  -6.102 2.83e-07 ***
## treatB       0.36250    0.06458   5.614 1.43e-06 ***
## treatC       0.07833    0.06458   1.213  0.23189    
## treatD       0.22000    0.06458   3.407  0.00146 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1582 on 42 degrees of freedom
## Multiple R-squared:  0.6503, Adjusted R-squared:  0.6087 
## F-statistic: 15.62 on 5 and 42 DF,  p-value: 1.123e-08

A combined model indicates both of these levels are significant, as well as treatment D. The initial R^2 is approximately 0.61. These coefficients are the easiest to interpret, i.e., poison III results in a significant impact on survival, with an approximately 25% reduction in survival time relative to Poison I. Poison II is not significantly different than I.

## 
## Call:
## lm(formula = log(time) ~ ., data = rats)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.41827 -0.13020 -0.01135  0.10405  0.58670 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.88731    0.08349 -10.627 1.77e-13 ***
## poisonII    -0.18666    0.08349  -2.236   0.0307 *  
## poisonIII   -0.77515    0.08349  -9.284 9.82e-12 ***
## treatB       0.70465    0.09641   7.309 5.27e-09 ***
## treatC       0.19671    0.09641   2.040   0.0476 *  
## treatD       0.50707    0.09641   5.260 4.57e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2362 on 42 degrees of freedom
## Multiple R-squared:  0.7897, Adjusted R-squared:  0.7646 
## F-statistic: 31.54 on 5 and 42 DF,  p-value: 3.401e-13

After considering transformation of the response variable, now both Poisons II and III offer significantly reduced survival time. Looking at the residual plot however there appears to be some heteroscedasticity. The R^2 increases to over 0.75. The coefficients require exponentiation prior to interpreting, with Poison II having a 17% and Poison III having a 54% reduction in survival time relative to Poison I.

## 
## Call:
## lm(formula = log(time) ~ .^2, data = rats)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.50006 -0.11846  0.01995  0.12202  0.48733 
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      -0.89755    0.11626  -7.720 3.82e-09 ***
## poisonII         -0.26383    0.16442  -1.605 0.117330    
## poisonIII        -0.66727    0.16442  -4.058 0.000254 ***
## treatB            0.75768    0.16442   4.608 4.94e-05 ***
## treatC            0.30281    0.16442   1.842 0.073777 .  
## treatD            0.38891    0.16442   2.365 0.023523 *  
## poisonII:treatB   0.13472    0.23253   0.579 0.565960    
## poisonIII:treatB -0.29379    0.23253  -1.263 0.214553    
## poisonII:treatC  -0.13101    0.23253  -0.563 0.576659    
## poisonIII:treatC -0.18730    0.23253  -0.805 0.425823    
## poisonII:treatD   0.30494    0.23253   1.311 0.198023    
## poisonIII:treatD  0.04953    0.23253   0.213 0.832508    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2325 on 36 degrees of freedom
## Multiple R-squared:  0.8252, Adjusted R-squared:  0.7718 
## F-statistic: 15.45 on 11 and 36 DF,  p-value: 1.643e-10

When considering interactions between poison and treatment Poison II is no longer different. Poison III remains significantly different, with a 49% reduction in survival time.

  1. Build an inverse Gaussian GLM for this data. Select an appropriate link function and perform diagnostics to verify your choices. Interpret the effects of the poisons.
## 
## Call:
## glm(formula = time ~ ., family = inverse.gaussian(link = "identity"), 
##     data = rats)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -0.82006  -0.25671  -0.05081   0.14507   0.89770  
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.48364    0.04626  10.454 2.93e-13 ***
## poisonII    -0.10455    0.05411  -1.932 0.060115 .  
## poisonIII   -0.28673    0.04610  -6.219 1.92e-07 ***
## treatB       0.24102    0.05103   4.723 2.60e-05 ***
## treatC       0.03546    0.02646   1.340 0.187474    
## treatD       0.15972    0.04034   3.959 0.000285 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for inverse.gaussian family taken to be 0.1722363)
## 
##     Null deviance: 25.7437  on 47  degrees of freedom
## Residual deviance:  6.2095  on 42  degrees of freedom
## AIC: -71.183
## 
## Number of Fisher Scoring iterations: 9

The initial inverse Gaussian model (using identity link function) generally agrees with the findings of the linear model. Relative to Poison I, Poison III has significantly worse survival, approximately 41% reduction in duration. This is a larger diffrence than predicted using the original linear model’s mean response, which was 25%. However, there appears to be a pattern in the residuals as well as two potential outliers in the highest quartile.

## 
## Call:
## glm(formula = time ~ .^2, family = inverse.gaussian(link = "identity"), 
##     data = rats)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.6987  -0.2145   0.0238   0.1703   0.5229  
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)       0.41250    0.04062  10.155 4.12e-12 ***
## poisonII         -0.09250    0.04920  -1.880   0.0682 .  
## poisonIII        -0.20250    0.04322  -4.685 3.91e-05 ***
## treatB            0.46750    0.13293   3.517   0.0012 ** 
## treatC            0.15500    0.07712   2.010   0.0520 .  
## treatD            0.19750    0.08359   2.363   0.0237 *  
## poisonII:treatB   0.02750    0.17655   0.156   0.8771    
## poisonIII:treatB -0.34250    0.13701  -2.500   0.0171 *  
## poisonII:treatC  -0.10000    0.08920  -1.121   0.2697    
## poisonIII:treatC -0.13000    0.08044  -1.616   0.1148    
## poisonII:treatD   0.15000    0.12145   1.235   0.2248    
## poisonIII:treatD -0.08250    0.08951  -0.922   0.3628    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for inverse.gaussian family taken to be 0.09403976)
## 
##     Null deviance: 25.7437  on 47  degrees of freedom
## Residual deviance:  3.6418  on 36  degrees of freedom
## AIC: -84.797
## 
## Number of Fisher Scoring iterations: 3

Adding interaction effects between treatment and poisons removes the pattern from residual plot and the outliers. The only significant interaction is between Poison III and treatment B, and this pairing has a coefficient with a negative sign. Poison III is uniquely resistant to treatment. This finding also agrees with the results of the linear model with interactions.

## 
## Call:
## glm(formula = time ~ .^2, family = inverse.gaussian(link = "log"), 
##     data = rats)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.6987  -0.2145   0.0238   0.1703   0.5229  
## 
## Coefficients:
##                  Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      -0.88552    0.09848  -8.992 9.80e-11 ***
## poisonII         -0.25392    0.13123  -1.935 0.060890 .  
## poisonIII        -0.67513    0.12097  -5.581 2.53e-06 ***
## treatB            0.75769    0.17432   4.347 0.000108 ***
## treatC            0.31900    0.15179   2.102 0.042647 *  
## treatD            0.39122    0.15504   2.523 0.016183 *  
## poisonII:treatB   0.17718    0.23889   0.742 0.463097    
## poisonIII:treatB -0.29066    0.20784  -1.398 0.170531    
## poisonII:treatC  -0.16040    0.19844  -0.808 0.424230    
## poisonIII:treatC -0.20653    0.18303  -1.128 0.266637    
## poisonII:treatD   0.34400    0.21738   1.582 0.122295    
## poisonIII:treatD  0.04549    0.19135   0.238 0.813422    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for inverse.gaussian family taken to be 0.09403976)
## 
##     Null deviance: 25.7437  on 47  degrees of freedom
## Residual deviance:  3.6418  on 36  degrees of freedom
## AIC: -84.797
## 
## Number of Fisher Scoring iterations: 5

Finally, once a log link function is considered, this interaction effect is no longer significant and all treatments offer an improvment over treatment A. Poison III remains significantly different from I by a 49% reduction in mean survival time. This is in agreement with the linear model findings. Poison III also exhibits the least amount of variance in its Pearson residuals.