Stat 700 Homework 7

The concentrations (in nanograms per milliliter) of plasma epinephrine were measured for ten dogs under: (1) isofluorane, (2) halothane, and (3) cyclopropane anesthesia. (Ref: Perry et al, 1974).

dog.dat=read.table("http://www-rohan.sdsu.edu/~babailey/stat700/dog.dat",header=T)

Plot the data using strip charts. Describe any differences that you see.

stripchart(PEconc ~ Anesthesia, vertical=T,method="stack",xlab="Anesthesia",ylab="PEconc",data=dog.dat,main="Dog Data Stripchart")

plot of chunk unnamed-chunk-2 It appears that Anesthesia 3 has a different variance than the first two anesthesias. Anesthesia 1 and 2 don’t differ too much, we could believe that they have the same variance and mean.

(B)Fit a linear mixed effects model with lme. Give summary and diagnostics plots of the residuals. What do you conclude? Make sure that a factor is a factor! You can use the as.factor funcion inside of lme.

require(nlme)

## Loading required package: nlme

dog.dat$Anesthesia = as.factor(dog.dat$Anesthesia) #anesthesia is a factor so don't use numerical values
fit1 = lme(PEconc ~1,data=dog.dat,random=~1| Anesthesia)
summary(fit1)

## Linear mixed-effects model fit by REML
##  Data: dog.dat 
##     AIC   BIC logLik
##   28.55 32.65 -11.27
## 
## Random effects:
##  Formula: ~1 | Anesthesia
##         (Intercept) Residual
## StdDev:      0.2096   0.3177
## 
## Fixed effects: PEconc ~ 1 
##              Value Std.Error DF t-value p-value
## (Intercept) 0.5853    0.1342 27   4.361   2e-04
## 
## Standardized Within-Group Residuals:
##     Min      Q1     Med      Q3     Max 
## -1.6464 -0.5659 -0.3195  0.6612  2.2885 
## 
## Number of Observations: 30
## Number of Groups: 3

plot(fit1)

plot of chunk unnamed-chunk-3

qqnorm(fit1)

plot of chunk unnamed-chunk-3 After looking at the residual of the lme fit. It doesnt appear that there is any specific patter. The data falls pretty evenly on both sides across the x-axis. As for the QQ plot the data does have a slight deiviation from the a linear shape. There is a small bump towards the center and the tails aren’t straight. Overall is it still believable that the data is normal.

(c)Test the hypothesis H0 : σ2α = 0 vs H1 : σ2α6= 0, using a LRT. What do you conclude?

fit2 = gls(PEconc ~1,data=dog.dat)
anova(fit1,fit2)

##      Model df   AIC   BIC logLik   Test L.Ratio p-value
## fit1     1  3 28.55 32.65 -11.27                       
## fit2     2  2 30.80 33.54 -13.40 1 vs 2    4.26   0.039

Because we get a p-value of 0.039 we reject the null hypothesis and have reason to assume the std deviation is significant.

(d) The default method with lme is REML. Repeat (c) using ML for BOTH fits. How do the p-values of the LRT using REML and ML compare?

fitb1 = lme(PEconc ~1,data=dog.dat,random=~1| Anesthesia,method="ML")
fitb2 = gls(PEconc ~1,data=dog.dat, method="ML")
anova(fitb1,fitb2)

##       Model df   AIC   BIC logLik   Test L.Ratio p-value
## fitb1     1  3 26.15 30.35 -10.07                       
## fitb2     2  2 27.19 30.00 -11.60 1 vs 2   3.044   0.081

The p-value is no longer significant (p-value=0.081 > 0.05), therefore we fail to reject the null hpothesis, unlike the previous question. REML and ML result in different results.

PROBLEM 2 Return to Nonlinear Models: Create 100 observations where the underlying signal is a sine function with amplitude of 4 and a horizontal phase shift of π. Noise is added in the form of normal (Gaussian) random numbers with mean equal to zero and standard deviation equal to 0.5.

x = seq(0,2*pi,length=100) #sequence from 0 to 2pi, 100 terms 
set.seed(1)
e = rnorm(100,0,.5)
y = 4*sin(x-pi)+e
plot(x,y,main=expression('y'[i]*' = 4sin(x'[i]*'-pi)+e'[i]))

plot of chunk unnamed-chunk-6

Use non linear least squares to create a model of the data.

Start with the approximation: amp=1, horshft=0

y1=nls(y~amp*sin(x-horshft),start=list(amp=1, horshft=0))
summary(y1)

## 
## Formula: y ~ amp * sin(x - horshft)
## 
## Parameters:
##         Estimate Std. Error t value Pr(>|t|)    
## amp     -4.01269    0.06461   -62.1   <2e-16 ***
## horshft -0.00311    0.01594    -0.2     0.85    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.455 on 98 degrees of freedom
## 
## Number of iterations to convergence: 3 
## Achieved convergence tolerance: 4.69e-07

The model converges with an amp=-4.012691 and horshft=-0.003115 and Residual Standard Errorr=.4545

Repeate the process with approximation: amp=2, horshft=2

y2=y1=nls(y~amp*sin(x-horshft),start=list(amp=2, horshft=2))
summary(y2)

## 
## Formula: y ~ amp * sin(x - horshft)
## 
## Parameters:
##         Estimate Std. Error t value Pr(>|t|)    
## amp       4.0127     0.0646    62.1   <2e-16 ***
## horshft   3.1385     0.0159   196.9   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.455 on 98 degrees of freedom
## 
## Number of iterations to convergence: 5 
## Achieved convergence tolerance: 6.99e-07

The model converges with an amp=4.0169, horshft=3.13848 and Residual Standard Error=.4545

Show that if you start with initial estimates of amp and horshft closer to their true value, nls gives the estimates you want.For illustration, a print out of the summary of the fit is acceptable. What properties of the sine function might cause this behavior?

We notice that the amplitude and horizontal shifts of both NLS models are different and yet the Residual Standard Errors are the same. The reason this occurs is the sine function is periodic. As long as the period is the same you can add 2*PI to any initial guess and it would yield the same result. The same concept occurs with amplitude except the sign would be opposite and the other parameters would have to be adjusted accordingly. We also plot the two results below and notice that they are exactly the same model except the parameters are different, proving what we expect would happen with the sine function.

par(mfrow=c(2,1)) 
plot(x,y,main=expression('y'[i]*' = 4sin(x'[i]*'-pi)+e'[i]))
lines(x,predict(y1),col="red")
legend(4,0,"Prediction 1",col="red",pch=NA,lty=1,cex=.5)
plot(x,y,main=expression('y'[i]*' = 4sin(x'[i]*'-pi)+e'[i]))
lines(x,predict(y2),col="green")
legend(4,0,"Prediction 2 (correct)",col="green",pch=NA,lty=1,cex=.5)

plot of chunk unnamed-chunk-9

Stat 700 Homework 7

Francis Alvarez , Gonzalo Urrutia

November 3, 2014