##Question One: A study investigated characteristics associated withy= whether a cancer patient achievedremission (1 = yes, 0 = no). An important explanatory variable was a labeling index (LI=percentage of “labeled” cells) that measures proliferative activity of cells after a patientreceives an injection of tritiated thymidine. Input the data yourself using

LI <- c( 8, 8,10,10,12,12,12,14,14,14,16,16,16,18,20,20,20,22,22,24,26,28,32,34,38,38,38)
y  <- c(0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,1,0,0,1,1,0,1,1,1,0)
df <- data.frame(LI, y)
fit <- glm(y ~ LI, family = binomial, data = df)
summary(fit)
## 
## Call:
## glm(formula = y ~ LI, family = binomial, data = df)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.9448  -0.6465  -0.4947   0.6571   1.6971  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)   
## (Intercept) -3.77714    1.37862  -2.740  0.00615 **
## LI           0.14486    0.05934   2.441  0.01464 * 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 34.372  on 26  degrees of freedom
## Residual deviance: 26.073  on 25  degrees of freedom
## AIC: 30.073
## 
## Number of Fisher Scoring iterations: 4

###Question 1(a) Estimate the probability of remission for a patient with a labeling index value of 12.

predict(fit, data.frame(LI = 12), type = "resp")
##         1 
## 0.1151908

###Question 1(b) Estimate the percentage of labeled cells at which the probability of remission is 0.50.

-as.numeric(coefficients(fit)[1]/coefficients(fit)[2])
## [1] 26.07384

###Question 1(c) Estimate the effect of a one-unit increase in LI on the odds of remission.

exp(coefficients(fit)[2])
##       LI 
## 1.155881

##When LI increase by 1, show the estimate odds of remission multiply by 1.16.

###Question 1(d) How does the estimated probability of remission change from the lower to upperquartile values of labeling index, that is, fromLI= 13 toLI= 25?

quantile(LI)
##   0%  25%  50%  75% 100% 
##    8   13   18   25   38
predict(fit, data.frame(LI = 14), type = "resp")
##         1 
## 0.1481664
predict(fit, data.frame(LI = 28), type = "resp")
##         1 
## 0.5693082
0.4611882-0.1307983 
## [1] 0.3303899

##Question Two: Continue with the cancer remission data from the previous exercise.

###Question 2(a) What is the rate of change in ˆπ(x) =ˆP(Y= 1|X=x) at x= 12?

0.14486* 0.1151908*(1- 0.1151908)
## [1] 0.0147644

###Question 2(b) Conduct a Wald test for the LI effect, and construct a 95% Wald interval for the odds ratio corresponding to a one-unit increase in LI. Interpret. ##HO:β=0

0.14486 / 0.05934   
## [1] 2.441186

##Wald Statistics : (β/SE)= {0.1449}{.0593}^2 = 5.97; chi-square = z2 is apx 5.97 with df=1 → p= 0.0146. Thus, HO:β=0 is rejected→there is effect of treatment on patient. ##CI → eβ± Zα/2(SE) = 1.155924 ± 1.96 (.0593), thus the CI is (1.039696, 1.272152) So, odds of remission at LI = x + 1 are estimated between 1.039696 and 1.272152 times the odds of remission at LI= x.

###Question 2(c) Conduct a likelihood-ratio test for the LI effect, and construct a 95% profile likelihood interval for the odds ratio. Interpret. ##HO:β=0

m0 <- glm(y ~ 1, family = binomial, data = df)
summary(m0)
## 
## Call:
## glm(formula = y ~ 1, family = binomial, data = df)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -0.9005  -0.9005  -0.9005   1.4823   1.4823  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)  
## (Intercept)  -0.6931     0.4082  -1.698   0.0895 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 34.372  on 26  degrees of freedom
## Residual deviance: 34.372  on 26  degrees of freedom
## AIC: 36.372
## 
## Number of Fisher Scoring iterations: 4
anova(fit, m0, test="Chisq")
## Analysis of Deviance Table
## 
## Model 1: y ~ LI
## Model 2: y ~ 1
##   Resid. Df Resid. Dev Df Deviance Pr(>Chi)   
## 1        25     26.073                        
## 2        26     34.372 -1  -8.2988 0.003967 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
exp(confint(fit, "LI"))
## Waiting for profiling to be done...
##    2.5 %   97.5 % 
## 1.043440 1.329319

##there is effect of treatment on patient (34.37 - 26.07 = 8.30 with degree of freedom 1 and P-value = 0.004) ##The lower and upper limits are (0.0425 and 0.2846) ##Then, exponentiating : e0.0425and e02846are (1.04, 1.33)→ Thus, the odds of remission at LI=x+1 are estimated to fall between 1.04 and 1.33 times the odds of remission at LI=x

###Question 2(d) The data are organized as 14 observations in grouped-data format, in the data filehttp://users.stat.ufl.edu/~aa/cat/data/Remission.dat. Fit the logistic re-gression model to the grouped data. Are the parameter estimates the same as withungrouped data? Their standard errors? What about the deviance? Does the likeli-hood ratio test forLIeffect change? Explain.******

filename <- "http://users.stat.ufl.edu/~aa/cat/data/Remission.dat"
df2 <- read.table(file=filename, header=T);
p <- df2$remissions/df2$cases
xg <- log(df2$LI)
fit2 <- glm(p ~ xg, family=binomial)
## Warning in eval(family$initialize): non-integer #successes in a binomial glm!
m2 <- glm(p ~ 1, family = binomial, data = df2)
## Warning in eval(family$initialize): non-integer #successes in a binomial glm!
summary(m2)
## 
## Call:
## glm(formula = p ~ 1, family = binomial, data = df2)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.0383  -1.0383  -0.4352   1.1183   1.3232  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)  -0.3365     0.5421  -0.621    0.535
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 15.085  on 13  degrees of freedom
## Residual deviance: 15.085  on 13  degrees of freedom
## AIC: 21.13
## 
## Number of Fisher Scoring iterations: 3
anova(fit2, m2, test="Chisq")
## Analysis of Deviance Table
## 
## Model 1: p ~ xg
## Model 2: p ~ 1
##   Resid. Df Resid. Dev Df Deviance Pr(>Chi)  
## 1        12     10.776                       
## 2        13     15.085 -1   -4.309  0.03791 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

##The coefficients, standard error and deviance are different. The LR Statistic is halved.

##Question 3. For the Crabs data file at www.stat.ufl.edu/~aa/cat/data, fit the logistic regression model for the probability of a satellite (y= 1) using x= weight as the sole explanatory variable.

filename1 <- "http://users.stat.ufl.edu/~aa/cat/data/Crabs.dat"
Crabs <- read.table(file=filename1, header=T);
fit3 <- glm(y ~ weight, family = binomial, data = Crabs)
summary(fit3)
## 
## Call:
## glm(formula = y ~ weight, family = binomial, data = Crabs)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.1108  -1.0749   0.5426   0.9122   1.6285  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -3.6947     0.8802  -4.198 2.70e-05 ***
## weight        1.8151     0.3767   4.819 1.45e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 225.76  on 172  degrees of freedom
## Residual deviance: 195.74  on 171  degrees of freedom
## AIC: 199.74
## 
## Number of Fisher Scoring iterations: 4

###3(a) logit[π(x)^]=−3.6947264+1.8151446x… π(x)^=e−3.6947264+1.8151446x1+e−3.6947264+1.8151446x ###3(b) Conduct the Wald and likelihood ratio tests of the hypothesis that weight has noeffect. Report theP-value, and interpret. ##Wald Test P-value: (1.8151/0.3767)=4.819…1.45e-06 ##LR Test: 4.273e-08

m3 <- glm(y ~ 1, family = binomial, data = Crabs)
summary(m3)
## 
## Call:
## glm(formula = y ~ 1, family = binomial, data = Crabs)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.4326  -1.4326   0.9421   0.9421   0.9421  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   0.5824     0.1585   3.673 0.000239 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 225.76  on 172  degrees of freedom
## Residual deviance: 225.76  on 172  degrees of freedom
## AIC: 227.76
## 
## Number of Fisher Scoring iterations: 4
anova(fit3, m3, test="Chisq")
## Analysis of Deviance Table
## 
## Model 1: y ~ weight
## Model 2: y ~ 1
##   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
## 1       171     195.74                          
## 2       172     225.76 -1  -30.021 4.273e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

###3(c)Find ˆπ(x) at the weight values 1.20, 2.44, and 5.20 kg, which are the sample minimum,mean, and maximum.

MinMeanMax <- data.frame(weight=c(1.20,2.44,5.20))
predict(fit3,MinMeanMax,type="response")
##         1         2         3 
## 0.1799697 0.6757320 0.9968084

###3(d)Find the weight at which ˆπ(x) = 0.50.

fiftyfifty <- -unname(fit3$coefficients[1]/fit3$coefficients[2])
fiftyfifty
## [1] 2.0355

###3(e) At the weight found in part (d), give a linear approximation for the estimated effectof (i) a 1-kg increase in weight. This represents a relatively large increase, so convertthis to the effect of (ii) a 0.10-kg increase, and (iii) a standard deviation increase inweight (0.58 kg).

onekg <- data.frame(weight=fiftyfifty+1)
predict(fit3,onekg,type="response")
##         1 
## 0.8599825

#For a 1 kg increase in weight, the predicted probability of a crab having a satellite increases 0.8599825

pointonekg <- data.frame(weight=fiftyfifty+0.1)
predict(fit3,pointonekg,type="response")
##         1 
## 0.5452544

#For a 0.1 kg increase in weight, the predicted probabilityof a crab having a satellite increases 0.5452544.

fiftyeight <- data.frame(weight=fiftyfifty+0.58)
predict(fit3,fiftyeight,type="response")
##         1 
## 0.7413091

#For a 0.58 kg increase in weight, the predicted probabilityof a crab having a satellite increases 0.7413091.

###3(f) Construct a 95% confidence interval to describe the effect of weight on the odds of a satellite. Interpret.

exp(confint(fit3)[2,])
## Waiting for profiling to be done...
##    2.5 %   97.5 % 
##  3.04588 13.42750

###CI: ( 3.04588, 13.42750) So, odds of a satellite at weight = x + 1 are estimated between 3.04588 and 13.42750 times the odds of remission at LI= x.*

##Question 4. For the Crabs data file, fit a logistic regression model for the probability of a satellite,using color alone as the predictor.

glm(y~color, family=binomial, data=Crabs)
## 
## Call:  glm(formula = y ~ color, family = binomial, data = Crabs)
## 
## Coefficients:
## (Intercept)        color  
##      2.3635      -0.7147  
## 
## Degrees of Freedom: 172 Total (i.e. Null);  171 Residual
## Null Deviance:       225.8 
## Residual Deviance: 213.3     AIC: 217.3

###4(a)Treat color as a nominal scale (qualitative). Report the prediction equation and explain how to use it to compare the first and fourth colors.

glm(y~factor(color), family=binomial, data=Crabs)
## 
## Call:  glm(formula = y ~ factor(color), family = binomial, data = Crabs)
## 
## Coefficients:
##    (Intercept)  factor(color)2  factor(color)3  factor(color)4  
##         1.0986         -0.1226         -0.7309         -1.8608  
## 
## Degrees of Freedom: 172 Total (i.e. Null);  169 Residual
## Null Deviance:       225.8 
## Residual Deviance: 212.1     AIC: 220.1

###Logit(π)= 1.0986-0.1226c1-0.7309c2-1.8608c3’ ###The estimate odds medium-light crab has a satellite e-.1226= 0.8846 times the estimated odds a dark crab has a satellite.

###4(b)For the model in part (a), conduct a likelihood ratio test of the hypothesis that color has no effect. Interpret.

m4 <- glm(y~factor(color), family=binomial, data=Crabs)
m5 <- glm(y~1, family=binomial, data=Crabs)
anova(m5, m4, test="LRT")
## Analysis of Deviance Table
## 
## Model 1: y ~ 1
## Model 2: y ~ factor(color)
##   Resid. Df Resid. Dev Df Deviance Pr(>Chi)   
## 1       172     225.76                        
## 2       169     212.06  3   13.698 0.003347 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

###Likelihood-ratio statistics=13.7, df=3 →p-value <0.003347→very strong evidence of difference between at least two of colors in probability of having a satellite.

###4(c)Treating color in a quantitative manner (scores 1,2,3,4), obtain a prediction equa-tion. Interpret the coefficient of color.

m6 <- glm(y~color, family=binomial, data=Crabs)

###Logit (π) = 2.36 – 0.71c ###The estimated odds of satellite : e-0.71= 0.49 → for each category increase in color darkness is multiplied by (0.49)

###4(d)For the model in part (c), test the hypothesis that color has no effect. Interpret.

m6 <- glm(y~color, family=binomial, data=Crabs)
m7 <- glm(y~1, family=binomial, data=Crabs)
anova(m6, m7, test="LRT")
## Analysis of Deviance Table
## 
## Model 1: y ~ color
## Model 2: y ~ 1
##   Resid. Df Resid. Dev Df Deviance  Pr(>Chi)    
## 1       171     213.30                          
## 2       172     225.76 -1  -12.461 0.0004156 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

###For the model in (c ), test the hypothesis that color has no effect. Interpret ###Likelihood-ratio statistics= 12.5, df=1 →p-value <0.01 →very strong evidence of a color effect.

###4(e)When we treat color as quantitative instead of qualitative, state a potential advantage relating to power, and a potential disadvantage relating to model fit. ###Advantage: The model is simpler and easier to interpret, and tests of the effect of the ordinal predictor are generally more powerful when it has a single parameter rather than several parameters. ###Disadvantage: may not be a linear trend for color effect, and does not impose a pattern for way the probability of satellite changes as color changes.

Question 5. The data filehttp://users.stat.ufl.edu/~aa/intro-cda/data/MBTI.datcross-classifies a sample of people from the MBTI Step II National Sample on whether they report drinking alcohol frequently and on the four binary scales of the Myers-Briggs per-sonality test: Extroversion/Introversion (EI), Sensing/iNtuitive (SN), Thinking/Feeling(TF), and Judging/Perceiving (JP).

filename2 <- "http://users.stat.ufl.edu/~aa/intro-cda/data/MBTI.dat"
df3 <- read.table(file=filename2, header=T);
dim(df3)
## [1] 16  7
df3
##    EI SN TF JP smoke drink   n
## 1   e  s  t  j    13    10  77
## 2   e  s  t  p    11     8  42
## 3   e  s  f  j    16     5 106
## 4   e  s  f  p    19     7  79
## 5   e  n  t  j     6     3  23
## 6   e  n  t  p     4     2  18
## 7   e  n  f  j     6     4  31
## 8   e  n  f  p    23    15  80
## 9   i  s  t  j    32    17 140
## 10  i  s  t  p     9     3  52
## 11  i  s  f  j    34     6 138
## 12  i  s  f  p    29     4 106
## 13  i  n  t  j     4     1  13
## 14  i  n  t  p     9     5  35
## 15  i  n  f  j     4     1  31
## 16  i  n  f  p    22     6  79

##5(a)Which of the 16 personality types has the highest percentage that report drinking alcohol frequently?

((df3$drink)/(df3$n)*100)
##  [1] 12.987013 19.047619  4.716981  8.860759 13.043478 11.111111 12.903226
##  [8] 18.750000 12.142857  5.769231  4.347826  3.773585  7.692308 14.285714
## [15]  3.225806  7.594937

###Personality ESTP has the highest percentage that reports drining alcohol frequently

##5(b)Fit a model using the four scales as predictors of the probability of drinking alcohol frequently. Report the prediction equation, specifying how you set up the indicator variables. Test the null hypothesis that all four coefficients are equal to zero.

df3 <- read.table(file = "http://users.stat.ufl.edu/~aa/intro-cda/data/MBTI.dat", header = TRUE)
no <- df3$n - df3$drink; yes <- df3$drink
fit4 <- glm(yes/(yes+no) ~ EI + SN + TF + JP, weights = yes + no, family = binomial(link = "logit"), data = df3); fit
## 
## Call:  glm(formula = y ~ LI, family = binomial, data = df)
## 
## Coefficients:
## (Intercept)           LI  
##     -3.7771       0.1449  
## 
## Degrees of Freedom: 26 Total (i.e. Null);  25 Residual
## Null Deviance:       34.37 
## Residual Deviance: 26.07     AIC: 30.07
m8 <- glm(yes/(yes+no) ~ 1, data=df3, family=binomial(link="logit")); 
## Warning in eval(family$initialize): non-integer #successes in a binomial glm!
(Dev0 <- summary(fit4)$deviance);  (Dev1 <- summary(m8)$deviance);
## [1] 11.14907
## [1] 0.4376451
LR.stat <- Dev0 - Dev1; LR.stat; anova(fit4, m8, test="Chisq");
## [1] 10.71143
## Analysis of Deviance Table
## 
## Model 1: yes/(yes + no) ~ EI + SN + TF + JP
## Model 2: yes/(yes + no) ~ 1
##   Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1        11    11.1491                     
## 2        15     0.4376 -4   10.711

###Prediction Equation: logit[π(x)^]=−2.114047+−0.5550115I+−0.4291508S+0.6873349T+0.2022295P ###There is evidence that all four coefficients are not equal to zero (The P-Value is .030011. The result is significant at p < .05)

##5(c)Test for lack of fit by comparing your model of part (b) to the saturated model. What do you conclude?

fit4$deviance; fit4$df.residual
## [1] 11.14907
## [1] 11
1-pchisq(fit4$deviance, fit4$df.residual)
## [1] 0.4308605

###residual deviance: 0.14907; p-value for deviance goodness-of-fit test = 0.4308605… therefore the model is a good fit

##5(d)According your model of part (b), which personality type has the highest estimated probability of drinking alcohol frequently? Is it the same as your answer to part (a)?Explain. ###ENTP has the highest probability as opposed to ESTP in my first estimate. By default, R set E, N, F & J to be built into the intercept term leaving estimated regression coefficients for I = 1, S = 1, T = 1 & P = 1. The coefficients for I = S = 1 are negative while the coefficients for T = P = 1 are positive. When I = S = 0 and T = P = 1, the negative coefficients have no impact while the positive coefficients are impacting the predicted probability.

##Question 6. The data file SoreThroat shows results of a study about Y= whether a patient having surgery with general anesthesia experience a sore throat on waking (1 = yes, 0 = no) asa function of D= duration of the surgery (in minutes) and T= type of device used to secure the airway (0 = laryngeal mask airway, 1 = tracheal tube). ##Question 6(a) Fit a main effects model using these predictors. Interpret parameter estimates. ##Question 6(b) Conduct inference about the D effect in your model of part (a). ##Question 6(c) Fit a model permitting interaction. Report the prediction equation for the effect of D when (i)T= 1, and (ii)T= 0. Interpret. ##Question 6(d) Conduct inference about whether you need the interaction term. ##Question 6(e) Construct side-by-side scatterplots of Y versus D (be sure to jitterthe Y-values), using different plotting symbols for T= 0 versusT= 1. Overlay the fitted logistic curves corresponding to the main effects model in the left-hand plot and the inter-action model in the right-hand plot, and using different line types for T= 0 versusT= 1. Include legends on your plots — there is some R code in theHomework hints folder on Courseworks that you may find helpful. Comment on the plots.

filename3 <- "http://users.stat.ufl.edu/~aa/cat/data/SoreThroat.dat"
SoreThroat <- read.table(file=filename3, header=T);
m9 <- glm(Y~D+T, data=SoreThroat, family=binomial)
m10 <- glm(Y~D+T+D:T, data=SoreThroat, family=binomial)
anova(m9, m10, test="Chisq")
## Analysis of Deviance Table
## 
## Model 1: Y ~ D + T
## Model 2: Y ~ D + T + D:T
##   Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1        32     30.138                     
## 2        31     28.321  1   1.8169   0.1777
x <- range(SoreThroat$D)
x <- seq(x[1], x[2])
par(mfrow=c(1,2)); set.seed(406);
plot(jitter(Y,.2) ~ D, pch=2-T, data=SoreThroat, ylab="P(SoreThroat)",xlab="Duration", main="Main effects model")
curve(predict(m9, data.frame(D=x,T=1), type="response"), lty=1, add=T)
curve(predict(m9, data.frame(D=x,T=0), type="response"), lty=2, add=T)
legend("bottomright", inset=.05, pch=1:2, lty=1:2,legend=c("Tracheal tube", "Laryngeal mask"))
plot(jitter(Y,.2) ~ D, pch=2-T, data=SoreThroat, ylab="P(SoreThroat)",xlab="Duration", main="Interaction model")
curve(predict(m10, data.frame(D=x,T=1), type="response"), lty=1, add=T)
curve(predict(m10, data.frame(D=x,T=0), type="response"), lty=2, add=T)
legend("bottomright", inset=.05, pch=1:2, lty=1:2,legend=c("Tracheal tube", "Laryngeal mask"))

summary(m9)
## 
## Call:
## glm(formula = Y ~ D + T, family = binomial, data = SoreThroat)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.3802  -0.5358   0.3047   0.7308   1.7821  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)   
## (Intercept) -1.41734    1.09457  -1.295  0.19536   
## D            0.06868    0.02641   2.600  0.00931 **
## T           -1.65895    0.92285  -1.798  0.07224 . 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 46.180  on 34  degrees of freedom
## Residual deviance: 30.138  on 32  degrees of freedom
## AIC: 36.138
## 
## Number of Fisher Scoring iterations: 5

###Answer 6(a). Estimates of the main effects model: So we can say for a one-unit increase in duration of the surgery, we expect to see about 7% (exp(0.069)=1.07) increase in the odds of being a patient with general anesthesia experienced a sore throat on waking.

###And for using a tracheal tube in surgery, we expect to see about 19% (exp(-1.65895)=0.1903387) increase in the odds of being a patient with general anesthesia experienced a sore throat on waking than if aryngeal mask airway was used in the surgery.

###Answer 6(b) Test H0: beta1(D) = 0, get p-value of 0.00931, D is important predictor of prob of the odds of being a patient with general anesthesia experienced a sore throat on waking. Patients who use a laryngeal mask airway and survived longer surgery are more likely to experience a sore throat.

###Answer 6(c) logit(P-hat(Y=1))=0.04979+0.02848D-4.47224T+0.07460(D*T) ###For tracheal tube T=1 and logit(P-hat(Y=1))=-4.42245+0.10308(D) ###For laryngeal mask airway T=0 logit(P-hat(Y=1))=0.04979+0.02848(D) ###So we can say for a one-unit increase in duration of the surgery, we expect to see about 11% (exp(0.10308)=1.10858) increase in the odds of being a patient with general anesthesia experienced a sore throat on waking, given T=1 (For tracheal tube) .

###Answer 6(d) P-value = 0.1777… Data do not indicate a need for an interaction term.

###Answer 6(e) In the main effects plot of Y versus D the curves are “parallel” in the sense that the curves never touch. The horizontal distance between each curve is constant. ### In the interaction model, for the laryngeal mask airway, the stregth of association between probability of sore throat and duration of sleep is much stronger for shorter durations of sleep, while the association is much stronger for tracheal tube for longer durations of sleep