[All Students] Using the βLIβ data from pp.113-114, exercise 4.1,
Plot the jittered data (as on p.92) with the GAM and logistic models fits superimposed (and with a legend for the two plots in the top left corner of the plot) β use two distinct colors for the plots
LI = c(8,8,10,10,12,12,12,14,14,14,16,16,16,18,20,20,20,22,22,24,26,28,32,34,38,38,38)
y = c(rep(0,13), rep(1,3), 0,1,0,0,1,1,0,1,1,1,0)
fit <- glm(y ~ LI, family = binomial)
plot(jitter(y, 0.08) ~ LI, pch = 19, col = "purple", xlab = "Percentage of Labeled Cells", ylab = "Remission")
library(gam)
## Loading required package: splines
## Loading required package: foreach
## Loaded gam 1.16
gam.fit <- gam(y ~ s(LI), family = binomial)
curve(predict(gam.fit, data.frame(LI = x), type = "resp"), col = "blue", lty = 4, lwd = 3, add = T)
curve(predict(fit, data.frame(LI = x), type = "resp"), col = "darkgreen", lty = 2, lwd = 3, add = T)
legend("topleft", legend=c("Logistic Fit","GAM Fit"), lty = c(2, 4), lwd = 3, col = c("darkgreen", "blue"))
Showing all needed calculations, find, report and interpret the π³π«ππ β the point on the x axis where the predicted response reaches ππ% β and clearly indicate the units
summary(fit)
##
## Call:
## glm(formula = y ~ LI, family = binomial)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.9448 -0.6465 -0.4947 0.6571 1.6971
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.77714 1.37862 -2.740 0.00615 **
## LI 0.14486 0.05934 2.441 0.01464 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 34.372 on 26 degrees of freedom
## Residual deviance: 26.073 on 25 degrees of freedom
## AIC: 30.073
##
## Number of Fisher Scoring iterations: 4
Using the Logit model:
#log(.75/.25) = -3.77714 + 0.14486*LI
#Solving with algebra
(log(.75/.25)+3.77714)/.14486
## [1] 33.65838
#Checking Answer
predict(fit, data.frame(LI = 33.61875), type = "response")
## 1
## 0.7489423
Holly found a package that is more accurate for some reason.
library(MASS)
dose.p(fit, p = .75)
## Dose SE
## p = 0.75: 33.65763 5.931823
ld75 = 33.65753
predict(fit, data.frame(LI = ld75), type = "response")
## 1
## 0.7499972
Units are βPercentage ofβlabeled" cells.
Showing all needed calculations, find, report and interpret the π³π«ππ β the point on the x axis where the predicted response reaches ππ% β and clearly indicate the units.
(log(.1/.9)+3.7714)/.14486
## [1] 10.86687
dose.p(fit, p = .10)
## Dose SE
## p = 0.1: 10.90626 5.566491
ld10 = 10.90626
predict(fit, data.frame(LI = ld10), type = "response")
## 1
## 0.1
Following the same logic as (b), units are βPercentage ofβlabeled" cells.
[All Students] p.Β 115, exercise 4.8. Show all work in your calculations and attach R programs and output/graph. Be clear and detailed in your interpretation of the AUC in part (d).
crabs <- read.table("http://www.stat.ufl.edu/~aa/cat/data/Crabs.dat", header = T)
fitB <- glm(y ~ weight, family = binomial, data=crabs)
summary(fitB)
##
## Call:
## glm(formula = y ~ weight, family = binomial, data = crabs)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.1108 -1.0749 0.5426 0.9122 1.6285
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -3.6947 0.8802 -4.198 2.70e-05 ***
## weight 1.8151 0.3767 4.819 1.45e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 225.76 on 172 degrees of freedom
## Residual deviance: 195.74 on 171 degrees of freedom
## AIC: 199.74
##
## Number of Fisher Scoring iterations: 4
\(logit(pi) = -3.6947 + 1.8151*weight\)
#i
exp(1.8151)
## [1] 6.14169
This is the estimated multiplicative increase in odds of having a satellite for each 1-kg increase in weight.
#ii
exp(1.8151*.10)
## [1] 1.199027
This is the estimated multiplicative increase in odds of having a satellite for each .10-kg increase in weight.
#iii
exp(1.8151*.58)
## [1] 2.865543
This is the estimated multiplicative increase in odds of having a satellite for each .58-kg increase in weight.
exp(1.8151*.10)
## [1] 1.199027
This is the estimated multiplicative increase in odds of having a satellite for each .10-kg increase in weight.
prop = sum(crabs$y)/nrow(crabs);prop
## [1] 0.6416185
predicted = as.numeric(fitted(fitB) > prop)
t = xtabs(~crabs$y + predicted);addmargins(t)
## predicted
## crabs$y 0 1 Sum
## 0 45 17 62
## 1 43 68 111
## Sum 88 85 173
Sensitivity = 68/111; Sensitivity
## [1] 0.6126126
Specificity = 45/62; Specificity
## [1] 0.7258065
#Correct Total:
(68+45)/173
## [1] 0.6531792
Thus the Model correctly predicts a satellite 0.6126126 and correctly predicts not having a satellite 0.7258065.
#comparing to .5
predicted2 = as.numeric(fitted(fitB) > .5)
t2 = xtabs(~crabs$y + predicted2);addmargins(t2)
## predicted2
## crabs$y 0 1 Sum
## 0 27 35 62
## 1 20 91 111
## Sum 47 126 173
library(pROC)
## Type 'citation("pROC")' for a citation.
##
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
##
## cov, smooth, var
rocplot <- roc(y ~ fitted(fitB), data = crabs)
plot.roc(rocplot, legacy.axes = T, col = "purple", lwd = 3, lty = 1)
auc = auc(rocplot); auc
## Area under the curve: 0.7379
The concordance index 0.7379396 estimates the probability that the model will correctly predict if crabs will have at least one satellite or not. If the model was just guessing, then the probability would be 0.5. Therefore this model seems to be better at predicting than just guessing.
[All Students] pp.Β 116-117, exercise 4.12. Clearly label your answers with these designations:
Fit the binomial logit-link logistic model using R (attach program and output)
df = read.table("http://users.stat.ufl.edu/~aa/cda/data/MBTI.dat", header = TRUE)
fitC = glm(drink/n ~ EI + SN + TF + JP , family = binomial, weights = n, data=df)
summary(fitC)
##
## Call:
## glm(formula = drink/n ~ EI + SN + TF + JP, family = binomial,
## data = df, weights = n)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.2712 -0.8062 -0.1063 0.1124 1.5807
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.1140 0.2715 -7.788 6.82e-15 ***
## EIi -0.5550 0.2170 -2.558 0.01053 *
## SNs -0.4292 0.2340 -1.834 0.06664 .
## TFt 0.6873 0.2206 3.116 0.00184 **
## JPp 0.2022 0.2266 0.893 0.37209
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 30.488 on 15 degrees of freedom
## Residual deviance: 11.149 on 11 degrees of freedom
## AIC: 73.99
##
## Number of Fisher Scoring iterations: 4
Report the fitted or prediction equation and clearly indicate how you (or R) set up the 4 variables (c) Clearly/thoroughly explain why the highest predicted probability is associated with the ENTP run Add the following parts:
\(logit(pi) = -2.114 - .555*EI - .4292* SN + .6873*TF + .2022*JP\) where pi is the probability of reporting drinking alcohol frequenty, 1 for EI is I, 1 for SN is S, 1 for TF is T, and 1 for JP is P. Since the Estimate for the Betas corresponding to EI and SN are negative, with those βoffβ (AKA NOT introverted and NOT Sensing), and with values of 1 for TF (Thinking) and JP (Perceiving) we get the maximum value for our logit response (hence ENTP is the highest value for the logit).
Using the prediction equation, give the predicted probability associated with this ENTP run showing all relevant work
predict(fitC, data.frame(EI = "e", SN = "n", TF = "t", JP = "p"), type = "response")
## 1
## 0.2271486
Using R (and providing your R program and output), provide a table (i.e., the R output) with each of the 16 cases here as rows β include the raw data, the predicted probabilities (also called fitted values), and the 95% confidence intervals for the predicted probabilities. Hint: see p.96.
untransform <- function(x) {
exp(x)/ (1 + exp(x))
}
pred.prob = fitted(fitC)
lp = predict(fitC, se.fit = TRUE)
LB = lp$fit-qnorm(.975)*lp$se.fit
UB = lp$fit+qnorm(.975)*lp$se.fit
LB.p = untransform(LB)
UB.p = untransform(UB)
cbind(as.character(df$EI), as.character(df$SN), as.character(df$TF), as.character(df$JP), pred.prob, LB.p, UB.p)
## pred.prob LB.p
## 1 "e" "s" "t" "j" "0.135185995617697" "0.0931172482722204"
## 2 "e" "s" "t" "p" "0.160618495610535" "0.104683668260431"
## 3 "e" "s" "f" "j" "0.0728847929649104" "0.0475781110856337"
## 4 "e" "s" "f" "p" "0.0877863430939439" "0.056945742765649"
## 5 "e" "n" "t" "j" "0.193611499337586" "0.122230341820276"
## 6 "e" "n" "t" "p" "0.227148569323643" "0.150430654325489"
## 7 "e" "n" "f" "j" "0.107739004955063" "0.0662299916180012"
## 8 "e" "n" "f" "p" "0.128776805961202" "0.087368053391938"
## 9 "i" "s" "t" "j" "0.0823472222532527" "0.0557651787705972"
## 10 "i" "s" "t" "p" "0.0989768623938524" "0.0634001298871951"
## 11 "i" "s" "f" "j" "0.0431811803063737" "0.0272291405566716"
## 12 "i" "s" "f" "p" "0.0523526558573392" "0.032964391018237"
## 13 "i" "n" "t" "j" "0.121135227579985" "0.0727636218369248"
## 14 "i" "n" "t" "p" "0.144365624878002" "0.0911440303870178"
## 15 "i" "n" "f" "j" "0.0648240222100406" "0.0376315312815638"
## 16 "i" "n" "f" "p" "0.0782165584699526" "0.0500686110443195"
## UB.p
## 1 "0.192231981663945"
## 2 "0.238479683006856"
## 3 "0.11009605188184"
## 4 "0.132974420092835"
## 5 "0.292774426605533"
## 6 "0.327891347602674"
## 7 "0.17051280884275"
## 8 "0.185815618486869"
## 9 "0.119990147919009"
## 10 "0.15129251895222"
## 11 "0.0678270189260874"
## 12 "0.0821752633991756"
## 13 "0.194903955094926"
## 14 "0.221104337446394"
## 15 "0.109431440893027"
## 16 "0.120186799360624"
#When I thought I was being clever...:
#dat = expand.grid(EI = c("e","i"), SN = c("s","n"), TF = c("t","f"), JP = c("j", "p"))
#predict(fitC, data.frame(dat), type = "response")
[All Students] pp.Β 117-118, exercise 4.14. Using these data (accessed via the Infection data file on the text website), fit the related logistic model(s) to do the following:
D = read.table("http://users.stat.ufl.edu/~aa/cda/data/Infection.dat", header = TRUE)
fitD = glm(y/n ~ factor(center) + treat, family = binomial, weights = n, data=D)
summary(fitD)
##
## Call:
## glm(formula = y/n ~ factor(center) + treat, family = binomial,
## data = D, weights = n)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.87919 -0.77729 -0.00401 0.47293 0.92934
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.3220 0.3165 -4.177 2.95e-05 ***
## factor(center)2 2.0554 0.4201 4.893 9.94e-07 ***
## factor(center)3 1.1529 0.4246 2.715 0.00662 **
## factor(center)4 -1.4185 0.6636 -2.138 0.03255 *
## factor(center)5 -0.5199 0.5338 -0.974 0.33007
## factor(center)6 -2.1469 1.0614 -2.023 0.04310 *
## factor(center)7 -0.7977 0.8149 -0.979 0.32764
## factor(center)8 2.2079 0.7195 3.069 0.00215 **
## treat 0.7769 0.3067 2.533 0.01130 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 93.5545 on 15 degrees of freedom
## Residual deviance: 9.7463 on 7 degrees of freedom
## AIC: 66.136
##
## Number of Fisher Scoring iterations: 4
Using the likelihood-based test, test whether one can accept that the odds ratios are equal across the 8 centers. Here, βoddsβ refers to the Success as response comparing Drug and Control. Clearly give the hypotheses, test statistic, degrees of freedom, p-value, and your clear conclusion.
\(H_0:\theta_2 = \theta_3 =...= \theta_8\) \(H_a:\) At least one \(\theta_i \neq \theta_j\) Where \(i\neq j\) and \(\theta\) is the odds ratios of success at each center. Note \(\theta_2\) refers to center 2 and \(\alpha\) absorbs center 1 and \(\theta_1\) corresponds to treatment.
fit.red = glm(y/n ~ treat, family = binomial, weights = n, data=D)
an = anova(fitD, fit.red, test = "LRT"); an
## Analysis of Deviance Table
##
## Model 1: y/n ~ factor(center) + treat
## Model 2: y/n ~ treat
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 7 9.746
## 2 14 90.960 -7 -81.214 7.788e-15 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
stat = an$Deviance[2]
df = an$Df[2]
With a difference in Deviance of -81.2139004 and a corresponding p-value of 7.788215110^{-15} < \(\alpha = 0.05\) we reject tjhe null and conclude that at least one odds ratio of success at a center is different.
In the logistic model involving only center (as a factor) and treatment/drug (as a dummy variable), obtain and clearly interpret the estimated odds ratio associated with (i) center 2
OR.Center2 <- exp(coefficients(fitD)[2]); OR.Center2
## factor(center)2
## 7.81023
Controlling for treatment/drug, the estimated odds of success at center 2 is 7.8102302 times the estimated odds for the other centers.
OR.Center4 <- exp(coefficients(fitD)[4]); OR.Center4
## factor(center)4
## 0.2420881
Controlling for treatment/drug, the estimated odds of success at center 4 is 0.2420881 times the estimated odds for the other centers.
OR.treat <- exp(coefficients(fitD)[9]); OR.treat
## treat
## 2.174764
Controlling for center, the estimated odds of success at for treatment is 2.1747643 times the estimated odds of control.
In the logistic model involving only center (as a factor) and treatment/drug (as a dummy variable), use the likelihood-based test to test whether treatment/drug is significant in this model. Give the hypotheses, test statistic, p-value, and your clear conclusion.
\(H_0:\beta_{treat} =0\) \(H_a:\beta_{treat} \neq 0\)
fit.null = glm(y/n ~ factor(center), family = binomial, weights = n, data=D)
an = anova(fitD, fit.null, test = "LRT"); an
## Analysis of Deviance Table
##
## Model 1: y/n ~ factor(center) + treat
## Model 2: y/n ~ factor(center)
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 7 9.7463
## 2 8 16.4151 -1 -6.6688 0.009811 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
stat = an$Deviance[2]
df = an$Df[2]
With a difference in Deviance of -6.6688176 with a corresponding p-value of 0.0098114 < $= 0.05 $ we reject tjhe null and conclude that treatment/drug is significant in this model.
[STAT-410 students only] pp.Β 118-119, exercise 4.16. Perform the likelihood test in part (a), clearly giving the hypotheses, test statistic, p-value and your clear answer. Add the following parts:
Input = ("
D t Y
45 0 0
15 0 0
40 0 1
83 1 1
90 1 1
25 1 1
35 0 1
65 0 1
95 0 1
35 0 1
75 0 1
45 1 1
50 1 0
75 1 1
30 0 0
25 0 1
20 1 0
60 1 1
70 1 1
30 0 1
60 0 1
61 0 0
65 0 1
15 1 0
20 1 0
45 0 1
15 1 0
25 0 1
15 1 0
30 0 1
40 0 1
15 1 0
135 1 1
20 1 0
40 1 0
")
dfE = read.table(textConnection(Input),header=TRUE)
fitE = glm(Y ~ D + t + D*t, family = binomial, data=dfE)
summary(fitE)
##
## Call:
## glm(formula = Y ~ D + t + D * t, family = binomial, data = dfE)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -1.9707 -0.3779 0.3448 0.7292 1.9961
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 0.04979 1.46940 0.034 0.9730
## D 0.02848 0.03429 0.831 0.4062
## t -4.47224 2.46707 -1.813 0.0699 .
## D:t 0.07460 0.05777 1.291 0.1966
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 46.180 on 34 degrees of freedom
## Residual deviance: 28.321 on 31 degrees of freedom
## AIC: 36.321
##
## Number of Fisher Scoring iterations: 6
When t = 1: \(logit(pi) = 0.04979 + 0.02848*D - 4.47224* (1) + 0.07460*D*(1)\) \(logit(pi) = -4.42245 + 0.10308*D\)
exp(0.10308)
## [1] 1.10858
So when using a tracheal tube, the estimated odds of having a sore throat on waking from surgery increases by a multiplicative factor of 1.10858 for each minute increasing in duration of surgery.
When t = 0: \(logit(pi) = 0.04979 + 0.02848*D - 4.47224* (0) + 0.07460*D*(0)\) \(logit(pi) = 0.04979 + 0.02848*D\)
exp(0.02848)
## [1] 1.028889
So when using a laryngeal mask, the estimated odds of having a sore throat on waking from surgery increases by a multiplicative factor of 1.028889 for each minute increasing in duration of surgery.
\(H_0:\beta_{interaction} =0\) \(H_a:\beta_{interaction} \neq 0\)
fitE2 = glm(Y ~ D + t, family = binomial, data=dfE)
aovE = anova(fitE, fitE2, test = "LRT"); aovE
## Analysis of Deviance Table
##
## Model 1: Y ~ D + t + D * t
## Model 2: Y ~ D + t
## Resid. Df Resid. Dev Df Deviance Pr(>Chi)
## 1 31 28.321
## 2 32 30.138 -1 -1.8169 0.1777
With a difference in deviance of NA, -1.8168856 and since the p-value 0.1776844 > \(\alpha = .05\), we fail-to-reject the null and cannot conclude the interaction term is significant in this model. So we can remove it.
cor(dfE$Y, fitted(fitE))
## [1] 0.6598764
cor(dfE$Y, fitted(fitE2))
## [1] 0.6528899
Since these are very close, the simpler model (without interaction) does essentially as well as the full model.
In the model with no interaction term (between π and π), obtain and clearly interpret the odds ratios for both π (duration) and π (type of device)
exp(fitE2$coefficients)
## (Intercept) D t
## 0.2423575 1.0710910 0.1903389
The estimated odds of having a sore throat after waking up from surgery increases by a multiplicative factor of 1.071091 for each additional minute of surgery (duration), given type of device t stays constant.
The estimated odds of having a sore throat after waking up from surgery increases by a multiplicative factor of 0.1903389 when changing from laryngeal mask to a tracheal tube, given duration of surgery d stays constant.
For the model with the π Γ π interaction term, using R obtain and submit the fitted logistic curves for the π = π and π = π cases juxtaposed on the same plot
#range(dfE$D) Use this to find range of xx
xx <- seq(15, 135, length = 1000)
yy1 = predict(fitE, data.frame(D = xx, t = 0), type = "response")
yy2 = predict(fitE, data.frame(D = xx, t = 1), type = "response")
xrange = range(xx)
yrange = c(0,1)
plot(xx, yy1, xlim = xrange, ylim = yrange, col = "blue", xlab = "Duration of Surgery", ylab = "Percent Sore Throat")
par(new = T)
plot(xx, yy2, xlim = xrange, ylim = yrange, col = "red", xlab = "", ylab = "")
legend("bottomright", lwd = 3, lty = c(1, 1), col = c("blue", "red"), legend = c("t = 0 Laryngeal Mask", "t = 1 Tracheal Tube"))
For the model without the π Γ π interaction term, using R obtain and submit the fitted logistic curves for the π = π and π = π cases juxtaposed on the same plot.
xx <- seq(15, 135, length = 1000)
yy1 = predict(fitE2, data.frame(D = xx, t = 0), type = "response")
yy2 = predict(fitE2, data.frame(D = xx, t = 1), type = "response")
xrange = range(xx)
yrange = c(0,1)
plot(xx, yy1, xlim = xrange, ylim = yrange, col = "blue", xlab = "Duration of Surgery", ylab = "Percent Sore Throat")
par(new = T)
plot(xx, yy2, xlim = xrange, ylim = yrange, col = "red", xlab = "", ylab = "")
legend("bottomright", lwd = 3, lty = c(1, 1), col = c("blue", "red"),
legend = c("t = 0 Laryngeal Mask", "t = 1 Tracheal Tube"))
Figuring it out using curve function instead of plot:
curve(predict(fitE2, data.frame(D = x, t = 0), type = "resp"), col = "blue", lty = 4, lwd = 3, xlim = xrange, ylim = yrange, xlab = "Duration of Surgery", ylab = "Percent Sore Throat")
par(new = T)
curve(predict(fitE2, data.frame(D = x, t = 1), type = "resp"), col = "red", lty = 1, lwd = 3, xlim = xrange, ylim = yrange, xlab = "", ylab = "")
legend("bottomright", lwd = 3, lty = c(4, 1), col = c("blue", "red"), legend = c("t = 0 Laryngeal Mask", "t = 1 Tracheal Tube"))
For the model in part (e), give the estimated πΏπ·50s for π in both the π = π and π = π cases. F.
Using on the plots (since this is estimated) the LD50 for t = 0 (Laryngeal Mask) is about 23 minutes of surgery, while the LD50 for t = 1 (Tracheal Tube) is about 44 minutes of surgery.
Now to calculate them:
summary(fitE2)
##
## Call:
## glm(formula = Y ~ D + t, family = binomial, data = dfE)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.3802 -0.5358 0.3047 0.7308 1.7821
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.41734 1.09457 -1.295 0.19536
## D 0.06868 0.02641 2.600 0.00931 **
## t -1.65895 0.92285 -1.798 0.07224 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 46.180 on 34 degrees of freedom
## Residual deviance: 30.138 on 32 degrees of freedom
## AIC: 36.138
##
## Number of Fisher Scoring iterations: 5
\(logit(pi) = -1.41734 + 0.06868*D - 1.65895*t\)
t = 0: \(logit(pi) = -1.41734 + 0.06868*D\) ld50 for t = 0
(log(.5/.5) + 1.41734)/ 0.06868
## [1] 20.63687
#Note log(1)=0, so ld50 is just alpha/beta
t = 1: \(logit(pi) = -1.41734 + 0.06868*D - 1.65895*1\) \(\iff logit(pi) = -3.07529 + 0.06868*D\) ld50 for t = 1
(log(.5/.5) + 3.07629)/ 0.06868
## [1] 44.79164
These are about the same as my estimations from the graph.
[STAT-410 students only] p.Β 120, exercise 4.22. Using the reported data, obtain and report both logistic model estimated parameters (πΌΜ and π½). Show all relevant calculations.
Model: \(logit(pi) = \alpha + \beta*x\)
With x = 18, pi = 80: \(log(.80/.20) = \alpha + \beta*18\)
and with x = 65, pi = 20: \(log(.20/.80) = \alpha + \beta*65\)
So solving the system using elimination. I multiply the second equation by -1 and add them together: \(log(.80/.20) = \alpha + \beta*18\) \(+ - log(.20/.80) = -\alpha - \beta*65\)
Yields: \(log(.80/.20) - log(.20/.80) = -47\beta\)
Solving for \(\beta\):
(log(.8/.2)-log(.2/.8))/47
## [1] 0.05899125
So \(\beta \approx 0.05899125\)