library(faraway)
The dataset wbca comes from a study of breast cancer in Wisconsin. There are 681 cases of potentially cancerous tumors of which 238 are actually malignant. Determining whether a tumor is really malignant is traditionally determined by an invasive surgical procedure. The purpose of this study was to determine whether a new procedure called fine needle aspiration, which draws only a small sample of tissue, could be effective in determining tumor status.
Fit a binomial regression with Class as the response and the other nine variables as predictors. Report the residual deviance and associated degrees of freedom. Can this information be used to determine if this model fits the data?
Use AIC as the criterion to determine the best subset of variables. (Use the step function.)
data(wbca, package="faraway")
head(wbca)
## Class Adhes BNucl Chrom Epith Mitos NNucl Thick UShap USize
## 1 1 1 1 3 2 1 1 5 1 1
## 2 1 5 10 3 7 1 2 5 4 4
## 3 1 1 2 3 2 1 1 3 1 1
## 4 1 1 4 3 3 1 7 6 8 8
## 5 1 3 1 3 2 1 1 4 1 1
## 6 0 8 10 9 7 1 7 8 10 10
glm_mod <- glm(Class ~ ., family="binomial", data = wbca)
glm_mod_2 <- step(glm_mod, direction="backward")
## Start: AIC=109.46
## Class ~ Adhes + BNucl + Chrom + Epith + Mitos + NNucl + Thick +
## UShap + USize
##
## Df Deviance AIC
## - USize 1 89.523 107.52
## - Epith 1 89.613 107.61
## - UShap 1 90.627 108.63
## <none> 89.464 109.46
## - Mitos 1 93.551 111.55
## - NNucl 1 95.204 113.20
## - Adhes 1 98.844 116.84
## - Chrom 1 99.841 117.84
## - BNucl 1 109.000 127.00
## - Thick 1 110.239 128.24
##
## Step: AIC=107.52
## Class ~ Adhes + BNucl + Chrom + Epith + Mitos + NNucl + Thick +
## UShap
##
## Df Deviance AIC
## - Epith 1 89.662 105.66
## - UShap 1 91.355 107.36
## <none> 89.523 107.52
## - Mitos 1 93.552 109.55
## - NNucl 1 95.231 111.23
## - Adhes 1 99.042 115.04
## - Chrom 1 100.153 116.15
## - BNucl 1 109.064 125.06
## - Thick 1 110.465 126.47
##
## Step: AIC=105.66
## Class ~ Adhes + BNucl + Chrom + Mitos + NNucl + Thick + UShap
##
## Df Deviance AIC
## <none> 89.662 105.66
## - UShap 1 91.884 105.88
## - Mitos 1 93.714 107.71
## - NNucl 1 95.853 109.85
## - Adhes 1 100.126 114.13
## - Chrom 1 100.844 114.84
## - BNucl 1 109.762 123.76
## - Thick 1 110.632 124.63
glm_mod_2
##
## Call: glm(formula = Class ~ Adhes + BNucl + Chrom + Mitos + NNucl +
## Thick + UShap, family = "binomial", data = wbca)
##
## Coefficients:
## (Intercept) Adhes BNucl Chrom Mitos
## 11.0333 -0.3984 -0.4192 -0.5679 -0.6456
## NNucl Thick UShap
## -0.2915 -0.6216 -0.2541
##
## Degrees of Freedom: 680 Total (i.e. Null); 673 Residual
## Null Deviance: 881.4
## Residual Deviance: 89.66 AIC: 105.7
Residual Deviance equals 89.66 and has 673 Degrees of Freedom and AIC 105.66.