Access ‘BinarayLogistic_1.sav’ SPSS data file. This file contains data from a survey of home owners conducted by an electricity company about an offer of roof solar panels with a 50% subsidy from the state government as part of the state’s environmental policy. The variables involve household income measured in units of a thousand dollars, age, monthly mortgage, size of family household, and whether the householder would take or decline the offer. Please follow the instructions below and conduct a logistic regression to determine whether family size and monthly mortgage will predict taking or declining the offer.
# Load the data set frist
library(haven)
BinarayLogistic_1 <- read_sav("Desktop/BinarayLogistic_1.sav")
# Make a new copy
test_data <- BinarayLogistic_1
# Check the data
summary(test_data)
## income age takeoffer Mortgage
## Min. : 29.0 Min. :23.00 Min. :0.0000 Min. :1500
## 1st Qu.: 47.0 1st Qu.:28.00 1st Qu.:0.0000 1st Qu.:1825
## Median : 66.5 Median :36.00 Median :1.0000 Median :2100
## Mean : 72.9 Mean :37.67 Mean :0.5333 Mean :2180
## 3rd Qu.: 95.0 3rd Qu.:46.00 3rd Qu.:1.0000 3rd Qu.:2400
## Max. :132.0 Max. :58.00 Max. :1.0000 Max. :3400
## Famsize
## Min. :1.000
## 1st Qu.:2.000
## Median :3.500
## Mean :3.533
## 3rd Qu.:4.750
## Max. :6.000
# Run a full model
mylogit <- glm(takeoffer ~ Mortgage + Famsize, data = test_data, family = "binomial")
# Check the model
summary(mylogit)
##
## Call:
## glm(formula = takeoffer ~ Mortgage + Famsize, family = "binomial",
## data = test_data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.26944 -0.24356 0.00234 0.39237 1.57783
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -18.627293 8.653725 -2.153 0.0314 *
## Mortgage 0.005013 0.002813 1.782 0.0747 .
## Famsize 2.398534 0.962076 2.493 0.0127 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 41.455 on 29 degrees of freedom
## Residual deviance: 17.359 on 27 degrees of freedom
## AIC: 23.359
##
## Number of Fisher Scoring iterations: 7
# Load the package aod to use the wald.test function
library(aod)
# Run a wald.test
wald.test(b=coef(mylogit),Sigma=vcov(mylogit), Terms=2:3)
## Wald test:
## ----------
##
## Chi-squared test:
## X2 = 6.2, df = 2, P(> X2) = 0.044
# The chi-squared test statistic of 6.2, with three degrees of freedom is associated with a p-value of 0.044 indicating that the data fits the model well (p<.001).
# Calculate the R^2
library(DescTools)
# Calculate Cox and Snell’s R^2
PseudoR2(mylogit, which = "CoxSnell")
## CoxSnell
## 0.552113
# The Cox and Snell’s R-Square indicates that 55.2% of the variation in the dependent variable is explained by the logistic model.
# Calculate Nagelkerke ’s R^2
PseudoR2(mylogit, which = "Nagelkerke")
## Nagelkerke
## 0.7372461
# The Nagelkerke R-Square indicates a strong relationship of 73.7% between the predictors and the prediction.
# Add a predict column
test_data$offerP <- predict(mylogit, newdata = test_data, type = "response")
# Recode the predict possibility using a cut point 50%
test_data$offerP[test_data$offerP <= 0.5] <- 0
test_data$offerP[test_data$offerP > 0.5] <- 1
# To see the crosstable, we need CrossTable function from gmodels package
library(gmodels)
## Registered S3 method overwritten by 'gdata':
## method from
## reorder.factor DescTools
# Build the crosstable
CrossTable(test_data$takeoffer, test_data$offerP)
##
##
## Cell Contents
## |-------------------------|
## | N |
## | Chi-square contribution |
## | N / Row Total |
## | N / Col Total |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 30
##
##
## | test_data$offerP
## test_data$takeoffer | 0 | 1 | Row Total |
## --------------------|-----------|-----------|-----------|
## 0 | 13 | 1 | 14 |
## | 5.143 | 5.143 | |
## | 0.929 | 0.071 | 0.467 |
## | 0.867 | 0.067 | |
## | 0.433 | 0.033 | |
## --------------------|-----------|-----------|-----------|
## 1 | 2 | 14 | 16 |
## | 4.500 | 4.500 | |
## | 0.125 | 0.875 | 0.533 |
## | 0.133 | 0.933 | |
## | 0.067 | 0.467 | |
## --------------------|-----------|-----------|-----------|
## Column Total | 15 | 15 | 30 |
## | 0.500 | 0.500 | |
## --------------------|-----------|-----------|-----------|
##
##
# Interpret the B and exp(B) values and make your conclusion
summary(mylogit)
##
## Call:
## glm(formula = takeoffer ~ Mortgage + Famsize, family = "binomial",
## data = test_data)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.26944 -0.24356 0.00234 0.39237 1.57783
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -18.627293 8.653725 -2.153 0.0314 *
## Mortgage 0.005013 0.002813 1.782 0.0747 .
## Famsize 2.398534 0.962076 2.493 0.0127 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 41.455 on 29 degrees of freedom
## Residual deviance: 17.359 on 27 degrees of freedom
## AIC: 23.359
##
## Number of Fisher Scoring iterations: 7
## odds ratios only
exp(coef(mylogit))
## (Intercept) Mortgage Famsize
## 8.133346e-09 1.005025e+00 1.100703e+01
The number of people in the household is significant (p < 0.05), showing that the variable makes a significant contribution.
The monthly mortage payment is not significant (p > 0.05), showing that the variable does not makes a significant contribution.
For every one unit change in the monthly mortage, the log odds of taking an offer (versus declining an offer) increases by 0.005.
For every one unit change in the number of persons in household, the log odds of taking an offer (versus declining an offer) increases by 2.399.
The Exp(B) value assosiated with monthly mortage is 1.005. Hence when monthly mortage is raised by one unit, the odds ratio is 1.0 times as high and therefore people who got 1 point higher monthly mortage are 1.0 times likely to belong to the take offer group.
The Exp(B) value assosiated with number of persons in household is 11.17. Hence when number of people in the household is raised by one unit, the odds ratio is 11.17 times as high and therefore people who got 1 point higher monthly mortage are 11.17 times likely to belong to the take offer group.
For question 4, please check the summary table to get the target values.
For question 5, The probability that the householder will get the offer is of 72.5%.