Access ‘BinarayLogistic_1.sav’ SPSS data file. This file contains data from a survey of home owners conducted by an electricity company about an offer of roof solar panels with a 50% subsidy from the state government as part of the state’s environmental policy. The variables involve household income measured in units of a thousand dollars, age, monthly mortgage, size of family household, and whether the householder would take or decline the offer. Please follow the instructions below and conduct a logistic regression to determine whether family size and monthly mortgage will predict taking or declining the offer.

  1. Test a full model using a chi-square test and R^2 .
  2. Provide a classification table and interpret the percentages.
  3. Interpret the B and exp(B) values and make your conclusion.
  4. Please provide a predictive equation ( e.g., )
  5. Please answer the below question. • Assume that a householder whose household size including themselves was 4 and paying a monthly mortgage of $2,000. Would they take up the offer?
# Load the data set frist
library(haven)
BinarayLogistic_1 <- read_sav("Desktop/BinarayLogistic_1.sav")
# Make a new copy
test_data <- BinarayLogistic_1
# Check the data
summary(test_data)
##      income           age          takeoffer         Mortgage   
##  Min.   : 29.0   Min.   :23.00   Min.   :0.0000   Min.   :1500  
##  1st Qu.: 47.0   1st Qu.:28.00   1st Qu.:0.0000   1st Qu.:1825  
##  Median : 66.5   Median :36.00   Median :1.0000   Median :2100  
##  Mean   : 72.9   Mean   :37.67   Mean   :0.5333   Mean   :2180  
##  3rd Qu.: 95.0   3rd Qu.:46.00   3rd Qu.:1.0000   3rd Qu.:2400  
##  Max.   :132.0   Max.   :58.00   Max.   :1.0000   Max.   :3400  
##     Famsize     
##  Min.   :1.000  
##  1st Qu.:2.000  
##  Median :3.500  
##  Mean   :3.533  
##  3rd Qu.:4.750  
##  Max.   :6.000
# Run a full model
mylogit <- glm(takeoffer ~ Mortgage + Famsize, data = test_data, family = "binomial")
# Check the model
summary(mylogit)
## 
## Call:
## glm(formula = takeoffer ~ Mortgage + Famsize, family = "binomial", 
##     data = test_data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.26944  -0.24356   0.00234   0.39237   1.57783  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -18.627293   8.653725  -2.153   0.0314 *
## Mortgage      0.005013   0.002813   1.782   0.0747 .
## Famsize       2.398534   0.962076   2.493   0.0127 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 41.455  on 29  degrees of freedom
## Residual deviance: 17.359  on 27  degrees of freedom
## AIC: 23.359
## 
## Number of Fisher Scoring iterations: 7
# Load the package aod to use the wald.test function
library(aod)
# Run a wald.test
wald.test(b=coef(mylogit),Sigma=vcov(mylogit), Terms=2:3)
## Wald test:
## ----------
## 
## Chi-squared test:
## X2 = 6.2, df = 2, P(> X2) = 0.044
# The chi-squared test statistic of 6.2, with three degrees of freedom is associated with a p-value of 0.044 indicating that the data fits the model well (p<.001).
# Calculate the R^2
library(DescTools)
# Calculate Cox and Snell’s R^2 
PseudoR2(mylogit, which = "CoxSnell")
## CoxSnell 
## 0.552113
# The Cox and Snell’s R-Square indicates that 55.2% of the variation in the dependent variable is explained by the logistic model.
# Calculate Nagelkerke ’s R^2 
PseudoR2(mylogit, which = "Nagelkerke")
## Nagelkerke 
##  0.7372461
# The Nagelkerke R-Square indicates a strong relationship of 73.7% between the predictors and the prediction.
# Add a predict column
test_data$offerP <- predict(mylogit, newdata = test_data, type = "response")
# Recode the predict possibility using a cut point 50%
test_data$offerP[test_data$offerP <= 0.5] <- 0
test_data$offerP[test_data$offerP > 0.5] <- 1
# To see the crosstable, we need CrossTable function from gmodels package
library(gmodels)
## Registered S3 method overwritten by 'gdata':
##   method         from     
##   reorder.factor DescTools
# Build the crosstable
CrossTable(test_data$takeoffer, test_data$offerP)
## 
##  
##    Cell Contents
## |-------------------------|
## |                       N |
## | Chi-square contribution |
## |           N / Row Total |
## |           N / Col Total |
## |         N / Table Total |
## |-------------------------|
## 
##  
## Total Observations in Table:  30 
## 
##  
##                     | test_data$offerP 
## test_data$takeoffer |         0 |         1 | Row Total | 
## --------------------|-----------|-----------|-----------|
##                   0 |        13 |         1 |        14 | 
##                     |     5.143 |     5.143 |           | 
##                     |     0.929 |     0.071 |     0.467 | 
##                     |     0.867 |     0.067 |           | 
##                     |     0.433 |     0.033 |           | 
## --------------------|-----------|-----------|-----------|
##                   1 |         2 |        14 |        16 | 
##                     |     4.500 |     4.500 |           | 
##                     |     0.125 |     0.875 |     0.533 | 
##                     |     0.133 |     0.933 |           | 
##                     |     0.067 |     0.467 |           | 
## --------------------|-----------|-----------|-----------|
##        Column Total |        15 |        15 |        30 | 
##                     |     0.500 |     0.500 |           | 
## --------------------|-----------|-----------|-----------|
## 
## 
# Interpret the B and exp(B) values and make your conclusion
summary(mylogit)
## 
## Call:
## glm(formula = takeoffer ~ Mortgage + Famsize, family = "binomial", 
##     data = test_data)
## 
## Deviance Residuals: 
##      Min        1Q    Median        3Q       Max  
## -2.26944  -0.24356   0.00234   0.39237   1.57783  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)  
## (Intercept) -18.627293   8.653725  -2.153   0.0314 *
## Mortgage      0.005013   0.002813   1.782   0.0747 .
## Famsize       2.398534   0.962076   2.493   0.0127 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 41.455  on 29  degrees of freedom
## Residual deviance: 17.359  on 27  degrees of freedom
## AIC: 23.359
## 
## Number of Fisher Scoring iterations: 7
## odds ratios only
exp(coef(mylogit))
##  (Intercept)     Mortgage      Famsize 
## 8.133346e-09 1.005025e+00 1.100703e+01

For question 4, please check the summary table to get the target values.

For question 5, The probability that the householder will get the offer is of 72.5%.