Promotional Strategy At Simmons

Project Objectives

Make business recommendations for Simmons store catalog promotion based on logistic regression

Question 1 & 2 : Develop the Model & Asses Predictor Significance

Step 1: Install and load required libraries

#install.packages("readexcel)
#install.packages("Hmisc")
#install.packages("pscl")
#if(!require(pROC)) install.packages("pROC")
  
## step 1: load the libraries 
library(readxl) #allows us to import excel files
library(Hmisc) #allows us to call the correlation function
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:base':
## 
##     format.pval, units
library(pscl) #allows us to call the pseudo R-square package to evaluate our model
## Classes and Methods for R originally developed in the
## Political Science Computational Laboratory
## Department of Political Science
## Stanford University (2002-2015),
## by and under the direction of Simon Jackman.
## hurdle and zeroinfl functions by Achim Zeileis.
library(pROC) #allows us to run the area under the curve (AUC) package to get the plot and AUC score
## Type 'citation("pROC")' for a citation.
## 
## Attaching package: 'pROC'
## The following objects are masked from 'package:stats':
## 
##     cov, smooth, var

Step 2 : Import and explore dataset

simmons_df <-read_excel(file.choose())
head(simmons_df)
## # A tibble: 6 × 4
##   Customer Spending  Card Coupon
##      <dbl>    <dbl> <dbl>  <dbl>
## 1        1     2.29     1      0
## 2        2     3.22     1      0
## 3        3     2.13     1      0
## 4        4     3.92     0      0
## 5        5     2.53     1      0
## 6        6     2.47     0      1
# Standard Deviatoin using the SD function
sapply(simmons_df, sd) 
##   Customer   Spending       Card     Coupon 
## 29.0114920  1.7412979  0.5025189  0.4923660
customer 29, spending 1.74, card .50, coupon 49
# Cross tabulation of coupon and card
xtabs(~Coupon + Card, data = simmons_df) 
##       Card
## Coupon  0  1
##      0 36 24
##      1 14 26

Step 3: Building the Model

sim_logit = glm(Coupon ~ Card + Spending, data = simmons_df, family = binomial)
summary(sim_logit)
## 
## Call:
## glm(formula = Coupon ~ Card + Spending, family = binomial, data = simmons_df)
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)  -2.1464     0.5772  -3.718 0.000201 ***
## Card          1.0987     0.4447   2.471 0.013483 *  
## Spending      0.3416     0.1287   2.655 0.007928 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 134.60  on 99  degrees of freedom
## Residual deviance: 120.97  on 97  degrees of freedom
## AIC: 126.97
## 
## Number of Fisher Scoring iterations: 4

Step 4: Odds Ratio

exp(coef(sim_logit))
## (Intercept)        Card    Spending 
##   0.1169074   3.0003587   1.4072585
# Interpretation: All the variables are significant with the target variable.
Interpretation: the odds of the outcome are multiplied by ~ 3.00 for every customer that has a "Simmons Credit Card". The card increased the odds of outcome by 300%

Interpretation: the odds of the outcome are multiplied by ~ 1.41 for every customer that has an increase in "Spending". The card increased the odds of outcome by 40.73%