1 Intro

The goal of this project is to evaluate and understand the performance of a marketing food campaign.

Key performance indicators (KPI’s) are used to evaluate the performance of the campaigns as well as understanding segments of the customer base. Income, amount spent on certain items, education level, relationship status, and children presence are some of the key indicators that will be used to gain insight into the campaign.

1.1 Motivating Questions

Analysis —Who is the audience?

Understanding —What is the audience’s knowledge and attitude toward the campaign?

Demographics —What is the audience’s age, gender, education, location,and so on?

Interest —Why is the audience reading, sharing, and interacting with your brand content?

Environment —Where does the audience spend time accepting the campaign?

Needs —What are the audience needs associated with your campaign?

Customization —What specific needs and/or interests should the brand address in order to add value for the audience?

2 Exploratory Data Analysis

2.1 Data Glossary and SAS Utilization

Initially the raw dataset contains 2205 Observations/Rows and 39 variables Features/Variables/Columns total.

Link to raw dataset here

Through SAS manipulation and cleaning, the dimensionality grew in adding 24 more variables.

Link to SAS manipulation and cleaning here

For this project there, 2205 Observations/Rows and 53 Features/Variables/Columns will be used.

There are no missing values as the data set was curated and cleaned prior.

Three variables were removed due to the lack of relevance to the project. ## Data Glossary

People

Income: Customer’s yearly household income
Kidhome: Number of children in customer’s household
Teenhome: Number of teenagers in customer’s household
Recency: Number of days since customer’s last purchase
Age: Customers age
Customer_Days: Days since customer’s enrollment with the company
TOT_Children: Total number of children and teens
ChildrenBV: 1 if there are kids/teens, 0 otherwise
Children: “Has Children” or “No Children”

Products

Depending if there is enough variability or need, this may be broken down into the last two variables of this list.

MntWines: Amount spent on wine in last 2 years
MntFruits: Amount spent on fruits in last 2 years
MntMeatProducts: Amount spent on meat in last 2 years
MntFishProducts: Amount spent on fish in last 2 years
MntSweetProducts: Amount spent on sweets in last 2 years
MntGoldProds: Amount spent on gold in last 2 years
MntTotal: Amount Spent by all products

Promotion

Some variables in this section will be aggregated to a variable that accounts for all responses/acceptances. Don’t care when the campaign was accepted, care who it worked for.

AcceptedCmp1: 1 if customer accepted the offer in the 1st campaign, 0 otherwise
AcceptedCmp2: 1 if customer accepted the offer in the 2nd campaign, 0 otherwise
AcceptedCmp3: 1 if customer accepted the offer in the 3rd campaign, 0 otherwise
AcceptedCmp4: 1 if customer accepted the offer in the 4th campaign, 0 otherwise
AcceptedCmp5: 1 if customer accepted the offer in the 5th campaign, 0 otherwise
Response: 1 if customer accepted the offer in the last campaign, 0 otherwise
Complain: 1 if the customer complained, 0 otherwise
pattern: Responses to all campaigns in sequence
Accept: Number of campaigns accepted
AcceptBV: 1 if they accepted a campaign
AcceptCBV: “Respond” or “Did Not Respond”
Accept5BV: 1 if they accepted campaigns 1-5, 0 otherwise
Accept5CBV: “New Response” or “Did Not Respond”

Product Attainment Allocation

NumDealsPurchases: Number of purchases made with a discount
NumWebPurchases: Number of purchases made through the company’s web site
NumCatalogPurchases: Number of purchases made using a catalog
NumStorePurchases: Number of purchases made directly in stores
NumWebVisitsMonth: Number of visits to company’s web site in the last month.
TOT_Purchase: Total number of items purchased through mediums (NumWebPurchases+NumCatalogPurchases + NumStorePurchases)

Relationship status (0 or 1)

These may be consolidated to single and not single if analysis shows it is a more practical approach.

marital_Divorced: Customer is divorced.
marital_Married: Customer is married.
marital_Single: Customer is single.
marital_Together: Customer is in a relationship.
marital_widow: Customer is widowed.
SingleBV: 1 if Single or Divorced or Widow, 0 otherwise
Single: “Single” or “Not Single”
Marital_Status: All 5 marital options as a factor

Education status (0 or 1)

May also be consolidated to college and no college is analysis allows.

education_2n.Cycle: Did not pass high school.
education_Basic: High School/GED.
education_Graduation: Bachelors.
education_Master: Master degree.
education_PhD: Nerds.
CollegeBV: 1 if graduated from undergrad, masters or phd, 0 otherwise
College: “College” or “Non College Grad”
Education_Level: All five education options

2.2 Campaign Response

The following table shows the average and standard deviation of those who responded to all campaigns and those who didn’t.

Characteristic	Did not Respond, N = 1,601¹	Respond, N = 604¹	Test Statistic	p-value²
Income	47,813.08 (19,110.66)	61,718.51 (21,411.17)	-14	<0.001
Recency	50.82 (28.59)	44.20 (29.31)	4.8	<0.001
MntWines	219.01 (259.18)	537.18 (405.53)	-18	<0.001
MntFruits	23.00 (37.25)	35.43 (44.62)	-6.1	<0.001
MntMeatProducts	124.26 (177.01)	274.12 (271.82)	-13	<0.001
MntFishProducts	32.42 (50.23)	51.91 (63.36)	-6.8	<0.001
MntSweetProducts	23.01 (37.61)	38.04 (47.61)	-7.0	<0.001
MntGoldProds	38.01 (47.76)	60.10 (58.11)	-8.3	<0.001
NumDealsPurchases	2.35 (1.87)	2.22 (1.93)	1.5	0.15
NumWebPurchases	3.73 (2.71)	5.08 (2.58)	-11	<0.001
NumCatalogPurchases	2.08 (2.54)	4.13 (2.92)	-15	<0.001
NumStorePurchases	5.53 (3.19)	6.60 (3.25)	-6.9	<0.001
NumWebVisitsMonth	5.44 (2.34)	5.06 (2.58)	3.2	0.001
Age	50.87 (11.49)	51.70 (12.25)	-1.5	0.15
Customer_Days	2,502.17 (198.15)	2,540.67 (211.45)	-3.9	<0.001
MntTotal	421.70 (466.79)	936.68 (664.35)	-17	<0.001
TOT_Purchase	11.35 (6.96)	15.81 (6.73)	-14	<0.001
ChildrenBV	1,254 / 1,601 (78%)	323 / 604 (53%)	133	<0.001
SingleBV	538 / 1,601 (34%)	245 / 604 (41%)	9.3	0.002
CollegeBV	1,401 / 1,601 (88%)	552 / 604 (91%)	6.5	0.011
TOT_Prods	459.71 (493.35)	996.78 (683.79)	-18	<0.001
TOT_Grocery	421.70 (466.79)	936.68 (664.35)	-17	<0.001
¹ Mean (SD); n / N (%)
² Welch Two Sample t-test; Pearson’s Chi-squared test

In order to check for significance of the variables between two samples, T-test and chi squared test are used to check for significance of features.

Notable observations:

Customers who accepted/responded to campaigns.

Are wealthier, average 61k vs 47k.
Bought more products in each area but significantly more in wines.
Roughly half are childless.

Rejected/No Response to campaigns

Almost 80% have children.
Bought more items by means of discount

Similar Statistics

Education doesn’t seem to be as strong a factor as I thought it’d be.
Age is not significant.
Partner status is nearly similar.

For the analysis and predictions, the amount spent on wines, the presence of children, and income will be focus points on but will keep other variables in mind throughout.

2.2.1 Income

Observations of Income and Response Class Histogram

As income increases so does the ratio of people who responded with those who didn’t.

Where to go from here

Find out at which income will the ratio favor people that respond to campaign as income increases.

Observations from Income vs Cumulative Sum of Responses

Around the 75000 - 90000 income range is where the ratio favors those who respond to the campaign.

The following table shows how the cumulative sum is calculated as it related to campaign response and income.

	Income	AcceptBV	FreqAccept	CSAccept
1226	1730	0	-1	-1
21	2447	0	-1	-2
1500	3502	0	-1	-3
1821	4023	0	-1	-4
1950	4428	0	-1	-5
962	4861	0	-1	-6
2182	5305	0	-1	-7
10	5648	1	1	-6
1308	6560	0	-1	-7
755	6835	0	-1	-8
1781	7144	0	-1	-9
11	7500	0	-1	-10
42	7500	0	-1	-11
44	7500	1	1	-10
226	7500	0	-1	-11
423	7500	0	-1	-12
705	7500	1	1	-11
843	7500	0	-1	-12
1133	7500	0	-1	-13
1227	7500	0	-1	-14

Fitting a polynomial to the data in order to find an inflection point that will give an idea as to where the change in income responses starts to leans towards customers that accepted the campaigns.

x = dfEDA$Income
y =dfEDA$CSAccept
plot(x,y,type="l")
lo <- loess(y~x)
xl <- seq(min(x),max(x), (max(x) - min(x))/444)
out = predict(lo,xl)
lines(xl, out, col='red', lwd=2)

infl <- c(FALSE, diff(diff(out)>0)!=0)

points(xl[infl ], out[infl ], col="blue")

xl[infl]

## [1] 83714.91

Observations of cumulative sum and income relationship

After about 84,000 dollars the cumulative sum of income starts trend towards customers that accepted the campaign.

Where to go from here

The goal should be to get this shift in trend to happen earlier.
Need to make campaign more responsive to customers making an income lower than $84,000.

2.2.2 Children

Why is having children noteworthy for this analysis?

53% of those that responded to any of the campaigns had children.
78% of people that did not respond have children.

Observations of Children and Campaign Responses

Campaigns 1 and 5 drew in a higher ratio of responses from people without children.
Response campaign was well balanced.
Campaigns 3 and 4 drew in a higher ratio of response from people with children but to a smaller degree than Campaigns 1 and 5.

In order to gain more insight into these campaigns, campaigns 1 and 5 will be compared to campaigns 3 and 4.

The following table shows the statistical comparison of these two aggregations for the campaigns.

Children = Campaign 3 and 4| No Children = Campaign 1 and 5

Characteristic	Children, N = 327¹	No Children, N = 303¹	Test Statistic	p-value²
Income	59,760.22 (20,990.13)	80,717.95 (10,200.70)	-16	<0.001
MntWines	565.01 (429.77)	819.92 (338.93)	-8.3	<0.001
MntFruits	27.82 (40.30)	55.76 (52.16)	-7.5	<0.001
MntMeatProducts	210.76 (238.60)	452.31 (261.60)	-12	<0.001
MntFishProducts	39.18 (55.74)	83.52 (67.66)	-8.9	<0.001
MntSweetProducts	29.29 (43.32)	65.20 (54.03)	-9.2	<0.001
MntGoldProds	57.62 (56.81)	77.05 (63.58)	-4.0	<0.001
NumDealsPurchases	2.30 (1.88)	1.21 (1.10)	8.9	<0.001
NumWebPurchases	5.09 (2.76)	5.60 (2.11)	-2.7	0.008
NumCatalogPurchases	3.94 (2.88)	6.08 (2.43)	-10	<0.001
NumStorePurchases	6.43 (3.26)	8.17 (2.74)	-7.2	<0.001
NumWebVisitsMonth	5.46 (2.42)	3.20 (1.95)	13	<0.001
Response	139 / 327 (43%)	170 / 303 (56%)	12	<0.001
Age	51.28 (12.03)	50.91 (13.68)	0.35	0.7
Customer_Days	2,515.51 (208.99)	2,496.43 (199.47)	1.2	0.2
marital_Divorced	38 / 327 (12%)	25 / 303 (8.3%)	2.0	0.2
marital_Married	125 / 327 (38%)	128 / 303 (42%)	1.1	0.3
marital_Single	71 / 327 (22%)	63 / 303 (21%)	0.08	0.8
marital_Together	79 / 327 (24%)	75 / 303 (25%)	0.03	0.9
marital_Widow	14 / 327 (4.3%)	12 / 303 (4.0%)	0.04	0.8
education_2n_Cycle	24 / 327 (7.3%)	24 / 303 (7.9%)	0.08	0.8
education_Basic	6 / 327 (1.8%)	0 / 303 (0%)		0.031
education_Graduation	157 / 327 (48%)	166 / 303 (55%)	2.9	0.089
education_Master	55 / 327 (17%)	45 / 303 (15%)	0.46	0.5
education_PhD	85 / 327 (26%)	68 / 303 (22%)	1.1	0.3
MntTotal	872.06 (659.75)	1,476.70 (462.06)	-13	<0.001
TOT_Purchase	15.46 (6.90)	19.84 (4.40)	-9.6	<0.001
ChildrenBV	212 / 327 (65%)	58 / 303 (19%)	134	<0.001
SingleBV	123 / 327 (38%)	100 / 303 (33%)	1.5	0.2
CollegeBV	297 / 327 (91%)	279 / 303 (92%)	0.32	0.6
TOT_Prods	929.69 (680.43)	1,553.75 (468.22)	-13	<0.001
TOT_Grocery	872.06 (659.75)	1,476.70 (462.06)	-13	<0.001
¹ Mean (SD); n / N (%)
² Welch Two Sample t-test; Pearson’s Chi-squared test; Fisher’s exact test

Observations of Number of Acceptances by Campaign

The latest campaign managed to generate a significant response from new customers and customers who have only responded once in the previous 5 campaigns.
No new responses to the 2nd campaign, barely any responses at all. .

Characteristic	Campaign1, N = 142¹	Campaign2, N = 30¹	Campaign3, N = 163¹	Campaign4, N = 164¹	Campaign5, N = 161¹	Response, N = 333¹
Income	78,872.63 (11,337.02)	71,054.83 (16,069.84)	50,802.58 (22,012.66)	68,663.23 (15,478.93)	82,345.50 (8,800.56)	60,209.68 (23,194.08)
Recency	46.68 (28.47)	48.67 (31.61)	45.70 (28.51)	50.81 (29.28)	49.04 (29.34)	35.26 (27.62)
MntWines	758.03 (335.81)	898.67 (467.49)	378.66 (396.50)	750.23 (379.36)	874.50 (333.23)	502.62 (427.82)
NumWebVisitsMonth	3.51 (2.04)	5.17 (2.29)	5.85 (2.53)	5.07 (2.25)	2.93 (1.83)	5.31 (2.56)
Age	51.62 (13.52)	51.87 (11.09)	48.55 (12.04)	53.98 (11.42)	50.29 (13.84)	50.50 (12.33)
Customer_Days	2,481.69 (201.43)	2,523.10 (205.68)	2,507.02 (211.13)	2,523.95 (207.15)	2,509.43 (197.44)	2,607.08 (196.47)
MntTotal	1,406.70 (501.92)	1,241.27 (547.81)	653.60 (663.95)	1,089.20 (580.74)	1,538.45 (415.67)	924.41 (698.91)
TOT_Purchase	19.90 (4.52)	18.23 (5.83)	13.26 (8.06)	17.63 (4.60)	19.80 (4.30)	15.35 (6.83)
ChildrenBV	33 / 142 (23%)	12 / 30 (40%)	115 / 163 (71%)	97 / 164 (59%)	25 / 161 (16%)	165 / 333 (50%)
SingleBV	48 / 142 (34%)	11 / 30 (37%)	63 / 163 (39%)	60 / 164 (37%)	52 / 161 (32%)	175 / 333 (53%)
Marital_Status
Divorced	12 / 142 (8.5%)	5 / 30 (17%)	20 / 163 (12%)	18 / 164 (11%)	13 / 161 (8.1%)	48 / 333 (14%)
Married	62 / 142 (44%)	7 / 30 (23%)	63 / 163 (39%)	62 / 164 (38%)	66 / 161 (41%)	98 / 333 (29%)
Single	31 / 142 (22%)	5 / 30 (17%)	39 / 163 (24%)	32 / 164 (20%)	32 / 161 (20%)	109 / 333 (33%)
Together	32 / 142 (23%)	12 / 30 (40%)	37 / 163 (23%)	42 / 164 (26%)	43 / 161 (27%)	60 / 333 (18%)
Widow	5 / 142 (3.5%)	1 / 30 (3.3%)	4 / 163 (2.5%)	10 / 164 (6.1%)	7 / 161 (4.3%)	18 / 333 (5.4%)
CollegeBV	128 / 142 (90%)	28 / 30 (93%)	142 / 163 (87%)	155 / 164 (95%)	151 / 161 (94%)	309 / 333 (93%)
Education_Level
Basic	0 / 142 (0%)	0 / 30 (0%)	6 / 163 (3.7%)	0 / 164 (0%)	0 / 161 (0%)	2 / 333 (0.6%)
Masters	18 / 142 (13%)	2 / 30 (6.7%)	24 / 163 (15%)	31 / 164 (19%)	27 / 161 (17%)	56 / 333 (17%)
Phd	30 / 142 (21%)	10 / 30 (33%)	40 / 163 (25%)	45 / 164 (27%)	38 / 161 (24%)	101 / 333 (30%)
Second Cycle	14 / 142 (9.9%)	2 / 30 (6.7%)	15 / 163 (9.2%)	9 / 164 (5.5%)	10 / 161 (6.2%)	22 / 333 (6.6%)
Undergrad	80 / 142 (56%)	16 / 30 (53%)	78 / 163 (48%)	79 / 164 (48%)	86 / 161 (53%)	152 / 333 (46%)
Accept
1 Cmpgn	39 / 142 (27%)	0 / 30 (0%)	74 / 163 (45%)	72 / 164 (44%)	37 / 161 (23%)	146 / 333 (44%)
2 Cmpgn	29 / 142 (20%)	8 / 30 (27%)	64 / 163 (39%)	34 / 164 (21%)	43 / 161 (27%)	100 / 333 (30%)
3 Cmpgn	32 / 142 (23%)	5 / 30 (17%)	10 / 163 (6.1%)	27 / 164 (16%)	37 / 161 (23%)	42 / 333 (13%)
4 Cmpgn	32 / 142 (23%)	7 / 30 (23%)	12 / 163 (7.4%)	24 / 164 (15%)	34 / 161 (21%)	35 / 333 (11%)
5 Cmpgn	10 / 142 (7.0%)	10 / 30 (33%)	3 / 163 (1.8%)	7 / 164 (4.3%)	10 / 161 (6.2%)	10 / 333 (3.0%)
AcceptBV	142 / 142 (100%)	30 / 30 (100%)	163 / 163 (100%)	164 / 164 (100%)	161 / 161 (100%)	333 / 333 (100%)
TOT_Prods	1,484.35 (509.56)	1,307.67 (555.42)	720.54 (696.27)	1,137.56 (597.45)	1,614.96 (420.62)	985.66 (719.39)
TOT_Grocery	1,406.70 (501.92)	1,241.27 (547.81)	653.60 (663.95)	1,089.20 (580.74)	1,538.45 (415.67)	924.41 (698.91)
¹ Mean (SD); n / N (%)

Observations

Campaign 3 and the latest campaign show similarities in the customers that responded. Low income, low wine consumption, and overall lower purchase totals.
The percentage of people who graduated college is consistently high at 87%-95%.
Campaigns 1 and 5 seem to have targeted similar high income customers.
Campaigns 3 and the response campaign targets a variety of mixed income customers.
Campaigns 2 and 4 had an upper income response as well but not as high income as campaigns 1 and 5. Campaign 2 is less than a third of all other campaigns

2.2.2.1 Analysis of customers with children

Characteristic	Campaign3, N = 115¹	Campaign4, N = 97¹	Response, N = 165¹
Income	43,443.47 (17,374.09)	62,493.96 (12,802.49)	45,585.50 (18,618.66)
MntWines	269.43 (346.82)	653.23 (367.81)	315.12 (392.55)
MntFruits	13.98 (22.30)	13.78 (21.03)	13.16 (20.24)
MntMeatProducts	87.17 (106.18)	119.86 (121.18)	104.76 (120.03)
MntFishProducts	18.16 (34.33)	23.75 (44.80)	19.96 (35.64)
MntSweetProducts	13.85 (26.08)	14.90 (29.40)	14.85 (25.23)
MntGoldProds	58.27 (61.05)	41.43 (39.95)	47.29 (50.07)
NumDealsPurchases	2.67 (1.61)	3.45 (2.23)	3.62 (2.31)
NumWebPurchases	4.27 (3.18)	5.85 (2.42)	4.73 (2.87)
NumCatalogPurchases	2.74 (2.69)	3.11 (2.03)	2.47 (2.46)
NumStorePurchases	4.19 (2.85)	7.34 (2.54)	4.62 (2.46)
NumWebVisitsMonth	6.70 (2.08)	6.24 (1.45)	7.10 (1.36)
Response	47 / 115 (41%)	24 / 97 (25%)	165 / 165 (100%)
Age	48.75 (10.71)	54.92 (8.71)	51.00 (10.38)
Customer_Days	2,517.83 (211.15)	2,534.20 (219.48)	2,640.51 (191.21)
education_Graduation	55 / 115 (48%)	49 / 97 (51%)	70 / 165 (42%)
MntTotal	402.59 (475.69)	825.52 (468.10)	467.85 (514.78)
TOT_Purchase	11.20 (7.74)	16.30 (4.42)	11.82 (6.55)
ChildrenBV	115 / 115 (100%)	97 / 97 (100%)	165 / 165 (100%)
SingleBV	42 / 115 (37%)	28 / 97 (29%)	74 / 165 (45%)
CollegeBV	102 / 115 (89%)	93 / 97 (96%)	155 / 165 (94%)
TOT_Prods	460.86 (509.54)	866.95 (485.97)	515.14 (537.63)
TOT_Grocery	402.59 (475.69)	825.52 (468.10)	467.85 (514.78)
¹ Mean (SD); n / N (%)

## 
##  Shapiro-Wilk normality test
## 
## data:  df2[df2$Campaign == "Campaign3" & df2$Accepted_C == 1 & df2$ChildrenBV == 1, ]$Income
## W = 0.97862, p-value = 0.06271

## 
##  Shapiro-Wilk normality test
## 
## data:  df2[df2$Campaign == "Campaign4" & df2$Accepted_C == 1 & df2$ChildrenBV == 1, ]$Income
## W = 0.97776, p-value = 0.09855

## 
##  Shapiro-Wilk normality test
## 
## data:  df2[df2$Campaign == "Response" & df2$Accepted_C == 1 & df2$ChildrenBV == 1, ]$Income
## W = 0.97887, p-value = 0.01265

## Anova Table (Type III tests)
## 
## Response: Income
##                 Sum Sq  Df F value    Pr(>F)    
## (Intercept) 2.1704e+11   1 758.652 < 2.2e-16 ***
## Campaign    2.3107e+10   2  40.383 < 2.2e-16 ***
## Residuals   1.0700e+11 374                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The more times a customer accepts the campaign, the higher the income of the customer.

3 Predictive Model

3.1 LASSO

3.1.1 Training Model

set.seed(123)

training.samples <- dfmodel$Response %>% 
  createDataPartition(p = 0.8, list = FALSE)

train.data  <- dfmodel[training.samples, ]
test.data <- dfmodel[-training.samples, ]

3.1.2 LASSO model with reduced KPIs

x <- model.matrix(Response~., train.data)[,-22]
# Convert the outcome (class) to a numerical variable
y <- ifelse(train.data$Response == 1, 1, 0)

# Find the best lambda using cross-validation
set.seed(123) 
cv.lasso <- cv.glmnet(x, y, alpha = 1, family = "binomial")
plot(cv.lasso)

3.1.3 LASSO model predictions

# Fit the final model on the training data
model <- glmnet(x, y, alpha = 1, family = "binomial",
                lambda = cv.lasso$lambda.min)
# Display regression coefficients
coef(model)

## 43 x 1 sparse Matrix of class "dgCMatrix"
##                                 s0
## (Intercept)          -1.338483e+01
## (Intercept)           .           
## Income                .           
## Kidhome               2.541566e-01
## Teenhome             -8.944843e-01
## Recency              -3.063529e-02
## MntWines             -7.711868e-04
## MntFruits             2.721773e-03
## MntMeatProducts       2.371779e-03
## MntFishProducts      -2.282393e-03
## MntSweetProducts      .           
## MntGoldProds          1.838066e-03
## NumDealsPurchases     1.644908e-01
## NumWebPurchases       6.105248e-02
## NumCatalogPurchases   1.624755e-01
## NumStorePurchases    -2.042529e-01
## NumWebVisitsMonth     7.581466e-02
## AcceptedCmp3          1.436130e+00
## AcceptedCmp4          9.776652e-01
## AcceptedCmp5          1.494024e+00
## AcceptedCmp1          1.049988e+00
## AcceptedCmp2          1.761545e+00
## Age                   6.702086e-03
## Customer_Days         4.175057e-03
## marital_Divorced      5.122041e-02
## marital_Married       .           
## marital_Single        .           
## marital_Together      .           
## marital_Widow        -1.668337e-01
## education_2n_Cycle    .           
## education_Basic      -9.599181e-01
## education_Graduation  .           
## education_Master      4.117749e-01
## education_PhD         1.187721e+00
## MntTotal              .           
## TOT_Purchase          .           
## TOT_Children          .           
## ChildrenBV           -2.648628e-01
## SingleBV              1.160085e+00
## CollegeBV             5.142897e-02
## Accept5BV             5.759373e-01
## TOT_Prods             .           
## TOT_Grocery           .

# Make predictions on the test data
x.test <- model.matrix(Response ~., test.data)[,-22]
probabilities <- model %>% predict(newx = x.test)
predicted.classes <- ifelse(probabilities > 0.5, 1, 0)
# Model accuracy
observed.classes <- test.data$Response

mean(predicted.classes == observed.classes)

## [1] 0.8956916

3.1.4 Logistic model with all indicators

# Fit the model
full.model <- glm(Response ~., data = train.data, family = binomial)
# Make predictions
probabilities <- full.model %>% predict(test.data[,-22], type = "response")
predicted.classes <- ifelse(probabilities > 0.5, 1, 0)
# Model accuracy
observed.classes <- test.data$Response
mean(predicted.classes == observed.classes)

## [1] 0.893424

3.1.5 Results of LASSO model

Not only does the LASSO model use 12 fewer indicators than the full Logistic model, it also predicts a higher rate of the observations.

Campaign Marketing Analysis

Alan Morales

2023-09-11