Comprehensive Marketing Analysis

class: center, middle, inverse, title-slide

.title[
# <font size="7" color="Purple">Comprehensive Marketing Analysis</font>
]
.author[
### <font size="5" color="Purple"> Jaiden Neff </font>
]
.institute[
### <font size="6" color="white">West Chester University of Pennsylvania</font><br>
]
.date[
### <font color="white" size="4"> Prepared for<br> </font> <br> <font color="gold" size="6"> </font> <br> <br> <font color="white" size="3"> Slides available at: <a href="https://rpubs.com/jaidenneff" class="uri">https://rpubs.com/jaidenneff</a> AND <a href="https://github.com/jaidenneff" class="uri">https://github.com/jaidenneff</a></font>
]

---

#### Introduction

My data set is found on Kaggle its a Data set that contains information on a companies social media ad campaign. This data set includes the following variables.

`1.  ad_id: an unique ID for each ad.`<br>
`2.  xyz_campaign_id: an ID associated with each ad campaign of XYZ company.`<br>
`3.  fb_campaign_id: an ID associated with how Facebook tracks each campaign.`<br>
`4.  gender: gender of the person to whim the add is shown.`<br>
`5.  age: age of the person to whom the ad is shown.`<br>
`6.  interest: a code specifying the category to which the person’s interest belongs (interests are as mentioned in the person’s Facebook public profile).`<br>
`7.  Impressions: the number of times the ad was shown.`<br>
`8.  Clicks: number of clicks on for that ad.`<br>
`9.  Spent: Amount paid by company xyz to Facebook, to show that ad.`<br>
`10.  Total_conversion: Total number of people who inquired about the product after seeing the ad.`<br>
`11.  Approved_conversion: Total number of people who bought the product after seeing the ad.`

In my analysis I will be using Approved_conversion as the Y variable and the other variables as X variables and through some exploratory analysis plan to weed out unhelpful variables.

---

<h1 align="center">ggpairs</h1>

<center><div class='wrap'>
<object data="https://jaidenneff.github.io/sta490/screenshot.pdf" type="application/pdf" width="80%" height="500px">
      <p>Unable to display PDF file. <a href="https://jaidenneff.github.io/sta490/screenshot.pdf">Download</a> instead.</p>
    </object>
</div></center>

---

# R Generated Plot

---

# Text Format Data Table

```
# A tibble: 1,143 × 11
    ad_id xyz_campaign_id fb_campaign_id age   gender interest Impressions
    <int>           <int>          <int> <chr> <chr>     <int>       <int>
 1 708746             916         103916 30-34 M            15        7350
 2 708749             916         103917 30-34 M            16       17861
 3 708771             916         103920 30-34 M            20         693
 4 708815             916         103928 30-34 M            28        4259
 5 708818             916         103928 30-34 M            28        4133
 6 708820             916         103929 30-34 M            29        1915
 7 708889             916         103940 30-34 M            15       15615
 8 708895             916         103941 30-34 M            16       10951
 9 708953             916         103951 30-34 M            27        2355
10 708958             916         103952 30-34 M            28        9502
# ℹ 1,133 more rows
# ℹ 4 more variables: Clicks <int>, Spent <dbl>, Total_Conversion <int>,
#   Approved_Conversion <int>
```

---

<h1 align="center">Marketing Analysis Research </h1>

<center><div class='wrap'>
<object data="https://jaidenneff.github.io/sta490/marketingpdf.pdf" type="application/pdf" width="80%" height="500px">
      <p>Unable to display PDF file. <a href="https://jaidenneff.github.io/sta490/marketingpdf.pdf">Download</a> instead.</p>
    </object>
</div></center>

---

# Conclusion

Through an Exploratory Analysis and Supervised and Unsupervised modeling I have found that our best predictor variables for "Approved_Conversion" or the amount of completed sales are the amount a person inquires about an ad, number of times the ad was shown, and how high their calculated rate of interest is for the product being advertised. The model I had created with these variables had a R squared of 0.755 which is a good indication that the model is preforming well, with still room for improvement.  I also discovered that the last ad with the id "1178" had the highest mean sales. The ad with the id "916" had the highest retention rate meaning they had the highest inquires to actual sale ratio. We also discovered that age and gender didn't really play too key a factor in weather someone bought something off of an ad meaning we would have to focus pushing our ad on these calculated facebook algorithm pages instead of targeting a certain demographic of who we think would like the item, at least for the item that is being advertised here.

If I was to continue this in the future I would first collect tons more data to look at, I would then go into the project with a more directed questions to answer such as "How does our 20-30 year old demographic best interact with advertisements? What type of Ads were they? What Factors contributed to the sale". I think that having a problem to solve with this data would be ideal rather then just looking at it to observe it. I would also like to look into diagnostics of the actual ad and customer sentiment around the ad and or product. Looking into internet chatter and that sort of thing. These would all be great factors to add if I was trying to create a predictive model to find out how to increase ad turn over and undercut costs!