R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Notice:

You should not focus on memorizing syntax. Instead, you should focus on how to use R for business reporting purposes (Matt Dancho & David Curry, 2019).

Segmentation

Objective - Dividing the target market or customers on the basis of some significant features which could help a company sell more products in less marketing expenses.

A potentially interesting question might be are some products (or customers) more alike than the others.

The goal of this tutorial is to show you how market segmentation works using a 2-variable dataset. For a complex example that involves more than 2 factors, you may want to visit the following tutorial: https://rpubs.com/utjimmyx/pcacluster.

Market segmentation

Market segmentation is a strategy that divides a broad target market of customers into smaller, more similar groups, and then designs a marketing strategy specifically for each group. Clustering is a common technique for market segmentation since it automatically finds similar groups given a data set.

Example

The file segmentation_analysis.csv is automatically generated using a computer process, and contains information on consumersí perceptions toward two sporting events.

Perform a K Means Cluster analysis

library(readr)
mydata <-read_csv('segmentation_analysis.csv')
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   Consumer = col_double(),
##   NASCAR = col_double(),
##   NCAA_College_Football = col_double()
## )
# Kmeans clustre analysis
clusterdata <- mydata[, -1]
head(clusterdata)
## # A tibble: 6 x 2
##   NASCAR NCAA_College_Football
##    <dbl>                 <dbl>
## 1      6                     1
## 2      5                     4
## 3      3                     4
## 4      2                     6
## 5      3                     9
## 6      5                     1
clustering <- kmeans(x = clusterdata, 2)
clusterdata$cluster <- as.character(clustering$cluster)
head(clusterdata)
## # A tibble: 6 x 3
##   NASCAR NCAA_College_Football cluster
##    <dbl>                 <dbl> <chr>  
## 1      6                     1 1      
## 2      5                     4 1      
## 3      3                     4 2      
## 4      2                     6 2      
## 5      3                     9 2      
## 6      5                     1 1

Plot the clusters using ggplot2

library(ggplot2)
ggplot() +
  geom_point(data = clusterdata, 
             mapping = aes(x = NASCAR, 
                           y = NCAA_College_Football, 
                           colour = cluster))

Add the centroid to each cluster using ggplot2

library(ggplot2)
ggplot() +
  geom_point(data = clusterdata, 
             mapping = aes(x = NASCAR, 
                           y = NCAA_College_Football, 
                           colour = cluster)) +
  geom_point(mapping = aes_string(x = clustering$centers[, "NASCAR"], 
                                  y = clustering$centers[, "NCAA_College_Football"]),
             color = "purple", size = 4) +
  geom_text(mapping = aes_string(x = clustering$centers[, "NASCAR"], 
                                 y = clustering$centers[, "NCAA_College_Football"],
                                 label = 1:2),
            color = "black", size = 4) +
  theme_light()

References:

Matt Dancho & David Curry. Business Reporting in R with RMarkdown - How did learning RMarkdown accelerate my career? https://www.business-science.io/labs/episode6-business-reporting-rmarkdown/

K-Means Clustering https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/kmeans

ggplot2: https://www.r-graph-gallery.com/ggplot2-package.html

Determining the number of clusters in a data set https://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set

http://www.learnbymarketing.com/tutorials/k-means-clustering-in-r-example/