This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.
You should not focus on memorizing syntax. Instead, you should focus on how to use R for business reporting purposes (Matt Dancho & David Curry, 2019).
Objective - Dividing the target market or customers on the basis of some significant features which could help a company sell more products in less marketing expenses.
A potentially interesting question might be are some products (or customers) more alike than the others.
The goal of this tutorial is to show you how market segmentation works using a 2-variable dataset. For a complex example that involves more than 2 factors, you may want to visit the following tutorial: https://rpubs.com/utjimmyx/pcacluster.
Market segmentation is a strategy that divides a broad target market of customers into smaller, more similar groups, and then designs a marketing strategy specifically for each group. Clustering is a common technique for market segmentation since it automatically finds similar groups given a data set.
The file segmentation_analysis.csv is automatically generated using a computer process, and contains information on consumersà perceptions toward two sporting events.
library(readr)
mydata <-read_csv('segmentation_analysis.csv')
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## Consumer = col_double(),
## NASCAR = col_double(),
## NCAA_College_Football = col_double()
## )
# Kmeans clustre analysis
clusterdata <- mydata[, -1]
head(clusterdata)
## # A tibble: 6 x 2
## NASCAR NCAA_College_Football
## <dbl> <dbl>
## 1 6 1
## 2 5 4
## 3 3 4
## 4 2 6
## 5 3 9
## 6 5 1
clustering <- kmeans(x = clusterdata, 2)
clusterdata$cluster <- as.character(clustering$cluster)
head(clusterdata)
## # A tibble: 6 x 3
## NASCAR NCAA_College_Football cluster
## <dbl> <dbl> <chr>
## 1 6 1 1
## 2 5 4 1
## 3 3 4 2
## 4 2 6 2
## 5 3 9 2
## 6 5 1 1
library(ggplot2)
ggplot() +
geom_point(data = clusterdata,
mapping = aes(x = NASCAR,
y = NCAA_College_Football,
colour = cluster))
library(ggplot2)
ggplot() +
geom_point(data = clusterdata,
mapping = aes(x = NASCAR,
y = NCAA_College_Football,
colour = cluster)) +
geom_point(mapping = aes_string(x = clustering$centers[, "NASCAR"],
y = clustering$centers[, "NCAA_College_Football"]),
color = "purple", size = 4) +
geom_text(mapping = aes_string(x = clustering$centers[, "NASCAR"],
y = clustering$centers[, "NCAA_College_Football"],
label = 1:2),
color = "black", size = 4) +
theme_light()
References:
Matt Dancho & David Curry. Business Reporting in R with RMarkdown - How did learning RMarkdown accelerate my career? https://www.business-science.io/labs/episode6-business-reporting-rmarkdown/
K-Means Clustering https://www.rdocumentation.org/packages/stats/versions/3.6.2/topics/kmeans
ggplot2: https://www.r-graph-gallery.com/ggplot2-package.html
Determining the number of clusters in a data set https://en.wikipedia.org/wiki/Determining_the_number_of_clusters_in_a_data_set
http://www.learnbymarketing.com/tutorials/k-means-clustering-in-r-example/