Introduction to Market Segmentation Analysis using R

Market segmentation is a crucial aspect of marketing analytics as it allows us to understand the diverse customer groups in a market and tailor marketing strategies accordingly. In this analysis, we will perform a cluster analysis on the dataset Kirin_Segmentation.csv to identify distinct market segments. This report will detail the steps taken in R to achieve our segmentation, including preparing the dataset, performing hierarchical clustering, and profiling the resulting clusters.

Firstly, we need to ensure that all the required packages are installed and loaded. The “cluster” package is essential as it contains methods for cluster analysis. We will clear the R environment to ensure a clean workspace and set our working directory to the location where our dataset is stored. Subsequently, we will load the segmentation data, select the relevant variables for the analysis, and normalize the data.

Normalization is key in preparing the data for clustering, as it ensures each attribute contributes equally without bias due to scale differences. After normalizing the data, we compute a distance matrix which serves as the basis for our hierarchical clustering. Using complete linkage, we perform hierarchical clustering and determine the membership of each data point in a cluster.

Finally, we will characterize the clusters by calculating the mean of each attribute within each cluster, thus profiling them. The output of this process will be used to answer subsequent questions about the number of distinct market segments, their detailed profiles, and which segments to target for marketing campaigns.

Below is the R code that initiates this segmentation analysis:

# Installation and loading of the "cluster" package
# install.packages("cluster") # Uncomment this line if "cluster" is not installed
library(cluster)
## Warning: package 'cluster' was built under R version 4.3.3
# Clearing the R environment
rm(list = ls(all = TRUE))

# Setting the working directory to where the dataset is located
setwd("C:/Users/gambe/Downloads")

# Loading the segmentation data
segdata <- read.csv("Kirin_Segmentation.csv")

# Exploring the dataset structure, viewing the first few rows and the variable names
str(segdata)
## 'data.frame':    317 obs. of  34 variables:
##  $ id                                 : int  6861 4129 4393 445 7393 964 6773 461 7156 5785 ...
##  $ Rich.full.bodied                   : int  0 0 0 8 9 0 8 5 9 8 ...
##  $ Light.beer                         : int  1 1 4 7 7 8 3 8 2 3 ...
##  $ No.aftertaste                      : int  2 7 4 0 7 7 5 8 4 6 ...
##  $ Refreshing                         : int  3 0 8 0 9 0 7 2 6 5 ...
##  $ Goes.down.easily                   : int  6 0 7 0 7 0 5 8 5 5 ...
##  $ Gives.a..buzz.                     : int  5 1 5 2 8 4 8 1 6 7 ...
##  $ Good.taste                         : int  0 0 9 0 9 0 7 8 9 9 ...
##  $ Low.price                          : int  1 1 2 3 5 0 3 5 6 8 ...
##  $ Good.value                         : int  8 1 6 8 0 0 2 2 7 8 ...
##  $ From.country.with.brewing.tradition: int  3 1 3 3 8 2 7 2 5 8 ...
##  $ Attractive.bottle                  : int  2 1 4 5 6 3 6 1 2 6 ...
##  $ Prestigious.brand                  : int  2 1 3 2 5 2 4 2 4 3 ...
##  $ High.quality                       : int  0 9 0 2 8 8 8 0 8 9 ...
##  $ Drink.at.picnics                   : int  8 8 6 5 6 8 6 5 5 5 ...
##  $ Masculine                          : int  1 1 5 1 1 0 5 1 3 5 ...
##  $ For.young.people                   : int  1 1 4 1 8 1 3 2 2 5 ...
##  $ Drink.with.friends                 : int  8 7 7 2 6 8 6 5 8 7 ...
##  $ Drink.at.home                      : int  0 7 0 2 8 9 5 9 8 8 ...
##  $ To.serve.dinner.guests             : int  7 6 6 9 6 2 7 8 6 6 ...
##  $ For.dining.out                     : int  9 6 8 7 5 2 8 5 5 7 ...
##  $ Drink.at.bar                       : int  7 8 7 1 8 2 5 5 4 7 ...
##  $ Weekly.consumption                 : int  2 9 6 24 2 8 5 12 1 8 ...
##  $ Age..1.7.                          : int  5 6 4 5 2 5 4 5 4 2 ...
##  $ Income..1.7.                       : int  4 7 6 7 3 5 4 4 5 6 ...
##  $ Education..1.6.                    : int  5 3 6 5 5 5 5 2 3 3 ...
##  $ Sex..M.1.                          : int  1 1 1 2 2 2 1 1 1 1 ...
##  $ Adapt.to.new.situations            : int  4 4 4 3 3 4 3 3 4 3 ...
##  $ Make.friends.easily                : int  3 4 4 2 3 4 2 4 2 2 ...
##  $ Don.t.like.to.be.tied.to.timetable : int  4 3 4 4 3 3 4 3 4 3 ...
##  $ Like.to.take.chances               : int  3 4 4 3 3 4 3 2 3 3 ...
##  $ Like.to.travel.abroad              : int  4 3 4 4 3 4 2 2 4 4 ...
##  $ Like.ethnic.food                   : int  4 4 4 3 2 4 3 2 3 3 ...
##  $ Knowledgeable.about.beer           : int  3 3 4 4 3 3 4 2 2 4 ...
head(segdata)
##     id Rich.full.bodied Light.beer No.aftertaste Refreshing Goes.down.easily
## 1 6861                0          1             2          3                6
## 2 4129                0          1             7          0                0
## 3 4393                0          4             4          8                7
## 4  445                8          7             0          0                0
## 5 7393                9          7             7          9                7
## 6  964                0          8             7          0                0
##   Gives.a..buzz. Good.taste Low.price Good.value
## 1              5          0         1          8
## 2              1          0         1          1
## 3              5          9         2          6
## 4              2          0         3          8
## 5              8          9         5          0
## 6              4          0         0          0
##   From.country.with.brewing.tradition Attractive.bottle Prestigious.brand
## 1                                   3                 2                 2
## 2                                   1                 1                 1
## 3                                   3                 4                 3
## 4                                   3                 5                 2
## 5                                   8                 6                 5
## 6                                   2                 3                 2
##   High.quality Drink.at.picnics Masculine For.young.people Drink.with.friends
## 1            0                8         1                1                  8
## 2            9                8         1                1                  7
## 3            0                6         5                4                  7
## 4            2                5         1                1                  2
## 5            8                6         1                8                  6
## 6            8                8         0                1                  8
##   Drink.at.home To.serve.dinner.guests For.dining.out Drink.at.bar
## 1             0                      7              9            7
## 2             7                      6              6            8
## 3             0                      6              8            7
## 4             2                      9              7            1
## 5             8                      6              5            8
## 6             9                      2              2            2
##   Weekly.consumption Age..1.7. Income..1.7. Education..1.6. Sex..M.1.
## 1                  2         5            4               5         1
## 2                  9         6            7               3         1
## 3                  6         4            6               6         1
## 4                 24         5            7               5         2
## 5                  2         2            3               5         2
## 6                  8         5            5               5         2
##   Adapt.to.new.situations Make.friends.easily
## 1                       4                   3
## 2                       4                   4
## 3                       4                   4
## 4                       3                   2
## 5                       3                   3
## 6                       4                   4
##   Don.t.like.to.be.tied.to.timetable Like.to.take.chances Like.to.travel.abroad
## 1                                  4                    3                     4
## 2                                  3                    4                     3
## 3                                  4                    4                     4
## 4                                  4                    3                     4
## 5                                  3                    3                     3
## 6                                  3                    4                     4
##   Like.ethnic.food Knowledgeable.about.beer
## 1                4                        3
## 2                4                        3
## 3                4                        4
## 4                3                        4
## 5                2                        3
## 6                4                        3
names(segdata)
##  [1] "id"                                  "Rich.full.bodied"                   
##  [3] "Light.beer"                          "No.aftertaste"                      
##  [5] "Refreshing"                          "Goes.down.easily"                   
##  [7] "Gives.a..buzz."                      "Good.taste"                         
##  [9] "Low.price"                           "Good.value"                         
## [11] "From.country.with.brewing.tradition" "Attractive.bottle"                  
## [13] "Prestigious.brand"                   "High.quality"                       
## [15] "Drink.at.picnics"                    "Masculine"                          
## [17] "For.young.people"                    "Drink.with.friends"                 
## [19] "Drink.at.home"                       "To.serve.dinner.guests"             
## [21] "For.dining.out"                      "Drink.at.bar"                       
## [23] "Weekly.consumption"                  "Age..1.7."                          
## [25] "Income..1.7."                        "Education..1.6."                    
## [27] "Sex..M.1."                           "Adapt.to.new.situations"            
## [29] "Make.friends.easily"                 "Don.t.like.to.be.tied.to.timetable" 
## [31] "Like.to.take.chances"                "Like.to.travel.abroad"              
## [33] "Like.ethnic.food"                    "Knowledgeable.about.beer"
# Selecting columns for cluster analysis and removing the ID variable
segdata1 <- segdata[, 1:22]
z <- segdata1[, -1]
# Normalizing the data
means <- apply(z, 2, mean)
sds <- apply(z, 2, sd)
nor <- scale(z, center = means, scale = sds)
# Calculating the Euclidean distance matrix
distance <- dist(nor)
# Performing hierarchical clustering
segdata.hclust <- hclust(distance)
plot(segdata.hclust, hang = -1)

# Determining cluster membership for 5 clusters
member <- cutree(segdata.hclust, 5)

# Adding cluster membership to the dataset and saving it
segdata2 <- cbind(segdata, member)
write.csv(segdata2, 'Kirin_with_member.csv')

# Characterizing clusters by calculating the mean of each attribute per cluster
cluster_means <- aggregate(segdata1[,-1], by = list(member), mean)

Question 3B. Segmentation Analysis: Identifying Number of Market Segments

We can determine the number of distinct market segments by analyzing the within-group sum of squares (WSS) for different numbers of clusters. The “elbow method” is used to identify the point at which the decrease in WSS slows down significantly, indicating the optimal number of clusters. Below is the R code used to generate the Scree Plot:

# Scree Plot
wss <- (nrow(nor)-1)*sum(apply(nor,2,var))
for (i in 2:20) wss[i] <- sum(kmeans(nor, centers=i)$withinss)
plot(1:20, wss, type="b", xlab="Number of Clusters", ylab="Within groups sum of squares") 

Based on the Scree Plot, we look for the “elbow” where the plot begins to flatten out, which indicates that adding more clusters does not significantly improve the fit.

The plot shows that the decrease in WSS is sharp up to around 5 clusters, after which the decrease becomes more gradual. This suggests that the market can be segmented into 5 distinct segments. The choice of 5 is supported by the rationale that further increasing the number of segments would result in a lesser relative improvement in WSS, thus not adding significant value to the segmentation model

Hence, for the purposes of this segmentation analysis, we will select 5 as the number of distinct market segments.

Question 3C. Profiling the Clusters

After determining that there are five distinct segments in the market, the next step is to profile each segment. Profiling involves analyzing the mean values of each attribute within the clusters to understand the characteristics that define each segment. The aggregate() function in R can be used to calculate the means for each cluster.

Here is the R code that profiles the clusters:

# Characterizing clusters by calculating means for each cluster
cluster_profiles <- aggregate(segdata1[,-1], by=list(cluster=member), mean)

# View the cluster profiles
print(cluster_profiles)
##   cluster Rich.full.bodied Light.beer No.aftertaste Refreshing Goes.down.easily
## 1       1         5.184615   3.076923      4.223077   5.638462         5.476923
## 2       2         2.302632   3.131579      3.552632   2.197368         2.552632
## 3       3         7.424242   4.060606      5.954545   7.333333         7.242424
## 4       4         2.875000   6.458333      4.750000   4.208333         6.416667
## 5       5         4.904762   5.666667      5.714286   5.095238         4.761905
##   Gives.a..buzz. Good.taste Low.price Good.value
## 1       3.000000  2.3923077  3.330769   4.323077
## 2       2.605263  0.4868421  3.302632   3.197368
## 3       4.681818  7.8939394  5.454545   6.727273
## 4       4.416667  1.5000000  5.500000   4.750000
## 5       3.476190  2.1904762  3.047619   5.285714
##   From.country.with.brewing.tradition Attractive.bottle Prestigious.brand
## 1                            3.515385          2.546154          2.607692
## 2                            2.815789          2.460526          1.697368
## 3                            4.757576          3.696970          4.742424
## 4                            5.375000          2.875000          4.625000
## 5                            4.571429          5.666667          5.809524
##   High.quality Drink.at.picnics Masculine For.young.people Drink.with.friends
## 1     4.746154         4.800000  2.100000         1.700000           5.330769
## 2     1.210526         2.421053  1.921053         1.815789           2.236842
## 3     6.803030         5.727273  4.196970         3.924242           6.348485
## 4     7.666667         6.333333  2.708333         3.375000           6.166667
## 5     3.666667         5.190476  4.047619         4.285714           2.809524
##   Drink.at.home To.serve.dinner.guests For.dining.out Drink.at.bar
## 1      4.538462               5.500000       5.423077     4.523077
## 2      1.973684               2.302632       2.236842     1.552632
## 3      6.424242               6.909091       6.893939     6.348485
## 4      5.541667               6.166667       6.375000     6.916667
## 5      3.714286               3.095238       5.333333     4.142857

The naming of the clusters is based on the distinguishing features that are prominent within each group. These profiles will assist in tailoring marketing campaigns to each segment’s specific preferences and behaviors.

Question 3D. Targeting Clusters for Marketing Campaigns

Based on the detailed profiling of each cluster, we can now develop a targeted marketing strategy. Our objective is to maximize the impact of our campaign by focusing on those segments that are most likely to respond to our marketing efforts and have the highest potential value to our brand. Here’s a strategic approach to targeting or avoiding each cluster:

Each cluster’s strategy is crafted based on its unique attributes and preferences, as revealed by the cluster analysis. By aligning our marketing tactics with these insights, we can ensure that our campaigns are well-received and effective in driving brand growth within each targeted segment.

In conclusion, we will primarily target Clusters 1, 2, and 4, as these groups align with our brand’s value proposition and are likely to respond positively to our marketing efforts. Cluster 3 will receive a more moderate focus, leveraging the brand’s tradition and heritage, while Cluster 5 will be engaged less aggressively due to their high price sensitivity.