R Markdown This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

Segmentation Objective - Dividing the target market or customers on the basis of some significant features which could help a company sell more products in less marketing expenses.

Market segmentation Market segmentation is a strategy that divides a broad target market of customers into smaller, more similar groups, and then designs a marketing strategy specifically for each group. Clustering is a common technique for market segmentation since it automatically finds similar groups given a data set.

Create a product which evokes the needs & wants in target market

Imagine that you are the Director of Customer Relationships at Nike, and you have five managers working for you. You would like to organize all the company’s customers into five groups so that each group can be assigned to a different manager. Strategically, you would like that the customers in each group are as similar as possible. Examples of Objectives 1.Identify the type of customers who would respond to a particular offer 2.Identify high spenders among customers who will use the e-commerce channel for festive shopping 3.Identify customers who will default on their credit obligation for a loan or credit card

Example The file segmetation.csv contains information on consumers’ perceptions toward a brand in the apparel industry. The purpose of the case analysis is to gain a better understanding of the consumer segments for the brand, in hopes that such understanding would allow the brand to develop effective segment- or product-specific advertising campaigns.

Questions 1.Can you perform a 5-cluster analysis? Yes we can perform, a 5-cluster analysis. This is all created between each cluster, and how each cluster is segmented. 2.How many observations do you have in each cluster? .Between the three clusters we have 22 total observations. Within the Customer service we have 7, in professionalism we have 6, and pick up service we have 9. 3.List the cluster member IDs in each cluster. From the excel spredsheet there are 14 member IDs across all three clusters.The member IDs: Customer Service is helpful, Recommend, Come again, All Product I need, Professionalism, Limitation, Online grocery delivery, Pick up service, Find items, other shops, Gender, Age, and Education.

Questions to submit 1. Basic descriptive statistics (mean, SD, outliers, etc.) The mean is 2.136363636, SD of 0.833549755. This is the average amount of the data. The outliers are difficlut to find out and comprehend.

  1. How many clusters do we have? 3 clusters

  2. How many observations do you have in each cluster, respectively? As mentioned before, we have 7 with Customer Service, 6 with professionalism, and 9 in pickup service.

  3. List the cluster membership (the customer IDs) for each cluster. The member IDs: Customer Service is helpful, Recommend, Come again, All Product I need, Professionalism, Limitation, Online grocery delivery, Pick up service, Find items, other shops, Gender, Age, and Education.

  4. What are common characteristics of the customers in each cluster? The moist common characteristics are the overlapping ones which relay the message of modes.

  5. List at least two R functions (mean(), library(help = ggplot2), etc.) you have learned so far and explain it by referring to an online article.

  6. A 50-word reflection of your learning experience (e.g., three things you have learned this week). How could it benefit you as a future manager? Cite online sources if possible. Alternatively, you may create 2 Tweets with the hashtag #csubmktr20 and post the URL of your Tweets here. Overall, this semester has been a combination of haymakers and uppercuts. I have learned many coding techniques, using R and Rstudion, and tying it into Rpubs. The semester started off tough and I was discouraged to do any assignment, until I started becoming more friendly with this work and it became very smooth. As a future manager, it will help me understand how my customers are and there perks they enjoy.

  7. Two responses to your peers. Coming soon!

#install.packages(‘dplyr’) library(dplyr) # sane data manipulation ## ## Attaching package: ‘dplyr’ ## The following objects are masked from ‘package:stats’: ## ## filter, lag ## The following objects are masked from ‘package:base’: ## ## intersect, setdiff, setequal, union library(tidyr) # sane data munging library(ggplot2) # needs no introduction library(ggfortify) # super-helpful for plotting non-“standard” stats objects

#identifying your working directory install.packages(“readr”) ## Installing package into ‘/home/rstudio-user/R/x86_64-pc-linux-gnu-library/4.0’ ## (as ‘lib’ is unspecified) library(readr)

mydata <-read_csv(‘Segmentation.csv’) ## ## ── Column specification ──────────────────────────────────────────────────────── ## cols( ## .default = col_double() ## ) ## ℹ Use spec() for the full column specifications. # read csv file #This allows you to read the data from my Github site.

#Open the data. Note that some students will see an Excel option in “Import Dataset”; #those that do not will need to save the original data as a csv and import that as a text file. #rm(list = ls()) #used to clean your working environment fit <- kmeans(mydata[,-1], 3, iter.max=1000) #exclude the first column since it is “id” instead of a factor #or variable. #3 means you want to have 3 clusters table(fit\(cluster) ## ## 1 2 3 ## 66 90 65 barplot(table(fit\)cluster), col=“#336699”) #plot

pca <- prcomp(mydata[,-1]) #principle component analysis pca_data <- mutate(fortify(pca), col=fit$cluster) #We want to examine the cluster memberships for each #observation - see last column

ggplot(pca_data) + geom_point(aes(x=PC1, y=PC2, fill=factor(col)), size=3, col=“#7f7f7f”, shape=21) + theme_bw(base_family=“Helvetica”)

autoplot(fit, data=mydata[,-1], frame=TRUE, frame.type=‘norm’) ## Warning: select_() is deprecated as of dplyr 0.7.0. ## Please use select() instead. ## This warning is displayed once every 8 hours. ## Call lifecycle::last_warnings() to see where this warning was generated.

write.csv(pca_data, “pca_data.csv”) #save your cluster solutions in the working directory #We want to examine the cluster memberships for each observation - see last column of pca_data