Project Title: Understanding the Store Performance
NAME: Rajat R Shrivastav
EMAIL: rajatabhi605@gmail.com
COLLEGE: Shah and Anchor Kutchhi Engineering College
Lets bring in the data first for the analysis.
setwd("C:/Users/Rajat/Desktop/Internship IIM Lucknow/Datasets/Offlinel4")
hhm <- read.csv(paste("CapstoneTATA.csv",sep=""))
some(hhm)
## Shop.ID Location.Id Region.Id
## 2 2 46132 6175
## 6 6 46261 6189
## 36 36 46086 6184
## 77 77 46107 6180
## 107 107 46128 6158
## 118 118 46109 6192
## 121 121 46109 6192
## 126 126 46110 6221
## 131 131 46104 6155
## 156 156 46108 6161
## Location.Name City
## 2 ACTIVE AUTOMOBILES Ahmedabad
## 6 Adishakti Cars Pvt Ltd 3003160 Shimoga
## 36 CONCORDE MOTORS (INDIA) LIMITED-3005350-BANGALORE Bangalore
## 77 FAIRDEAL MOTORS & WORKSHOP PVT. LTD.-3002901 Jammu
## 107 JASPER AUTOMOBILES PVT. LTD.-3009210 Vijayawada
## 118 KVR DREAM VEHICLES PVT LTD (Kannur) Kannur
## 121 KVR DREAM VEHICLES PVT LTD (Kannur) Kannur
## 126 LEXUS MOTORS LTD-3002150 Kolkata
## 131 MALIK CARS Hyderabad
## 156 PRAGATI TRADING COMPANY LTD Jorhat
## State Tier Brand Overall.Score First.Impression
## 2 Gujarat Tier 1 Tiago 70 78
## 6 Karnataka Tier 3 Hexa 77 89
## 36 Karnataka Tier 1 Tigor 84 40
## 77 Jammu and Kashmir Tier 2 Tigor 28 44
## 107 Andhra Pradesh Tier 3 Tigor 60 40
## 118 Kerala Tier 2 Hexa 87 100
## 121 Kerala Tier 2 Tigor 77 100
## 126 West Bengal Tier 1 Tigor 57 89
## 131 Andhra Pradesh Tier 1 Tigor 66 89
## 156 Assam Tier 3 Tigor 46 67
## Showroom.Ambience Display.Vehicles Showroom.Facility Meet...Greet
## 2 60 100 80 36
## 6 100 100 0 75
## 36 100 100 100 58
## 77 60 33 20 50
## 107 60 100 0 75
## 118 100 100 100 100
## 121 100 100 100 58
## 126 100 100 0 92
## 131 100 100 0 75
## 156 80 100 40 67
## CA.Grooming CA.Selling.Skills Need.Analysis Product.Knowledge
## 2 100 90 78 100
## 6 100 97 89 100
## 36 100 87 100 84
## 77 50 12 0 21
## 107 75 50 78 47
## 118 100 87 100 84
## 121 100 100 100 100
## 126 100 25 44 16
## 131 100 87 89 84
## 156 100 38 33 47
## Test.drive Follow.up
## 2 88 40
## 6 88 100
## 36 100 100
## 77 62 0
## 107 88 88
## 118 88 100
## 121 85 0
## 126 58 88
## 131 96 0
## 156 0 0
lets have a look at the dimension of the dataset.
dim(hhm)
## [1] 247 20
lets have a look at the data types and the summary statistics.
str(hhm)
## 'data.frame': 247 obs. of 20 variables:
## $ Shop.ID : int 1 2 3 4 5 6 7 8 9 10 ...
## $ Location.Id : int 46132 46132 46132 46132 46132 46261 46261 46261 46261 46261 ...
## $ Region.Id : int 6175 6175 6175 6175 6175 6189 6189 6189 6189 6189 ...
## $ Location.Name : Factor w/ 50 levels "ACTIVE AUTOMOBILES",..: 1 1 1 1 1 2 2 2 2 2 ...
## $ City : Factor w/ 34 levels "Ahmedabad","Bangalore",..: 1 1 1 1 1 30 30 30 30 30 ...
## $ State : Factor w/ 16 levels "Andhra Pradesh",..: 5 5 5 5 5 9 9 9 9 9 ...
## $ Tier : Factor w/ 3 levels "Tier 1","Tier 2",..: 1 1 1 1 1 3 3 3 3 3 ...
## $ Brand : Factor w/ 3 levels "Hexa","Tiago",..: 1 2 2 3 3 1 2 2 3 3 ...
## $ Overall.Score : int 74 70 63 78 68 77 72 76 71 69 ...
## $ First.Impression : int 89 78 67 89 78 89 89 89 89 89 ...
## $ Showroom.Ambience: int 80 60 100 100 80 100 100 100 100 100 ...
## $ Display.Vehicles : int 100 100 100 100 100 100 100 100 100 100 ...
## $ Showroom.Facility: int 100 80 80 100 80 0 0 0 0 0 ...
## $ Meet...Greet : int 100 36 75 100 27 75 75 83 75 67 ...
## $ CA.Grooming : int 100 100 100 100 100 100 100 100 100 100 ...
## $ CA.Selling.Skills: int 79 90 81 93 88 97 85 93 78 76 ...
## $ Need.Analysis : int 67 78 78 78 78 89 78 100 78 67 ...
## $ Product.Knowledge: int 100 100 84 100 100 100 95 89 84 79 ...
## $ Test.drive : int 92 88 0 85 92 88 73 88 88 85 ...
## $ Follow.up : int 0 40 0 0 0 100 83 83 83 88 ...
summary(hhm)
## Shop.ID Location.Id Region.Id
## Min. : 1.0 Min. :46086 Min. :6155
## 1st Qu.: 62.5 1st Qu.:46102 1st Qu.:6180
## Median :124.0 Median :46117 Median :6194
## Mean :124.0 Mean :46164 Mean :6192
## 3rd Qu.:185.5 3rd Qu.:46262 3rd Qu.:6210
## Max. :247.0 Max. :46468 Max. :6222
##
## Location.Name City
## ACTIVE AUTOMOBILES : 5 Bangalore: 20
## Adishakti Cars Pvt Ltd 3003160 : 5 Delhi : 15
## AUTOVIKAS SALES & SERVICE PVT. LTD.: 5 Hyderabad: 15
## BASUDEB AUTO LIMITED-3000180 : 5 Kolkata : 15
## Berkerly Tata Motors : 5 Pune : 15
## Bijjargi Motors 3003010 : 5 Bhopal : 10
## (Other) :217 (Other) :157
## State Tier Brand Overall.Score
## Karnataka :40 Tier 1:110 Hexa :50 Min. :28.00
## Maharashtra :37 Tier 2: 80 Tiago:99 1st Qu.:61.00
## Andhra Pradesh:25 Tier 3: 57 Tigor:98 Median :68.00
## Tamil Nadu :25 Mean :66.25
## Uttar Pradesh :20 3rd Qu.:74.00
## West Bengal :20 Max. :89.00
## (Other) :80
## First.Impression Showroom.Ambience Display.Vehicles Showroom.Facility
## Min. : 20.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 67.00 1st Qu.: 80.00 1st Qu.:100.00 1st Qu.: 0.00
## Median : 80.00 Median :100.00 Median :100.00 Median : 40.00
## Mean : 77.19 Mean : 91.09 Mean : 98.38 Mean : 45.43
## 3rd Qu.: 89.00 3rd Qu.:100.00 3rd Qu.:100.00 3rd Qu.:100.00
## Max. :100.00 Max. :100.00 Max. :100.00 Max. :100.00
##
## Meet...Greet CA.Grooming CA.Selling.Skills Need.Analysis
## Min. : 0.00 Min. : 50.00 Min. : 12.00 Min. : 0.00
## 1st Qu.: 58.00 1st Qu.:100.00 1st Qu.: 65.00 1st Qu.: 56.00
## Median : 75.00 Median :100.00 Median : 78.00 Median : 78.00
## Mean : 71.09 Mean : 97.37 Mean : 73.98 Mean : 69.62
## 3rd Qu.: 83.00 3rd Qu.:100.00 3rd Qu.: 89.00 3rd Qu.: 89.00
## Max. :100.00 Max. :100.00 Max. :100.00 Max. :100.00
##
## Product.Knowledge Test.drive Follow.up
## Min. : 16.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 74.00 1st Qu.: 77.00 1st Qu.: 0.00
## Median : 84.00 Median : 85.00 Median : 0.00
## Mean : 80.33 Mean : 75.62 Mean : 25.62
## 3rd Qu.: 95.00 3rd Qu.: 92.00 3rd Qu.: 57.00
## Max. :100.00 Max. :100.00 Max. :100.00
##
Starting of with one way contigency tables in the order as Statewise, tierwise and brandwise distribution
The number now denotes secret Audits at the location
u=table(hhm$City)
u
##
## Ahmedabad Bangalore Bhopal Bijapur Chandigarh Chennai
## 5 20 10 5 5 10
## Coimbatore Delhi Hisar Howrah Hubli Hyderabad
## 10 15 5 5 5 15
## Indore Jammu Jorhat Kannur Kanpur Kolkata
## 5 5 5 5 5 15
## Kozhikode Lucknow Moga Mumbai Mysore Nagpur
## 5 10 5 10 5 5
## Navi Mumbai Noida Pune Ranchi Ratnagiri Shimoga
## 5 5 15 5 2 5
## Thrissur Vijayawada Viluppuram Warangal
## 5 5 5 5
e=table(hhm$State)
e
##
## Andhra Pradesh Assam Chandigarh Delhi
## 25 5 5 15
## Gujarat Haryana Jammu and Kashmir Jharkhand
## 5 5 5 5
## Karnataka Kerala Madhya Pradesh Maharashtra
## 40 15 15 37
## Punjab Tamil Nadu Uttar Pradesh West Bengal
## 5 25 20 20
o=table(hhm$Tier)
o
##
## Tier 1 Tier 2 Tier 3
## 110 80 57
p=table(hhm$Brand)
p
##
## Hexa Tiago Tigor
## 50 99 98
Moving on to 2 way contigency table.
w = xtabs(~Location.Name + Brand, data= hhm)
addmargins(w)
## Brand
## Location.Name Hexa Tiago Tigor Sum
## ACTIVE AUTOMOBILES 1 2 2 5
## Adishakti Cars Pvt Ltd 3003160 1 2 2 5
## AUTOVIKAS SALES & SERVICE PVT. LTD. 1 2 2 5
## Bafna Motors (Ratnagiri) Pvt Ltd 3005150 1 1 0 2
## BASUDEB AUTO LIMITED-3000180 1 2 2 5
## Berkerly Tata Motors 1 2 2 5
## Bijjargi Motors 3003010 1 2 2 5
## CONCORDE MOTORS (INDIA) LIMITED-3005350-BANGALORE 1 2 2 5
## CONCORDE MOTORS (INDIA) LIMITED-3005450-CHENNAI 1 2 2 5
## CONCORDE MOTORS (INDIA) LIMITED-3005550-HYDERABAD 1 2 2 5
## CONCORDE MOTORS (INDIA) LIMITED-3005805-PUNE 1 2 2 5
## CONCORDE MOTORS (INDIA) LTD - 3005800-MUMBAI 1 2 2 5
## CONCORDE MOTORS INDIA LIMITED-DELHI 1 2 2 5
## Dada Motors Private Limited 1 2 2 5
## EBONY AUTOMOBILES PVT LTD/AADYA MOTORS 1 2 2 5
## FAIRDEAL MOTORS & WORKSHOP PVT. LTD.-3002901 1 2 2 5
## FORTUNE CARS PVT. LTD. 1 2 2 5
## GOLDRUSH SALES & SERVICES LTD 1 2 2 5
## HYSON MOTORS (P) LTD 1 2 2 5
## JABALPUR MOTORS LTD 1 2 2 5
## JAIKA MOTORS LIMITED-3002400 1 2 2 5
## JASPER AUTOMOBILES PVT. LTD.-3009210 1 2 2 5
## KB Motors Pvt Ltd 3001630 1 2 2 5
## KHT MOTORS 1 2 2 5
## KVR DREAM VEHICLES PVT LTD (Kannur) 1 2 2 5
## LEXUS MOTORS LTD-3002150 1 2 2 5
## MALIK CARS 1 2 2 5
## MANICKBAG AUTOMOBILES PVT LTD-3002970 1 2 2 5
## MARINA MOTORS(INDIA) PVT LTD 1 2 2 5
## MCTC EXIM PVT LTD 1 2 2 5
## NATIONAL AUTO WHEELS PVT LTD 1 2 2 5
## PRAGATI TRADING COMPANY LTD 1 2 2 5
## PRERANA MOTORS (P) LTD-3002450 1 2 2 5
## RD Motors Pvt Ltd 3008940 1 2 2 5
## S R TRANZCARS PVT. LTD. 1 2 2 5
## Sagar Motors 3006230 1 2 2 5
## Schakralaya Motors 3002310 1 2 2 5
## SELECT MOTORS 1 2 2 5
## Society Motors Ltd 3000530 1 2 2 5
## Sridha Motors Pvt Ltd 3006190 1 2 2 5
## SRM MOTORS 1 2 2 5
## TAFE ACCESS LIMITED 1 2 2 5
## TAFE ACCESS LTD-3006900 1 2 2 5
## TC Motors Pvt Ltd 3000400 1 2 2 5
## Telmos Automobiles Pvt. Ltd. 1 2 2 5
## URS KAR SERVICE CENTRE (P) LTD 1 2 2 5
## Varenyam Motor Car 3006770 1 2 2 5
## VEER MOTOR COMPANY 1 2 2 5
## VENKATARAMANA MOTORS - 3008780 1 2 2 5
## WASAN MOTORS LTD. 1 2 2 5
## Sum 50 99 98 247
w = xtabs(~City + Tier, data= hhm)
addmargins(w)
## Tier
## City Tier 1 Tier 2 Tier 3 Sum
## Ahmedabad 5 0 0 5
## Bangalore 20 0 0 20
## Bhopal 0 10 0 10
## Bijapur 0 0 5 5
## Chandigarh 0 5 0 5
## Chennai 10 0 0 10
## Coimbatore 0 10 0 10
## Delhi 15 0 0 15
## Hisar 0 0 5 5
## Howrah 0 0 5 5
## Hubli 0 0 5 5
## Hyderabad 15 0 0 15
## Indore 0 5 0 5
## Jammu 0 5 0 5
## Jorhat 0 0 5 5
## Kannur 0 5 0 5
## Kanpur 0 5 0 5
## Kolkata 15 0 0 15
## Kozhikode 0 5 0 5
## Lucknow 0 10 0 10
## Moga 0 0 5 5
## Mumbai 10 0 0 10
## Mysore 0 5 0 5
## Nagpur 0 5 0 5
## Navi Mumbai 5 0 0 5
## Noida 0 5 0 5
## Pune 15 0 0 15
## Ranchi 0 0 5 5
## Ratnagiri 0 0 2 2
## Shimoga 0 0 5 5
## Thrissur 0 0 5 5
## Vijayawada 0 0 5 5
## Viluppuram 0 0 5 5
## Warangal 0 5 0 5
## Sum 110 80 57 247
w = xtabs(~City + Brand, data= hhm)
addmargins(w)
## Brand
## City Hexa Tiago Tigor Sum
## Ahmedabad 1 2 2 5
## Bangalore 4 8 8 20
## Bhopal 2 4 4 10
## Bijapur 1 2 2 5
## Chandigarh 1 2 2 5
## Chennai 2 4 4 10
## Coimbatore 2 4 4 10
## Delhi 3 6 6 15
## Hisar 1 2 2 5
## Howrah 1 2 2 5
## Hubli 1 2 2 5
## Hyderabad 3 6 6 15
## Indore 1 2 2 5
## Jammu 1 2 2 5
## Jorhat 1 2 2 5
## Kannur 1 2 2 5
## Kanpur 1 2 2 5
## Kolkata 3 6 6 15
## Kozhikode 1 2 2 5
## Lucknow 2 4 4 10
## Moga 1 2 2 5
## Mumbai 2 4 4 10
## Mysore 1 2 2 5
## Nagpur 1 2 2 5
## Navi Mumbai 1 2 2 5
## Noida 1 2 2 5
## Pune 3 6 6 15
## Ranchi 1 2 2 5
## Ratnagiri 1 1 0 2
## Shimoga 1 2 2 5
## Thrissur 1 2 2 5
## Vijayawada 1 2 2 5
## Viluppuram 1 2 2 5
## Warangal 1 2 2 5
## Sum 50 99 98 247
w = xtabs(~State + Brand, data= hhm)
addmargins(w)
## Brand
## State Hexa Tiago Tigor Sum
## Andhra Pradesh 5 10 10 25
## Assam 1 2 2 5
## Chandigarh 1 2 2 5
## Delhi 3 6 6 15
## Gujarat 1 2 2 5
## Haryana 1 2 2 5
## Jammu and Kashmir 1 2 2 5
## Jharkhand 1 2 2 5
## Karnataka 8 16 16 40
## Kerala 3 6 6 15
## Madhya Pradesh 3 6 6 15
## Maharashtra 8 15 14 37
## Punjab 1 2 2 5
## Tamil Nadu 5 10 10 25
## Uttar Pradesh 4 8 8 20
## West Bengal 4 8 8 20
## Sum 50 99 98 247
Lets have a look at the different aggregation of data fro analysis for gaining deeper insights.
FOR OVERALL SCORE
lets calculate the mean overall score.
mean(hhm$Overall.Score)
## [1] 66.24696
options(digits = 0)
m<- aggregate(hhm$Overall.Score,by=list(City=hhm$City),mean)
names(m)[2] <- "Overall Score"
m
## City Overall Score
## 1 Ahmedabad 71
## 2 Bangalore 70
## 3 Bhopal 58
## 4 Bijapur 67
## 5 Chandigarh 73
## 6 Chennai 68
## 7 Coimbatore 65
## 8 Delhi 68
## 9 Hisar 61
## 10 Howrah 70
## 11 Hubli 62
## 12 Hyderabad 69
## 13 Indore 53
## 14 Jammu 47
## 15 Jorhat 49
## 16 Kannur 81
## 17 Kanpur 68
## 18 Kolkata 63
## 19 Kozhikode 81
## 20 Lucknow 62
## 21 Moga 53
## 22 Mumbai 66
## 23 Mysore 65
## 24 Nagpur 60
## 25 Navi Mumbai 56
## 26 Noida 67
## 27 Pune 72
## 28 Ranchi 76
## 29 Ratnagiri 72
## 30 Shimoga 73
## 31 Thrissur 71
## 32 Vijayawada 60
## 33 Viluppuram 76
## 34 Warangal 72
seg.mean <- aggregate(Overall.Score ~ City,data = hhm,mean)
barchart( Overall.Score ~ City,data = seg.mean,xlab="Different Cities",
main="Histogram of Overall Score Citywise",
col=c("red","blue","yellow","darkorange","seagreen"))
options(digits = 0)
m=aggregate(hhm$Overall.Score,by=list(State=hhm$State),mean)
names(m)[2] <- "Overall Score"
m
## State Overall Score
## 1 Andhra Pradesh 68
## 2 Assam 49
## 3 Chandigarh 73
## 4 Delhi 68
## 5 Gujarat 71
## 6 Haryana 61
## 7 Jammu and Kashmir 47
## 8 Jharkhand 76
## 9 Karnataka 68
## 10 Kerala 78
## 11 Madhya Pradesh 57
## 12 Maharashtra 67
## 13 Punjab 53
## 14 Tamil Nadu 68
## 15 Uttar Pradesh 65
## 16 West Bengal 64
seg.mean <- aggregate(Overall.Score ~ State,data = hhm,mean)
barchart( Overall.Score ~ State,data = seg.mean,xlab="States",
main="Histogram of Overall Score statewise",
col=c("palevioletred","gold","purple","darkorange","navy"))
scatterplot(Overall.Score ~State, data=hhm,
spread=FALSE, smoother.args=list(lty=2),
main="Scatter plot of Overall Score vs State",
xlab="States",
ylab="Overall Score")
## [1] "104" "46" "22" "21" "84" "200"
options(digits = 0)
m=aggregate(hhm$Overall.Score,by=list(Tier=hhm$Tier),mean)
names(m)[2] <- "Overall Score"
m
## Tier Overall Score
## 1 Tier 1 68
## 2 Tier 2 65
## 3 Tier 3 65
seg.mean <- aggregate(Overall.Score ~ Tier,data = hhm,mean)
barchart( Overall.Score ~ Tier,data = seg.mean,xlab=" 3 Tiers ",
main="Histogram of Overall Score Tierwise",
col=c("powderblue","olivedrab","red4"))
options(digits = 0)
m=aggregate(hhm$Overall.Score,by=list(Brand=hhm$Brand),mean)
names(m)[2] <- "Overall Score"
m
## Brand Overall Score
## 1 Hexa 68
## 2 Tiago 66
## 3 Tigor 65
seg.mean <- aggregate(Overall.Score ~ Brand,data = hhm,mean)
barchart( Overall.Score ~ Brand,data = seg.mean,xlab="Brands",
main="Histogram of Overall Score Brandwise",
col=c("maroon1","purple4","olivedrab"))
Generating Boxplots
boxplot(hhm$Overall.Score,
xlab="Overall Score in Percentage",col="yellow",
main="Box plot of overall Score",horizontal=TRUE)
boxplot(hhm$First.Impression,
xlab="First impression in Percentage",col="darkorange",
main="Box plot of first impression score",horizontal=TRUE)
boxplot(hhm$Showroom.Ambience,
xlab="Ambience rating in Percentage",col="pink",
main="Box plot of overall Score",horizontal=TRUE)
boxplot(hhm$Display.Vehicles,
xlab="Score of display vehicles in Percentage",col="seagreen",
main="Box plot of Display vehicle Score",horizontal=TRUE)
boxplot(hhm$Showroom.Facility,
xlab="Showroom Facility score in Percentage",col="gold",
main="Box plot of Showroom Facility Score",horizontal=TRUE)
boxplot(hhm$Meet...Greet,
xlab="Meet Greet etiquettes scores in Percentage",col="purple",
main="Box plot of Meet Greet etiquettes scores",horizontal=TRUE)
boxplot(hhm$CA.Grooming,
xlab="CA Grooming Score in Percentage",col="maroon",
main="Box plot of CA Grooming Score",horizontal=TRUE)
boxplot(hhm$CA.Selling.Skills,
xlab="CA selling skills Score in Percentage",col="powderblue",
main="Box plot of CA selling skills Score",horizontal=TRUE)
boxplot(hhm$Need.Analysis,
xlab="Need Analysis in Percentage",col="olivedrab",
main="Box plot of getting the Need Analysis Score",horizontal=TRUE)
boxplot(hhm$Product.Knowledge,
xlab="Product knowledge Score in Percentage",col="salmon",
main="Box plot of Product knowledge Score",horizontal=TRUE)
boxplot(hhm$Test.drive,
xlab="Test Drive Score in Percentage",col="saddlebrown",
main="Box plot of Test Drive Score",horizontal=TRUE)
lets find a correlation matrix for all the numeric variables
dd2 <- subset(hhm,select=c(Overall.Score,First.Impression,Showroom.Ambience,
Display.Vehicles,Showroom.Facility,Meet...Greet,
CA.Grooming,CA.Selling.Skills,Need.Analysis,Product.Knowledge,
Test.drive))
corrs <- cor(dd2, use="pairwise.complete.obs")
corrs
## Overall.Score First.Impression Showroom.Ambience
## Overall.Score 1 0 0
## First.Impression 0 1 0
## Showroom.Ambience 0 0 1
## Display.Vehicles 0 0 0
## Showroom.Facility 0 0 0
## Meet...Greet 0 0 0
## CA.Grooming 0 0 0
## CA.Selling.Skills 1 0 0
## Need.Analysis 1 0 0
## Product.Knowledge 1 0 0
## Test.drive 1 0 0
## Display.Vehicles Showroom.Facility Meet...Greet
## Overall.Score 0 0 0
## First.Impression 0 0 0
## Showroom.Ambience 0 0 0
## Display.Vehicles 1 -0 0
## Showroom.Facility -0 1 -0
## Meet...Greet 0 -0 1
## CA.Grooming 0 0 0
## CA.Selling.Skills 0 -0 0
## Need.Analysis 0 -0 0
## Product.Knowledge 0 0 0
## Test.drive 0 0 0
## CA.Grooming CA.Selling.Skills Need.Analysis
## Overall.Score 0 1 1
## First.Impression 0 0 0
## Showroom.Ambience 0 0 0
## Display.Vehicles 0 0 0
## Showroom.Facility 0 -0 -0
## Meet...Greet 0 0 0
## CA.Grooming 1 0 0
## CA.Selling.Skills 0 1 1
## Need.Analysis 0 1 1
## Product.Knowledge 0 1 1
## Test.drive 0 0 0
## Product.Knowledge Test.drive
## Overall.Score 1 1
## First.Impression 0 0
## Showroom.Ambience 0 0
## Display.Vehicles 0 0
## Showroom.Facility 0 0
## Meet...Greet 0 0
## CA.Grooming 0 0
## CA.Selling.Skills 1 0
## Need.Analysis 1 0
## Product.Knowledge 1 0
## Test.drive 0 1
Lets have a correlation Matrix for the entire dataset to get a visual representation of the correlations betweeen the variables.
par(mfrow=c(1,1))
corrplot(corr=cor(hhm[,c(10:20)]),use="complete.obs",
method="ellipse")
Lets see that what are the ratings appearing most number of times
mlv(hhm$Overall.Score, method = "mfv")
## Mode (most frequent value): 66 69 70 71
## Bickel's modal skewness: -0
## Call: mlv.integer(x = hhm$Overall.Score, method = "mfv")
mlv(hhm$First.Impression, method = "mfv")
## Mode (most frequent value): 89
## Bickel's modal skewness: -0
## Call: mlv.integer(x = hhm$First.Impression, method = "mfv")
mlv(hhm$Showroom.Ambience, method = "mfv")
## Mode (most frequent value): 100
## Bickel's modal skewness: -0
## Call: mlv.integer(x = hhm$Showroom.Ambience, method = "mfv")
mlv(hhm$Display.Vehicles, method = "mfv")
## Mode (most frequent value): 100
## Bickel's modal skewness: -0
## Call: mlv.integer(x = hhm$Display.Vehicles, method = "mfv")
mlv(hhm$Showroom.Facility, method = "mfv")
## Mode (most frequent value): 0
## Bickel's modal skewness: 1
## Call: mlv.integer(x = hhm$Showroom.Facility, method = "mfv")
mlv(hhm$Meet...Greet, method = "mfv")
## Mode (most frequent value): 75 83
## Bickel's modal skewness: -0
## Call: mlv.integer(x = hhm$Meet...Greet, method = "mfv")
mlv(hhm$CA.Grooming, method = "mfv")
## Mode (most frequent value): 100
## Bickel's modal skewness: -0
## Call: mlv.integer(x = hhm$CA.Grooming, method = "mfv")
mlv(hhm$CA.Selling.Skills, method = "mfv")
## Mode (most frequent value): 71
## Bickel's modal skewness: 0
## Call: mlv.integer(x = hhm$CA.Selling.Skills, method = "mfv")
mlv(hhm$Need.Analysis, method = "mfv")
## Mode (most frequent value): 89
## Bickel's modal skewness: -1
## Call: mlv.integer(x = hhm$Need.Analysis, method = "mfv")
mlv(hhm$Product.Knowledge, method = "mfv")
## Mode (most frequent value): 100
## Bickel's modal skewness: -1
## Call: mlv.integer(x = hhm$Product.Knowledge, method = "mfv")
mlv(hhm$Test.drive, method = "mfv")
## Mode (most frequent value): 88
## Bickel's modal skewness: -0
## Call: mlv.integer(x = hhm$Test.drive, method = "mfv")
lets look at the chi-sqaure correlation test for different variables.
cor.test(hhm$Need.Analysis,hhm$CA.Selling.Skills)
##
## Pearson's product-moment correlation
##
## data: hhm$Need.Analysis and hhm$CA.Selling.Skills
## t = 20, df = 200, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 1 1
## sample estimates:
## cor
## 1
cor.test(hhm$Product.Knowledge,hhm$CA.Selling.Skills)
##
## Pearson's product-moment correlation
##
## data: hhm$Product.Knowledge and hhm$CA.Selling.Skills
## t = 40, df = 200, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 1 1
## sample estimates:
## cor
## 1
cor.test(hhm$Need.Analysis,hhm$Product.Knowledge)
##
## Pearson's product-moment correlation
##
## data: hhm$Need.Analysis and hhm$Product.Knowledge
## t = 10, df = 200, p-value <2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 1 1
## sample estimates:
## cor
## 1
Now some scatter plots and scatterplot matrix for the variables.
scatterplot.matrix(formula= ~Overall.Score + Overall.Score + First.Impression +
Showroom.Ambience + Display.Vehicles + Showroom.Facility +
Meet...Greet + CA.Grooming + CA.Selling.Skills +
Need.Analysis + Product.Knowledge + Test.drive,cex=0.6,
data=hhm,diagonal="density")
Running a regression model to check the factors affectting overall score.
mode1 <- (Overall.Score ~ Overall.Score + First.Impression +
Showroom.Ambience + Display.Vehicles + Showroom.Facility +
Meet...Greet + CA.Grooming + CA.Selling.Skills +
Need.Analysis + Product.Knowledge + Test.drive)
modulus11 <- lm(Overall.Score ~ Overall.Score + First.Impression +
Showroom.Ambience + Display.Vehicles + Showroom.Facility +
Meet...Greet + CA.Grooming + CA.Selling.Skills +
Need.Analysis + Product.Knowledge + Test.drive,data=hhm )
summary(modulus11)
##
## Call:
## lm(formula = Overall.Score ~ Overall.Score + First.Impression +
## Showroom.Ambience + Display.Vehicles + Showroom.Facility +
## Meet...Greet + CA.Grooming + CA.Selling.Skills + Need.Analysis +
## Product.Knowledge + Test.drive, data = hhm)
##
## Residuals:
## Min 1Q Median 3Q Max
## -4.08 -2.57 -1.77 2.38 7.63
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.14601 3.07340 0.05 0.9621
## First.Impression -0.00216 0.01484 -0.15 0.8845
## Showroom.Ambience 0.09367 0.01733 5.40 1.6e-07 ***
## Display.Vehicles 0.11822 0.02534 4.67 5.2e-06 ***
## Showroom.Facility 0.08586 0.00533 16.11 < 2e-16 ***
## Meet...Greet 0.07225 0.01288 5.61 5.6e-08 ***
## CA.Grooming 0.07761 0.02703 2.87 0.0045 **
## CA.Selling.Skills 0.03281 0.06188 0.53 0.5965
## Need.Analysis 0.13094 0.02147 6.10 4.3e-09 ***
## Product.Knowledge 0.13595 0.04470 3.04 0.0026 **
## Test.drive 0.09320 0.00948 9.83 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0 '**' 0 '*' 0 '.' 0 ' ' 1
##
## Residual standard error: 4 on 236 degrees of freedom
## Multiple R-squared: 0.896, Adjusted R-squared: 0.891
## F-statistic: 203 on 10 and 236 DF, p-value: <2e-16
library(leaps)
## Warning: package 'leaps' was built under R version 3.4.3
leap1 <- regsubsets(mode1, data = hhm, nbest=1)
## Warning in model.matrix.default(terms(formula, data = data), mm): the
## response appeared on the right-hand side and was dropped
## Warning in model.matrix.default(terms(formula, data = data), mm): problem
## with term 1 in model.matrix: no columns are assigned
summary(leap1)
## Subset selection object
## Call: regsubsets.formula(mode1, data = hhm, nbest = 1)
## 10 Variables (and intercept)
## Forced in Forced out
## First.Impression FALSE FALSE
## Showroom.Ambience FALSE FALSE
## Display.Vehicles FALSE FALSE
## Showroom.Facility FALSE FALSE
## Meet...Greet FALSE FALSE
## CA.Grooming FALSE FALSE
## CA.Selling.Skills FALSE FALSE
## Need.Analysis FALSE FALSE
## Product.Knowledge FALSE FALSE
## Test.drive FALSE FALSE
## 1 subsets of each size up to 8
## Selection Algorithm: exhaustive
## First.Impression Showroom.Ambience Display.Vehicles
## 1 ( 1 ) " " " " " "
## 2 ( 1 ) " " " " " "
## 3 ( 1 ) " " " " " "
## 4 ( 1 ) " " " " " "
## 5 ( 1 ) " " " " "*"
## 6 ( 1 ) " " "*" " "
## 7 ( 1 ) " " "*" "*"
## 8 ( 1 ) " " "*" "*"
## Showroom.Facility Meet...Greet CA.Grooming CA.Selling.Skills
## 1 ( 1 ) " " " " " " "*"
## 2 ( 1 ) "*" " " " " "*"
## 3 ( 1 ) "*" " " " " "*"
## 4 ( 1 ) "*" "*" " " "*"
## 5 ( 1 ) "*" "*" " " "*"
## 6 ( 1 ) "*" "*" " " " "
## 7 ( 1 ) "*" "*" " " " "
## 8 ( 1 ) "*" "*" "*" " "
## Need.Analysis Product.Knowledge Test.drive
## 1 ( 1 ) " " " " " "
## 2 ( 1 ) " " " " " "
## 3 ( 1 ) " " " " "*"
## 4 ( 1 ) " " " " "*"
## 5 ( 1 ) " " " " "*"
## 6 ( 1 ) "*" "*" "*"
## 7 ( 1 ) "*" "*" "*"
## 8 ( 1 ) "*" "*" "*"
plot(leap1, scale="adjr2")
library(coefplot)
## Warning: package 'coefplot' was built under R version 3.4.3
coefplot(modulus11, intercept= FALSE, outerCI=1.96,coefficients=c("Overall. Score","Showroom.Ambience","Display.Vehicles", "Showroom.Facility","Meet...Greet","Product.Knowledge",
"Need.Analysis"))
The R sqaure and adjusted R square says that it is a very good model and we could infer that the variables first impressions and CA selling skills are statistically insignificant