GROUP 4 MEMBERS
1.Vinod G Moovankara
2.Pahar Singh
3.Nitin Sharma
4.Ambuj Singh
5.Shilpee Srivastava Saxena
6.Kaushal Falwaria
Q. Mitra decides to form homogeneous subgroups of players, which would better help him to express the nuances of T20 cricket. How would you go about implementing this? Apply relevant data analysis technique and generate useful insights.
The Indian Premier League Gauging Player Performance excel.XLS
library(readxl)
#setwd()
ipl <- read_excel("The Indian Premier League Gauging Player Performance excel.xlsx",
sheet = "Sheet1")
## New names:
## • `` -> `...1`
## • `` -> `...2`
View(ipl)
ipl<- (ipl[,-c(2)])
ipl1<- ipl
We generated it by multiplying the average and strike rate (SR). In T20 we need batsmen who have consistent performance (indicated by their average) and high strike rate, i.e., how fast they score. To improve the data we divide the SR by 100, to get the SR per ball
#ipl1$SR <- ipl1$SR/100
ipl1$BI <- ipl1$Avg*ipl$SR/100
ipl1<- (ipl1[,-c(5:6)])
ipls <- scale (ipl1[,c(4:9)])
Silhouette tells us the best k number where the value is positive - that is the relation is positive and separation is also high. or how much b is greater than a.
We also did a standard wss plot
#install.packages("factoextra")
library(factoextra)
## Loading required package: ggplot2
## Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
fviz_nbclust(ipls[,1:5], kmeans, method = "wss")
fviz_nbclust(ipls[,1:5], kmeans, method = "silhouette")
set.seed(1234)
cluster_1<-kmeans(ipls,3)
cluster_1
## K-means clustering with 3 clusters of sizes 6, 38, 26
##
## Cluster means:
## Runs Hundreds Fifties Fours Sixes Salary
## 1 1.4157758 3.2425739 0.9693617 1.3713348 0.6253612 0.7560000
## 2 -0.7970102 -0.3039913 -0.6898410 -0.7720593 -0.6145994 -0.2806928
## 3 0.8381436 -0.3039913 0.7845303 0.8119325 0.7539466 0.2357817
##
## Clustering vector:
## [1] 3 1 2 2 3 2 2 2 3 3 2 2 1 2 2 3 2 3 3 2 2 1 3 2 2 2 3 1 2 2 3 2 3 2 2 3 2 3
## [39] 3 2 3 3 2 2 2 2 3 2 3 3 2 1 2 3 2 3 2 2 3 3 3 2 2 3 3 2 2 1 2 2
##
## Within cluster sum of squares by cluster:
## [1] 31.16488 43.71399 85.54479
## (between_SS / total_SS = 61.3 %)
##
## Available components:
##
## [1] "cluster" "centers" "totss" "withinss" "tot.withinss"
## [6] "betweenss" "size" "iter" "ifault"
cluster_1$totss
## [1] 414
cluster_1$betweenss
## [1] 253.5763
cluster_1$tot.withinss
## [1] 160.4237
cluster_1$betweenss/cluster_1$totss*100
## [1] 61.25032
cluster_1$tot.withinss/cluster_1$totss*100
## [1] 38.74968
Total = 414 Total of between = 226.1073 (55% heterogeneity) Total of within = 187.8927 (45% heterogeneity)
#Step 1 - make it a class
class(ipl)
## [1] "tbl_df" "tbl" "data.frame"
#Step 2 - make it a data frame
ipl <- data.frame(ipl)
#Step 3 - add cluster columnn
ipl$cluster_1 <- cluster_1$cluster
ipl$cluster_1<-replace(ipl$cluster_1, ipl$cluster_1 ==1, "Extremely Valuable")
ipl$cluster_1<-replace(ipl$cluster_1, ipl$cluster_1 ==3, "Valuable")
ipl$cluster_1<-replace(ipl$cluster_1, ipl$cluster_1 ==2, "Under performer")
ipl
## ...1 Player Team Runs Avg SR
## 1 1 AB de Villiers Royal Challengers Bangalore 442 44.20 154.00
## 2 2 Ajinkya Rahane Rajasthan Royals 393 32.75 137.89
## 3 3 Akshdeep Nath Royal Challengers Bangalore 61 12.20 107.01
## 4 4 Ambati Rayudu Chennai Super Kings 282 23.50 93.06
## 5 5 Andre Russell Kolkata Knight Riders 510 56.66 204.81
## 6 6 Axar Patel Royal Challengers Bangalore 110 18.33 125.00
## 7 7 Ben Stokes Rajasthan Royals 123 20.50 124.24
## 8 8 Bhuvneshwar Kumar Sunrisers Hyderabad 12 4.00 63.15
## 9 9 Chris Gayle Kings XI Punjab 490 40.83 153.60
## 10 10 Chris Lynn Kolkata Knight Riders 405 31.15 139.65
## 11 11 Chris Morris Delhi Capitals 32 5.33 86.48
## 12 12 Colin Ingram Delhi Capitals 184 18.40 119.48
## 13 13 David Warner Sunrisers Hyderabad 692 69.20 143.86
## 14 14 David Miller Kings XI Punjab 213 26.62 129.87
## 15 15 Deepak Hooda Sunrisers Hyderabad 64 10.66 101.58
## 16 16 Dinesh Karthik Kolkata Knight Riders 253 31.62 146.24
## 17 17 Dwayne Bravo Chennai Super Kings 80 16.00 121.21
## 18 18 Faf du Plessis Chennai Super Kings 396 36.00 123.36
## 19 19 Hardik Pandya Mumbai Indians 402 44.66 191.42
## 20 20 Ishan Kishan Mumbai Indians 101 16.83 101.00
## 21 21 Jofra Archer Rajasthan Royals 67 33.50 167.50
## 22 22 Jonny Bairstow Sunrisers Hyderabad 445 55.62 157.24
## 23 23 Jos Buttler Rajasthan Royals 311 38.87 151.70
## 24 24 Kane Williamson Sunrisers Hyderabad 156 22.28 120.00
## 25 25 Kedar Jadhav Chennai Super Kings 162 18.00 95.85
## 26 26 Keemo Paul Rajasthan Royals 18 3.60 75.00
## 27 27 Kieron Pollard Mumbai Indians 279 34.87 156.74
## 28 28 KL Rahul Kings XI Punjab 593 53.90 135.38
## 29 29 Krunal Pandya Mumbai Indians 183 16.63 122.00
## 30 30 Mandeep Singh Kings XI Punjab 165 41.25 137.50
## 31 31 Manish Pandey Sunrisers Hyderabad 344 43.00 130.79
## 32 32 Marcus Stoinis Royal Challengers Bangalore 211 52.75 135.25
## 33 33 Mayank Agarwal Kings XI Punjab 332 25.53 141.88
## 34 34 Moeen Ali Royal Challengers Bangalore 220 27.50 165.41
## 35 35 Mohammad Nabi Sunrisers Hyderabad 115 19.16 151.31
## 36 36 MS Dhoni Chennai Super Kings 416 83.20 134.62
## 37 37 Nicholas Pooran Kings XI Punjab 168 28.00 157.00
## 38 38 Nitish Rana Kolkata Knight Riders 344 34.40 146.38
## 39 39 Parthiv Patel Royal Challengers Bangalore 373 26.64 139.17
## 40 40 Piyush Chawla Kolkata Knight Riders 42 14.00 113.51
## 41 41 Prithvi Shaw Delhi Capitals 353 22.06 133.71
## 42 42 Quinton de Kock Mumbai Indians 529 35.26 132.91
## 43 43 Rahul Tripathi Rajasthan Royals 141 23.50 119.49
## 44 44 Rashid Khan Sunrisers Hyderabad 34 6.80 147.82
## 45 45 Ravichandran Ashwin Kings XI Punjab 42 8.40 150.00
## 46 46 Ravindra Jadeja Chennai Super Kings 106 35.33 120.45
## 47 47 Rishabh Pant Delhi Capitals 488 37.53 162.66
## 48 48 Riyan Parag Rajasthan Royals 160 32.00 126.98
## 49 49 Robin Uthappa Kolkata Knight Riders 282 31.33 115.10
## 50 50 Rohit Sharma Mumbai Indians 405 28.92 128.57
## 51 51 Sam Curran Kings XI Punjab 95 23.75 172.72
## 52 52 Sanju Samson Rajasthan Royals 342 34.20 148.69
## 53 55 Sarfaraz Khan Kings XI Punjab 180 45.00 125.87
## 54 56 Shane Watson Chennai Super Kings 398 23.41 127.56
## 55 57 Sherfane Rutherford Delhi Capitals 73 14.60 135.18
## 56 60 Shikhar Dhawan Delhi Capitals 521 34.73 135.67
## 57 61 Shimron Hetmyer Royal Challengers Bangalore 90 18.00 123.28
## 58 63 Shreyas Gopal Rajasthan Royals 63 15.75 136.95
## 59 64 Shreyas Iyer Delhi Capitals 463 30.86 119.94
## 60 65 Shubman Gill Kolkata Knight Riders 296 32.88 124.36
## 61 67 Steve Smith Rajasthan Royals 319 39.87 116.00
## 62 68 Stuart Binny Rajasthan Royals 70 23.33 175.00
## 63 71 Sunil Narine Kolkata Knight Riders 143 17.87 166.27
## 64 72 Suresh Raina Chennai Super Kings 383 23.93 121.97
## 65 74 Suryakumar Yadav Mumbai Indians 424 32.61 130.86
## 66 76 Umesh Yadav Kolkata Knight Riders 25 12.50 100.00
## 67 77 Vijay Shankar Sunrisers Hyderabad 244 20.33 126.42
## 68 80 Virat Kohli Royal Challengers Bangalore 464 33.14 141.46
## 69 84 Wriddhiman Saha Sunrisers Hyderabad 86 17.20 162.26
## 70 92 Yusuf Pathan Sunrisers Hyderabad 40 13.33 88.88
## Hundreds Fifties Fours Sixes Salary cluster_1
## 1 0 5 31 26 1.71875 Valuable
## 2 1 1 45 9 0.62500 Extremely Valuable
## 3 0 0 5 2 0.51430 Under performer
## 4 0 1 20 7 0.34375 Under performer
## 5 0 4 31 52 1.32813 Valuable
## 6 0 0 10 3 0.71430 Under performer
## 7 0 0 8 4 1.95313 Under performer
## 8 0 0 1 0 1.32813 Under performer
## 9 0 4 45 34 0.31250 Valuable
## 10 0 4 41 22 1.50000 Valuable
## 11 0 0 1 2 1.71875 Under performer
## 12 0 0 20 5 0.91430 Under performer
## 13 1 8 57 21 1.95313 Extremely Valuable
## 14 0 1 19 7 0.46875 Under performer
## 15 0 0 5 1 0.56250 Under performer
## 16 0 2 22 14 1.15625 Valuable
## 17 0 0 6 3 1.00000 Under performer
## 18 0 3 36 15 0.25000 Valuable
## 19 0 1 28 29 1.71875 Valuable
## 20 0 0 8 4 0.96875 Under performer
## 21 0 0 4 4 1.12500 Under performer
## 22 1 2 48 18 0.31430 Extremely Valuable
## 23 0 3 38 14 0.68750 Valuable
## 24 0 1 12 5 0.46875 Under performer
## 25 0 1 19 3 1.21875 Under performer
## 26 0 0 1 1 0.07140 Under performer
## 27 0 1 14 22 0.84375 Valuable
## 28 1 6 49 25 1.71875 Extremely Valuable
## 29 0 0 18 5 1.37500 Under performer
## 30 0 0 10 4 0.21875 Under performer
## 31 0 3 34 6 1.71875 Valuable
## 32 0 0 14 10 0.96875 Under performer
## 33 0 2 26 14 0.15625 Valuable
## 34 0 2 16 17 0.26563 Under performer
## 35 0 0 8 7 0.15625 Under performer
## 36 0 3 22 23 2.34375 Valuable
## 37 0 0 10 14 0.60000 Under performer
## 38 0 3 27 21 0.53125 Valuable
## 39 0 2 48 10 0.26563 Valuable
## 40 0 0 4 2 0.65625 Under performer
## 41 0 2 45 9 0.18750 Valuable
## 42 0 4 45 25 0.43750 Valuable
## 43 0 1 13 2 0.53125 Under performer
## 44 0 0 2 2 1.40625 Under performer
## 45 0 0 3 3 1.18750 Under performer
## 46 0 0 7 4 1.09375 Under performer
## 47 0 3 37 27 2.34375 Valuable
## 48 0 1 17 5 0.02860 Under performer
## 49 0 1 28 10 1.00000 Valuable
## 50 0 2 52 10 2.34375 Valuable
## 51 0 1 13 3 1.02860 Under performer
## 52 1 0 28 13 1.25000 Extremely Valuable
## 53 0 1 19 4 0.03570 Under performer
## 54 0 3 42 20 0.62500 Valuable
## 55 0 0 2 7 0.28570 Under performer
## 56 0 5 64 11 0.81250 Valuable
## 57 0 1 4 7 0.60000 Under performer
## 58 0 0 8 1 0.03125 Under performer
## 59 0 3 41 14 1.09375 Valuable
## 60 0 3 21 10 0.28125 Valuable
## 61 0 3 30 4 1.95313 Valuable
## 62 0 0 5 4 0.07813 Under performer
## 63 0 0 17 9 1.95313 Under performer
## 64 0 3 45 9 1.71875 Valuable
## 65 0 2 45 10 0.50000 Valuable
## 66 0 0 3 1 0.65625 Under performer
## 67 0 0 11 12 0.50000 Under performer
## 68 1 2 46 13 2.65625 Extremely Valuable
## 69 0 0 13 1 0.17140 Under performer
## 70 0 0 1 1 0.29688 Under performer
#install.packages("cluster")
library("cluster")
clusplot(ipl, ipl$cluster_1,
color = TRUE, shade = TRUE,
labels = 2,lines = 0)
In Cluster 2 (Extremely valuable), the batsmen have scored - - the maximum runs - centuries - maximum 4s - reasonable amount of 6s - have a high BI (Batting Index)
In Cluster 1 (Valuable), the batsmen have scored - - high runs - a few 50s - maximum 6s - reasonable amount of 4s - have the second highest BI (Batting Index)
In Cluster 3 (Under performer)