Introduction

This should be last in the series of posts based on my R package cricketr. That is, unless some bright idea comes trotting along and light bulbs go on around my head.

In this post cricketr adapts to the Twenty20 International format. Now cricketr can handle stats from all 3 formats of the game namely Test matches, ODIs and Twenty20 International from ESPN Cricinfo. You should be able to install the package from GitHub and use the many of the functions available in the package.

Please be mindful of the ESPN Cricinfo Terms of Use

You can also read this post at Rpubs as twenty20-cricketr. Download this report as a PDF file from twenty20-cricketr.pdf

I have chosen the Top 4 batsmen and top 4 bowlers based on ICC rankings and/or number of matches played.

Batsmen

  1. Virat Kohli (Ind)
  2. Faf du Plessis (SA)
  3. A J Finch (Aus)
  4. Brendon McCullum (Aus)

Bowlers

  1. Samuel Badree (WI)
  2. Sunil Narine (WI)
  3. Ravichander Ashwin (Ind)
  4. Ajantha Mendis (SL)

I have explained the plots and added my own observations. Please feel free to draw your conclusions!

The data for a particular player can be obtained with the getPlayerData() function. To do you will need to go to ESPN CricInfo Player and type in the name of the player for e.g Virat Kohli, Sunil Narine etc. This will bring up a page which have the profile number for the player e.g. for Virat Kohli this would be http://www.espncricinfo.com/india/content/player/253802.html. Hence, Sachin’s profile is 253802. This can be used to get the data for Virat Kohli as shown below

library(devtools)
install_github("tvganesh/cricketr")
library(cricketr)

The data for a particular player can be obtained with the getPlayerData() function. To do you will need to go to ESPN CricInfo Player and type in the name of the player for e.g Virat Kohli, Sunil Narine etc. This will bring up a page which have the profile number for the player e.g. for Virat Kohli this would be http://www.espncricinfo.com/india/content/player/253802.html. Hence, Kohlis profile is 253802. This can be used to get the data for Virat Kohli as shown below

kohli <- getPlayerDataTT(253802,dir="..",file="kohli.csv",type="batting")
## http://stats.espncricinfo.com/ci/engine/player/253802.html?class=3;home_or_away=1;home_or_away=2;home_or_away=3;result=1;result=2;result=3;result=5;template=results;type=batting;view=innings

The analysis is included below

Analyses of Batsmen

The following plots gives the analysis of the 4 ODI batsmen

  1. Virat Kohli (Ind) - Innings-26, Runs-972, Average-46.28,Strike Rate-131.70
  2. Faf du Plessis (SA) - Innings-24, Runs-805, Average-42.36,Strike Rate-135.75
  3. A J Finch (Aus) - Innings-22, Runs-756, Average-39.78,Strike Rate-152.41
  4. Brendon McCullum (NZ) - Innings-70, Runs-2140, Average-35.66,Strike Rate-136.21

Plot of 4s, 6s and the scoring rate in ODIs

The 3 charts below give the number of

  1. 4s vs Runs scored
  2. 6s vs Runs scored
  3. Balls faced vs Runs scored A regression line is fitted in each of these plots for each of the ODI batsmen

A. Virat Kohli - The 1st plot shows that Kohli approximately hits about 5 4’s on his way to the 50s - The 2nd box plot of no of 6s and runs shows the range of runs when Kohli scored 1,2 or 4 6s. The dark line in the box shows the average runs when he scored those number of 6s. So when he scored 1 6 the average runs he scored was 45 - The 3rd plot shows the number of runs scored against the balls faced. It can be seen when Kohli faced 50 balls he had scored around ~ 70 runs

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./kohli.csv","Kohli")
batsman6s("./kohli.csv","Kohli")
batsmanScoringRateODTT("./kohli.csv","Kohli")

dev.off()
## null device 
##           1

B. Faf du Plessis

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./plessis.csv","Du Plessis")
batsman6s("./plessis.csv","Du Plessis")
batsmanScoringRateODTT("./plessis.csv","Du Plessss")

dev.off()
## null device 
##           1

C. A J Finch

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./finch.csv","A J Finch")
batsman6s("./finch.csv","A J Finch")
batsmanScoringRateODTT("./finch.csv","A J Finch")

dev.off()
## null device 
##           1

D. Brendon McCullum

par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./mccullum.csv","McCullum")
batsman6s("./mccullum.csv","McCullum")
batsmanScoringRateODTT("./mccullum.csv","McCullum")

dev.off()
## null device 
##           1

Relative Mean Strike Rate

This plot shows the Mean Strike Rate of the batsman in each run range. It can be seen the A J Finch has the best strike rate followed by B McCullum.

par(mar=c(4,4,2,2))
frames <- list("./kohli.csv","./plessis.csv","finch.csv","mccullum.csv")
names <- list("Kohli","Du Plessis","Finch","McCullum")
relativeBatsmanSRODTT(frames,names)
## Warning in simpleLoess(y, x, w, span, degree, parametric, drop.square,
## normalize, : Chernobyl! trL>n 6
## Warning in simpleLoess(y, x, w, span, degree, parametric, drop.square,
## normalize, : Chernobyl! trL>n 6
## Warning in sqrt(sum.squares/one.delta): NaNs produced

Relative Runs Frequency Percentage

The plot below provides the average runs scored in each run range 0-5,5-10,10-15 etc. Clearly Kohli has the most runs scored in most of the runs ranges. . This is also evident in the fact that Kohli has the highest average. He is followed by McCullum

frames <- list("./kohli.csv","./plessis.csv","finch.csv","mccullum.csv")
names <- list("Kohli","Du Plessis","Finch","McCullum")
relativeRunsFreqPerfODTT(frames,names)

Percent 4’s,6’s in total runs scored

The plot below shows the percentage of runs scored by way of 4s and 6s for each batsman. Du Plessis has the highest percentage of 4s, McCullum has the highest 6s. Finch has the highest percentage of 4s & 6s - 25.37 + 15.64= 41.01%

rames <- list("./kohli.csv","./plessis.csv","finch.csv","mccullum.csv")
names <- list("Kohli","Du Plessis","Finch","McCullum")
runs4s6s <-batsman4s6s(frames,names)

print(runs4s6s)
##                Kohli Du Plessis Finch McCullum
## Runs(1s,2s,3s) 64.29      64.55 58.99    61.45
## 4s             27.78      24.38 25.37    22.87
## 6s              7.94      11.07 15.64    15.69

3D plot of Runs vs Balls Faced and Minutes at Crease

The plot is a scatter plot of Runs vs Balls faced and Minutes at Crease. A prediction plane is then fitted based on the Balls Faced and Minutes at Crease to give the runs scored

par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
battingPerf3d("./kohli.csv","Kohli")
battingPerf3d("./plessis.csv","Du Plessis")

dev.off()
## null device 
##           1
par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
battingPerf3d("./finch.csv","A J Finch")
battingPerf3d("./mccullum.csv","McCullum")

dev.off()
## null device 
##           1

Predicting Runs given Balls Faced and Minutes at Crease

A hypothetical Balls faced and Minutes at Crease is used to predict the runs scored by each batsman based on the computed prediction plane

BF <- seq( 5, 70,length=10)
Mins <- seq(5,70,length=10)
newDF <- data.frame(BF,Mins)

kohli <- batsmanRunsPredict("./kohli.csv","Kohli",newdataframe=newDF)
plessis <- batsmanRunsPredict("./plessis.csv","Du Plessis",newdataframe=newDF)
finch <- batsmanRunsPredict("./finch.csv","A J Finch",newdataframe=newDF)
mccullum <- batsmanRunsPredict("./mccullum.csv","McCullum",newdataframe=newDF)

The predicted runs is displayed. As can be seen Finch has the best overall strike rate followed by McCullum.

batsmen <-cbind(round(kohli$Runs),round(plessis$Runs),round(finch$Runs),round(mccullum$Runs))
colnames(batsmen) <- c("Kohli","Du Plessis","Finch","McCullum")
newDF <- data.frame(round(newDF$BF),round(newDF$Mins))
colnames(newDF) <- c("BallsFaced","MinsAtCrease")
predictedRuns <- cbind(newDF,batsmen)
predictedRuns
##    BallsFaced MinsAtCrease Kohli Du Plessis Finch McCullum
## 1           5            5     2          1     5        3
## 2          12           12    12         10    22       16
## 3          19           19    22         19    40       28
## 4          27           27    31         28    57       41
## 5          34           34    41         37    74       54
## 6          41           41    51         47    91       66
## 7          48           48    60         56   108       79
## 8          56           56    70         65   125       91
## 9          63           63    79         74   142      104
## 10         70           70    89         84   159      117

Highest runs likelihood

The plots below the runs likelihood of batsman. This uses K-Means Kohli has the highest likelihood of scoring runs 34.2% likely to score 66 runs. Du Plessis has 25% likelihood to score 53 runs, A. Virat Kohli

batsmanRunsLikelihood("./kohli.csv","Kohli")

## Summary of  Kohli 's runs scoring likelihood
## **************************************************
## 
## There is a 23.08 % likelihood that Kohli  will make  10 Runs in  10 balls over 13  Minutes 
## There is a 42.31 % likelihood that Kohli  will make  29 Runs in  23 balls over  30  Minutes 
## There is a 34.62 % likelihood that Kohli  will make  66 Runs in  47 balls over 63  Minutes

B. Faf Du Plessis

batsmanRunsLikelihood("./plessis.csv","Du Plessis")

## Summary of  Du Plessis 's runs scoring likelihood
## **************************************************
## 
## There is a 62.5 % likelihood that Du Plessis  will make  14 Runs in  11 balls over 19  Minutes 
## There is a 25 % likelihood that Du Plessis  will make  53 Runs in  40 balls over  50  Minutes 
## There is a 12.5 % likelihood that Du Plessis  will make  94 Runs in  61 balls over 90  Minutes

C. A J Finch

batsmanRunsLikelihood("./finch.csv","A J Finch")

## Summary of  A J Finch 's runs scoring likelihood
## **************************************************
## 
## There is a 20 % likelihood that A J Finch  will make  95 Runs in  54 balls over 70  Minutes 
## There is a 25 % likelihood that A J Finch  will make  42 Runs in  27 balls over  35  Minutes 
## There is a 55 % likelihood that A J Finch  will make  8 Runs in  8 balls over 12  Minutes

D. Brendon McCullum

batsmanRunsLikelihood("./mccullum.csv","McCullum")