This should be last in the series of posts based on my R package cricketr. That is, unless some bright idea comes trotting along and light bulbs go on around my head.
In this post cricketr adapts to the Twenty20 International format. Now cricketr can handle stats from all 3 formats of the game namely Test matches, ODIs and Twenty20 International from ESPN Cricinfo. You should be able to install the package from GitHub and use the many of the functions available in the package.
Please be mindful of the ESPN Cricinfo Terms of Use
You can also read this post at Rpubs as twenty20-cricketr. Download this report as a PDF file from twenty20-cricketr.pdf
I have chosen the Top 4 batsmen and top 4 bowlers based on ICC rankings and/or number of matches played.
Batsmen
Bowlers
I have explained the plots and added my own observations. Please feel free to draw your conclusions!
The data for a particular player can be obtained with the getPlayerData() function. To do you will need to go to ESPN CricInfo Player and type in the name of the player for e.g Virat Kohli, Sunil Narine etc. This will bring up a page which have the profile number for the player e.g. for Virat Kohli this would be http://www.espncricinfo.com/india/content/player/253802.html. Hence, Sachin’s profile is 253802. This can be used to get the data for Virat Kohli as shown below
library(devtools)
install_github("tvganesh/cricketr")
library(cricketr)
The data for a particular player can be obtained with the getPlayerData() function. To do you will need to go to ESPN CricInfo Player and type in the name of the player for e.g Virat Kohli, Sunil Narine etc. This will bring up a page which have the profile number for the player e.g. for Virat Kohli this would be http://www.espncricinfo.com/india/content/player/253802.html. Hence, Kohlis profile is 253802. This can be used to get the data for Virat Kohli as shown below
kohli <- getPlayerDataTT(253802,dir="..",file="kohli.csv",type="batting")
## http://stats.espncricinfo.com/ci/engine/player/253802.html?class=3;home_or_away=1;home_or_away=2;home_or_away=3;result=1;result=2;result=3;result=5;template=results;type=batting;view=innings
The analysis is included below
The following plots gives the analysis of the 4 ODI batsmen
The 3 charts below give the number of
A. Virat Kohli - The 1st plot shows that Kohli approximately hits about 5 4’s on his way to the 50s - The 2nd box plot of no of 6s and runs shows the range of runs when Kohli scored 1,2 or 4 6s. The dark line in the box shows the average runs when he scored those number of 6s. So when he scored 1 6 the average runs he scored was 45 - The 3rd plot shows the number of runs scored against the balls faced. It can be seen when Kohli faced 50 balls he had scored around ~ 70 runs
par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./kohli.csv","Kohli")
batsman6s("./kohli.csv","Kohli")
batsmanScoringRateODTT("./kohli.csv","Kohli")
dev.off()
## null device
## 1
B. Faf du Plessis
par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./plessis.csv","Du Plessis")
batsman6s("./plessis.csv","Du Plessis")
batsmanScoringRateODTT("./plessis.csv","Du Plessss")
dev.off()
## null device
## 1
C. A J Finch
par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./finch.csv","A J Finch")
batsman6s("./finch.csv","A J Finch")
batsmanScoringRateODTT("./finch.csv","A J Finch")
dev.off()
## null device
## 1
D. Brendon McCullum
par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsman4s("./mccullum.csv","McCullum")
batsman6s("./mccullum.csv","McCullum")
batsmanScoringRateODTT("./mccullum.csv","McCullum")
dev.off()
## null device
## 1
This plot shows the Mean Strike Rate of the batsman in each run range. It can be seen the A J Finch has the best strike rate followed by B McCullum.
par(mar=c(4,4,2,2))
frames <- list("./kohli.csv","./plessis.csv","finch.csv","mccullum.csv")
names <- list("Kohli","Du Plessis","Finch","McCullum")
relativeBatsmanSRODTT(frames,names)
## Warning in simpleLoess(y, x, w, span, degree, parametric, drop.square,
## normalize, : Chernobyl! trL>n 6
## Warning in simpleLoess(y, x, w, span, degree, parametric, drop.square,
## normalize, : Chernobyl! trL>n 6
## Warning in sqrt(sum.squares/one.delta): NaNs produced
The plot below provides the average runs scored in each run range 0-5,5-10,10-15 etc. Clearly Kohli has the most runs scored in most of the runs ranges. . This is also evident in the fact that Kohli has the highest average. He is followed by McCullum
frames <- list("./kohli.csv","./plessis.csv","finch.csv","mccullum.csv")
names <- list("Kohli","Du Plessis","Finch","McCullum")
relativeRunsFreqPerfODTT(frames,names)
The plot below shows the percentage of runs scored by way of 4s and 6s for each batsman. Du Plessis has the highest percentage of 4s, McCullum has the highest 6s. Finch has the highest percentage of 4s & 6s - 25.37 + 15.64= 41.01%
rames <- list("./kohli.csv","./plessis.csv","finch.csv","mccullum.csv")
names <- list("Kohli","Du Plessis","Finch","McCullum")
runs4s6s <-batsman4s6s(frames,names)
print(runs4s6s)
## Kohli Du Plessis Finch McCullum
## Runs(1s,2s,3s) 64.29 64.55 58.99 61.45
## 4s 27.78 24.38 25.37 22.87
## 6s 7.94 11.07 15.64 15.69
The plot is a scatter plot of Runs vs Balls faced and Minutes at Crease. A prediction plane is then fitted based on the Balls Faced and Minutes at Crease to give the runs scored
par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
battingPerf3d("./kohli.csv","Kohli")
battingPerf3d("./plessis.csv","Du Plessis")
dev.off()
## null device
## 1
par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
battingPerf3d("./finch.csv","A J Finch")
battingPerf3d("./mccullum.csv","McCullum")
dev.off()
## null device
## 1
A hypothetical Balls faced and Minutes at Crease is used to predict the runs scored by each batsman based on the computed prediction plane
BF <- seq( 5, 70,length=10)
Mins <- seq(5,70,length=10)
newDF <- data.frame(BF,Mins)
kohli <- batsmanRunsPredict("./kohli.csv","Kohli",newdataframe=newDF)
plessis <- batsmanRunsPredict("./plessis.csv","Du Plessis",newdataframe=newDF)
finch <- batsmanRunsPredict("./finch.csv","A J Finch",newdataframe=newDF)
mccullum <- batsmanRunsPredict("./mccullum.csv","McCullum",newdataframe=newDF)
The predicted runs is displayed. As can be seen Finch has the best overall strike rate followed by McCullum.
batsmen <-cbind(round(kohli$Runs),round(plessis$Runs),round(finch$Runs),round(mccullum$Runs))
colnames(batsmen) <- c("Kohli","Du Plessis","Finch","McCullum")
newDF <- data.frame(round(newDF$BF),round(newDF$Mins))
colnames(newDF) <- c("BallsFaced","MinsAtCrease")
predictedRuns <- cbind(newDF,batsmen)
predictedRuns
## BallsFaced MinsAtCrease Kohli Du Plessis Finch McCullum
## 1 5 5 2 1 5 3
## 2 12 12 12 10 22 16
## 3 19 19 22 19 40 28
## 4 27 27 31 28 57 41
## 5 34 34 41 37 74 54
## 6 41 41 51 47 91 66
## 7 48 48 60 56 108 79
## 8 56 56 70 65 125 91
## 9 63 63 79 74 142 104
## 10 70 70 89 84 159 117
The plots below the runs likelihood of batsman. This uses K-Means Kohli has the highest likelihood of scoring runs 34.2% likely to score 66 runs. Du Plessis has 25% likelihood to score 53 runs, A. Virat Kohli
batsmanRunsLikelihood("./kohli.csv","Kohli")
## Summary of Kohli 's runs scoring likelihood
## **************************************************
##
## There is a 23.08 % likelihood that Kohli will make 10 Runs in 10 balls over 13 Minutes
## There is a 42.31 % likelihood that Kohli will make 29 Runs in 23 balls over 30 Minutes
## There is a 34.62 % likelihood that Kohli will make 66 Runs in 47 balls over 63 Minutes
B. Faf Du Plessis
batsmanRunsLikelihood("./plessis.csv","Du Plessis")
## Summary of Du Plessis 's runs scoring likelihood
## **************************************************
##
## There is a 62.5 % likelihood that Du Plessis will make 14 Runs in 11 balls over 19 Minutes
## There is a 25 % likelihood that Du Plessis will make 53 Runs in 40 balls over 50 Minutes
## There is a 12.5 % likelihood that Du Plessis will make 94 Runs in 61 balls over 90 Minutes
C. A J Finch
batsmanRunsLikelihood("./finch.csv","A J Finch")
## Summary of A J Finch 's runs scoring likelihood
## **************************************************
##
## There is a 20 % likelihood that A J Finch will make 95 Runs in 54 balls over 70 Minutes
## There is a 25 % likelihood that A J Finch will make 42 Runs in 27 balls over 35 Minutes
## There is a 55 % likelihood that A J Finch will make 8 Runs in 8 balls over 12 Minutes
D. Brendon McCullum
batsmanRunsLikelihood("./mccullum.csv","McCullum")
## Summary of McCullum 's runs scoring likelihood
## **************************************************
##
## There is a 50.72 % likelihood that McCullum will make 11 Runs in 10 balls over 13 Minutes
## There is a 28.99 % likelihood that McCullum will make 36 Runs in 27 balls over 37 Minutes
## There is a 20.29 % likelihood that McCullum will make 74 Runs in 48 balls over 70 Minutes
The moving average for the 4 batsmen indicate the following. It must be noted that there is not sufficient data yet on Twenty20 Internationals. Kpohli, Du Plessis and Finch average only 26 innings while McCullum has close to 70. So the moving average while an indication will regress towards the mean over time.
par(mfrow=c(2,2))
par(mar=c(4,4,2,2))
batsmanMovingAverage("./kohli.csv","Kohli")
batsmanMovingAverage("./plessis.csv","Du Plessis")
batsmanMovingAverage("./finch.csv","A J Finch")
batsmanMovingAverage("./mccullum.csv","McCullum")
dev.off()
## null device
## 1
The plot shows the frequency with which the bowlers have taken 1,2,3 etc wickets. The most wickets taken is by Ajantha Mendis (6 wickets)
This plot gives the percentage of wickets for each wickets (1,2,3…etc)
par(mfrow=c(1,4))
par(mar=c(4,4,2,2))
bowlerWktsFreqPercent("./badree.csv","Badree")
bowlerWktsFreqPercent("./mendis.csv","Mendis")
bowlerWktsFreqPercent("./narine.csv","Narine")
bowlerWktsFreqPercent("./ashwin.csv","Ashwin")
dev.off()
## null device
## 1
The plot below gives a boxplot of the runs ranges for each of the wickets taken by the bowlers. The ends of the box indicate the 25% and 75% percentile of runs scored for the wickets taken and the dark balck line is the average runs conceded.
par(mfrow=c(1,4))
par(mar=c(4,4,2,2))
bowlerWktsRunsPlot("./badree.csv","Badree")
bowlerWktsRunsPlot("./mendis.csv","Mendis")
bowlerWktsRunsPlot("./narine.csv","Narine")
bowlerWktsRunsPlot("./ashwin.csv","Ashwin")
dev.off()
## null device
## 1
This plot below shows the average number of deliveries needed by the bowler to take the wickets (1,2,3 etc)
par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerWktRateTT("./badree.csv","Badree")
bowlerWktRateTT("./mendis.csv","Mendis")
dev.off()
## null device
## 1
par(mfrow=c(1,2))
par(mar=c(4,4,2,2))
bowlerWktRateTT("./narine.csv","Narine")
bowlerWktRateTT("./ashwin.csv","Ashwin")
dev.off()
## null device
## 1
The plot below shows that Narine has the most wickets in the 2 -4 range followed by Mendis
frames <- list("./badree.csv","./mendis.csv","narine.csv","ashwin.csv")
names <- list("Badree","Mendis","Narine","Ashwin")
relativeBowlingPerf(frames,names)
The economy rate can be deduced as follows from the plot below. Narine has a good economy rate around 1 & 4 wickets, Ashwin around 2 wickets and Badree around 3. wickets
frames <- list("./badree.csv","./mendis.csv","narine.csv","ashwin.csv")
names <- list("Badree","Mendis","Narine","Ashwin")
relativeBowlingERODTT(frames,names)
The relative wicket rate plots the mean number of deliveries needed to take the wickets namely (1,2,3,4). For e.g. Narine needed an average of 22 deliveries to take 1 wicket and 22.5,23.2, 24 deliveries to take 2,3 & 4 wickets respectively
frames <- list("./badree.csv","./mendis.csv","narine.csv","ashwin.csv")
names <- list("Badree","Mendis","Narine","Ashwin")
relativeWktRateTT(frames,names)
## Moving average of wickets over career
par(mfrow=c(2,2))
par(mar=c(4,4,2,2))
bowlerMovingAverage("./badree.csv","Badree")
bowlerMovingAverage("./mendis.csv","Mendis")
bowlerMovingAverage("./narine.csv","Narine")
bowlerMovingAverage("./ashwin.csv","Ashwin")
dev.off()
## null device
## 1
Here are some key conclusions
Twenty 20 batsmen
Twenty20 bowlers
Key takeaways 1. If all the above batsment and bowlers were in the same team we expect
Also see my other posts in R
You may also like