library(cricketr)
## Registered S3 methods overwritten by 'ggplot2':
## method from
## [.quosures rlang
## c.quosures rlang
## print.quosures rlang
## Registered S3 method overwritten by 'xts':
## method from
## as.zoo.xts zoo
## Registered S3 method overwritten by 'quantmod':
## method from
## as.zoo.data.frame zoo
## Registered S3 methods overwritten by 'forecast':
## method from
## fitted.fracdiff fracdiff
## residuals.fracdiff fracdiff
“He felt that his whole life was some kind of dream and he sometimes wondered whose it was and whether they were enjoying it.”
“The ships hung in the sky in much the same way that bricks don’t.”
“We demand rigidly defined areas of doubt and uncertainty!”
“For a moment, nothing happened. Then, after a second or so, nothing continued to happen.”
“The Answer to the Great Question… Of Life, the Universe and Everything… Is… Forty-two,’ said Deep Thought, with infinite majesty and calm.”
The Hitchhiker's guide to the Galaxy - Douglas Adams
In this post, I introduce 2 new functions in my R package ‘cricketr’ (cricketr v0.22) see Re-introducing cricketr! : An R package to analyze performances of cricketers which enable granular analysis of batsmen and bowlers. They are
Note All the existing cricketr functions can be used on this smaller fine-grained data set for a closer analysis of players
This post has been published in Rpubs and can be accessed at Cricketr learns new tricks
You can download a PDF version of this post at Cricketr learns new tricks
The following functions analyze Sachin Tendulkar during 3 different periods of his illustrious career. a) 1st Jan 2001-1st Jan 2002 b) 1st Jan 2005-1st Jan 2006 c) 1st Jan 2012-1st Jan 2013
# Get the homeOrAway dataset for Tendulkar in matches
#df=getPlayerDataHA(35320,tfile="tendulkarTestHA.csv",matchType="Test")
# Get Tendulkar's data for 2001-02
df1=getPlayerDataOppnHA(infile="tendulkarHA.csv",outfile="tendulkarTest2001.csv",
startDate="2001-01-01",endDate="2002-01-01")
# Get Tendulkar's data for 2005-06
df2=getPlayerDataOppnHA(infile="tendulkarHA.csv",outfile="tendulkarTest2005.csv",
startDate="2005-01-01",endDate="2006-01-01")
# Get Tendulkar's data for 20012-13
#df3=getPlayerDataOppnHA(infile="tendulkarHA.csv",outfile="tendulkarTest2012.csv",
# startDate="2012-01-01",endDate="2013-01-01")
`
Note: Any of the cricketr R functions can be used on the fine-grained subset of data as below. The mean strike rate of Tendulkar is of the order of 60+ in 2001 which decreases to 50 and later to around 45
# Compute and plot mean strike rate of Tendulkar in the 3 periods
batsmanMeanStrikeRate ("./tendulkarTest2001.csv","Tendulkar-2001")
batsmanMeanStrikeRate ("./tendulkarTest2005.csv","Tendulkar-2005")
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : span too small. fewer data values than degrees of freedom.
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : pseudoinverse used at 22.05
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : neighborhood radius 30.45
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : reciprocal condition number 0
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : There are other near singularities as well. 3654.2
batsmanMeanStrikeRate ("./tendulkarTest2012.csv","Tendulkar-2012")
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : span too small. fewer data values than degrees of freedom.
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : pseudoinverse used at 7.125
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : neighborhood radius 30.375
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : reciprocal condition number 0
## Warning in simpleLoess(y, x, w, span, degree = degree, parametric =
## parametric, : There are other near singularities as well. 3645.1
On an average Tendulkar score 60+ in 2001 and is really blazing. This performance decreases in 2005 and later in 2012
par(mfrow=c(1,3))
par(mar=c(4,4,2,2))
batsmanAvgRunsGround("tendulkarTest2001.csv","Tendulkar-2001")
batsmanAvgRunsGround("tendulkarTest2005.csv","Tendulkar-2005")
batsmanAvgRunsGround("tendulkarTest2012.csv","Tendulkar-2012")
dev.off()
## null device
## 1
Sachin uniformly scores 50+ againstformidable oppositions in 2001. In 2005 this decreases to 40 in 2005 and in 2012 it is around 25
batsmanAvgRunsOpposition("tendulkarTest2001.csv","Tendulkar-2001")
batsmanAvgRunsOpposition("tendulkarTest2005.csv","Tendulkar-2005")
batsmanAvgRunsOpposition("tendulkarTest2012.csv","Tendulkar-2012")
The plot below compares Tendulkar’s cumulative strike rate and cumulative average during 3 different stages of his career
frames=list("tendulkarTest2001.csv","tendulkarTest2005.csv","tendulkarTest2012.csv")
names=list("Tendulkar-2001","Tendulkar-2005","Tendulkar-2012")
relativeBatsmanCumulativeAvgRuns(frames,names)
relativeBatsmanCumulativeStrikeRate(frames,names)
The analysis below looks at Kohli’s performance against England in ‘away’ venues (England) in 2014 and 2018
# Get the homeOrAway data for Kohli in Test matches
#df=getPlayerDataHA(253802,tfile="kohliTestHA.csv",type="batting",matchType="Test")
# Get the subset if data of Kohli's performance against England in England in 2014
df=getPlayerDataOppnHA(infile="kohliTestHA.csv",outfile="kohliTestEng2014.csv",
opposition=c("England"),homeOrAway=c("away"),startDate="2014-01-01",endDate="2015-01-01")
# Get the subset if data of Kohli's performance against England in England in 2018
df1=getPlayerDataOppnHA(infile="kohliHA.csv",outfile="kohliTestEng2018.csv",
opposition=c("England"),homeOrAway=c("away"),startDate="2018-01-01",endDate="2019-01-01")
Kohli had a miserable outing to England in 2014 with a string of low scores. In 2018 Kohli pulls himself out of the morass
batsmanAvgRunsGround("kohliTestEng2014.csv","Kohli-Eng-2014")
batsmanAvgRunsGround("kohliTestEng2018.csv","Kohli-Eng-2018")
Kohli’s cumulative average runs in 2014 is in the low 15s, while in 2018 it is 70+. Kohli stamps his class back again and undoes the bad memories of 2014
batsmanCumulativeAverageRuns("kohliTestEng2014.csv", "Kohli-Eng-2014")
batsmanCumulativeAverageRuns("kohliTestEng2018.csv", "Kohli-Eng-2018")
The analyses below compares the performances of Sourav Ganguly, Rahul Dravid and VVS Laxman against Australia, South Africa, and England in ‘away’ venues between 01 Jan 2002 to 01 Jan 2008
#Get the HA data for Ganguly, Dravid and Laxman
#df=getPlayerDataHA(28779,tfile="gangulyTestHA.csv",type="batting",matchType="Test")
#df=getPlayerDataHA(28114,tfile="dravidTestHA.csv",type="batting",matchType="Test")
#df=getPlayerDataHA(30750,tfile="laxmanTestHA.csv",type="batting",matchType="Test")
# Slice the data
df=getPlayerDataOppnHA(infile="gangulyTestHA.csv",outfile="gangulyTestAES2002-08.csv"
,opposition=c("Australia", "England", "South Africa"),
homeOrAway=c("away"),startDate="2002-01-01",endDate="2008-01-01")
df=getPlayerDataOppnHA(infile="dravidTestHA.csv",outfile="dravidTestAES2002-08.csv"
,opposition=c("Australia", "England", "South Africa"),
homeOrAway=c("away"),startDate="2002-01-01",endDate="2008-01-01")
df=getPlayerDataOppnHA(infile="laxmanTestHA.csv",outfile="laxmanTestAES2002-08.csv"
,opposition=c("Australia", "England", "South Africa"),
homeOrAway=c("away"),startDate="2002-01-01",endDate="2008-01-01")
Plot the relative cumulative average runs and relative cumative strike rate of Ganguly, Dravid and Laxman
-Dravid towers over Laxman and Ganguly with respect to cumulative average runs. - Ganguly has a superior strike rate followed by Laxman and then Dravid
frames=list("gangulyTestAES2002-08.csv","dravidTestAES2002-08.csv","laxmanTestAES2002-08.csv")
names=list("GangulyAusEngSA2002-08","DravidAusEngSA2002-08","LaxmanAusEngSA2002-08")
relativeBatsmanCumulativeAvgRuns(frames,names)
relativeBatsmanCumulativeStrikeRate(frames,names)
Compare the performances of Rohit Sharma, Joe Root and Kane williamson in away & neutral venues against Australia, West Indies and Soouth Africa
# Get the ODI HA data for Rohit, Root and Williamson
#df=getPlayerDataHA(34102,tfile="rohitODIHA.csv",type="batting",matchType="ODI")
#df=getPlayerDataHA(303669,tfile="joerootODIHA.csv",type="batting",matchType="ODI")
#df=getPlayerDataHA(277906,tfile="williamsonODIHA.csv",type="batting",matchType="ODI")
# Subset the data for specific opposition in away and neutral venues
df=getPlayerDataOppnHA(infile="rohitODIHA.csv",outfile="rohitODIAusWISA.csv"
,opposition=c("Australia", "West Indies", "South Africa"),
homeOrAway=c("away","neutral"))
df=getPlayerDataOppnHA(infile="joerootODIHA.csv",outfile="joerootODIAusWISA.csv"
,opposition=c("Australia", "West Indies", "South Africa"),
homeOrAway=c("away","neutral"))
df=getPlayerDataOppnHA(infile="williamsonODIHA.csv",outfile="williamsonODIAusWiSA.csv"
,opposition=c("Australia", "West Indies", "South Africa"),
homeOrAway=c("away","neutral"))
The relative cumulative strike rate of all 3 are comparable
frames=list("rohitODIAusWISA.csv","joerootODIAusWISA.csv","williamsonODIAusWiSA.csv")
names=list("Rohit-ODI-AusWISA","Joe Root-ODI-AusWISA","Williamson-ODI-AusWISA")
relativeBatsmanCumulativeAvgRuns(frames,names)
relativeBatsmanCumulativeStrikeRate(frames,names)
Plot the performances of Dhoni against Australia, West Indies, South Africa and England
# Get the HA T20 data for Dhoni
#df=getPlayerDataHA(28081,tfile="dhoniT20HA.csv",type="batting",matchType="T20")
#Subset the data
df=getPlayerDataOppnHA(infile="dhoniT20HA.csv",outfile="dhoniT20AusWISAEng.csv"
,opposition=c("Australia", "West Indies", "South Africa","England"),
homeOrAway=c("all"))
Note You can use any of cricketr’s functions against the fine grained data
batsmanAvgRunsOpposition("dhoniT20AusWISAEng.csv","Dhoni")
batsmanAvgRunsGround("dhoniT20AusWISAEng.csv","Dhoni")
batsmanCumulativeStrikeRate("dhoniT20AusWISAEng.csv","Dhoni")
batsmanCumulativeAverageRuns("dhoniT20AusWISAEng.csv","Dhoni")
Compute the performances of Kumble, Warne and Maralitharan against New Zealand, West Indies, South Africa and England in pitches that are not ‘home’ pithes
# Get the bowling data for Kumble, Warne and Muralitharan in Test matches
#df=getPlayerDataHA(30176,tfile="kumbleTestHA.csv",type="bowling",matchType="Test")
#df=getPlayerDataHA(8166,tfile="warneTestHA.csv",type="bowling",matchType="Test")
#df=getPlayerDataHA(49636,tfile="muraliTestHA.csv",type="bowling",matchType="Test")
# Subset the data
df=getPlayerDataOppnHA(infile="kumbleTestHA.csv",outfile="kumbleTest-NZWISAEng.csv"
,opposition=c("New Zealand", "West Indies", "South Africa","England"),
homeOrAway=c("away"))
df=getPlayerDataOppnHA(infile="warneTestHA.csv",outfile="warneTest-NZWISAEng.csv"
,opposition=c("New Zealand", "West Indies", "South Africa","England"),
homeOrAway=c("away"))
df=getPlayerDataOppnHA(infile="muraliTestHA.csv",outfile="muraliTest-NZWISAEng.csv"
,opposition=c("New Zealand", "West Indies", "South Africa","England"),
homeOrAway=c("away"))
bowlerAvgWktsOpposition("kumbleTest-NZWISAEng.csv","Kumble-NZWISAEng-AN")
bowlerAvgWktsOpposition("warneTest-NZWISAEng.csv","Warne-NZWISAEng-AN")
bowlerAvgWktsOpposition("muraliTest-NZWISAEng.csv","Murali-NZWISAEng-AN")
bowlerAvgWktsGround("kumbleTest-NZWISAEng.csv","Kumble")
bowlerAvgWktsGround("warneTest-NZWISAEng.csv","Warne")
bowlerAvgWktsGround("muraliTest-NZWISAEng.csv","Murali")
frames=list("kumbleTest-NZWISAEng.csv","warneTest-NZWISAEng.csv","muraliTest-NZWISAEng.csv")
names=list("Kumble","Warne","Murali")
relativeBowlerCumulativeAvgEconRate(frames,names)
relativeBowlerCumulativeAvgWickets(frames,names)
# Get the HA data for Bumrah in ODI in bowling
df=getPlayerDataHA(625383,tfile="bumrahODIHA.csv",type="bowling",matchType="ODI")
## [1] "Working..."
# Slice the data for periods 2016, 2017 and 2018
df=getPlayerDataOppnHA(infile="bumrahODIHA.csv",outfile="bumrahODI2016.csv",
startDate="2016-01-01",endDate="2017-01-01")
df=getPlayerDataOppnHA(infile="bumrahODIHA.csv",outfile="bumrahODI2017.csv",
startDate="2017-01-01",endDate="2018-01-01")
df=getPlayerDataOppnHA(infile="bumrahODIHA.csv",outfile="bumrahODI2018.csv",
startDate="2018-01-01",endDate="2019-01-01")
frames=list("bumrahODI2016.csv","bumrahODI2017.csv","bumrahODI2018.csv")
names=list("Bumrah-2016","Bumrah-2017","Bumrah-2018")
relativeBowlerCumulativeAvgEconRate(frames,names)
relativeBowlerCumulativeAvgWickets(frames,names)
# Get the HA bowling data for Shakib, Bumrah and Jadeja
df=getPlayerDataHA(56143,tfile="shakibT20HA.csv",type="bowling",matchType="T20")
## [1] "Working..."
df=getPlayerDataHA(625383,tfile="bumrahT20HA.csv",type="bowling",matchType="T20")
## [1] "Working..."
df=getPlayerDataHA(234675,tfile="jadejaT20HA.csv",type="bowling",matchType="T20")
## [1] "Working..."
# Slice the data for performances against Sri Lanka, Australia, South Africa and England
df=getPlayerDataOppnHA(infile="shakibT20HA.csv",outfile="shakibT20-SLAusSAEng.csv"
,opposition=c("Sri Lanka","Australia", "South Africa","England"),
homeOrAway=c("all"))
df=getPlayerDataOppnHA(infile="bumrahT20HA.csv",outfile="bumrahT20-SLAusSAEng.csv"
,opposition=c("Sri Lanka","Australia", "South Africa","England"),
homeOrAway=c("all"))
df=getPlayerDataOppnHA(infile="jadejaT20HA.csv",outfile="jadejaT20-SLAusSAEng.csv"
,opposition=c("Sri Lanka","Australia", "South Africa","England"),
homeOrAway=c("all"))
frames=list("shakibT20-SLAusSAEng.csv","bumrahT20-SLAusSAEng.csv","jadejaT20-SLAusSAEng.csv")
names=list("Shakib-SLAusSAEng","Bumrah-SLAusSAEng","Jadeja-SLAusSAEng")
relativeBowlerCumulativeAvgEconRate(frames,names)
relativeBowlerCumulativeAvgWickets(frames,names)
By getting the homeOrAway data for players using the profileNo, you can slice and dice the data based on your choice of opposition, whether you want matches that were played at home/away/neutral venues. Finally by specifying the period for which the data has to be subsetted you can create fine grained analysis.
Hope you have a great time with cricketr!!!