Sachin Tendulkar (SRT) has many records both in test matches and in One Day Internationals (ODIs). For example, he currently has most runs, and most centuries in both these forms of cricket. He was also the first to reach the barrier of a double hundred in ODIs (it was subsequently overtaken). He retired after playing his two-hundredth test match. I knew of this “coincidence” record that belonged to SRT. For a long time I wondered if it was unique, knowing full well that it was unlikely to be. The question: Are there other players whose highest score in ODIs equals the number of test matches they have played. Earlier this week during some idle moments I went ahead and checked that.
I used the wonderful cricinfo database which has a very usable API. It does involve a little bit of html scraping and fetching tables in multiple small “pages” but that task along with joining the pages is easily handled by simple R functions. So here goes.
We start by defining two functions to help us fetch the test match (class = 1) and ODI (class = 2) batting (type = batting) records.
classpage(class,page) generates the URL to be fetched for a single page for the given class. tablepart(url) fetches the URL, and extracts the relevant table part from it. cattables(class,maxpage) calls tablepart in the range(1:maxpages) for that class and concats the appropriate tables.
library(XML)
classpage = function(class,page){
url1 = "http://stats.espncricinfo.com/ci/engine/stats/index.html?class="
url2 = ";template=results;type=batting;page="
return(paste(url1,class,url2,page,sep=""))
}
tablepart = function(url){
return(readHTMLTable(url)$"Overall figures")
}
cattables = function(class,maxpage){
btable = tablepart(classpage(class,1))
for (y in 2:maxpage){
ptable = tablepart(classpage(class,y))
btable = rbind(btable,ptable)
}
return(btable)
}
#tests
test_table = cattables(1,58)
#odis
odi_table = cattables(2,48)
save(test_table,odi_table,file=paste("testnodi_batting_",format(Sys.time(), "%Y%m%d"),".rda",sep=""))
load(file="/users/aam/programs/R/cricket/testnodi_batting_20170209.rda")
Now we merge the two tables on Player name. The highest scores (HS.x and HS.y for the two tables) are succeeded by * for notouts. So we remove them before doing any numeric comparison.
output = merge(test_table, odi_table, by.x='Player', by.y='Player')
output$HS.x = gsub('\\*','',output$HS.x)
output$HS.y = gsub('\\*','',output$HS.y)
Now we select rows where the umber of test matches played (Mat.x) equals the highest score in ODIs (HS.y) It turns out that SRT is far from unique. There are 21 such players!
L = output$Mat.x == output$HS.y
length(output[L,1])
## [1] 21
output[L,1]
## [1] Abul Hasan (BDESH) AS Joseph (WI)
## [3] B Yograj Singh (INDIA) D Ganesh (INDIA)
## [5] DS Steele (ENG) EA Moseley (WI)
## [7] GD Campbell (AUS) Harvinder Singh (INDIA)
## [9] JM Mennie (AUS) MD Wettimuny (SL)
## [11] MG Johnson (AUS) Nadeem Khan (PAK)
## [13] NV Ojha (INDIA) PIC Thompson (WI)
## [15] RR Ramdass (WI) Saleem Altaf (PAK)
## [17] SR Tendulkar (INDIA) SS Cottrell (WI)
## [19] Subashis Roy (BDESH) TJ Franklin (NZ)
## [21] W Mwayenga (ZIM)
## 2852 Levels: AB de Villiers (SA) AJ Stewart (ENG) ... VN Swamy (INDIA)
That is clearly a disaster. SRT is far from unique. Just for fun, let us do it in reverse and see how many players have their test match highest score numerically equal to the ODIs they have played. We know it will not include SRT.
M = output$Mat.y == output$HS.x
length(output[M,1])
## [1] 20
output[M,1]
## [1] ACI Lock (ZIM) Akram Khan (BDESH) B Yograj Singh (INDIA)
## [4] DJ Terbrugge (SA) DR Smith (WI) EZ Matambanadzo (ZIM)
## [7] IG Butler (NZ) IS Gallage (SL) JP Duminy (SA)
## [10] Kabir Khan (PAK) MS Panesar (ENG) PN Webb (NZ)
## [13] PP Ojha (INDIA) RGCE Wijesuriya (SL) RSA Jayasekera (SL)
## [16] SG Law (AUS) SR Clark (AUS) Tareq Aziz (BDESH)
## [19] VR Aaron (INDIA) WKM Benjamin (WI)
## 2852 Levels: AB de Villiers (SA) AJ Stewart (ENG) ... VN Swamy (INDIA)
That is 20, an almost symmetric count. Is it possible, could it be, wouldn’t it be wonderful … Let us check!
N = (output$Mat.y == output$HS.x & output$Mat.x == output$HS.y)
output[N,]
## Player Span.x Mat.x Inns.x NO.x Runs.x HS.x Ave.x
## 137 B Yograj Singh (INDIA) 1981-1981 1 2 0 10 6 5.00
## 100.x 50.x 0.x .x Span.y Mat.y Inns.y NO.y Runs.y HS.y Ave.y BF
## 137 0 0 0 1980-1981 6 4 2 1 1 0.50 12
## SR 100.y 50.y 0.y .y
## 137 8.33 0 0 1
Wow! Yes! So there is ONE player who figures in BOTH lists. Most certainly a unique record - if only coincidental.
output[N,c("Player","Span.x","Mat.x","HS.y","Mat.y","HS.x")]
## Player Span.x Mat.x HS.y Mat.y HS.x
## 137 B Yograj Singh (INDIA) 1981-1981 1 1 6 6
B Yograj Singh (INDIA) scored 6 runs in one of the two test innings he played. He came out to bat four times in his 6 ODI innings and scored one run in total.
Here are the full lists for both types (ordered by respective highest scores).
For num(tests) == HS(ODIs)
revorder = sort(strtoi(output[L,"HS.y"]),decreasing = TRUE, index.return = TRUE)$ix
output[L,][revorder,c("Player","Span.x","Mat.x","HS.y","Mat.y","HS.x")]
## Player Span.x Mat.x HS.y Mat.y HS.x
## 1154 SR Tendulkar (INDIA) 1989-2013 200 200 463 248
## 730 MG Johnson (AUS) 2007-2015 73 73 153 123
## 1066 Saleem Altaf (PAK) 1967-1978 21 21 6 53
## 1207 TJ Franklin (NZ) 1983-1991 21 21 3 101
## 339 DS Steele (ENG) 1975-1976 8 8 1 106
## 261 D Ganesh (INDIA) 1997-1997 4 4 1 8
## 404 GD Campbell (AUS) 1989-1990 4 4 12 6
## 27 Abul Hasan (BDESH) 2012-2013 3 3 6 113
## 457 Harvinder Singh (INDIA) 1998-2001 3 3 16 6
## 111 AS Joseph (WI) 2016-2016 2 2 2 6
## 350 EA Moseley (WI) 1990-1990 2 2 9 26
## 715 MD Wettimuny (SL) 1983-1983 2 2 1 17
## 828 Nadeem Khan (PAK) 1993-1999 2 2 2 25
## 898 PIC Thompson (WI) 1996-1997 2 2 2 10
## 1157 SS Cottrell (WI) 2013-2014 2 2 2 5
## 137 B Yograj Singh (INDIA) 1981-1981 1 1 6 6
## 566 JM Mennie (AUS) 2016-2016 1 1 2 10
## 863 NV Ojha (INDIA) 2015-2015 1 1 1 35
## 1021 RR Ramdass (WI) 2005-2005 1 1 1 23
## 1167 Subashis Roy (BDESH) 2017-2017 1 1 1 0
## 1257 W Mwayenga (ZIM) 2005-2005 1 1 3 14
And for num(ODIs) == HS(tests)
revorder = sort(strtoi(output[M,"HS.x"]),decreasing = TRUE, index.return = TRUE)$ix
output[M,][revorder,c("Player","Span.x","Mat.x","HS.y","Mat.y","HS.x")]
## Player Span.x Mat.x HS.y Mat.y HS.x
## 577 JP Duminy (SA) 2008-2017 42 150 166 166
## 333 DR Smith (WI) 2004-2006 10 97 105 105
## 1270 WKM Benjamin (WI) 1987-1995 21 31 85 85
## 1098 SG Law (AUS) 1995-1995 1 110 54 54
## 71 Akram Khan (BDESH) 2000-2003 8 65 44 44
## 1151 SR Clark (AUS) 2006-2009 24 16 39 39
## 488 IG Butler (NZ) 2002-2004 8 25 26 26
## 798 MS Panesar (ENG) 2006-2013 50 13 26 26
## 918 PP Ojha (INDIA) 2009-2013 24 16 18 18
## 600 Kabir Khan (PAK) 1994-1995 4 5 10 10
## 1195 Tareq Aziz (BDESH) 2004-2004 3 11 10 10
## 1251 VR Aaron (INDIA) 2011-2015 9 6 9 9
## 35 ACI Lock (ZIM) 1995-1995 1 5 8 8
## 981 RGCE Wijesuriya (SL) 1982-1985 4 12 8 8
## 364 EZ Matambanadzo (ZIM) 1996-1999 3 5 7 7
## 137 B Yograj Singh (INDIA) 1981-1981 1 1 6 6
## 916 PN Webb (NZ) 1980-1980 2 10 5 5
## 301 DJ Terbrugge (SA) 1998-2004 7 5 4 4
## 508 IS Gallage (SL) 1999-1999 1 14 3 3
## 1030 RSA Jayasekera (SL) 1982-1982 1 17 2 2
You can also ‘grep’ for players from a given country.
Indian players with num(tests) == HS(ODIs)
output[L,][c(grep("INDIA",output[L,1])),]
## Player Span.x Mat.x Inns.x NO.x Runs.x HS.x Ave.x
## 137 B Yograj Singh (INDIA) 1981-1981 1 2 0 10 6 5.00
## 261 D Ganesh (INDIA) 1997-1997 4 7 3 25 8 6.25
## 457 Harvinder Singh (INDIA) 1998-2001 3 4 1 6 6 2.00
## 863 NV Ojha (INDIA) 2015-2015 1 2 0 56 35 28.00
## 1154 SR Tendulkar (INDIA) 1989-2013 200 329 33 15921 248 53.78
## 100.x 50.x 0.x .x Span.y Mat.y Inns.y NO.y Runs.y HS.y Ave.y BF
## 137 0 0 0 1980-1981 6 4 2 1 1 0.50 12
## 261 0 0 0 1997-1997 1 1 0 4 4 4.00 8
## 457 0 0 2 1997-2001 16 5 1 6 3 1.50 19
## 863 0 0 0 2010-2010 1 1 0 1 1 1.00 7
## 1154 51 68 14 1989-2012 463 452 41 18426 200 44.83 21367
## SR 100.y 50.y 0.y .y
## 137 8.33 0 0 1
## 261 50.00 0 0 0
## 457 31.57 0 0 1
## 863 14.28 0 0 0
## 1154 86.23 49 96 20
Indian players with num(ODIs) == HS(tests)
output[M,][c(grep("INDIA",output[M,1])),]
## Player Span.x Mat.x Inns.x NO.x Runs.x HS.x Ave.x
## 137 B Yograj Singh (INDIA) 1981-1981 1 2 0 10 6 5.00
## 918 PP Ojha (INDIA) 2009-2013 24 27 17 89 18 8.90
## 1251 VR Aaron (INDIA) 2011-2015 9 14 5 35 9 3.88
## 100.x 50.x 0.x .x Span.y Mat.y Inns.y NO.y Runs.y HS.y Ave.y BF
## 137 0 0 0 1980-1981 6 4 2 1 1 0.50 12
## 918 0 0 4 2008-2012 18 10 8 46 16 23.00 112
## 1251 0 0 1 2011-2014 9 3 2 8 6 8.00 15
## SR 100.y 50.y 0.y .y
## 137 8.33 0 0 1
## 918 41.07 0 0 0
## 1251 53.33 0 0 1
J P Duminy (SA) is the only player in the complete second list who is still playing. He is certain to drop off the list soon though. Subashis Roy (BDESH) is in the first list and has a small window to equal Yograj Singh’s unique accomplishment. Will he?
An alternative would be to use test innings (instead of matches) and ODI highest scores. That list too has 20 players, and includes Subashis Roy (BDESH) as he has played just one inning in his solitary test.
Other things to do with it - like possible overlap of players with Mat.y == HS.x is left as an exercise.