Background

Sachin Tendulkar (SRT) has many records both in test matches and in One Day Internationals (ODIs). For example, he currently has most runs, and most centuries in both these forms of cricket. He was also the first to reach the barrier of a double hundred in ODIs (it was subsequently overtaken). He retired after playing his two-hundredth test match. I knew of this “coincidence” record that belonged to SRT. For a long time I wondered if it was unique, knowing full well that it was unlikely to be. The question: Are there other players whose highest score in ODIs equals the number of test matches they have played. Earlier this week during some idle moments I went ahead and checked that.

I used the wonderful cricinfo database which has a very usable API. It does involve a little bit of html scraping and fetching tables in multiple small “pages” but that task along with joining the pages is easily handled by simple R functions. So here goes.

We start by defining two functions to help us fetch the test match (class = 1) and ODI (class = 2) batting (type = batting) records.

classpage(class,page) generates the URL to be fetched for a single page for the given class. tablepart(url) fetches the URL, and extracts the relevant table part from it. cattables(class,maxpage) calls tablepart in the range(1:maxpages) for that class and concats the appropriate tables.

Define a couple functions

library(XML)

classpage = function(class,page){
    url1 = "http://stats.espncricinfo.com/ci/engine/stats/index.html?class="
    url2 = ";template=results;type=batting;page="
    return(paste(url1,class,url2,page,sep=""))
    }

tablepart = function(url){
    return(readHTMLTable(url)$"Overall figures")
    }

cattables = function(class,maxpage){
    btable = tablepart(classpage(class,1))
    for (y in 2:maxpage){
        ptable = tablepart(classpage(class,y))
        btable = rbind(btable,ptable)
        }
    
    return(btable)
    }

Downloading data

#tests
test_table = cattables(1,58)
#odis
odi_table = cattables(2,48)
save(test_table,odi_table,file=paste("testnodi_batting_",format(Sys.time(), "%Y%m%d"),".rda",sep=""))

Optional load() step (when you want to skip downloading). Replace with the correct dump file.

load(file="/users/aam/programs/R/cricket/testnodi_batting_20170209.rda")

Merge tables and select rows

Now we merge the two tables on Player name. The highest scores (HS.x and HS.y for the two tables) are succeeded by * for notouts. So we remove them before doing any numeric comparison.

output = merge(test_table, odi_table, by.x='Player', by.y='Player')
output$HS.x = gsub('\\*','',output$HS.x)
output$HS.y = gsub('\\*','',output$HS.y)

Now we select rows where the umber of test matches played (Mat.x) equals the highest score in ODIs (HS.y) It turns out that SRT is far from unique. There are 21 such players!

L = output$Mat.x == output$HS.y
length(output[L,1])
## [1] 21
output[L,1]
##  [1] Abul Hasan (BDESH)      AS Joseph (WI)         
##  [3] B Yograj Singh (INDIA)  D Ganesh (INDIA)       
##  [5] DS Steele (ENG)         EA Moseley (WI)        
##  [7] GD Campbell (AUS)       Harvinder Singh (INDIA)
##  [9] JM Mennie (AUS)         MD Wettimuny (SL)      
## [11] MG Johnson (AUS)        Nadeem Khan (PAK)      
## [13] NV Ojha (INDIA)         PIC Thompson (WI)      
## [15] RR Ramdass (WI)         Saleem Altaf (PAK)     
## [17] SR Tendulkar (INDIA)    SS Cottrell (WI)       
## [19] Subashis Roy (BDESH)    TJ Franklin (NZ)       
## [21] W Mwayenga (ZIM)       
## 2852 Levels: AB de Villiers (SA) AJ Stewart (ENG) ... VN Swamy (INDIA)

Introspection and next steps

That is clearly a disaster. SRT is far from unique. Just for fun, let us do it in reverse and see how many players have their test match highest score numerically equal to the ODIs they have played. We know it will not include SRT.

M = output$Mat.y == output$HS.x
length(output[M,1])
## [1] 20
output[M,1]
##  [1] ACI Lock (ZIM)         Akram Khan (BDESH)     B Yograj Singh (INDIA)
##  [4] DJ Terbrugge (SA)      DR Smith (WI)          EZ Matambanadzo (ZIM) 
##  [7] IG Butler (NZ)         IS Gallage (SL)        JP Duminy (SA)        
## [10] Kabir Khan (PAK)       MS Panesar (ENG)       PN Webb (NZ)          
## [13] PP Ojha (INDIA)        RGCE Wijesuriya (SL)   RSA Jayasekera (SL)   
## [16] SG Law (AUS)           SR Clark (AUS)         Tareq Aziz (BDESH)    
## [19] VR Aaron (INDIA)       WKM Benjamin (WI)     
## 2852 Levels: AB de Villiers (SA) AJ Stewart (ENG) ... VN Swamy (INDIA)

That is 20, an almost symmetric count. Is it possible, could it be, wouldn’t it be wonderful … Let us check!

N = (output$Mat.y == output$HS.x & output$Mat.x == output$HS.y)
output[N,]
##                     Player    Span.x Mat.x Inns.x NO.x Runs.x HS.x Ave.x
## 137 B Yograj Singh (INDIA) 1981-1981     1      2    0     10    6  5.00
##     100.x 50.x 0.x .x    Span.y Mat.y Inns.y NO.y Runs.y HS.y Ave.y BF
## 137     0    0   0    1980-1981     6      4    2      1    1  0.50 12
##       SR 100.y 50.y 0.y .y
## 137 8.33     0    0   1

Wow! Yes! So there is ONE player who figures in BOTH lists. Most certainly a unique record - if only coincidental.

output[N,c("Player","Span.x","Mat.x","HS.y","Mat.y","HS.x")]
##                     Player    Span.x Mat.x HS.y Mat.y HS.x
## 137 B Yograj Singh (INDIA) 1981-1981     1    1     6    6

B Yograj Singh (INDIA) scored 6 runs in one of the two test innings he played. He came out to bat four times in his 6 ODI innings and scored one run in total.

Supplement

Here are the full lists for both types (ordered by respective highest scores).

For num(tests) == HS(ODIs)

revorder = sort(strtoi(output[L,"HS.y"]),decreasing = TRUE, index.return = TRUE)$ix
output[L,][revorder,c("Player","Span.x","Mat.x","HS.y","Mat.y","HS.x")]
##                       Player    Span.x Mat.x HS.y Mat.y HS.x
## 1154    SR Tendulkar (INDIA) 1989-2013   200  200   463  248
## 730         MG Johnson (AUS) 2007-2015    73   73   153  123
## 1066      Saleem Altaf (PAK) 1967-1978    21   21     6   53
## 1207        TJ Franklin (NZ) 1983-1991    21   21     3  101
## 339          DS Steele (ENG) 1975-1976     8    8     1  106
## 261         D Ganesh (INDIA) 1997-1997     4    4     1    8
## 404        GD Campbell (AUS) 1989-1990     4    4    12    6
## 27        Abul Hasan (BDESH) 2012-2013     3    3     6  113
## 457  Harvinder Singh (INDIA) 1998-2001     3    3    16    6
## 111           AS Joseph (WI) 2016-2016     2    2     2    6
## 350          EA Moseley (WI) 1990-1990     2    2     9   26
## 715        MD Wettimuny (SL) 1983-1983     2    2     1   17
## 828        Nadeem Khan (PAK) 1993-1999     2    2     2   25
## 898        PIC Thompson (WI) 1996-1997     2    2     2   10
## 1157        SS Cottrell (WI) 2013-2014     2    2     2    5
## 137   B Yograj Singh (INDIA) 1981-1981     1    1     6    6
## 566          JM Mennie (AUS) 2016-2016     1    1     2   10
## 863          NV Ojha (INDIA) 2015-2015     1    1     1   35
## 1021         RR Ramdass (WI) 2005-2005     1    1     1   23
## 1167    Subashis Roy (BDESH) 2017-2017     1    1     1    0
## 1257        W Mwayenga (ZIM) 2005-2005     1    1     3   14

And for num(ODIs) == HS(tests)

revorder = sort(strtoi(output[M,"HS.x"]),decreasing = TRUE, index.return = TRUE)$ix
output[M,][revorder,c("Player","Span.x","Mat.x","HS.y","Mat.y","HS.x")]
##                      Player    Span.x Mat.x HS.y Mat.y HS.x
## 577          JP Duminy (SA) 2008-2017    42  150   166  166
## 333           DR Smith (WI) 2004-2006    10   97   105  105
## 1270      WKM Benjamin (WI) 1987-1995    21   31    85   85
## 1098           SG Law (AUS) 1995-1995     1  110    54   54
## 71       Akram Khan (BDESH) 2000-2003     8   65    44   44
## 1151         SR Clark (AUS) 2006-2009    24   16    39   39
## 488          IG Butler (NZ) 2002-2004     8   25    26   26
## 798        MS Panesar (ENG) 2006-2013    50   13    26   26
## 918         PP Ojha (INDIA) 2009-2013    24   16    18   18
## 600        Kabir Khan (PAK) 1994-1995     4    5    10   10
## 1195     Tareq Aziz (BDESH) 2004-2004     3   11    10   10
## 1251       VR Aaron (INDIA) 2011-2015     9    6     9    9
## 35           ACI Lock (ZIM) 1995-1995     1    5     8    8
## 981    RGCE Wijesuriya (SL) 1982-1985     4   12     8    8
## 364   EZ Matambanadzo (ZIM) 1996-1999     3    5     7    7
## 137  B Yograj Singh (INDIA) 1981-1981     1    1     6    6
## 916            PN Webb (NZ) 1980-1980     2   10     5    5
## 301       DJ Terbrugge (SA) 1998-2004     7    5     4    4
## 508         IS Gallage (SL) 1999-1999     1   14     3    3
## 1030    RSA Jayasekera (SL) 1982-1982     1   17     2    2

You can also ‘grep’ for players from a given country.

Indian players with num(tests) == HS(ODIs)

output[L,][c(grep("INDIA",output[L,1])),]
##                       Player    Span.x Mat.x Inns.x NO.x Runs.x HS.x Ave.x
## 137   B Yograj Singh (INDIA) 1981-1981     1      2    0     10    6  5.00
## 261         D Ganesh (INDIA) 1997-1997     4      7    3     25    8  6.25
## 457  Harvinder Singh (INDIA) 1998-2001     3      4    1      6    6  2.00
## 863          NV Ojha (INDIA) 2015-2015     1      2    0     56   35 28.00
## 1154    SR Tendulkar (INDIA) 1989-2013   200    329   33  15921  248 53.78
##      100.x 50.x 0.x .x    Span.y Mat.y Inns.y NO.y Runs.y HS.y Ave.y    BF
## 137      0    0   0    1980-1981     6      4    2      1    1  0.50    12
## 261      0    0   0    1997-1997     1      1    0      4    4  4.00     8
## 457      0    0   2    1997-2001    16      5    1      6    3  1.50    19
## 863      0    0   0    2010-2010     1      1    0      1    1  1.00     7
## 1154    51   68  14    1989-2012   463    452   41  18426  200 44.83 21367
##         SR 100.y 50.y 0.y .y
## 137   8.33     0    0   1   
## 261  50.00     0    0   0   
## 457  31.57     0    0   1   
## 863  14.28     0    0   0   
## 1154 86.23    49   96  20

Indian players with num(ODIs) == HS(tests)

output[M,][c(grep("INDIA",output[M,1])),]
##                      Player    Span.x Mat.x Inns.x NO.x Runs.x HS.x Ave.x
## 137  B Yograj Singh (INDIA) 1981-1981     1      2    0     10    6  5.00
## 918         PP Ojha (INDIA) 2009-2013    24     27   17     89   18  8.90
## 1251       VR Aaron (INDIA) 2011-2015     9     14    5     35    9  3.88
##      100.x 50.x 0.x .x    Span.y Mat.y Inns.y NO.y Runs.y HS.y Ave.y  BF
## 137      0    0   0    1980-1981     6      4    2      1    1  0.50  12
## 918      0    0   4    2008-2012    18     10    8     46   16 23.00 112
## 1251     0    0   1    2011-2014     9      3    2      8    6  8.00  15
##         SR 100.y 50.y 0.y .y
## 137   8.33     0    0   1   
## 918  41.07     0    0   0   
## 1251 53.33     0    0   1

Future

J P Duminy (SA) is the only player in the complete second list who is still playing. He is certain to drop off the list soon though. Subashis Roy (BDESH) is in the first list and has a small window to equal Yograj Singh’s unique accomplishment. Will he?

Exercise

An alternative would be to use test innings (instead of matches) and ODI highest scores. That list too has 20 players, and includes Subashis Roy (BDESH) as he has played just one inning in his solitary test.

Other things to do with it - like possible overlap of players with Mat.y == HS.x is left as an exercise.