Akshay Prasannan - S3818611
Last updated: 29 October, 2020
*Cricket is 2nd most popular sport after football with over 1 billion fans across the world and the world cup is the biggest stage as in context of cricket. (1)
*Each Criket team consist of 11 players divided into batsman, bowlers fielders and one keeper.
*Indian premier league is a popular professional Twenty over cricket leagues conducted in India every year from 2008 which attracts about 269 million viewers
*A bating style of a batsman can either be left handed or right handed per his natural ability
*Stike Rate is one of the key factor considered while assessing the performance of a batsman.
*Study reveals about 10.6% of world population is left handed. (2)
*When comparing the above data with the the the batting players in cricket the research on top 8 international teams put forwards that 30% of the players occupying 1-6th position are left handed.(3)
*This study investigates the data obtained from kaggle (4) https://www.kaggle.com/ramjidoolla/ipl-data-set to eveluate the reltionship of left handed batsman and right handed batman’s strikerate.
*Determine whether the difference betweent left handed and right handed batsman’s Batting Strikerate is statistically significant.
*A two sample t-test will be conducted to study the difference in population mean and to test the left handed and right handed batsman’s strikerate is statistical significance
*The data for this study is extracted from kaggle: https://www.kaggle.com/ramjidoolla/ipl-data-set
*The dataset Players and most_runs_average_strikerate were used to conduct this study.
*The two data sets were merged together using merge() function.
*The new data set created (PlayerSR) consist of 10 variables. Out of these variable , the following two variables will be used for this:
1 Batting_hand: Factor describling the batting player’s batting style (Right hand and Left hand)
The variable Batting_hand was converted into factors.
The levels of the factors are as follows:
*For general overview Histograms and boxplot of Right handed and Left handed batman’s stikerate were drawn to have a general overview
*The heads of all the variables and first few observations are given below
| Player_Name | total_runs | out | numberofballs | average | strikerate | DOB | Batting_Hand | Bowling_Skill | Country |
|---|---|---|---|---|---|---|---|---|---|
| A Ashish Reddy | 280 | 15 | 191 | 18.66667 | 146.59686 | 1991-02-24 | Right_Hand | Right-arm medium | India |
| A Chandila | 4 | 1 | 7 | 4.00000 | 57.14286 | 1983-12-05 | Right_Hand | Right-arm offbreak | India |
| A Chopra | 53 | 5 | 71 | 10.60000 | 74.64789 | 1977-09-19 | Right_Hand | Right-arm offbreak | India |
| A Choudhary | 25 | 2 | 20 | 12.50000 | 125.00000 | NA | Right_Hand | Left-arm fast-medium | NA |
| A Dananjaya | 4 | 0 | 5 | NA | 80.00000 | NA | Right_Hand | Right-arm offbreak | NA |
| A Flintoff | 62 | 2 | 53 | 31.00000 | 116.98113 | 1977-12-06 | Right_Hand | Right-arm fast-medium | England |
PlayerSR_right <- PlayerSR %>% filter (PlayerSR$Batting_Hand == "Right_Hand")
PlayerSR_Left <- PlayerSR %>% filter (PlayerSR$Batting_Hand == "Left_Hand")
PlayerSR_Left$strikerate %>% hist(col="blue",xlim=c(0,200),
xlab="Strikerate - Left handed batsman",
main="Strikerate - Left handed batsman")PlayerSR_right$strikerate %>% hist(col="blue",xlim=c(0,200),
xlab="Strikerate - Right handed",
main="Strikerate - Right handed batsman")PlayerSR %>% boxplot(strikerate ~ Batting_Hand, data = ., ylab = "Strike Rate", col="yellow",main="Strike rate of left and right handed batsman")*Summary of variable used here of analysis- strikerate left handed and right handed batsman
PlayerSR %>% group_by(Batting_Hand) %>% summarise(Min = min(strikerate,na.rm = TRUE),
Q1 = quantile(strikerate,probs = .25,na.rm = TRUE),
Median = median(strikerate, na.rm = TRUE),
Q3 = quantile(strikerate,probs = .75,na.rm = TRUE),
Max = max(strikerate,na.rm = TRUE),
Mean = mean(strikerate, na.rm = TRUE),
SD = sd(strikerate, na.rm = TRUE),
n = n(),
Missing = sum(is.na(strikerate))) -> table1
knitr::kable(table1)| Batting_Hand | Min | Q1 | Median | Q3 | Max | Mean | SD | n | Missing |
|---|---|---|---|---|---|---|---|---|---|
| Left_Hand | 0 | 98.14815 | 119.4805 | 133.2732 | 172.7273 | 112.0160 | 33.27622 | 133 | 0 |
| Right_Hand | 0 | 78.98981 | 109.3220 | 129.4414 | 250.0000 | 103.1476 | 41.41030 | 383 | 0 |
## [1] 2 126
## [1] 58 172
*Proceeded to check the Homogeneity of variance, or the assumption of equal variance, using the Levene’s test.
*The Levene’s test has the following statistical hypotheses:
\[H_0: \mu_1^2 = \mu_2^2 \]
\[H_A: \mu_1^2 \ne \mu_2^2\]
*Levene’s test returned a p value of 0.003.
*Implies that Levene’s test is statistically significant i.e p<0.05, therefore it is not safe to assume equal variance
*Now we will test Two-sample t-test - Assuming Unequal Variance
*The two-sample t-test has the following statistical hypotheses
\[H_0: \mu_1 = \mu_2 \]
\[H_A: \mu_1 \ne \mu_2\]
##
## Welch Two Sample t-test
##
## data: strikerate by Batting_Hand
## t = 2.4785, df = 283.79, p-value = 0.01377
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 1.825448 15.911488
## sample estimates:
## mean in group Left_Hand mean in group Right_Hand
## 112.0160 103.1476
*p value is 0.01377, which is less than <0.05 . Therefore we reject the null hypothesis
*A two-sample t-test was used to test for a significant difference between the mean Strike rate of Right handed and left handed batsman.
*The test of normality of strike rate of both right handed and left handed batsman using Q Q plot.
*Both the distribution displayed non-normality upon inspection of the normal Q-Q plot
*Since the data consist of sample size greater that 30, the central limit theorem ensured that the t-test could be applied
*Levene’s test were conducted to check homogeneity of variance. The test indicated that equal variance could not be assumed as the p values was less that 0.05.
*The results of the two-sample t-test assuming unequal variance was constructed and found statistical significant in difference between the mean Strike rate of left handed and right handed batsman. t(df = 283) = 2.4785 and p = 0.01377. I for the difference in means [1.825448 15.911488]
*The results of the investigation suggest that Left handed have significantly higher average strike rate than that of Right handed batsman
1.ICC [internet] https://www.icc-cricket.com/media-releases/759733
2.Human handedness: A meta-analysis’ is published in Psychological Bulletin.DOI (10.1037/bul0000229)
3.ICC [internet] https://www.icc-cricket.com/media-releases/759733