2023-10-16

Data Science in Cricket

-Performance Analysis:

Data science assists in analysing player performance by painstakingly tracking and analysing various data, including batting averages, bowling strike rates, fielding effectiveness, and more. These evaluations aid in identifying players’ strengths and limitations, allowing them to hone their abilities and raise their level of play.

-Statistical Modelling:

Due to advances in data science, it is now possible to anticipate player performance, match results, and even tactical choices made during a game. For the purpose of strategic planning and decision-making, predictive modelling makes predictions using historical data and a variety of statistical methods.

About this Project

In this presentation We will be talking about AsiaCup, a cricket tournament held in Asia where only Asian countries play, we will be using the data set which tells us about various stuff such as which team chose to ball or bat first, or who hit most runs, who took most wickets and stuff like that. We will try to use Linear Regression for this topic.

Normal Distribution of Batting Strike Rate

## Warning: Ignoring 2 observations

Analysis of the Normal Distribution

We have used this formula to calculate the Batting Strike Rate of the various matches:- \[ \text{Strike Rate} = \left( \frac{\text{Total Runs Scored}}{\text{Total Balls Faced}} \right) \times 100 \] From the previous slide we saw that this was a Normal Distribution, now we can include this in our simple linear regression formula, now we can also find another independent variable to use it in our Regression graph.

Let’s have a look at Run Rate.

Distribution of Run Rate:-

Analysis of Run Rate of matches:-

We generally use this formula to calculate the Run Rate of the match, \[ \text{Run Rate} = \frac{\text{Total Runs Scored}}{\text{Total Overs Bowled}} \] From our data set, we know that Total Overs Bowled is 50. From the previous slide we saw that this was a Normal Distribution, now we can include this in our simple linear regression formula. On the next slide we can see the relation between Strike Rate and Run Rate of a team.

Linear Regression

## `geom_smooth()` using formula = 'y ~ x'

Analysis of Linear Regression

From the graph from previous slide, we can see that if the Run Rate is higher the Average Batting Strike Rate also increases, this is also a great way to look at how many runs can be scored in 50 overs. If the run rate is higher the more runs can be scored and the Average Batting Strike rate also increases.

All the analysis that was done in these 9 slides , tells us that data science plays an important role in crickets moreover in any sport, therefore we see lot of teams within sports taking help from stats and improving their game plan.