Football Analytics

Ruchita Thombare, Ayan Basu, Mitali Warty, Kumar Shivam

12/08/2021

Background

Football is one of the most popular sports on the planet, it has been followed very closely by a large number of people. In recent years, emergence of new types of data collected from games such as play-by-play data including information on each shot or pass made in a match drove Data Science on the forefront of football.

In the race for footballing supremacy, teams are using an increasing amount of data and information to help give them an edge over the competition.

Data

Primary Data -
Secondary Data -
Additional relevant data relating to in-game statistics-

Objective

Look at the play-by-play statistics involved in a game, which helped us evaluate a team’s performance, instead of just leveraging the actual number of “goals scored” metric from prior matches.

Combine key metrics of a team like their half time scores, team’s overall offensive and defensive ratings which are updated after each game to build model predicting the outcome of future matches.

Leverage in-game statistics to provide interesting and deeper perspective rather than the goals metrics, creating an opportunity to look beyond the match result itself.

Original Plan

Analytics Plan

Peer Comments Summary

Comments received from our peer teams suggested us to explore on many aspects of the game such as -

Ideas incorporated

Data Summary

Data from Primary source:

The primary source data from Competetion and Matches:

Data from secondary source:
Team performance data from secondary source: Team game Statistics

Dropping NA values from Team performance data to enhance our dataset.

Final dataset to be worked upon:

Exploratory Data Analysis

  • Top 10 teams with most number AWAY ground WINS
  • Most number of wins HOME/AWAY/DRAW
  • Top 3 winning team with HOME ground statistics
  • Machine Learning

    Using H2O:
    We have implemented four different models -

    Spliting for Training and Testing

    Data wrangling for Machine Learning

    Recipes and Baking

    Deep Learning 1

    Deep Learning 2 with tuning

    HyperParameter Tuning for Deep Learning

    AutoML

    Summary for Model 1

    Deep Learning M1

    Summary for Model 2

    Deep Learning M2

    HyperParameter Tuning Results

    Summary for AutoML

    Summarizing the AutoML Results

    Conclusion