We used a public-domain dataset from Kaggle that contains information about international football matches between 1872 and 2018 . It contains information about 38759 matches (Fifa World Cup matches, Friendly matches, etc).
Our dataset looks like this:
We performed some prepropcessing operations on the data:
Some dates in the dataset were represented as yyyy-mm-dd, and some as yyyy/mm/dd. We changed all to yyyy/mm/dd format using Python and Pandas library.
So the modified dataset looks like this:
We used Excel, Python, and Pandas library to do that.
We created a new dataset that is derived from the original dataset. It looks like this:
We used Tableau to analyze the data and get insights from it.
We also created a Python program that receives the names of two teams as an input, and produces a summary of all the matches between those two teams as an output.
