A Brief Project Report —>

  • I have combined my love for football and data science to create analysis of the football dataset found on kaggle using Python.

  • It contains an extensive EDA and Regression Analysis,hyperparameter tuning, and various Regression Machine Learning Models Comaprison.

  • To see the complete project and notebooks. Click Here. Link

  • The dataset on kaggle is organized in 3 files:

  • events.csv contains event data about each game. Text commentary was scraped from: bbc.com, espn.com and onefootball.com
  • ginf.csv - contains metadata and market odds about each game. odds were collected from oddsportal.com
  • dictionary.txt contains a dictionary with the textual description of each categorical variable coded with integers

  • I have used this data to:

  • Make Explorative Data Analysis about games played.

Below is the goal distribution by month in Europe

kd

Below is goal distribution by month in Top 5 European Leagues.

kd

Maximum Goals occur in 90th min of match.

kd

  • Build expected goals models and compare players on various attributes.

Training the model to Predict Players Overall Rating.

kd

  • Which has RMSE of less than ** < 0.9452 **
  • To see the complete project and notebooks. Click Here. Link