In 2024 I completed the Google Data Analysis Certificate course through Coursera. This is my first data analysis case study completed as part of the capstone project for that course. Please see the contents below to find out more.
The final task of my studies for the Google Data Analysis Certificate was to produce a case study for my portfolio. I had found an American Football dataset online a few years ago and it seemed like a great basis for a data analysis case study. I am a keen follower of American Football, in no small part because the games often feel very close and the late stages are more dramatic more often than any other sport that I have an interest in.
This case study task seemed like a great opportunity to dust off the dataset I had downloaded and see if I could work with it to establish whether my perception that American Football was a drama filled game was backed up by data or whether I was suffering a cognitive bias.
To read more about the specific question I asked, my methodology and the problem solving and technical skills that went into producing my analysis click here.
To view my report containing my findings click here
I used Google's data analysis lifecycle as a framework with which to break up the task into manageable stages that could be planned out. As per Google’s lifecycle, those stages were:
To circumvent any compatibility issues and to save you the time to download and run my modified dataset I built the interactive spreadsheet below. It features a sheet selector and a formula bar. For brevity the main "Games" sheet features the play by play for just two games, this should be sufficient to introduce the main concepts. The yellow columns in the "Games" sheet indicate custom columns I added to derive data. You can click into these columns and all the columns in the validation sheet to review the formulas I used to extract the data I wanted.
Use the "Info" button to toggle the info pane. Clicking on a custom column will give you a brief description of the purpose of the column.
If you would rather download and run the final version of the modified dataset it can be downloaded here
To view the report I produced to summarise the results of my analysis click here