Data Introduction

For this project, I scraped data off of Basketball Reference’s website. This site hosts various leagues from NBA, G-League, and WNBA, just to name a few. I wanted to focus on the NBA because I find this to be the most interesting out of all of them. Not only did I focus on the NBA, I also focused specifically on the 2017-2018 season. I scraped from Basketball Reference’s schedule and results tab to gather my data. This provides data for every game that was played (see table below for variables).

I wanted to see if the Golden State Warriors had any significant differences in these stats because they won the NBA Finals for this season. My intention was to figure out these differences by looking at different variable, like the number of wins they had, average attendance (see if fans had a factor), points scored home and away, days and months that most games were played in, for the NBA as a whole and then specifically for the Warriors.

Glossary of Variables

Below is an explanation of all of the variables that were used in the data.

Variable Description
game_id Unique ID for each game
date_game Date of the game
game_start_time Time the game started
visitor_team_name Name of the visiting team
visitor_pts Number of points for the visiting team
home_team_name Name of the home team
home_pts Number of points for the home team
box_score_text Text from the box score
overtimes Number if overtimes the game went into
attendance Number of people in attendance at the game
game_remarks Game remarks that were made

Addition and Deletion of Variables

With the data that I scraped, I decide to add and remove some of the variables. I removed the variables box_score_text and game_remarks. My reasoning for doing this is because they are both blank columns and would provide no insight into the date. Essentially they were just taking up space. However, I also added some variables that I thought may be useful. In order to distinguish whether a game was a playoff or regular season game, I created another variable to produce whether or not it was based on the the playoff start date of April 14th, 2018. I also created two columns that display the winner and loser of each game in order to easily identify the outcome of the game.

Conference and Division Comparison

Because I was looking into the Golden State Warriors, I wanted to see how they did in terms of wins and losses compared to those teams in their conference along with their division. The tables show each team along with their conference, division, wins, losses, and win percentage. The graph is then broken down by only the number of wins and their respective conferences.

When looking at the Western Conference we see that there are only two true contenders in this conference, Houston Rockets (76 wins) and Golden State Warriors (74 wins). The next highest number of wins after these two is 53 wins, so we have roughly a 20 game gap between second and third. This is quite substantial when we compare it to the Eastern Conference. Only having these two teams as contenders in the West, probably gave the Warriors an easier road to the finals then some other teams in the Eastern Conference. Because if we look at the Eastern Conference, it is much more packed at the top then the Western Conference. There are roughly four or five teams that you could consider to be contenders as opposed to the two that the West had. If we took the 20 game gap that the West has, been second and third, and applied it to the East, then that would account for firth through sixth place. This shows the competitive nature of the East versus the West.

Now how did the Warriors get so many wins? I answer this question I took a look into their division. The Warriors are in the Pacific division with the Los Angeles Clippers, Los Angeles Lakers, Sacramento Kings, and Phoenix Suns. For the 2017-2018 season none of these other teams were very good, with the highest win total being 42. This gives them a .512 win percentage, which means they only won just over half of their games. Being in this division, basically gave the Warriors some “easy” wins and allowed them to gain their 74 wins.

Average Attendance

Since the Warriors were doing so well, how did their fans feel about it? To answer this question, I looked into the average attendance for each team. The Warriors came in seventh, averaging 19,596 fans. Although they are seventh on the list, this average for Oracle Arena is a sell out crowd. So, for the 2017-2018 season they averaged a sell out crowd, whereas other teams may have averaged more because they can simply hold more fans. One thing in the future that I would want to look into is the percentage of the arena that filled with these average attendances and compare those to see who has the fullest arenas.

Points per Team

With 74 wins in the season, did the Warriors get lucky or did they just dominate the competition? I looked into the total number of points that each team scored at home and on the road. The tables below show that the Warriors are at the top for each of these categories. At home the Warriors scored 5,902 points and on the road they scored 5,721. It makes sense that they scored more points at home, as they would be more comfortable playing there. What came as a surprise to me was the number of points on the road. I would have expected this number to be a little bit lower, with the constant travel and potential injuries that they had to deal with all season. Being in the top of these two categories helps explain why they were able to gather so many wins, but again was it because they were good or because they were playing teams that weren’t so good? Looking over their schedule, I believe that it is a mix of both. I believe that they were challenged at times with their competition, but as all major league sports are, played teams who were terrible. When they played these better teams, they were still able to score a significant amount of points.

Scheduling

The schedule that the NBA has is talked about every year on whether they should change it or not. Since the schedule is always talked about I wanted to look into how it is by month and by day of the week for the whole NBA and then look at some of these aspects for the Warriors.

This first graph shows that a majority of NBA games are played on Wednesday, Friday, and Saturday. This makes sense since teams have to travel and need off days in order to do this. There really is not much to this graph other that showing what days are used for rest and what days they play games on.

This next graph is similar to the first one except it is the month each game is played. The start of the NBA season is in October and we see that there are just over 100 games played in this month. This is because the season does not start until the middle of the month resulting in a lower number of games played. Other month that have a low amount of games are April, May, June. The reason for this is because of the Playoffs. The Playoffs usually start about the second week of April resulting in fewer teams playing. May and June are even lower because teams are getting knocked out of the Playoffs, which results in fewer games played. February is an interesting month that does not fit with the least games played but also not the most games played. I suspect this is because of the number of days that February has having two to three fewer days to play games, but does that result in about 100 less games? Probably not the case and would be something interesting to look into. Now for the most games played it does not come as a surprise that they are November, December, January, and March. This is the middle of the NBA season and is when most teams are playing twice or three time a week, making the number of games add up.

With the schedule broken down by month, how many home and road games do the Warriors have? The NBA most likely tries to distribute home and road game evenly over the course of the year. In the graphs below, we can see that this is relatively the case. The only thing to really note are the months of December, January, and March. The Warriors played roughly four more home games in December than they did road games. However, this was balanced out in January with roughly the same difference. Some might say that playing at home gave the Warriors an advantage as they were able to be more comfortable and gain confidence as they won games early in the season. This could be the case because they went 8-10 at home in December. A similar record was achieved in January on the road as they went 7-10. It seems that the Warriors took advantage of their early home games and winning most of them and carrying that in January as they went more on the road. Being able to take advantage of these early opportunities is the sign of a good team. The last month the has a noticeable difference is March. They seemed to play a hand full of more home games in this month. This is critical because this is when teams are fighting for spots in the playoffs. Not only are teams doing this, they are also tired and recovering from injuries, so being able to play more home games and not have to travel as much is a huge benefit to have. This however did not seem to be a factor for the Warriors as they went 7-7 in March. This could also be due to resting players to have them as healthy as possible for the playoffs since they already knew that they would be in.

Future Analysis

After looking at the schedule and results from Basketball Reference, I think that looking into the box scores or season stats for each team would help valid my findings. This would be able to show more of their scoring and whether or not they make more 2s or 3s. Another thing to look at to see if they were the best team would be to look at individual player stats and see if players “got in the way of one another.” This could be done by looking into if they have a positive correlation. This would help and see if they make the other better or if they make them worse. Box plots would be a beneficial visual for this to break it down by the points each player makes with the other playing and when they are not playing (for injury, sick, rest, etc.). This would a simple way of looking into it or you could take each players offensive and defensive ratings for each game. This would allow us to see if one is independent of another and what those reasons may be. With this, you could compare different Warriors players with one another along with other players from different teams to see if they also have the same effect on one another.