Performance of Sports Teams by State

As a data enthusiast with a deep passion for sports, I’ve always been fascinated by the intersection of analytics and athletic performance. Having followed soccer for the past 15 years, I recently developed a keen interest in U.S. sports. While certain teams—such as the Kansas City Chiefs in the NFL—have dominated their respective leagues in recent years, I wanted to take a broader approach. Rather than focusing on individual franchises, I set out to analyze how states as a whole have performed across the four major professional leagues: the NFL, NBA, MLB, and NHL.

This curiosity led me to embark on a data-driven project to rank U.S. states based on the average win percentage of their professional teams over the past five years.

The interactive map below allows users to explore various layers, offering insights into overall state performance across the four major leagues, sport-specific success, and the geographic distribution of teams.

You can use the layers button on the top right corner to visualize the following over the last 5 years:

  1. Average Win % of Teams within a state across NHL, NBA, MLB and NFL
  2. Total Number of Conference Titles won by teams within the state
  3. Average Individual League Win % of Teams (ex: Minnesota’s NBA Win %)
  4. Teams located within each State

Plotting Guide:

This section provides an in-depth look at the process behind creating the final map displayed above. For a detailed explanation of how the raw data was gathered, processed, and refined for this project, you can find more information here.

performance_by_state <- read.csv('sports_performance.csv') #Contains the Sports Performance of States over the last 5 years
stadiums = read.csv('stadiums.csv') #Contains the list of American Teams and their Stadium Locations
head(performance_by_state)
##                  State NHL_WinPCT NHL_Teams NHL_ConferenceTitles NBA_WinPCT
## 1              Alberta      56.80         2                    1         NA
## 2              Arizona      34.76         1                    0      63.93
## 3     British Columbia      50.53         1                    0         NA
## 4           California      40.29         3                    0      53.96
## 5             Colorado      63.68         1                    1      64.03
## 6 District of Columbia      51.98         1                    0      34.43
##   NBA_Teams NBA_ConferenceTitles MLB_WinPCT MLB_Teams MLB_ConferenceTitles
## 1        NA                   NA         NA        NA                   NA
## 2         1                    1      45.76         1                    1
## 3        NA                   NA         NA        NA                   NA
## 4         4                    1      51.61         5                    2
## 5         1                    1      40.74         1                    0
## 6         1                    0      40.68         1                    0
##   NFL_WinPCT NFL_Teams NFL_ConferenceTitles StateWinPCT TotalConferenceTitles
## 1         NA        NA                   NA       56.80                     1
## 2      41.67         1                    0       46.53                     2
## 3         NA        NA                   NA       50.53                     0
## 4      53.97         3                    2       49.96                     5
## 5      41.67         1                    0       52.53                     2
## 6         NA        NA                   NA       42.36                     0


Step I: Creating the Base Map
#Shape file for US States
states <- read_sf('cb_2018_us_state_500k.shp')

#Create the final dataframe which includes the State Boundaries and Sports Performance
states_df <- left_join(states, performance_by_state, by = c('NAME' = 'State'))
states_df <- states_df[!is.na(states_df$StateWinPCT),]

#Create a base map and set the default view to US
base_map <- states_df %>% leaflet() %>% setView(lat = 39.8282, lng = -98.5795, 4) %>% addProviderTiles(providers$CartoDB.PositronNoLabels)
base_map

The basemap can be created using the leaflet() function, with the view set to focus on the United States. For this project, Census Bureau’s shapefile has been used to identify and mark the state boundaries for United States.


Step II: Creating the Base Layer

The base layer of the map highlights each state, representing the average win percentage of all teams across the four major leagues, over the last 5 years.

#Create labels that will be displayed on hover
labels_all <- paste0("<b>",states_df$NAME, "</b>", "</br>", "Avg. Win %: ",states_df$StateWinPCT)

#Create a color-palette for values ranging from 20% - 80% (range of Win %)
paletteNum_all <- colorNumeric('YlGn', domain = c(20,80))

#Use the addPolygons() function to fill the state using the win % of teams
map_state_win_pct <- base_map %>% 
  addProviderTiles(providers$CartoDB.PositronNoLabels) %>% 
  addPolygons(data = states_df, weight = 1, smoothFactor = 0.5, color = 'white', fillOpacity = 0.8, fillColor = ~paletteNum_all(states_df$StateWinPCT), label = lapply(labels_all,HTML),group = "StateWinPct") %>% 
  addLegend(pal = paletteNum_all, values = c(20,80), opacity = 0.7, position = "bottomleft",title = "Avg. Win % of Teams (Last 5 years)")

map_state_win_pct

The addPolygons() function has been used to define each state on the map and highlight the average win % of states across Big 4 Leagues.


Step III: Adding another layer

Additional layers showcasing win percentages for individual leagues are created using the addPolygons() function, applying the same methodology for each league.

#Create labels that will be displayed on hover
labels_nhl <- paste0("<b>", states_df$NAME ,"</b>", "<br>",
                     "<b>NHL Win Pct: </b>", round(states_df$NHL_WinPCT, 2), "%", "<br>",
                     "<b>NHL Conference Titles: </b>", states_df$NHL_ConferenceTitles)

#Use the addPolygons() function to fill the state using the NHL win % of teams
map_state_win_pct_nhl <- base_map %>% 
  addProviderTiles(providers$CartoDB.PositronNoLabels) %>% 
  addPolygons(data = states_df, weight = 1, smoothFactor = 0.5, color = 'white', fillOpacity = 0.8, fillColor = ~paletteNum_all(states_df$NHL_WinPCT), label = lapply(labels_nhl,HTML),group = "NHLWinPct")

map_state_win_pct_nhl

The map layers for individual leagues can re-use the existing color-palette and legend created for the base layer.


Step IV: Mapping the Team Locations

The locations of the individual teams can be added as point markers based on the longitude and latitude of the team’s stadium. The stadiums data frame contains the name and location of all team stadiums, along with the respective league names.

#Add Team Stadiums
nhl_teams <- stadiums %>% filter(League == 'NHL')
nfl_teams <- stadiums %>% filter(League == 'NFL')
nba_teams <- stadiums %>% filter(League == 'NBA')
mlb_teams <- stadiums %>% filter(League == 'MLB')

#Create a color-palette for each league
pal <- colorFactor(palette = c("black", "red", "green", "blue"), levels = c('NHL','NFL','NBA','MLB'))

#Add the location of the stadiums on the base map as markers
map_stadiums <- base_map %>% 
  addCircleMarkers(data = nhl_teams, ~Long, ~Lat, color = ~pal(League), label = ~htmlEscape(Team), group = "NHL",radius = 2) %>% 
  addCircleMarkers(data = nfl_teams, ~Long, ~Lat, color = ~pal(League), label = ~htmlEscape(Team), group = "NFL",radius = 2) %>% 
  addCircleMarkers(data = nba_teams, ~Long, ~Lat, color = ~pal(League), label = ~htmlEscape(Team), group = "NBA",radius = 2) %>%
  addCircleMarkers(data = mlb_teams, ~Long, ~Lat, color = ~pal(League), label = ~htmlEscape(Team), group = "MLB",radius = 2)

map_stadiums

The locations of stadiums can be plotted using the addCircleMarkers() or addMarkers() functions. For this map, we have divided the teams into 4 groups based on the league, and each of these leagues is represented with a different colour.


Step V: Combining All the Layers

The addPolygon() step above can be repeated to create layers for each individual league. The final map is then created by combining the layers created in the previous steps into a single map object, and using the addLayersControl() function to allow the user to navigate through different layers.

#Combine all the layers
final_map <- base_map %>%
  addProviderTiles(providers$CartoDB.PositronNoLabels) %>%
  addPolygons(data = states_df, weight = 1, smoothFactor = 0.5, color = 'white', fillOpacity = 0.8, fillColor = ~paletteNum_all(states_df$StateWinPCT), label = lapply(labels_all,HTML),highlightOptions = highlightOptions(color = "yellow", weight = 3), group = "StateWinPct") %>% 
  addLegend(pal = paletteNum_all, values = c(20,80), opacity = 0.7, position = "bottomleft",title = "Avg. Win % of Teams (Last 5 years)") %>% 
  addPolygons(data = states_df, weight = 1, smoothFactor = 0.5, color = 'white', fillOpacity = 0.8, fillColor = ~paletteNum_titles(states_df$TotalConferenceTitles), label = lapply(labels_titles,HTML), highlightOptions = highlightOptions(color = "yellow", weight = 3), group = "Titles") %>%
  addPolygons(data = states_df, weight = 1, smoothFactor = 0.5, color = 'white', fillOpacity = 0.8, fillColor = ~paletteNum_all(states_df$NHL_WinPCT), label = lapply(labels_nhl,HTML), highlightOptions = highlightOptions(color = "yellow", weight = 3), group = "NHLWinPct") %>%
  addPolygons(data = states_df, weight = 1, smoothFactor = 0.5, color = 'white', fillOpacity = 0.8, fillColor = ~paletteNum_all(states_df$NBA_WinPCT), label = lapply(labels_nba,HTML), highlightOptions = highlightOptions(color = "yellow", weight = 3), group = "NBAWinPct") %>%
  addPolygons(data = states_df, weight = 1, smoothFactor = 0.5, color = 'white', fillOpacity = 0.8, fillColor = ~paletteNum_all(states_df$MLB_WinPCT), label = lapply(labels_mlb,HTML), highlightOptions = highlightOptions(color = "yellow", weight = 3), group = "MLBWinPct") %>%
  addPolygons(data = states_df, weight = 1, smoothFactor = 0.5, color = 'white', fillOpacity = 0.8, fillColor = ~paletteNum_all(states_df$NFL_WinPCT), label = lapply(labels_nfl,HTML), highlightOptions = highlightOptions(color = "yellow", weight = 3), group = "NFLWinPct") %>%
  addCircleMarkers(data = nhl_teams, ~Long, ~Lat, color = ~pal(League), label = ~htmlEscape(Team), group = "NHLStadiums",radius = 2) %>% 
  addCircleMarkers(data = nfl_teams, ~Long, ~Lat, color = ~pal(League), label = ~htmlEscape(Team), group = "NFLStadiums",radius = 2) %>% 
  addCircleMarkers(data = nba_teams, ~Long, ~Lat, color = ~pal(League), label = ~htmlEscape(Team), group = "NBAStadiums",radius = 2) %>%
  addCircleMarkers(data = mlb_teams, ~Long, ~Lat, color = ~pal(League), label = ~htmlEscape(Team), group = "MLBStadiums",radius = 2) %>% 
  addLayersControl(overlayGroups = c("StateWinPct","Titles","NHLWinPct","NBAWinPct","MLBWinPct","NFLWinPct","NHLStadiums","NFLStadiums","NBAStadiums","MLBStadiums")) %>%
  hideGroup("Titles") %>%
  hideGroup("NHLWinPct") %>%
  hideGroup("NBAWinPct") %>%
  hideGroup("MLBWinPct") %>%
  hideGroup("NFLWinPct") %>%
  hideGroup("NHLStadiums") %>%
  hideGroup("NFLStadiums") %>%
  hideGroup("NBAStadiums") %>%
  hideGroup("MLBStadiums")

The hideGroup() function allows you to set the Win % Layer are the default layer for the map, by hiding the other layers until they are selected.