Introduction

Cliches are part and parcel of many aspects of life - including sports. Soccer has its own varied collection of phrases trotted out each weekend by commentators and pundits. One of these is the concept of the new manager bump, the belief that a change of coach leads to a temporary improvement in results. I decided to see what the data had to say on the matter. I scraped 12 years of Brazilian Serie A league data, and 22 years of English Premiership data to fuel my analysis. Why use Brazilian data to look at what is - as far as I know - a quintessential English phrase? The Premier League is far too sane when it comes to changing managers. On the other hand, Brazilian coaches are fired at the drop of a hat, and this chaos provided an interesting counterbalance for the question at hand.

Imports

There’s not much to say on the scraping here, except to perhaps explain why I scraped the data I did. Currently both the Brazilian and English first divisions have 20 teams who play 38 games a season. This format extended back 22 years for England and 12 for Brazil. Basically it was a matter of convenience.

library(tidyverse)
library(janitor)
source("helper_functions.R")

df <- read_csv("./data/br_results_refined.csv")
df_eng <- read_csv("./data/eng_results_refined.csv")
head(df, 5)
## # A tibble: 5 x 9
##   team_1 team_2 score manager_1 manager_2 season game_week team_1_score
##   <chr>  <chr>  <chr> <chr>     <chr>      <int>     <int>        <int>
## 1 Juven… Paran… 1-0   Hélio do… Caio Jún…   2006         1            1
## 2 Vasco… Inter… 1-1   Renato G… Abel Bra…   2006         1            1
## 3 Grêmi… Corin… 2-0   Mano Men… Ademar B…   2006         1            2
## 4 São P… Flame… 1-0   Muricy R… Waldemar…   2006         1            1
## 5 São C… Cruze… 2-1   Nelsinho… Gusmão      2006         1            2
## # ... with 1 more variable: team_2_score <int>
head(df_eng, 5)
## # A tibble: 5 x 9
##   team_1 team_2 score manager_1 manager_2 season game_week team_1_score
##   <chr>  <chr>  <chr> <chr>     <chr>     <chr>      <int>        <int>
## 1 Aston… Manch… 3-1   Brian Li… Alex Fer… 1995-…         1            3
## 2 Black… Queen… 1-0   Ray Harf… Ray Wilk… 1995-…         1            1
## 3 Chels… Evert… 0-0   Glenn Ho… Joe Royle 1995-…         1            0
## 4 Liver… Sheff… 1-0   Roy Evans David Pl… 1995-…         1            1
## 5 Manch… Totte… 1-1   Alan Ball Gerry Fr… 1995-…         1            1
## # ... with 1 more variable: team_2_score <int>

Methodology

Once I scraped and cleaned the data, I had to define what would qualify as a bump. I quickly settled on intra-season five game comparisons, i.e. the average points collected by the old coach in his last five games versus those gained by the new coach in his first five games. The data didn’t always comply when it came to using this definition, but I thought it was good enough. (To elaborate a bit more, one coach was fired four games into a new campaign, so his average points would be based on four rather than five games.)

I did a test for one Brazilian team in one season with two managerial changes.

club <- 'Santos FC'
ssn <- 2017

df_test <- df %>%  
    filter(team_1 == club | team_2 == club, season == ssn) %>%
    mutate(
        manager = ifelse(team_1 == club, manager_1, manager_2),
        manager_change = manager != lag(manager),
        team = club,
        result = case_when(
        (team_1 == club) & (team_1_score > team_2_score) ~ 'W',
        (team_2 == club) & (team_1_score < team_2_score) ~ 'W',
        (team_1 == club) & (team_1_score < team_2_score) ~ 'L',
        (team_2 == club) & (team_1_score > team_2_score) ~ 'L',
        TRUE ~ 'D'
        ),
        points = case_when(result == 'W' ~ 3, result == 'L' ~ 0, TRUE ~ 1)
    )

change_indices = which(df_test$manager_change == TRUE)
mc <- paste(df_test$manager, lag(df_test$manager), sep = ' -> ')[df_test$manager_change == TRUE]
mc <- mc[!is.na(mc)]

df_test_report <- map_df(change_indices, manager_change_comparison, df = df_test) %>% 
    mutate(season = ssn, manager_change = mc) %>% 
    select(team, season, manager_change, manager_prev_avg_pts, manager_new_avg_pts)

df_test_report
## # A tibble: 2 x 5
##   team    season manager_change        manager_prev_avg… manager_new_avg_…
##   <chr>    <dbl> <chr>                             <dbl>             <dbl>
## 1 Santos…   2017 Lévir Culpi -> Doriv…              0.75               2.6
## 2 Santos…   2017 Elano -> Lévir Culpi               1.2                1.2

Then I expanded my analysis to the entire Brazilian data set. In Brazil there were an average of 28 sackings per year across 12 years. The most sacked and hired managers were…unknown - missing data led in both categories. Below that Dorival Junior and Ney Franco were the most sacked, while Ney Franco and Cuca (and Dorival Junior in third) were the most frequent replacements.

ssn_clubs <- map_df(df$season %>% unique(), ssn_clubs_combinations, df = df)
managerial_change_data <- map2_df(ssn_clubs$Var1, ssn_clubs$Var2, club_ssn_data, df = df)

managerial_change_data %>%
    separate(manager_change, c("sacked", "hired"), sep = ' -> ') %>%
    group_by(sacked) %>%
    summarise(n = n(), pts_avg = mean(manager_prev_avg_pts)) %>%
    arrange(desc(n))
## # A tibble: 122 x 3
##    sacked                     n pts_avg
##    <chr>                  <int>   <dbl>
##  1 ???                       46   0.870
##  2 Dorival Júnior             9   0.8  
##  3 Ney Franco                 9   1.16 
##  4 Jorginho                   8   0.8  
##  5 Cuca                       7   0.714
##  6 Paulo César Carpegiani     7   1.03 
##  7 Renato Gaúcho              7   0.743
##  8 Antônio Lopes              6   1.08 
##  9 Enderson Moreira           6   0.875
## 10 Geninho                    6   0.581
## # ... with 112 more rows
managerial_change_data %>%
    separate(manager_change, c("sacked", "hired"), sep = ' -> ') %>%
    group_by(hired) %>%
    summarise(n = n(), pts_avg = mean(manager_new_avg_pts)) %>%
    arrange(desc(n))
## # A tibble: 116 x 3
##    hired                      n pts_avg
##    <chr>                  <int>   <dbl>
##  1 ???                       36   1.23 
##  2 Ney Franco                10   1.5  
##  3 Cuca                       8   1.75 
##  4 Dorival Júnior             7   1.97 
##  5 Jorginho                   7   1.07 
##  6 Paulo Autuori              7   1.31 
##  7 Paulo César Carpegiani     7   0.843
##  8 Vanderlei Luxemburgo       7   1.49 
##  9 Argél                      6   0.633
## 10 Émerson Leão               6   0.833
## # ... with 106 more rows

Results

The first was a raw calculation of how many times the successor outperformed his predecessor. That came out at 66% in Brazil. The second was two one-sided t-tests - one paired and the other not. In Brazil both tests rejected the null hypothesis with the mean points collected of the new managers being 1.3 and that of the old managers being 0.93. The 0.37 difference seems small, but expanded out to 38 games that’s an extra 14 points.

There’s a tiny disclaimer needed here that these tests are being run on an average of an average. That is, the average of the average number of points across (a max of) 5 games for hired managers in Brazil is 1.3. The sample sizes are more or less the same, though.

managerial_change_data %>% 
    mutate(compare = manager_new_avg_pts > manager_prev_avg_pts) %>% 
    pull(compare) %>% 
    mean()
## [1] 0.660767
t.test(
    managerial_change_data$manager_new_avg_pts,
    managerial_change_data$manager_prev_avg_pts,
    alternative = 'greater',
    paired = T
)
## 
##  Paired t-test
## 
## data:  managerial_change_data$manager_new_avg_pts and managerial_change_data$manager_prev_avg_pts
## t = 8.2841, df = 338, p-value = 1.41e-15
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.2969312       Inf
## sample estimates:
## mean of the differences 
##               0.3707473
t.test(
    managerial_change_data$manager_new_avg_pts,
    managerial_change_data$manager_prev_avg_pts,
    alternative = 'greater',
    paired = F
)
## 
##  Welch Two Sample t-test
## 
## data:  managerial_change_data$manager_new_avg_pts and managerial_change_data$manager_prev_avg_pts
## t = 7.8573, df = 661.64, p-value = 7.964e-15
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.2930259       Inf
## sample estimates:
## mean of x mean of y 
## 1.2989184 0.9281711

Now for England. The average was 10 sackings per year across the 22 years of data. The most sacked and hired managers were again unknown. Below that Harry Redknapp and Roy Hodgson (both 5 times) were the most sacked, while Redknapp and Garry Megson (both 5 times) were the most frequent replacements.

The new manager outperformed the old one 65% of the time.

Both t tests rejected the null hypothesis with the mean points collected of the new managers being 1.17 and that of the old managers being 0.81. The 0.36 difference expanded out to 38 games is also an extra 14 points.

ssn_clubs_eng <- map_df(df_eng$season %>% unique(), ssn_clubs_combinations, df = df_eng)
managerial_change_data_eng <- map2_df(ssn_clubs_eng$Var1, ssn_clubs_eng$Var2, club_ssn_data, df = df_eng)

managerial_change_data_eng %>%
    separate(manager_change, c("sacked", "hired"), sep = ' -> ') %>%
    group_by(sacked) %>%
    summarise(n = n(), pts_avg = mean(manager_prev_avg_pts)) %>%
    arrange(desc(n))
## # A tibble: 139 x 3
##    sacked              n pts_avg
##    <chr>           <int>   <dbl>
##  1 ???                10   0.61 
##  2 Harry Redknapp      5   0.533
##  3 Roy Hodgson         5   0.68 
##  4 Alan Pardew         4   0.5  
##  5 Dave Watson         4   0.85 
##  6 Gary Megson         4   0.45 
##  7 Glenn Hoddle        4   1.27 
##  8 Gordon Strachan     4   0.75 
##  9 Joe Royle           4   1    
## 10 John Gregory        4   1.15 
## # ... with 129 more rows
managerial_change_data_eng %>%
    separate(manager_change, c("sacked", "hired"), sep = ' -> ') %>%
    group_by(hired) %>%
    summarise(n = n(), pts_avg = mean(manager_new_avg_pts)) %>%
    arrange(desc(n))
## # A tibble: 139 x 3
##    hired              n pts_avg
##    <chr>          <int>   <dbl>
##  1 ???               10    1.08
##  2 Gary Megson        5    0.6 
##  3 Harry Redknapp     5    1   
##  4 Alan Pardew        4    1.05
##  5 Glenn Hoddle       4    1.5 
##  6 Joe Royle          4    0.8 
##  7 Kevin Keegan       4    1.6 
##  8 Mark Hughes        4    1.5 
##  9 Roy Hodgson        4    1.4 
## 10 Stuart Gray        4    1.3 
## # ... with 129 more rows
managerial_change_data_eng %>% 
    mutate(compare = manager_new_avg_pts > manager_prev_avg_pts) %>% 
    pull(compare) %>% 
    mean()
## [1] 0.6491228
t.test(
    managerial_change_data_eng$manager_new_avg_pts,
    managerial_change_data_eng$manager_prev_avg_pts,
    alternative = 'greater',
    paired = T
)
## 
##  Paired t-test
## 
## data:  managerial_change_data_eng$manager_new_avg_pts and managerial_change_data_eng$manager_prev_avg_pts
## t = 6.7818, df = 227, p-value = 5.083e-11
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.2724492       Inf
## sample estimates:
## mean of the differences 
##               0.3601608
t.test(
    managerial_change_data_eng$manager_new_avg_pts,
    managerial_change_data_eng$manager_prev_avg_pts,
    alternative = 'greater',
    paired = F
)
## 
##  Welch Two Sample t-test
## 
## data:  managerial_change_data_eng$manager_new_avg_pts and managerial_change_data_eng$manager_prev_avg_pts
## t = 6.1579, df = 448.74, p-value = 8.195e-10
## alternative hypothesis: true difference in means is greater than 0
## 95 percent confidence interval:
##  0.2637576       Inf
## sample estimates:
## mean of x mean of y 
## 1.1711257 0.8109649

Conclusion

So the new manager bump is a thing, or rather, to slightly channel Charles Babbage, a change of manager often leads to an improvement in results. Chalk one more up for the real football men in the battle against the evil number wizards.

You can find everything on GitHub.