All of the 2017 regular season games have been played and we’re finally ready for the real thing.

Perhaps, by now, you’ve found the best parameters but to use those parameters to make predictions for 2017 you’ll need to read in two new files:

  1. 2017_Final_CompactResults.csv: With regular season results for 2017. We’ll need to add these results to our regular season results from the previous seasons.

  2. SampleSubmission_5050Benchmark.csv: With all of the games that could be played in the 2017 tournament.

Here’s some code to do this:

library(dplyr); library(tidyr)


reg2017 <- read.csv('/home/rstudioshared/shared_files/data/2017_Final_CompactResults.csv')
reg <- read.csv('/home/rstudioshared/shared_files/data/RegularSeasonCompactResults.csv')

reg <- rbind(reg,reg2017)

sub <- read.csv('/home/rstudioshared/shared_files/data/SampleSubmission_5050Benchmark.csv')
colnames(sub) <- tolower(colnames(sub))


reg1 <- reg %>% rename(team = Wteam, opp.team=Lteam, score=Wscore, opp.score=Lscore) %>% mutate(win=1, loc= (Wloc=="H") - (Wloc=="A"))
reg2 <- reg %>% rename(team= Lteam, opp.team=Wteam, score=Lscore, opp.score=Wscore) %>% mutate(win=0, loc= (Wloc=="A") - (Wloc=="H"))
reg <- rbind(reg1, reg2)

Here’s a function to calculate ELO’s once you’ve already settled on a set of parameters:

calcELO <- function(K=20, Kmargin=20, R=0.1, start.season=2012, end.season=2017, gw.power=0){
  f <- updateELO(reg %>% filter(Season==start.season), K=K, Kmargin=Kmargin, gw.power=0)
  if(length(start.season:end.season)>1){
    for (i in seq_along((start.season:end.season)[-1])){
      f <- RegressElos(f, R)
      f <- updateELO(reg %>% filter(Season==start.season+i), K=K, Kmargin=Kmargin, gw.power=0,start.elos=f)
    }
      
  }
  return(f)
}

and here it is in action. The second line is just a little clean up so that you can join this with team names and with the submission file.

final.elos <- calcELO(K=35, Kmargin=15, R=0.1, start.season=2009, end.season=2017, gw.power=0.5)
final.elos <- final.elos %>% ungroup() %>% mutate(team = as.character(team)) %>% rename(elo=elo.end)

Now, let’s take a look at what teams you gave the highest ratings:

teams <- read.csv('/home/rstudioshared/shared_files/data/teams.csv')
colnames(teams) <- c("team", "team_name")
teams <- teams %>% mutate(team = as.character(team))

final.elos <- left_join(final.elos, teams)
## Joining, by = "team"
View(final.elos)

Finally, we can use this to make a submission file:

  sub <- sub[,c("id", "pred")]  %>% separate(id, into=c("year", "team", "opp.team"), sep="_", remove=FALSE)
  sub <- left_join(sub, final.elos, by="team")
  sub <- left_join(sub, final.elos, by=c("opp.team" = "team"))
  sub <- sub %>% mutate(pred = Ewins(elo.x, elo.y, loc=0, HFA=100))
  write.csv(sub %>% select(id, pred), 'submission.csv', row.names=FALSE)

One More Thing: Prediction-Alterations

1. Picking one Game

Let’s say that I think that the Maryland/Xavier first round game is a toss-up (just for example) and I want to change my first round predictions to 0/1 in my two brackets.

First, I can find the team numbers for Maryland and Xavier:

teams %>% filter(team_name=="Xavier")
teams %>% filter(team_name=="Maryland")

Next, I find this game in my sub file:

sub %>% filter(id == "2017_1462_1268" | id == "2017_1268_1462")
##               id year team opp.team      pred    elo.x team_name.x
## 1 2017_1268_1462 2017 1268     1462 0.5324971 1754.837    Maryland
##      elo.y team_name.y
## 1 1732.224      Xavier

… and then I can change it. In this example I’ll change this game prediction to make Maryland a lock. In my other ballot, I could change this to 0 instead of 1 and predict that Xavier is a lock:

sub <- sub %>% mutate(pred = ifelse(id == "2017_1268_1462", 1, pred))

After changing that game prediction, I can create the submission file just as I would have otherwise:

write.csv(sub %>% select(id, pred), 'maryland_beats_xavier.csv', row.names=FALSE)

2. Picking One Team to Win it All

Let’s say that I actually want to bank on Maryland winning it all – that is, I’m going to assume that they’re a lock to win all of their games. I can do that changing my prediction in every game where Maryland is the first team listed to 1 and in every game where Maryland is the second team listed to 0.

sub <- sub %>% mutate(pred = ifelse(team == "1268", 1, pred))
sub <- sub %>% mutate(pred = ifelse(opp.team == "1268", 0, pred))

and then create my submission file as always:

write.csv(sub %>% select(id, pred), 'crazy_for_maryland.csv', row.names=FALSE)