This document summarises the forecast outcomes for the most set of forecasts for English football match outcomes, following on from recent weeks, and it provides outcomes for forecasts of matches over the last week as detailed in this document.

Two sets of forecasts are constructed, one using a simple linear regression method for a dependent variable taking the values 0 for an away win, 0.5 for a draw, and 1 for a home win. This linear regression method only yields a number which can be interpreted as a probability of an outcome occurring. The second, ordered logit method, treats each outcome in an ordered manner, from away win, to draw, to home win, and estimates a probability for each outcome.

The purpose of this document is to evaluate these forecasts, and begin to form a longer-term narrative about them.

Outcomes

To consider outcomes; we must load up the specific outcomes file for matches forecast over the weekend:

dates <- c("2015-01-30","2015-02-06","2015-02-13","2015-02-20","2015-02-27","2015-03-06",
           "2015-03-13","2015-03-20")
date.1 <- dates[NROW(dates)]
recent.forecast.outcomes <- read.csv(paste("forecast_outcomes_",date.1,".csv",sep=""),stringsAsFactors=F)
forecast.matches <- read.csv(paste("forecasts_",date.1,".csv",sep=""))
forecast.matches <- forecast.matches[is.na(forecast.matches$outcome)==F,]
forecast.outcomes <- merge(forecast.matches[,c("match_id","outcome","Ph","Pd","Pa")],
                           recent.forecast.outcomes,by=c("match_id"),
                           suffixes=c(".forc",".final"))
forecast.outcomes <- forecast.outcomes[is.na(forecast.outcomes$outcome.final)==F,]

all.forecast.outcomes <- data.frame()
loc <- "/home/readejj/Dropbox/Teaching/Reading/ec313/2015/Football-forecasts/"
for(i in dates) {
  temp.0 <- read.csv(paste(loc,"forecast_outcomes_",i,".csv",sep=""),stringsAsFactors=F)
  temp.0$X <-NULL
  temp.0$forc.week <- i
  temp.1 <- read.csv(paste(loc,"forecasts_",i,".csv",sep=""))
  temp.1$X <-NULL
  temp.1 <- temp.1[is.na(temp.1$outcome)==F,]
  if(!("Ph" %in%  colnames(temp.1))) {
    temp.1$Ph <- NA
    temp.1$Pd <- NA
    temp.1$Pa <- NA
  }
  if(!("tier" %in% colnames(temp.0))) {
    temp.0$tier <- NA
  }
  temp.2 <- merge(temp.1[,c("match_id","outcome","Ph","Pd","Pa")],
                             temp.0[,c("match_id","date","division","team1",
                                       "goals1","goals2","team2","outcome",
                                       "season","tier","forc.week")],
                  by=c("match_id"),suffixes=c(".forc",".final"))
  all.forecast.outcomes <- rbind(temp.2[is.na(temp.2$outcome.final)==F,],all.forecast.outcomes)
}
outcomeplot <- function(div) {
  matches <- forecast.outcomes[forecast.outcomes$division==div,]
  matches$id <- 1:NROW(matches)
  par(mar=c(9,4,4,5)+.1)
  plot(matches$id,matches$outcome.forc,xaxt="n",xlab="",ylim=range(0,1),
       main=paste("Forecasts of Weekend ",div," Matches",sep=""),
       ylab="Probability of Outcome")
  lines(matches$id,matches$Ph,col=2,pch=15,type="p")
  lines(matches$id,matches$Pd,col=3,pch=16,type="p")
  lines(matches$id,matches$Pa,col=4,pch=17,type="p")
  legend("topleft",ncol=4,pch=c(1,15,16,17),col=c(1:4),
         legend=c("OLS","OL (home)","OL (draw)","OL (away)"),bty="n")
  abline(h=0.5,lty=2)
  abline(h=0.6,lty=3)
  abline(h=0.7,lty=2)
  abline(h=0.4,lty=3)
  for(i in 1:NROW(matches)) {
    if(matches$outcome.final[i]==1) {
      lines(matches$id[i],matches$outcome.final[i],col=2,type="p",pch=0)
      lines(rep(i,2),c(matches$Ph[i],matches$outcome.final[i]),type="l",lty=2,col="red")
    } else if (matches$outcome.final[i]==0.5) {
      lines(matches$id[i],matches$outcome.final[i],col=3,type="p",pch=1)
      lines(rep(i,2),c(matches$Pd[i],matches$outcome.final[i]),type="l",lty=2,col="green")
    } else {
      lines(matches$id[i],matches$outcome.final[i],col=4,type="p",pch=2)
      lines(rep(i,2),c(matches$Pa[i],matches$outcome.final[i]),type="l",lty=2,col="blue")
    }
  }
  axis(1,at=matches$id,labels=paste(matches$team1,matches$team2,sep=" v "),las=2,cex.axis=0.65)
}

We also load up previous weeks’ forecasts and outcomes in order that we can begin to determine trends over time. This week was one particularly filled with surprise results, but that does not necessarily mean our forecast model need be amended. More, it reflects that football is intrinsically a very uncertain game. This, of course, is not to say that the model cannot be improved upon.

First, our Premier League forecasts:

outcomeplot("English Premier")

Outcomes are hollowed variants of their predicted probabilities; red empty circles are home wins, marked on at 1, green empty circles are draws, marked on at 0.5, and blue empty triangles are away wins. All are linked to their associated probability. This drawing doesn’t quite reflect actual forecast errors since each outcome is 1 if it happened, but nonetheless illustrates the weekend’s outcomes.

Next, our Championship forecasts:

outcomeplot("English Championship")

Next, our League One forecasts:

outcomeplot("English League One")

Next, our League Two forecasts:

outcomeplot("English League Two")

Next, our Football Conference forecasts:

outcomeplot("Football Conference")

Tabular Version

Numerically it is important to evaluate forecast errors.

For the OLS model:

forecast.outcomes$error <- forecast.outcomes$outcome.final - forecast.outcomes$outcome.forc
forecast.outcomes$error2 <- forecast.outcomes$error^2
forecast.outcomes$aerror <- abs(forecast.outcomes$error)
#summary(forecast.outcomes[forecast.outcomes$tier<=5,c("error","error2","aerror")])
summary(forecast.outcomes[,c("error","error2","aerror")])
##      error              error2             aerror       
##  Min.   :-0.81213   Min.   :0.000176   Min.   :0.01327  
##  1st Qu.:-0.39298   1st Qu.:0.050603   1st Qu.:0.22494  
##  Median : 0.01645   Median :0.122864   Median :0.35052  
##  Mean   :-0.01067   Mean   :0.167176   Mean   :0.36176  
##  3rd Qu.: 0.34147   3rd Qu.:0.239396   3rd Qu.:0.48928  
##  Max.   : 0.55940   Max.   :0.659555   Max.   :0.81213

For the ordered logit we must consider the three outcomes distinctly; for the home win:

forecast.outcomes$error.h <- forecast.outcomes$outcome.final - forecast.outcomes$Ph
forecast.outcomes$error.d <- as.numeric(forecast.outcomes$outcome.final==0.5) - forecast.outcomes$Pd
forecast.outcomes$error.a <- as.numeric(forecast.outcomes$outcome.final==0) - forecast.outcomes$Pa
forecast.outcomes$error.h2 <- forecast.outcomes$error.h^2
forecast.outcomes$error.d2 <- forecast.outcomes$error.d^2
forecast.outcomes$error.a2 <- forecast.outcomes$error.a^2
forecast.outcomes$aerror.h <- abs(forecast.outcomes$error.h)
forecast.outcomes$aerror.d <- abs(forecast.outcomes$error.d)
forecast.outcomes$aerror.a <- abs(forecast.outcomes$error.a)
#summary(forecast.outcomes[forecast.outcomes$tier<=5,c("error","error2","aerror")])
summary(forecast.outcomes[,c("error.h","error.d","error.a")])
##     error.h           error.d            error.a        
##  Min.   :-0.7130   Min.   :-0.28876   Min.   :-0.61849  
##  1st Qu.:-0.2442   1st Qu.:-0.28159   1st Qu.:-0.26300  
##  Median : 0.1746   Median :-0.24986   Median :-0.18951  
##  Mean   : 0.1204   Mean   :-0.02964   Mean   : 0.02119  
##  3rd Qu.: 0.4687   3rd Qu.:-0.17142   3rd Qu.: 0.51228  
##  Max.   : 0.7147   Max.   : 0.81751   Max.   : 0.89081

Considering here just the mean errors, the forecasts for home wins were biased downward much more than those for either the draw or away win; a positive forecast error suggests that the event occurs more often than the model predicts.  Until we consider more forecasts, it is difficult to say whether this is simply an artifact of one particular week.

We can consider also, by division, forecast errors for our linear regression model:

library(knitr)
aggs <- aggregate(forecast.outcomes[,c("error","error2","aerror")],
          by=list(forecast.outcomes$division),FUN=mean,na.rm=T)
kable(aggs[c(4,1,2,3,5),])
Group.1 error error2 aerror
4 English League One 0.1322228 0.1725146 0.3790621
1 Conference North 0.2917426 0.0892911 0.2917426
2 Conference South -0.4457166 0.1986633 0.4457166
3 English Championship -0.0800856 0.2079279 0.4119031
5 English League Two -0.0530585 0.1516130 0.3331959

The error column is the mean forecast error, the error2 column is the mean squared forecast error, and the aerror column is the absolute forecast error.

We do the same for each outcome for the ordered probit model:

library(knitr)
aggs <- aggregate(forecast.outcomes[,c("error.h","error.h2","aerror.h")],
          by=list(forecast.outcomes$division),FUN=mean,na.rm=T)
colnames(aggs) <- gsub("Group.1","Home win",colnames(aggs))
kable(aggs[c(4,1,2,3,5),])
Home win error.h error.h2 aerror.h
4 English League One 0.2708243 0.2292139 0.4302124
1 Conference North 0.4106581 0.1743430 0.4106581
2 Conference South -0.2865024 0.0820836 0.2865024
3 English Championship 0.0454024 0.2058031 0.4169893
5 English League Two 0.0886337 0.1584265 0.3309699
aggs <- aggregate(forecast.outcomes[,c("error.d","error.d2","aerror.d")],
          by=list(forecast.outcomes$division),FUN=mean,na.rm=T)
colnames(aggs) <- gsub("Group.1","Draw",colnames(aggs))
kable(aggs[c(4,1,2,3,5),])
Draw error.d error.d2 aerror.d
4 English League One -0.0195968 0.1969545 0.3932748
1 Conference North -0.2322573 0.0548560 0.2322573
2 Conference South -0.2821696 0.0796197 0.2821696
3 English Championship 0.0048305 0.1980514 0.3817344
5 English League Two 0.0668038 0.2246139 0.4196360
aggs <- aggregate(forecast.outcomes[,c("error.a","error.a2","aerror.a")],
          by=list(forecast.outcomes$division),FUN=mean,na.rm=T)
colnames(aggs) <- gsub("Group.1","Away win",colnames(aggs))
kable(aggs[c(4,1,2,3,5),])
Away win error.a error.a2 aerror.a
4 English League One -0.1262275 0.1533157 0.3521596
1 Conference North -0.1784007 0.0338797 0.1784007
2 Conference South 0.5686719 0.3233878 0.5686719
3 English Championship 0.0747670 0.2522646 0.4251404
5 English League Two 0.0112292 0.1917875 0.3960793

We can also look at errors across weeks:

all.forecast.outcomes$error.h <- all.forecast.outcomes$outcome.final - all.forecast.outcomes$Ph
all.forecast.outcomes$error.d <- as.numeric(all.forecast.outcomes$outcome.final==0.5) - all.forecast.outcomes$Pd
all.forecast.outcomes$error.a <- as.numeric(all.forecast.outcomes$outcome.final==0) - all.forecast.outcomes$Pa
all.forecast.outcomes$error.h2 <- all.forecast.outcomes$error.h^2
all.forecast.outcomes$error.d2 <- all.forecast.outcomes$error.d^2
all.forecast.outcomes$error.a2 <- all.forecast.outcomes$error.a^2
all.forecast.outcomes$aerror.h <- abs(all.forecast.outcomes$error.h)
all.forecast.outcomes$aerror.d <- abs(all.forecast.outcomes$error.d)
all.forecast.outcomes$aerror.a <- abs(all.forecast.outcomes$error.a)

aggs.h <- aggregate(all.forecast.outcomes[,c("error.h","error.h2","aerror.h")],
          by=list(all.forecast.outcomes$forc.week),FUN=mean,na.rm=T)
aggs.d <- aggregate(all.forecast.outcomes[,c("error.d","error.d2","aerror.d")],
          by=list(all.forecast.outcomes$forc.week),FUN=mean,na.rm=T)
aggs.a <- aggregate(all.forecast.outcomes[,c("error.a","error.a2","aerror.a")],
          by=list(all.forecast.outcomes$forc.week),FUN=mean,na.rm=T)
plot(as.Date(aggs.h$Group.1),aggs.h$error.h,type="o",main="Forecast Errors Each Week",
     ylab="Forecast Error",xlab="Date",
     ylim=range(c(aggs.h$error.h,aggs.d$error.d,aggs.a$error.a),na.rm=T),col="red")
lines(as.Date(aggs.d$Group.1),aggs.d$error.d,type="o",col="green")
lines(as.Date(aggs.a$Group.1),aggs.a$error.a,type="o",col="blue")

plot(as.Date(aggs.h$Group.1),aggs.h$error.h2,type="o",main="Squared Forecast Errors Each Week",
     ylab="Forecast Error",xlab="Date",
     ylim=range(c(aggs.h$error.h2,aggs.d$error.d2,aggs.a$error.a2),na.rm=T),col="red")
lines(as.Date(aggs.d$Group.1),aggs.d$error.d2,type="o",col="green")
lines(as.Date(aggs.a$Group.1),aggs.a$error.a2,type="o",col="blue")

plot(as.Date(aggs.h$Group.1),aggs.h$aerror.h,type="o",main="Absolute Forecast Errors Each Week",
     ylab="Forecast Error",xlab="Date",
     ylim=range(c(aggs.h$aerror.h,aggs.d$aerror.d,aggs.a$aerror.a),na.rm=T),col="red")
lines(as.Date(aggs.d$Group.1),aggs.d$aerror.d,type="o",col="green")
lines(as.Date(aggs.a$Group.1),aggs.a$aerror.a,type="o",col="blue")

Finally, we list all the forecasts again with outcomes:

kable(forecast.outcomes[order(forecast.outcomes$date,forecast.outcomes$division),
                       c("date","division","team1","goals1","goals2","team2",
                         "outcome.forc","Ph","Pd","Pa","outcome.final","error","error2","aerror")],digits=3)
date division team1 goals1 goals2 team2 outcome.forc Ph Pd Pa outcome.final error error2 aerror
11 2015-03-20 English Championship Wolves 2 0 Derby 0.668 0.543 0.253 0.204 1.0 0.332 0.110 0.332
60 2015-03-20 Football Conference Bristol R 3 1 Aldershot 0.765 0.659 0.205 0.136 1.0 0.235 0.055 0.235
63 2015-03-21 Conference North Tamworth 3 1 Hednesford 0.644 0.514 0.262 0.224 1.0 0.356 0.127 0.356
64 2015-03-21 Conference North Boston Utd 2 1 Gainsborough 0.773 0.665 0.202 0.133 1.0 0.227 0.052 0.227
62 2015-03-21 Conference South Farnborough 2 7 Bath City 0.446 0.287 0.282 0.431 0.0 -0.446 0.199 0.446
12 2015-03-21 English Championship Bournemouth 3 0 Middlesbro 0.673 0.552 0.250 0.198 1.0 0.327 0.107 0.327
13 2015-03-21 English Championship Huddersfield 0 2 Fulham 0.506 0.352 0.289 0.359 0.0 -0.506 0.256 0.506
14 2015-03-21 English Championship Rotherham 2 3 Sheff Wed 0.490 0.333 0.288 0.379 0.0 -0.490 0.240 0.490
15 2015-03-21 English Championship Blackburn 0 1 Brighton 0.743 0.628 0.219 0.153 0.0 -0.743 0.552 0.743
16 2015-03-21 English Championship Charlton 3 2 Reading 0.710 0.596 0.233 0.171 1.0 0.290 0.084 0.290
17 2015-03-21 English Championship Watford 0 1 Ipswich 0.812 0.713 0.178 0.109 0.0 -0.812 0.660 0.812
18 2015-03-21 English Championship Cardiff 2 0 Birmingham 0.554 0.407 0.286 0.307 1.0 0.446 0.199 0.446
19 2015-03-21 English Championship Wigan 1 1 Bolton 0.606 0.468 0.275 0.257 0.5 -0.106 0.011 0.106
20 2015-03-21 English Championship Norwich 3 1 Nottm Forest 0.627 0.502 0.266 0.232 1.0 0.373 0.139 0.373
21 2015-03-21 English Championship Brentford 2 2 Millwall 0.794 0.704 0.182 0.114 0.5 -0.294 0.086 0.294
22 2015-03-21 English Championship Blackpool 1 1 Leeds 0.277 0.158 0.223 0.618 0.5 0.223 0.050 0.223
23 2015-03-21 English League One Barnsley 1 1 Preston 0.484 0.329 0.288 0.384 0.5 0.016 0.000 0.016
24 2015-03-21 English League One Sheff Utd 1 0 Port Vale 0.634 0.503 0.266 0.232 1.0 0.366 0.134 0.366
26 2015-03-21 English League One Crawley 1 0 Leyton Orient 0.512 0.355 0.289 0.357 1.0 0.488 0.238 0.488
27 2015-03-21 English League One Gillingham 2 2 Colchester 0.778 0.671 0.199 0.130 0.5 -0.278 0.077 0.278
28 2015-03-21 English League One Bradford 2 2 Fleetwood 0.618 0.484 0.271 0.245 0.5 -0.118 0.014 0.118
29 2015-03-21 English League One Crewe 0 1 Oldham 0.581 0.436 0.281 0.283 0.0 -0.581 0.337 0.581
30 2015-03-21 English League One Peterborough 1 0 Chesterfield 0.649 0.518 0.261 0.221 1.0 0.351 0.123 0.351
31 2015-03-21 English League One Coventry 1 3 Doncaster 0.504 0.350 0.289 0.362 0.0 -0.504 0.254 0.504
32 2015-03-21 English League One Rochdale 3 1 Scunthorpe 0.670 0.555 0.249 0.196 1.0 0.330 0.109 0.330
33 2015-03-21 English League One MK Dons 4 1 Notts Co 0.441 0.302 0.285 0.413 1.0 0.559 0.313 0.559
34 2015-03-21 English League Two Plymouth 0 0 Newport Co 0.529 0.379 0.288 0.333 0.5 -0.029 0.001 0.029
35 2015-03-21 English League Two AFC W’bledon 1 0 Portsmouth 0.481 0.322 0.287 0.390 1.0 0.519 0.270 0.519
36 2015-03-21 English League Two Shrewsbury 2 0 Oxford 0.737 0.630 0.218 0.152 1.0 0.263 0.069 0.263
37 2015-03-21 English League Two Tranmere 1 4 Burton 0.350 0.209 0.255 0.536 0.0 -0.350 0.122 0.350
38 2015-03-21 English League Two Hartlepool 1 0 Mansfield 0.447 0.287 0.282 0.431 1.0 0.553 0.306 0.553
39 2015-03-21 English League Two Cheltenham 1 2 Exeter 0.478 0.321 0.287 0.391 0.0 -0.478 0.228 0.478
40 2015-03-21 English League Two Southend 0 0 Cambridge U 0.690 0.564 0.245 0.191 0.5 -0.190 0.036 0.190
41 2015-03-21 English League Two Carlisle 1 1 Morecambe 0.651 0.521 0.260 0.219 0.5 -0.151 0.023 0.151
42 2015-03-21 English League Two Bury 2 1 Northampton 0.668 0.538 0.255 0.207 1.0 0.332 0.110 0.332
44 2015-03-21 English League Two Accrington 2 2 York 0.487 0.325 0.288 0.387 0.5 0.013 0.000 0.013
45 2015-03-21 English League Two Stevenage 0 1 Dag & Red 0.674 0.552 0.250 0.198 0.0 -0.674 0.455 0.674
3 2015-03-21 English Premier Stoke 1 2 C Palace 0.619 0.482 0.271 0.247 0.0 -0.619 0.383 0.619
4 2015-03-21 English Premier Newcastle 1 2 Arsenal 0.301 0.173 0.234 0.593 0.0 -0.301 0.090 0.301
5 2015-03-21 English Premier Southampton 2 0 Burnley 0.779 0.685 0.192 0.123 1.0 0.221 0.049 0.221
6 2015-03-21 English Premier Tottenham 4 3 Leicester 0.782 0.679 0.195 0.126 1.0 0.218 0.047 0.218
7 2015-03-21 English Premier Man City 3 0 West Brom 0.689 0.587 0.237 0.177 1.0 0.311 0.097 0.311
8 2015-03-21 English Premier Aston Villa 0 1 Swansea 0.578 0.434 0.282 0.285 0.0 -0.578 0.334 0.578
10 2015-03-21 English Premier West Ham 1 0 Sunderland 0.647 0.524 0.259 0.217 1.0 0.353 0.125 0.353
48 2015-03-21 Football Conference Wrexham 1 1 Lincoln 0.701 0.584 0.238 0.178 0.5 -0.201 0.040 0.201
50 2015-03-21 Football Conference Torquay 2 1 Kidderminster 0.596 0.458 0.277 0.265 1.0 0.404 0.163 0.404
51 2015-03-21 Football Conference Barnet 5 0 Welling 0.849 0.762 0.151 0.087 1.0 0.151 0.023 0.151
53 2015-03-21 Football Conference Macclesfield 0 1 Nuneaton 0.737 0.643 0.212 0.145 0.0 -0.737 0.544 0.737
54 2015-03-21 Football Conference Alfreton 1 1 Chester 0.587 0.447 0.279 0.273 0.5 -0.087 0.008 0.087
55 2015-03-21 Football Conference Dover 1 0 Gateshead 0.541 0.393 0.287 0.320 1.0 0.459 0.211 0.459
56 2015-03-21 Football Conference Grimsby 2 1 Eastleigh 0.635 0.503 0.266 0.231 1.0 0.365 0.133 0.365
57 2015-03-21 Football Conference Woking 1 0 Forest Green 0.549 0.402 0.286 0.312 1.0 0.451 0.203 0.451
59 2015-03-21 Football Conference Southport 2 0 Dartford 0.685 0.562 0.246 0.192 1.0 0.315 0.099 0.315
61 2015-03-21 Football Conference Altrincham 0 0 Halifax 0.515 0.363 0.289 0.349 0.5 -0.015 0.000 0.015
1 2015-03-22 English Premier QPR 1 2 Everton 0.464 0.307 0.286 0.407 0.0 -0.464 0.215 0.464
2 2015-03-22 English Premier Hull 2 3 Chelsea 0.274 0.156 0.222 0.622 0.0 -0.274 0.075 0.274
9 2015-03-22 English Premier Liverpool 1 2 Man Utd 0.605 0.464 0.276 0.260 0.0 -0.605 0.366 0.605
66 2015-03-22 JP Trophy Bristol C 2 0 Walsall 0.828 0.737 0.165 0.098 1.0 0.172 0.030 0.172
25 2015-03-24 English League One Oldham 3 0 Rochdale 0.442 0.285 0.282 0.433 1.0 0.558 0.311 0.558
65 2015-03-24 English League One Sheff Utd 4 0 Scunthorpe 0.601 0.463 0.276 0.261 1.0 0.399 0.159 0.399
43 2015-03-24 English League Two Luton 2 3 Wycombe 0.446 0.287 0.282 0.430 0.0 -0.446 0.199 0.446
46 2015-03-24 Football Conference Braintree 0 2 Telford 0.752 0.652 0.208 0.140 0.0 -0.752 0.565 0.752
47 2015-03-24 Football Conference Alfreton 0 0 Dartford 0.688 0.566 0.245 0.190 0.5 -0.188 0.035 0.188
49 2015-03-24 Football Conference Dover 0 1 Grimsby 0.436 0.280 0.281 0.439 0.0 -0.436 0.190 0.436
52 2015-03-24 Football Conference Woking 3 2 Torquay 0.767 0.662 0.203 0.135 1.0 0.233 0.054 0.233
58 2015-03-24 Football Conference Nuneaton 2 0 Wrexham 0.450 0.294 0.284 0.423 1.0 0.550 0.303 0.550
67 2015-03-24 Football Conference Halifax 2 2 Gateshead 0.523 0.370 0.288 0.341 0.5 -0.023 0.001 0.023