Plus/minus what? Let's talk about uncertainty

Sigrid Keydana, Trivadis
2017/22/11

About me & my employer

 

Trivadis

  • DACH-based IT consulting and service company, from traditional technologies to big data/machine learning/data science

My background

  • from psychology/statistics via software development and database engineering to data science and ML/DL

My passion

  • machine learning and deep learning
  • data science and (Bayesian) statistics
  • explanation/understanding over prediction accuracy

Where to find me

 

 

In this world nothing can be said to be certain, except death and taxes

Welcome to everything else!

 

  • How many super cool iphone lookalikes will we sell next quarter?
  • When should we invest in more powerful servers?
  • How many boxes of super healthy energy bars should we keep in stock?
  • How long does it take to run this batch job?
  • How much time do you need for that report?

 

 

Our job: sports forecasting

Our task today: forecast men's 400m Olympic winning times

 

It's 2000, just before the Olympics. This is the data we have:

plot of chunk unnamed-chunk-2

 

Expected time in 2000: 42.3 seconds? Or 42.1?

 

 

Linear regression says 42.33.

Whatever we say, it's pretty likely to be wrong…

How do we deal with this?

 

 

Let's better not commit to a point estimate…

Prediction intervals to the rescue!