The Konstanz Game

Dr. Nathaniel Phillips
23 February 2014

Exploration and Exploitation

Many real world decision tasks require a trade-off between information search (exploration) and acting on known information (exploitation)

  • Mate search
  • Restaurant search
  • Parking

Exploration and Exploitation

  • Trade-off: if you search too long, you miss out on reaping benefits from better options. If you search too little, you may choose poor options.
  • Many paradigms have been used to study exploration-exploitation

The Iowa Gambling Task (IGT)

  • Iowa Gambling Task (Bechara et al., 1994)
  • 4 options (2 good and 2 bad)
  • Fixed number of trials, rewards/punishments on every trial
  • Yechiam et al. (2005) behavior is explained by three processes (using the expectancy valence model, Busemeyer & Stout, 2002): Attention to losses/gains, learning, and choice consistency.
  • Findings: Healthy individuals do well, brain damaged individuals have deficits in specific processes.

"Sampling Paradigm"

  • Sampling paradigm of decisions from experience (Hertwig et al, 2004)
  • 2 options.
  • Unlimited number of 'free' samples from options without cost or reward.
  • When the player is ready, he/she can exploit one option for real rewards.
  • Findings: People do not search very long (around 10 samples per option)

Observe or bet Task

  • Observe or bet Task (Navarro & Newell, 2014)
  • 2 binary options.
  • On each trial, one option gives a reward and the other gives a loss.
  • Over a fixed number of trials, the participant can either “observe” an option and see an outcome, or 'bet' on an option and receive the outcome, but without seeing the outcome.
  • Finding: In a changing environment, people are more likely to switch between observing and betting (consistent with Bayesian optimal model)

* There is an experimental gap

  • The IGT assumes that people are always being rewarded or punished on every trial, with no possibility for 'free' and inconsequential search. Exploration is not qualitatively different from exploitation.
  • The Sampling paradigm assumes that people conduct all search prior to a single trial of exploitation.
  • The observe or bet task assumes that one cannot learn during exploitation trials, this seems a bit silly. Also no rare events

Why do we need *another* experimental task?

  • Many important real world decisions allow people to alternate between explicit exploration and explicit exploitation while always learning something about options.
    • Observing vs. investing in stocks
    • Hearing about vacations from friends without actually going on one yourself.
  • Many important real world tasks allow people the opportunity to 'wait' until the 'iron is hot' then strike on an option at it's peak moment
    • Stocks
    • Turning onto a street (?)
  • None of the current paradigms allow us to study how people manage these tasks.

The Konstanz Game

Introducing The Konstanz Game (KG)!

  • A combination of the IGT, the sampling paradigm, and observe/bet task
  • 4 (or more options), each with an unknown underlying distribution.
  • On each trial, a participant can select an option and either observe or bet (reap) on the outcome.
    • Observing is like hearing from a friend how a trip/restaurant/concert was without actually going.
  • After all (e.g.; N = 100) trials are over, the player is paid the sum of the outcomes of his/her bet trials.

Research questions

  • Do people begin with an extended phase of observation and then permenently swich to exploitation? Or do they constantly switch back and forth between observation and exploitation?

    • Does this happen both within and between options?
  • In changing environments, are people more likely to alternate between observation and exploitation?

  • Do people perceive patterns when they do and when they do not exist?

    • e.g.; Gambler's fallacy
  • If people compete with others in the task, do they change their degree of exploration / exploitation?

  • If new options appear during search, how long do people explore them before (potentially) reverting back to expoiting known options?

Young Scholar Fund (YSF) grant proposal

  • 9,500EU over 2 years

    • 6,000EU to pay participants over four studies
    • 3,500EU for conference travel
  • Four studies proposed over two years

    • Study 1: Simple version of game using established gamble sets from IGT and Sampling Paradigm.
      • What is the overall level of observation and exploitation? How much variability is there between participants?
    • Study 2: The effects of gamble risk on exploration.
      • If gambles have extreme rare events, do people increase their level of observation?
    • Study 3: Non-stationary gambles.
      • If gambles are known to change over time, are people more likely to observe previously discarded options?
    • Study 4: Autocorrelations (patterns) in gambles.
      • If there are detectable patterns in options, will people pick up on them? Or will people be more likely to develop false alarms?