Goal Bandit First Analysis

Summary

There were 4 between subjects conditions. Two factors were varied: The environment and whether there was a goal, i.e. a minimum number of points, to be reached or not. The four conditions were

LN - high EV is low variance, no goal
HN - high EV is high variance, no goal
LG - high EV is low variance, goal
HG - high EV is high variance, goal

The high EV is low variance environment had the following means and sds:

Means and sds of the high EV is low variance environment
Mean	SD
3	2
2	2.5
2	5

The high EV is high variance environment had the following means and sds:

Means and sds of the high EV is high variance environment
Mean	SD
3	5
2	2.5
2	2

The goal, if there was one, was always 135 points. Each game contained of 50 trials.

Findings:

The proportion of risky options chosen over games were higher in the high EV is high variance environment than in the high EV is low variance environment.
The proportion of risky options chosen over games tends to peek at the end of the later games for the high EV is low variance options. This peek tends to be stronger for the goal condition.
Median response times over the games tend to be higher in the high EV is high variance conditions (HN and HG).
Median response times over the games tend to be slightly U-shaped.
The goal was more often reached in the high EV is low variance environment (condition LG, 34.8 %) than in the high EV is high variance environment (condition HG, 26.8%). Further analyses need to specify why this is the case (maybe more hot stove effects in condition HG?).
Switch rates in high EV is low variance environments tend to be higher in the no goal vs. the goal condition (conditions LN and LG). Switch rates in high EV is high variance environments tend to be lower in the no goal vs. the goal condition (conditions HN and HG).
Switch rates in the goal conditions were higher when participants hadn’t yet reached the goal vs. when they were over the goal.
Switch rates decrease from the first to last games.
Switch rates peek in the last ten trials.
In the high EV is low variance goal condition (LG), the proportion of risky options chosen decreased when the goal was reached. In the high EV is high variance goal condition (HG), the proportion of risky options chosen increased when the goal was reached.
The high EV is low variance, goal condition (LG), has the highest proportion of high EV option chosen.
The proportion of high EV options chosen drops in the last ten trials over all conditions in most games.

Non Findings:

The proportion of risky options chosen didn’t differ systematically for no goal vs. goal. The only systematic differences found were between the different environments.
The proportion of risky options chosen over the games most of the time didn’t differ systematically over conditions.
Point earnings don’t systematically differ over conditions.

Summary Table

Note: X = no effect; maybe = maybe there’s an effect (normally judged by eye); Yes = there pretty sure is an effect (mostly judged by statistics).

Data Cleaning

First only data from participants that did not expierience a crash in one of the study parts was used. Then data form two participants that indicated that we may not want to trust their data for scientific research was excluded. Then the behavior in the games was checked, i.e. if the response times looked reasonable and if the choice patterns were reasonable (i.e. that there was at least some exploration/ option switching within games). Five participants were excluded because they had switching rates of 0 (never switched between options) in 3 or more games. Four participants were excluded because they had switching rates of 1 (switched after each option) in 3 or more games.

Survey Data

Data of a total of 95 participants was analyzed. An overview can be found in the table below.

Descriptive overview
Condition	n	mean age	% female
condition 1	23	33.6	83
condition 2	26	34.0	73
condition 3	23	33.3	83
condition 4	23	31.7	83

Participants had to rate how easy they had found it to earn points in the games (on a scale from 1- extremely easy to 5 - extremely difficult). A test showed, that participants in the no goal conditions (LN and HN) found it easier to earn points than participants in the goal conditions (LG and HG).

Bayesian Test of game difficulty of the no goal vs. goal condition.

Game Data

The pirateplot above shows the sum of the points of all games, separated per condition. We can see that point values for the high EV is low variance (conditions LN and LG) tend to be a bit higher than the ones of the other conditions.

The first pirateplot below shows the proportion of high variance options chosen over all games separated per conditions. The second pirateplot shows the proportion of low variance options chosen. The two are not complementary because there was a third option with a variance between the two.

Tests revealed that there were no significant differences between the no goal vs. goal conditions

How do the distributions of number of games with goal reached look for the conditions three and four?

Bayesian test of the distributions of proportion of games with goals reached for conditions LG and HG.

The distributions look relatively normal. We can see that condition LG has a higher rate of games in which the goal was reached. Note that the practice game was excluded here because everyone reached the goal there.

Next let’s check the course over the different games, for each condition separate, for the probability of choosing the risky option, the time used to make a decision (response time) and the cummulative points.

Do participants of different conditions have different curves of probability of choosing the high variance option over the game?

To check if strategies of picking the options changed over time and if this differs over the conditions, the curves of the proportion of high variance option chosen is shown in the graphs below for each game separate.

There don’t seem to be very large differences. Interestingly participants in conditions HN and HG (high EV is high variance, no goal vs. goal) have probabilities lower than .5 all the time which might indicate that they have problems learning that the high variance option yields the highest reward over time. Also, it is interesting to see that from game 2 on the probability of choosing the high variance option rises at the end of the game. We will need to figure out if this is only a trend or a reliable pattern.

Do participants of different conditions have different curves of median response times over the game?

Another dimension in which participants of the different conditions might differ is the response times. It may well be that in the goal conditions response times go up towards the end of the game. Now response times have the inconvenient charateristic that they are not normal distributed and usuitable for most statistical analyses. Still I just used the median to plot the curves…

The two high EV is high variance conditions (HN and HG) seem to have, tendentiously, higher response times. The response times for the goal condition at least sometimes go up slightly at the end of the game but the same is true for the no goal condition (LN). So also here no big difference between no goal vs. goal condition can be found.

Do participants of different conditions have different point earnings over the game?

If in the different conditions different strategies are used, this might lead to differences in the number of points gained over the games. To find out if this is true the plots below show the cummulative points separate for the games.

Point earnings don’t seem to differ systematically over conditions. This was confirmed in an ANOVA. The only difference was, unsurprisingly, between game 1 and the other games, because game 1 had fewer trials. The difference in points earned between the participants in the high EV is low variance conditions (LN and LG) and the high EV is high variance conditions (HN and HG) was not significant.

What was the percentage of goals reached in conditions 2 and 4?

Goals were reached in 34.8% of the games in condition LG and in 26.8% of the games in condition HG, indicating that the goal was relatively hard to reach but not impossible, i.e. most participants should have reached the goal at least once in the study.

Analyses of option switching

Another important aspect in which behavior might differ over conditions is exploration, i.e. how often the options were switched. The pirateplots below show the switch rates over condition and games for all trials and only the last 10 trials. Switchrates tend to be higher towards the end of the game, which from a rational point of view doesn’t make much sense. People, optimaly (at least in no goal conditions), should learn about the options and then stay with what they think yields the highest mean outcome.

Switching rates over all trials between condition LN, M = .39, and LG, M = .25 (high EV is low variance, no goal vs. goal) differed significantly \(t\)(206.65) = 3.75, p < .001. The same was true when only the last ten trials were considered: condition LN, M = .40, and LG, M = .28, \(t\)(212.3) = 3.24, p < .01.

Switching rates over all trials between condition HN, M = .25, and HG, M = .35 (high EV is low variance, no goal vs. goal) differed significantly \(t\)(215.09) = -2.62, p < .01. The same was true when only the last ten trials were considered: condition HN, M = .30, and HG, M = .39, \(t\)(227.37) = -2.18, p < .05.

Interestingly the effect for high EV is low variance (condition LN, no goal, had higher switching rates than condition LG with goal) was in the opposite direction than the effect for high EV is high variance (condition HN, no goal, had lower switching rates than condition HG with goal).

Now let’s look only at the last 10 trials in the goal conditions. Switch rates differed whether participants were over the goal or not:

Switch rates for the LG and HG condition in the last 10 trials separated for whether they were over or under the goal.
Condition	Under Goal	Over Goal
LG	.29	.26
HG	.40	.31

In both cases the switch rate is at least slightly higher when participants are under the goal.

The plots below show the option switch rates curves over the games. We can see that exploration often peeks near the end of the game.

When we look at the proportion of high variance option chosen (only last 10 trials are considered) separated for whether participants were over or under the goal, we can see that in condition LG (high EV is low variance) the proportion of high variance option chosen decreases after the goal was reached but increased slighty in condition HG (high EV is high varance).

Proportion of high variance option chosen over the last 10 trials chosen for the LG and HG condition separated for whether they were over or under the goal.
Condition	Under Goal	Over Goal
LG	.24	.19
HG	.40	.59

Analyses of proportion of high EV chosen

Finally let’s look at the proportion of high EV chosen over the games separated for conditions.

Over all, participants in the high EV is low variance, goal condition (LG) tend to choose the high EV option most often. The difference in proportion high EV option chosen between the high EV is low variance no goal, M = .45, vs. goal, M = .57, conditions (LN vs. LG) is significant, \(t\)() = , p < .001. The difference between conditions HN and HG is not significant. Interestingly, in the last ten trials there is a drop in the proportion of high EV option chosen in all conditions over most games.

If we look only at the goal conditions and check if the proportion of high EV is different whether participants are under vs. over the goal, we see, that when they are over the goal they more often choose the high EV option. The numbers for condition HG are the same as above in the proportion of high variance option chosen table, because in condition HG the high EV option was also the high variance option.

Proportion of high EV chosen over the last 10 trials for the LG and HG condition separated for whether they were over or under the goal.
Condition	Under Goal	Over Goal
LG	.54	.64
HG	.40	.59

Was the proportion of high variance option chosen higher when it was rational to do so vs. when it was not (only LG option)?

As it seems, if people are under the goal, they tend to choose the high variance option more often. This seems to be more the case when they are closer to the end of a game (there we actually have evidence for it).

Primary Analysis Goal Bandit

Markus Steiner

6 March 2017