Ranges after averaging by participant

ratioByParticipant <- trust2_all %>% 
  dplyr::select(starts_with("ratio"),ID) %>% #Keep just the ratio columns 
  gather(-(ID), key = "condition", value = "ratio") %>%
  group_by(ID) %>%
  summarise(meanRatio = mean(ratio, na.rm=TRUE))
summary(ratioByParticipant$meanRatio)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.01361 0.18487 0.23601 0.25089 0.30903 0.98669
sd(ratioByParticipant$meanRatio)
## [1] 0.1421434
ggplot(ratioByParticipant, aes(meanRatio)) +
  geom_histogram(binwidth = .01)+
  xlim(0,1)

ggplot(ratioByParticipant, aes(meanRatio)) +
  geom_density()+
  xlim(0,1)

range(ratioByParticipant$meanRatio, na.rm=TRUE)
## [1] 0.01360643 0.98668981
quantile(ratioByParticipant$meanRatio, c(.33, .66), na.rm=TRUE)
##       33%       66% 
## 0.2038655 0.2841788

Therefore, based on this distribution, the reputation ranges would be:

Low Reputation: 0.01 to 0.20 (1-20%)

Moderate Reputation: 0.21 to 0.28 (21-28%)

High Reputation: 0.29 to 0.99 (29-99%)

Capping outliers

However, as Michael mentioned, this creates a High Reputation range that is disproportionally wide (29-99%), so it might make more sense to cap the outliers, fit a normal distribution over this capped data, and then select ranges from this fitted normal distribution.

So we will “cap” the outliers by replacing those observations outside the lower limit with the value of 5th %ile and those that lie above the upper limit, with the value of 95th %ile.

Capped_Data <- ratioByParticipant
qnt <- quantile(Capped_Data$meanRatio, probs=c(.25, .75), na.rm = T)
caps <- quantile(Capped_Data$meanRatio, probs=c(.05, .95), na.rm = T)
H <- 1.5 * IQR(Capped_Data$meanRatio, na.rm = T)
Capped_Data$meanRatio[Capped_Data$meanRatio < (qnt[1] - H)] <- caps[1]
Capped_Data$meanRatio[Capped_Data$meanRatio > (qnt[2] + H)] <- caps[2]

Now let’s see what this new distribution looks like:

ggplot(Capped_Data, aes(meanRatio)) +
  geom_histogram(binwidth = .01)+
  xlim(0,1)

Our new range is 0.0136064 - 0.4930652.

I will now fit a normal distribution over this capped data:

# Fit normal distribution to log-transformed values
library(MASS)
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
fit <- fitdistr(Capped_Data$meanRatio, densfun = "normal")

# Plot density histogram and fitted distribution
ggplot(data = Capped_Data) +
  geom_histogram(binwidth = .01, mapping = aes(x = meanRatio, y = ..density..), col="white") +
  stat_function(fun = dnorm, 
      args = list(mean = fit$estimate[1], sd = fit$estimate[2], log = F), 
      color="red", lwd=1)

Therefore, the mean of our fitted normal distribution is 0.2415925 and the SD is 0.1104805.

From this, we could bin according to tertiles of this new normal distribution:

qnorm(c(.33, .66), fit$estimate[1], fit$estimate[2])
## [1] 0.1929907 0.2871617

Therefore, the updated reputation ranges would be:

Low Reputation: 0.01 to 0.19 (1-19%)

Moderate Reputation: 0.20 to 0.29 (20-29%)

High Reputation: 0.30 to 0.49 (30-49%)

Update (December 14, 2018):

We need to find likelihood Player 2 returned money based on the amount endowed. First, setting up the data so that we can represent the amount Player 2 received from Player 1 ($3, $6, $9, and $12, after tripling) as a condition.

ratioByEndowment <- trust2_all %>% 
  dplyr::select(starts_with("ratio"),ID) %>% #Keep just the ratio columns 
  dplyr::select(matches('3|6|9|12'),ID) %>% #Just look at the 4 endowment conditions
  gather(-(ID), key = "condition", value = "ratio") %>%
  mutate(ReturnedOrNot = as.numeric(ratio>0)) %>% #If ratio is greater than 0, new variable ReturnedOrNot will be 1, if returned 0, new variable is 0.
  mutate(Endowment=parse_number(condition))
## Warning: package 'bindrcpp' was built under R version 3.4.4
ForPlotting  <- ratioByEndowment %>%
  group_by(Endowment) %>%
  summarise(ReturnedAtAll = sum(ReturnedOrNot, na.rm=TRUE)/n())

ForPlotting
## # A tibble: 4 x 2
##   Endowment ReturnedAtAll
##       <dbl>         <dbl>
## 1        3.         0.307
## 2        6.         0.541
## 3        9.         0.550
## 4       12.         0.546

So here I’m plotting the likelihood Player 2 gave back (at all) as a function of the Endowment (3, 6, 9, 12 dollars).

ggplot(ForPlotting, aes(x = Endowment, y = ReturnedAtAll)) +
  geom_bar(stat = "identity")