Couldn’t We All Just Get Along?

Which is more contentious – the battle of the sexes or the battle of the races?

Exploring the divisions or unity within our union, the great United States of America, using data from game show “Friend or Foe?”

“Friend or Foe?”" is an American game show based on knowledge and trust which aired on Game Show Network. Teams of two strangers attempted to persuade their partner into sharing their accumulated winnings rather than stealing it for themselves. The show premiered June 3, 2002, and aired for two seasons totaling 130 episodes.
(Source: https://en.wikipedia.org/wiki/Friend_or_Foe%3F_(TV_series))

Rules of “Friend or Foe”

The main game is played in two rounds. In each round, host Kennedy asks a series of four multiple-choice questions. On each question, the teammates have 15 seconds to agree on an answer and simultaneously lock it in on separate keypads. Correct answers add 500 to the trust fund in round one, and 1,000 in round two.

At the end of each round, the team with the lowest total funds is eliminated and must go to the “Trust Box” to determine the fate of their money.

The Trust Box presents the teammates with a variation of the prisoner’s dilemma. Each contestant attempts to persuade the other to trust him or her, after which they secretly vote “friend” or “foe.”

If both vote “friend,” they split the trust fund evenly.

If one votes “friend” and the other “foe,” the foe collects the entire trust fund and the friend receives nothing.

If both vote “foe,” neither contestant wins any money.

Meaningful Question for Analysis

In this project, I aim to analyze the “Friend or Foe”" game show data to answer 2 questions:

Do gender differences and/or race differences between partnered contestants have any correlation with contestants’ decision to choose “foe”? and
Which of gender or race differences had a higher correlation with the decision to choose “foe”?

Here’s where I read in the data from github and display a few rows of data:

TheURL <- "https://raw.githubusercontent.com/vincentarelbundock/Rdatasets/master/csv/Ecdat/FriendFoe.csv"
ffdata <- read.table(file=TheURL, header=TRUE, sep = ",")
head(ffdata)

##   X    sex white age   play round season cash   sex1 white1 age1  play1
## 1 1   male   yes  20    foe     1      1  1.2   male    yes   32 friend
## 2 2   male   yes  40    foe     3      1  7.7 female    yes   31    foe
## 3 3 female    no  35    foe     2      1  3.2 female     no   24    foe
## 4 4   male   yes  26 friend     1      1  1.2   male    yes   40 friend
## 5 5 female   yes  40 friend     3      1  5.7   male    yes   26    foe
## 6 6 female   yes  28    foe     2      1  3.7 female    yes   23 friend
##   win win1
## 1 1.2  0.0
## 2 0.0  0.0
## 3 0.0  0.0
## 4 0.6  0.6
## 5 0.0  5.7
## 6 3.7  0.0

Note that the data set includes information about the gender (male or female), race (white or non-white), and age for the contestant and for their partner, where each row of data represents 1 contestant and their partner.

Importantly, the data includes the “play”, which indicates whether a contestant chose “friend” or “foe”, and “play1” indicates whether the partner selected “friend” or “foe”.

(The data also includes which season it was, and the round in which contestant is eliminated. For the purpose of this analysis, I am going to ignore these 2 pieces of information.)

Finally, the data includes “cash” for the amount of cash that is up for grabs in the trust box.

Section 1: Data Exploration

Here’s where I gathered some summary statistics on the data set:

summary(ffdata)

##        X             sex      white          age            play    
##  Min.   :  1.0   female:119   no : 37   Min.   :18.00   foe   :115  
##  1st Qu.: 57.5   male  :108   yes:190   1st Qu.:23.50   friend:112  
##  Median :114.0                          Median :28.00               
##  Mean   :114.0                          Mean   :29.35               
##  3rd Qu.:170.5                          3rd Qu.:33.00               
##  Max.   :227.0                          Max.   :57.00               
##      round           season           cash            sex1     white1   
##  Min.   :1.000   Min.   :1.000   Min.   : 0.200   female:111   no : 37  
##  1st Qu.:1.000   1st Qu.:1.000   1st Qu.: 1.000   male  :116   yes:190  
##  Median :2.000   Median :2.000   Median : 2.500                         
##  Mean   :1.925   Mean   :1.634   Mean   : 3.335                         
##  3rd Qu.:3.000   3rd Qu.:2.000   3rd Qu.: 5.350                         
##  Max.   :3.000   Max.   :2.000   Max.   :16.400                         
##       age1          play1          win              win1       
##  Min.   :18.00   foe   :132   Min.   : 0.000   Min.   : 0.000  
##  1st Qu.:23.00   friend: 95   1st Qu.: 0.000   1st Qu.: 0.000  
##  Median :27.00                Median : 0.000   Median : 0.000  
##  Mean   :28.75                Mean   : 1.006   Mean   : 1.301  
##  3rd Qu.:32.00                3rd Qu.: 1.350   3rd Qu.: 1.800  
##  Max.   :65.00                Max.   :15.000   Max.   :16.400

One thing I noticed immediately about the data is that there were far more white contestants than there were non-white contestants. This indicates that I might not have enough non-white sample to determine the correlation between race and friend or foe selection.

I was happy to see that age was fairly evenly distributed between the contestants and their partners, with both groups having similar summary statistics (mean age, median age, etc. was similar for the contestant and their partner). This will allow us to ignore age differences for the purpose of this analysis.

Furthermore, there was a good number of contestants and partners that selected “foe” as their play. Therefore, we have some contentious contestants who played the game. But did the gender or race differences between them and their partners play a role in their contentiousness? Let’s find out.

Then, I explored the pairings to see if there is a good distribution of same-gender and different-gender pairings:

table(ffdata$sex, ffdata$sex1)

##         
##          female male
##   female     51   68
##   male       60   48

This gives us some sample to work with for same-gender and different-gender pairings.

I did the same exploration for race in the data set:

table(ffdata$white, ffdata$white1)

##      
##        no yes
##   no   37   0
##   yes   0 190

Unfortunately, this data set doesn’t give us any multi-racial pairings to work with.
Among the 227 pairings captured in the data, all of them were same-race pairings.

So we will NOT be able to answer the 2nd question proposed for the analysis.
We can still proceed with approaching the 1st question proposed, but only as it relates to gender differences.

Section 2: Data Wrangling

Before we can test for correlation, we will need to create a new column to indicate whether the pairing of the contestant and their partner is a same-gender pairing or a different-gender pairing.

This is how I created the genderpair column, with values “same” or “different”:

ffdata$genderpair <- ifelse(ffdata$sex == ffdata$sex1, "same", "different")
summary(ffdata$genderpair)

##    Length     Class      Mode 
##       227 character character

head(ffdata)

##   X    sex white age   play round season cash   sex1 white1 age1  play1
## 1 1   male   yes  20    foe     1      1  1.2   male    yes   32 friend
## 2 2   male   yes  40    foe     3      1  7.7 female    yes   31    foe
## 3 3 female    no  35    foe     2      1  3.2 female     no   24    foe
## 4 4   male   yes  26 friend     1      1  1.2   male    yes   40 friend
## 5 5 female   yes  40 friend     3      1  5.7   male    yes   26    foe
## 6 6 female   yes  28    foe     2      1  3.7 female    yes   23 friend
##   win win1 genderpair
## 1 1.2  0.0       same
## 2 0.0  0.0  different
## 3 0.0  0.0       same
## 4 0.6  0.6       same
## 5 0.0  5.7  different
## 6 3.7  0.0       same

As a next step, I needed to create a column that indicates if 1 or both of the teammates played “foe” while in the Trust Box.

ffdata$player1foe <- ifelse(ffdata$play == "foe",1,0)
ffdata$player2foe <- ifelse(ffdata$play1 == "foe",1,0)
ffdata$totalfoe <- ffdata$player1foe + ffdata$player2foe
head(ffdata)

##   X    sex white age   play round season cash   sex1 white1 age1  play1
## 1 1   male   yes  20    foe     1      1  1.2   male    yes   32 friend
## 2 2   male   yes  40    foe     3      1  7.7 female    yes   31    foe
## 3 3 female    no  35    foe     2      1  3.2 female     no   24    foe
## 4 4   male   yes  26 friend     1      1  1.2   male    yes   40 friend
## 5 5 female   yes  40 friend     3      1  5.7   male    yes   26    foe
## 6 6 female   yes  28    foe     2      1  3.7 female    yes   23 friend
##   win win1 genderpair player1foe player2foe totalfoe
## 1 1.2  0.0       same          1          0        1
## 2 0.0  0.0  different          1          1        2
## 3 0.0  0.0       same          1          1        2
## 4 0.6  0.6       same          0          0        0
## 5 0.0  5.7  different          0          1        1
## 6 3.7  0.0       same          1          0        1

Then I create a table crosstabbing the 2 new fields that I had created:

gftable <- table(ffdata$genderpair, ffdata$totalfoe)
gftable

##            
##              0  1  2
##   different 30 53 45
##   same      22 50 27

Based on the frequency of occurrence, this new table indicates that the different-gendered pairs were more likely to select “foe” in playing the game.

But, eyeballing the data frequencies is not enough. We want to do a Chi-Squared test to check if there is an association between the gender pairing and their tendency to select foe, or if the 2 variables are independent.

chisqff <- chisq.test(gftable)
chisqff

## 
##  Pearson's Chi-squared test
## 
## data:  gftable
## X-squared = 2.1484, df = 2, p-value = 0.3416

The test compared the observed counts below:

chisqff$observed

##            
##              0  1  2
##   different 30 53 45
##   same      22 50 27

To the expected counts below:

round(chisqff$expected,2)

##            
##                 0     1    2
##   different 29.32 58.08 40.6
##   same      22.68 44.92 31.4

Then, the Chi Square test calculates the Pearson residuals:

round(chisqff$residuals, 3)

##            
##                  0      1      2
##   different  0.125 -0.666  0.691
##   same      -0.142  0.758 -0.785

The contribution (in %) of a given cell to the total Chi-square score is calculated as follows:

contrib <- 100*chisqff$residuals^2/chisqff$statistic
round(contrib, digits = 3)

##            
##                  0      1      2
##   different  0.731 20.677 22.205
##   same       0.945 26.733 28.710

The p-value of the test is as follows:

chisqff$p.value

## [1] 0.3415768

At a significance level of 0.05, we cannot conclude that the variables of gender pairing and “foe” voting are associated.

Section 3: Graphics

Balloon plot

install.packages(“gplots”)

library(gplots)

## 
## Attaching package: 'gplots'

## The following object is masked from 'package:stats':
## 
##     lowess

gftable <- table(ffdata$genderpair, ffdata$totalfoe)
balloonplot(gftable, main = "Teammates Gender Pairing * Number Selecting Foe", xlab = "Gender Pairing", ylab = "          Teammates Selected Foe", dotsize = 10, label.size = 1, label.lines=TRUE)

Mosaic plot

library(graphics)
mosaicplot(gftable, shade = TRUE, las = 2, main = "Gender Pairing by Teamates Selecting Foe")

Cohen-Friendly Association plot

install.packages(“vcd”)

library(vcd)

## Loading required package: grid

assocplot(gftable, col = c("black", "red"), space = 0.3, main = "Gender Pairings and Selection of Foe", xlab = "Gender Pair", ylab = "Number of Teammates Selecting Foe")

Visualize Pearson residuals

install.packages(“corrplot”)

library(corrplot)

## corrplot 0.84 loaded

corrplot(chisqff$residuals, is.cor = FALSE)

Visualize relative contribution of each cell to the total Chi-square score give some indication of the nature of the dependency between rows and columns of the contingency table.

corrplot(contrib, is.cor = FALSE)

Scatter plot

plot(cash ~ age, data=ffdata)

Based on this scatter plot, there doesn’t appear to be a relationship between age of contestant and the amount of cash earned into their Trust fund.

Box plot

boxplot(ffdata$cash)

Histogram

hist(ffdata$cash, main = "Cash Deposits Earned Into Team's Trust Funds", xlab = "Cash, in hundreds of dollars")

Conclusion

While the data initially appeared to indicate that different-gender pairings were more likely to select “foe” in the Trust Box than same-gender pairings, a Pearson’s Chi-Squared test found that we cannot conclude that the variables are associated.

We did not have enough data to explore how race relations might impact contestant’s choice to vote “friend” or “foe” because there were no multi-racial pairings in the data set. Parenthetically, I watched only one episode of the game show (https://www.youtube.com/watch?v=035_OxEw9Io), and noted that there was a pairing of a black contestant with a white contestant. I’m not sure why this pairing was not included in the github data set.

It appears that contestants on the game show tend to be more contentious than friendly in their voting, regardless of the battle of the sexes. Perhaps this is an indication that contestants are aware (or are made aware by producers), that conflict adds drama to a TV game show and is therefore desirable within the context of a TV program that is looking to improve audience ratings, and to keep audience tuned.

Further research might uncover that there is an association between gender differences, or race differences and the voting preferences of “Friend or Foe?” contestants, but it would probably require far more data, and many more seasons of the TV program. Since the show is currently off the air, we don’t have the option to collect more data from more contestants. But we could use the current data set to explore other questions such as whether males or females are more likely to select “foe”.

Ayala Cohen’s Final Project for R Bridge Course

Ayala Cohen

January 16, 2018