R Markdown

This is an R Markdown document. I often use Markdown to share R scripts with research partners and students. This helps show how we use data analysis software like R to answer research questions - or more often, to open up more research questions!

# Load Libraries ----------------------------------------------------------
# Required libraries ------------------------------------------------------
require(googlesheets4); require(lubridate); require(reshape)
library(dplyr); library(tidyverse)
library(ggplot2); library(ggpubr); library(plotly)

setwd(dirname(rstudioapi::getActiveDocumentContext()$path))

Two Tests, Two Results

This analysis draws on two tests - one pre-test that students took online, and one (actually, two) hand test in class. The formats were a little bit different, because in qualitative research, asking the same questions in multiple ways can strengthen your results.

I compiled all of the data together in a Google Sheet, and it is used for all of the analysis here.

Import Data

I have hidden the R script code that pulls in data from student surveys, but I will include all of the data analysis after student data has been anonymized and randomized in order:

After pulling in the data, I called it sData, standing for “survey data.”

#Randomize the row order
sData$person <- as.factor(seq.int(nrow(sData)))

#This shows how I initially randomized the anonymous data order:
#randomOrder2 <- sample(sData$person); write.csv(randomOrder2, 'randomOrder2.csv')

personOrder <- read.csv('randomOrder2.csv')
sData$person <- as.factor(personOrder$x)

#Map reactions to numerical values
original <- c("Doesn't Sound Like Me", "Sounds a Little Like Me", 
              "Sounds Somewhat Like Me", "Sounds A Lot Like Me")
newNames <- c(1,2,3,4)

####Process the Data####
sData$driving <- plyr::mapvalues(sData$driving, from=original, to=newNames)
sData$expressive <- plyr::mapvalues(sData$expressive, from=original, to=newNames)
sData$amiable <- plyr::mapvalues(sData$amiable, from=original, to=newNames)
sData$analytical <- plyr::mapvalues(sData$analytical, from=original, to=newNames)

sData$driving <- as.numeric(sData$driving)
sData$expressive <- as.numeric(sData$expressive)
sData$amiable <- as.numeric(sData$amiable)
sData$analytical <- as.numeric(sData$analytical)

####Calculate x-y coordinates from survey####
sData$assertiveness_survey <- sData$driving + sData$expressive - 
  sData$analytical - sData$amiable
sData$responsiveness_survey <- sData$analytical + sData$driving - 
  sData$amiable - sData$expressive

####Calculate x-y coordinates from hand evaluations####
sData$assertiveness_hand <- sData$assertiveness_hand/15
sData$responsiveness_hand <- sData$responsiveness_hand/15

#We need to remap possible values from [1-4] to [-5, 5]
#Using the equation here, https://rosettacode.org/wiki/Map_range
map_ss <- function(input) {
  a1 <- 1; a2 <- 4; b1 <- -5; b2 <- 5
  output <- b1 + (input-a1)*(b2-b1)/(a2-a1) 
  return(output)
}

sData$assertiveness_hand <- map_ss(sData$assertiveness_hand)
sData$responsiveness_hand <- map_ss(sData$responsiveness_hand)

#Nobody has identified as non-binary in our class, so we will
#simplify pronouns to two genders
sData$gender <- as.factor(sData$pronouns) 
sData$major <- as.factor(sData$major) #Add student majors for analysis

#Because we have one Engineering Physics major, we will need to 
#modify to maintain anonymity:
sData[sData$major == "Engineering Physics",]$major <- "Mechanical"

Plotting the Data

Now, we can take a look at the data, from both the initial online survey and the in-class hand worksheet.

####Plot on X-Y####
surveyPlot <- ggplot(data=sData, aes(x=assertiveness_survey, 
                                    y=responsiveness_survey)) + 
  theme_bw() +
  geom_label(label="AMIABLE", x= -2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="EXPRESSIVE", x= 2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="DRIVING", x= 2.5, y= 2.5, col="black",show.legend=F) +
  geom_label(label="ANALYTICAL", x= -2.5, y= 2.5, col="black",show.legend=F) +
  geom_point(aes(fill=person), shape=21,color="black",size=3,
             position=position_jitterdodge(jitter.width=0.25,
                                           jitter.height = 0.25)) +
  geom_hline(yintercept = 0,col="black",linetype="dashed") +
  geom_vline(xintercept = 0,col="black",linetype="dashed") +
  xlim(-5,5) + ylim(-5,5) +
  xlab("ASSERTIVENESS: effort made to influence the actions of others") +
  ylab("RESPONSIVENESS: extent of reaction to the task or the team")

show(surveyPlot)

surveyPlot + geom_point(aes(x=assertiveness_hand,y=responsiveness_hand,
                     fill=person), shape=21, color="black", size=5, alpha=0.75)

Here, the ONLINE survey data is shown as smaller dots, then I’ve added the IN-CLASS survey data as larger dots. It looks like for the most part, peoples’ self-evaluations changed quite a bit!

We can look at them next to each other to compare more easily:

Looking for Patterns

One issue that came up in class was that gender can be a driver of Social Styles - or perhaps of perceived Social Styles, or self-imposed Social Styles, or externally-imposed Social Styles. The way that we interact with others is a result of a chaotic mix of individual experiences and large social structures. The way we think about ourselves and form our identities is a complex outcome of inner experiences and perceptions, and shared social beliefs and practices.

Based on the initial survey data and the in-class worksheet data, are there any obvious patterns regarding student gender and self-evaluated Social Style?

#Does it look like gender is a predictor of Social Style?
plot1 <- ggplot(data=sData, aes(x=assertiveness_survey, y=responsiveness_survey)) + 
  theme_bw() + scale_color_brewer(palette = "Set1") +
  geom_label(label="AMIABLE", x= -2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="EXPRESSIVE", x= 2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="DRIVING", x= 2.5, y= 2.5, col="black",show.legend=F) +
  geom_label(label="ANALYTICAL", x= -2.5, y= 2.5, col="black",show.legend=F) +
  geom_point(aes(fill=gender), shape=21,color="black",size=5,
             position=position_jitterdodge(jitter.width=0.25,
                                           jitter.height = 0.25)) +
  geom_hline(yintercept = 0,col="black",linetype="dashed") +
  geom_vline(xintercept = 0,col="black",linetype="dashed") +
  xlim(-5,5) + ylim(-5,5) +
  xlab("ASSERTIVENESS: effort made to influence the actions of others") +
  ylab("RESPONSIVENESS: extent of reaction to the task or the team") +
  ggtitle("On-line Survey Data")

plot2 <- ggplot(data=sData, aes(x=assertiveness_hand, y=responsiveness_hand)) + 
  theme_bw() + scale_color_brewer(palette = "Set1") +
  geom_label(label="AMIABLE", x= -2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="EXPRESSIVE", x= 2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="DRIVING", x= 2.5, y= 2.5, col="black",show.legend=F) +
  geom_label(label="ANALYTICAL", x= -2.5, y= 2.5, col="black",show.legend=F) +
  geom_point(aes(fill=gender), shape=21,color="black",size=5,
             position=position_jitterdodge(jitter.width=0.25,
                                           jitter.height = 0.25)) +
  geom_hline(yintercept = 0,col="black",linetype="dashed") +
  geom_vline(xintercept = 0,col="black",linetype="dashed") +
  xlim(-5,5) + ylim(-5,5) +
  xlab("ASSERTIVENESS: effort made to influence the actions of others") +
  ylab("RESPONSIVENESS: extent of reaction to the task or the team") +
  ggtitle("In-class Handout Data")

ggarrange(plot1, plot2, nrow=1, ncol=2)

What other drivers might be involved in shaping our current self-evaluated Social Styles? What about your declared major?

#Does it look like declared major is a predictor of Social Style?
majorPlot <- ggplot(data=sData, aes(x=assertiveness_hand, y=responsiveness_hand)) + 
  theme_bw() + scale_color_brewer(palette = "Set1") +
  geom_label(label="AMIABLE", x= -2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="EXPRESSIVE", x= 2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="DRIVING", x= 2.5, y= 2.5, col="black",show.legend=F) +
  geom_label(label="ANALYTICAL", x= -2.5, y= 2.5, col="black",show.legend=F) +
  geom_point(aes(fill=major), shape=21,color="black",size=5,
             position=position_jitterdodge(jitter.width=0.25,
                                           jitter.height = 0.25)) +
  geom_hline(yintercept = 0,col="black",linetype="dashed") +
  geom_vline(xintercept = 0,col="black",linetype="dashed") +
  xlim(-5,5) + ylim(-5,5) +
  xlab("ASSERTIVENESS: effort made to influence the actions of others") +
  ylab("RESPONSIVENESS: extent of reaction to the task or the team")

majorPlot

Let’s look at what an average profile is for each of these primary engineering majors:

aveMajor <- sData %>% group_by(major) %>%
  dplyr::summarise(assertiveness = mean(assertiveness_hand, na.rm=T), 
                   responsiveness = mean(responsiveness_hand, na.rm=T))

aveGender <- sData %>% group_by(gender) %>%
  dplyr::summarise(assertiveness = mean(assertiveness_hand, na.rm=T), 
                   responsiveness = mean(responsiveness_hand, na.rm=T))

aveMajorPlot <- ggplot(data=aveMajor, aes(x=assertiveness, y=responsiveness)) + 
  theme_bw() + scale_color_brewer(palette = "Set1") +
  geom_label(label="AMIABLE", x= -2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="EXPRESSIVE", x= 2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="DRIVING", x= 2.5, y= 2.5, col="black",show.legend=F) +
  geom_label(label="ANALYTICAL", x= -2.5, y= 2.5, col="black",show.legend=F) +
  geom_point(aes(fill=major), shape=23,color="black",size=5,
             position=position_jitterdodge(jitter.width=0.25,
                                           jitter.height = 0.25)) +
  geom_hline(yintercept = 0,col="black",linetype="dashed") +
  geom_vline(xintercept = 0,col="black",linetype="dashed") +
  xlim(-5,5) + ylim(-5,5) +
  xlab("ASSERTIVENESS: effort made to influence the actions of others") +
  ylab("RESPONSIVENESS: extent of reaction to the task or the team")

ggarrange(majorPlot + theme(legend.position="noen"), aveMajorPlot, nrow=1, ncol=2)

ggplot(data=sData, aes(x=assertiveness_hand, y=responsiveness_hand)) + 
  theme_bw() + scale_color_brewer(palette = "Set1") +
  geom_label(label="AMIABLE", x= -2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="EXPRESSIVE", x= 2.5, y= -2.5, col="black",show.legend=F) +
  geom_label(label="DRIVING", x= 2.5, y= 2.5, col="black",show.legend=F) +
  geom_label(label="ANALYTICAL", x= -2.5, y= 2.5, col="black",show.legend=F) +
  geom_point(aes(col=major,shape=gender),size=5,
             position=position_jitterdodge(jitter.width=0.25,
                                           jitter.height = 0.25)) +
  geom_hline(yintercept = 0,col="black",linetype="dashed") +
  geom_vline(xintercept = 0,col="black",linetype="dashed") +
  xlim(-5,5) + ylim(-5,5) +
  xlab("ASSERTIVENESS: effort made to influence the actions of others") +
  ylab("RESPONSIVENESS: extent of reaction to the task or the team") +
  ggtitle("In-class Handout Data")

Let’s Get Really out There

I thought it would be interesting to look at a couple versions of 3D plots. The first one that came to mind is some metric that represents how much someone aligned with the statements in the online survey (for example, “I am typically assertive and more people-oriented in teams. I’m often described by peers as enthusiastic, warm, and communicative.”). A person who has a high score here really resonated with one or multiple of the Social Style descriptors; a person with a low score didn’t feel very attached to any of them.

What does that look like in 3D? We will also color-code based on the Z value, which makes trends a little bit easier to see.

Adding a Z value we’ll call “Resonance”

sData$resonance <- abs(sData$assertiveness_survey) + abs(sData$responsiveness_survey)

####Plot as a 3D scatter plot####
plot_ly(sData, x = ~assertiveness_hand, 
        y = ~responsiveness_hand, 
        z = ~resonance) %>%
  add_markers(color = ~resonance,) %>%
  layout(
    scene = list(
      xaxis = list(title = "Assertiveness"),
      yaxis = list(title = "Responsiveness"),
      zaxis = list(title = "Resonance")))

Adding a Z value called “delta”

This is different 3D scatter plot, where the Z value is determined by taking the difference between a person’s INITIAL location on the Social Styles plane an their FINAL location. The easiest way to do this is to just calculate the vector length (length = sqrt(x^2 + y^2)).

So in this plot, low Z values correspond to people whose self-evaluation didn’t change much between surveys, where high Z values correspond to people whose self-evaluation changed a lot.

sData$delta <- sqrt((sData$assertiveness_survey-sData$assertiveness_hand)^2 +
                      (sData$responsiveness_survey-sData$responsiveness_hand)^2)

####Plot as a 3D scatter plot####
plot_ly(sData, x = ~assertiveness_hand, 
        y = ~responsiveness_hand, 
        z = ~delta) %>%
  add_markers(color = ~delta,) %>%
  layout(
    scene = list(
      xaxis = list(title = "Assertiveness"),
      yaxis = list(title = "Responsiveness"),
      zaxis = list(title = "Change")))

How much did people’s responses change between the two evaluations?

We’ll use the delta value we calculated above to get a broad view of how people’s self-evaluated social styles changed between the Online survey and the in-class survey:

ggplot(sData, aes(x=delta)) + theme_bw() +
  geom_histogram(fill="forestgreen", col="black") + 
  xlab("Difference between test results (coordinate distance)")

Let’s visualize this - what do those distances actually mean?

bigChangers <- subset(sData, delta > 5)
smallChangers <- subset(sData, delta < 1)

Versatility

In class, we took a versatility self-test. I couldn’t find an existing versatility test, so I made the one we took by taking some phrases associated with high and low versatility and asking students to rank themselves from “Strongly Agree” to “Strongly Disagree.” Here are the results:

#Let's look at versatility in the class in the form of box plots.
#Box plots are a great way to quickly look at the mean, range, and deviation
#of a dataset.

ggplot(data=sData) + theme_bw() +
  geom_boxplot(aes(y=versatility)) + ylim(0,5) +
  geom_jitter(aes(x=0,y=versatility,col=person),width=0.1,alpha=0.75)+
  scale_x_discrete() + xlab("")

A boxplot is a good way to visualize these data. Let’s go one step deeper by color-coding the box plot based on major:

ggplot(data=sData) + theme_bw() +
  geom_boxplot(aes(y=versatility,x=major,col=major)) + ylim(0,5) +
  geom_jitter(aes(x=major,y=versatility,col=major),width=0.1,alpha=0.75)+
  scale_x_discrete() + xlab("")

A Final 3D Exploration

Okay, so we now have a class where every student has ranked themselves along three axes: assertiveness, responsiveness, and versatility. What does this look like in 3 dimensions? In this case, we will color-code by versatility to help visualize the z dimension:

####Plot as a 3D scatter plot####
plot_ly(sData, x = ~assertiveness_hand, 
        y = ~responsiveness_hand, 
        z = ~versatility) %>%
  add_markers(color = ~versatility,) %>%
  layout(
    scene = list(
      xaxis = list(title = "Assertiveness",
                   range=list(-5,5)),
      yaxis = list(title = "Responsiveness",
                   range=list(-5,5)),
      zaxis = list(title = "Versatility",
                   range=list(0,5))))