Project Background

In this project, participants were asked to describe a time when they witnessed someone deviate from a widely accepted norm. After reflecting on the instance of norm deviation, participants were asked follow-up questions regarding how their own attitudes toward the norm and perception of the norm in society more generally changed as a result of witnessing the norm deviation.

Import data

data <- import(here("Data", "norm_deviation_GS_SP19.xlsx"))
glimpse(data)
## Rows: 200
## Columns: 27
## $ Duration__in_seconds_ <dbl> 664, 843, 1244, 1534, 3382, 776, 1132, 1467, 986~
## $ RecordedDate          <dttm> 2019-04-01 12:42:36, 2019-04-01 14:50:36, 2019-~
## $ Q1.1                  <chr> "20", "19", "18", "30", "19", "19", "18", "18", ~
## $ Q328                  <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ~
## $ Q328_4_TEXT           <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, ~
## $ Q1.3                  <dbl> 10, 16, 16, 16, 14, 16, 10, 8, 17, 16, 16, 16, 1~
## $ Q1.3_17_TEXT          <chr> NA, NA, NA, NA, NA, NA, NA, NA, "mixed race", NA~
## $ Q346                  <chr> NA, "one of my friends in high school made the d~
## $ Q347.0                <chr> "religion norms", "gender norms- female cleanlin~
## $ Q348                  <chr> "traditional house hold, people around believe i~
## $ Q349.0                <dbl> 5, 5, 1, 4, 6, 7, 5, 4, 6, 5, 5, 6, 4, 7, 5, 1, ~
## $ Q350                  <chr> "School", "school environment", "This was within~
## $ Q351.0                <chr> "25", "when I witnessed the act and explanation,~
## $ Q352                  <chr> "95 percent", "women", "It applies to anyone", "~
## $ Q353.0                <dbl> 4, 4, 4, 4, 6, 6, 6, 4, 4, 4, 6, 4, 4, 4, 4, 6, ~
## $ Q355.0                <dbl> 6, 7, 7, 7, 6, 5, 5, 7, 7, 5, 6, 7, 6, 7, 7, 1, ~
## $ Q356                  <dbl> 7, 5, 7, 6, 5, 5, 4, 7, 5, 5, 6, 6, 5, 6, 4, 1, ~
## $ Q357                  <dbl> 95, 100, 95, 97, 95, 80, 80, 97, 90, 95, 90, 95,~
## $ Q358                  <dbl> 80, 98, 65, 95, 60, 50, 30, 97, 70, 0, 90, 90, 8~
## $ Q359                  <dbl> 4, 7, 4, 7, 7, 8, 7, 1, 8, 5, 4, 7, 6, 7, 8, 1, ~
## $ Q360                  <dbl> 4, 7, 1, 7, 6, 9, 7, 1, 8, 5, 4, 7, 6, 9, 9, 1, ~
## $ Q361                  <dbl> 2, 1, 2, 1, 1, 2, 2, 2, 2, 1, 1, 2, 1, 1, 2, 2, ~
## $ Q363                  <chr> NA, "made me think about why women are held to t~
## $ Q364                  <dbl> 6, 1, 7, 6, 6, 5, 5, 7, 7, 5, 6, 7, 6, 5, 6, 1, ~
## $ Q365                  <dbl> 5, 5, 7, 5, 7, 5, 5, 5, 5, 5, 6, 6, 5, 1, 5, 1, ~
## $ Q366                  <dbl> 95, 99, 95, 97, 100, 60, 90, 100, 90, 95, 90, 90~
## $ Q367                  <dbl> 80, 99, 65, 95, 100, 100, 30, 100, 70, 0, 90, 90~

Cleaning Data

Re-naming columns

data_clean <- data %>% 
  rename(Age = Q1.1,
         Gender = Q328,
         Gender_Text = Q328_4_TEXT,
         Ethnicity = Q1.3,
         Ethnicity_Text = Q1.3_17_TEXT,
         Deviation_Text = Q346,
         Norm_Deviated = Q347.0,
         Identity_Norm_Deviant = Q348,
         Trust_Deviant_Worldview = Q349.0,
         Location_Deviation = Q350,
         Num_Ppl_Present = Q351.0,
         Group_Norm_Applies = Q352,
         Whether_Observer_in_Group = Q353.0,
         Prior_Rate_Followed_Norm = Q355.0,
         Prior_Observer_Norm_Endorsement = Q356,
         Prior_Percent_Follow = Q357,
         Prior_Percent_Believed = Q358,
         Shared_Views_1 = Q359,
         Shared_Views_2 = Q360,
         Whether_Opinion_Changed = Q361,
         Whether_Opinion_Changed_Text = Q363,
         Post_Rate_Follow_Norm = Q364,
         Post_Observer_Norm_Endorsement = Q365,
         Post_Percent_Follow = Q366,
         Post_Percent_Believe = Q367)

Descriptions of Variables

  • Deviation_Text: Describe a time when you witnessed someone deviate from a norm by doing something that challenged conventions or by expressing an opinion that went against widely accepted ideas.

  • Norm_Deviated: What norm did this person deviate from?

  • Identity_Norm_Deviant: Describe the identity of the person who you witnessed deviate from a norm

  • Trust_Deviant_Worldview: How much do you trust this person’s way of seeing the world?

  • Location_Deviation: Where did the instance you described above take place?

  • Num_Ppl_Present: How many people were present?

  • Group_Norm_Applies: What group of people does the norm you described above apply to?

  • Whether_Observer_in_Group: Are you a part of this group?

  • Prior_Rate_Followed_Norm: Prior to witnessing someone deviate from this norm, how often did you follow this norm when you were in a situation where the norm applied?

  • Prior_Observer_Norm_Endorsement: Prior to witnessing someone deviate from this norm, how strongly did you believe that people should follow this norm?

  • Prior_Percent_Follow: Prior to witnessing someone deviate from this norm, what percentage of people in the group that this norm applies to did you think followed this norm? Fill in a percentage between 0 to 100

  • Prior_Percent_Believed: Prior to witnessing someone deviate from this norm, what percentage of people in your society did you think believed that people should follow this norm? Fill in a percentage between 0 to 100

  • Shared_Views_1: Please rate your agreement with the following statements about you and the person you described previously at the time that you witnessed them deviating from a norm: “I felt like we saw the world in the same way.”

  • Shared_Views_2: Please rate your agreement with the following statements about you and the person you described previously at the time that you witnessed them deviating from a norm: “I felt like we shared the same thoughts and feelings about things.”

  • Whether_Opinion_Changed: Did witnessing someone deviate from the norm you described above change your opinion about the original norm?

  • Whether_Opinion_Changed_Text: If yes, please explain how your opinion about the original norm changed.

  • Post_Rate_Follow_Norm: How often do you follow this norm when you are in a situation where the norm applies?

  • Post_Observer_Norm_Endorsement: How strongly do you believe that people should follow this norm?

  • Post_Percent_Follow: What percentage of people in the group that this norm applies to do you think follow this norm? Fill in a percentage between 0 to 100

  • Post_Percent_Believe: What percentage of people in your society do you think believe that people should follow this norm? Fill in a percentage between 0 to 100

Cleaning variable types

# str(data_clean)

data_clean <- data_clean %>%
  mutate(Age = as.numeric(Age),
         Gender = as.factor(Gender),
         Ethnicity = as.factor(Ethnicity),
         Whether_Observer_in_Group = as.factor(Whether_Observer_in_Group),
         Whether_Opinion_Changed = as.factor(Whether_Opinion_Changed))

nrow(data_clean) # 200
## [1] 200

Recoding numerical variables

data_clean$Trust_Deviant_Worldview <- recode(data_clean$Trust_Deviant_Worldview, `1` = 1, `4` = 2, `5` = 3, `6` = 4, `7` = 5) 

data_clean$Prior_Rate_Followed_Norm <- recode(data_clean$Prior_Rate_Followed_Norm, `1` = 1, `4` = 2, `5` = 3, `6` = 4, `7` = 5)

data_clean$Prior_Observer_Norm_Endorsement <- recode(data_clean$Prior_Observer_Norm_Endorsement, `1` = 1, `4` = 2, `5` = 3, `6` = 4, `7` = 5)

data_clean$Shared_Views_1 <- recode(data_clean$Shared_Views_1, `1` = 1, `4` = 2, `5` = 3, `6` = 4, `7` = 5, `8` = 6, `9` = 7)

data_clean$Shared_Views_2 <- recode(data_clean$Shared_Views_2, `1` = 1, `4` = 2, `5` = 3, `6` = 4, `7` = 5, `8` = 6, `9` = 7)

data_clean$Post_Rate_Follow_Norm <- recode(data_clean$Post_Rate_Follow_Norm, `1` = 1, `4` = 2, `5` = 3, `6` = 4, `7` = 5)

data_clean$Post_Observer_Norm_Endorsement <- recode(data_clean$Post_Observer_Norm_Endorsement, `1` = 1, `4` = 2, `5` = 3, `6` = 4, `7` = 5)

levels(data_clean$Whether_Opinion_Changed) <- c(0, 1)
levels(data_clean$Whether_Opinion_Changed)
## [1] "0" "1"

Labeling levels of nominal variables

levels(data_clean$Gender) <- c("Female", "Male", "Non-binary")
levels(data_clean$Ethnicity) <- c("American Indian or Alaska Native", "Asian", "Black or African American", "Hispanic, Latinx or Spanish Origin", "Middle Eastern or North African", "Native Hawaiian or Other Pacific Islander", "White", "Some other ethnicity or origin", "I prefer not to answer")
levels(data_clean$Whether_Observer_in_Group) <- c("Yes", "No")
levels(data_clean$Whether_Opinion_Changed) <- c("Yes", "No")

Choose specific columns

data_short <- data_clean %>%
  select(Age:Post_Percent_Believe)

data_short <- data_short %>%
  filter(Whether_Opinion_Changed != "NA") %>%
  droplevels()

Bar Plots

Whether opinion changed after witnessing norm deviation

text_settings <- theme(text = element_text(size = 16)) +
  theme(plot.title = element_text(size = 16)) +
  theme(axis.text.x = element_text(size = 16)) +
  theme(axis.text.y = element_text(size = 16)) +
  theme(axis.ticks = element_blank())

ggplot(data_short, aes(x = Whether_Opinion_Changed)) +
  geom_bar(na.rm = TRUE, fill = "coral") +
  scale_x_discrete(na.translate = FALSE) +
  labs(title = "Whether opinion changed after witnessing the norm deviation", x = "Whether Opinion Changed", y = "Frequency") +
  theme_light() + 
  theme(plot.title = element_text(hjust=0.5)) + 
  text_settings

Investigating whether rates of opinion change differ between men and women

# Outcome variable by gender
data_short <- data_short %>%
  filter(Gender != "NA",
         Gender != "Non-binary") %>%
  droplevels()

Stacked bar graph

library(RColorBrewer)

ggplot(data_short, aes(x = Whether_Opinion_Changed, fill = Gender)) +
  geom_bar(na.rm = TRUE) +
  scale_x_discrete(na.translate = FALSE) +
  labs(title = "Whether opinion changed after witnessing the norm deviation", x = "Whether Opinion Changed", y = "Frequency") +
  scale_fill_brewer(palette = "Accent") +
  theme_light() + 
  theme(plot.title = element_text(hjust=0.5)) + 
  text_settings

Side-by-side bar graph

ggplot(data_short, aes(x = Whether_Opinion_Changed, fill = Gender)) +
  geom_bar(position = "dodge", na.rm = TRUE) +
  scale_x_discrete(na.translate = FALSE) +
  labs(title = "Whether opinion changed after witnessing the norm deviation", x = "Whether Opinion Changed", y = "Frequency") +
  scale_fill_brewer(palette = "Accent") +
  theme_light() + 
  theme(plot.title = element_text(hjust=0.5)) + 
  text_settings

String Manipulation

Text analysis of the Deviation_Text variable

  • Deviation_Text: Describe a time when you witnessed someone deviate from a norm by doing something that challenged conventions or by expressing an opinion that went against widely accepted ideas.

Clean the string variables (i.e., remove punctuation, numbers, common words, whitespide, & convert to lowercase)

docs <- VCorpus(VectorSource(data_short$Deviation_Text)) %>%
  tm_map(removePunctuation) %>% # removes punctuation
  tm_map(removeNumbers) %>% # removes numbers
  tm_map(tolower)  %>% # converts to lowercase
  tm_map(removeWords, stopwords("english")) %>% # removes common words (i.e., 'a', 'the')
  tm_map(stripWhitespace) %>% # removes whitespace
  tm_map(PlainTextDocument)


tdm <- TermDocumentMatrix(docs) %>% # This creates a document in which each unique word is a row, and each document (which in this case is each participant's response) is a column
  as.matrix()
colnames(tdm) <- 1:nrow(data_short)

dtm <- DocumentTermMatrix(docs) %>% # This creates a document in which each unique word is a row, and each document (which in this case is each participant's response) is a column
  as.matrix()

# head(dtm)

Calculate frequency of each word

freq <- colSums(dtm)

Gather columns representing unique words into a single variable column

dtm_gather <- gather(as.data.frame(dtm), key = unique_words, frequency, able:zumba)

# Create a new variable containing only the unique words
unique_words <- unique(dtm_gather$unique_words)

Combine unique words and frequencies into a single dataframe

word_set <- cbind.data.frame(unique_words, freq)

Arrange word set in descending order from most to least frequently used words

word_set_ordered <- word_set[order(-freq),]

Wordcloud

ggwordcloud(word_set$unique_words, word_set$freq, colors = c("#1B9E77", "#D95F02", "#7570B3", "#E7298A", "#66A61E")) + labs(title = "Describe a time when you witnessed someone deviate from a norm by doing something that challenged \nconventions or by expressing an opinion that went against widely accepted ideas.")

Text analysis of the Norm_Deviated variable

  • Norm_Deviated: What norm did this person deviate from?
docs <- VCorpus(VectorSource(data_short$Norm_Deviated)) %>%
  tm_map(removePunctuation) %>% # removes punctuation
  tm_map(removeNumbers) %>% # removes numbers
  tm_map(tolower)  %>% # converts to lowercase
  tm_map(removeWords, c(stopwords("english"), "norm", "norms", "deviated")) %>% # removes common words (i.e., 'a', 'the'); I also removed words that re-stated the prompt (i.e., norm, norms, deviated)
  tm_map(stripWhitespace) %>% # removes whitespace
  tm_map(PlainTextDocument)


tdm <- TermDocumentMatrix(docs) %>% # This creates a document in which each unique word is a row, and each document (which in this case is each participant's response) is a column
  as.matrix()
colnames(tdm) <- 1:nrow(data_short)


dtm <- DocumentTermMatrix(docs) %>% # This creates a document in which each unique word is a row, and each document (which in this case is each participant's response) is a column
  as.matrix()

# head(dtm)

Calculate frequency of each word

freq <- colSums(dtm)

Gather columns representing unique words into a single variable column

dtm_gather <- gather(as.data.frame(dtm), key = unique_words, frequency, able:yelling)

# Create a new variable containing only the unique words
unique_words <- unique(dtm_gather$unique_words)

Combine unique words and frequencies into a single dataframe

word_set <- cbind.data.frame(unique_words, freq)

Arrange word set in descending order from most to least frequently used words

word_set_ordered <- word_set[order(-freq),]

Wordcloud

ggwordcloud(word_set$unique_words, word_set$freq, colors = c("#1B9E77", "#D95F02", "#7570B3", "#E7298A", "#66A61E")) + labs(title = "What norm did this person deviate from?")

Text analysis of the Identity_Norm_Deviant variable

  • Identity_Norm_Deviant: Describe the identity of the person who you witnessed deviate from a norm
docs <- VCorpus(VectorSource(data_short$Identity_Norm_Deviant)) %>%
  tm_map(removePunctuation) %>% # removes punctuation
  tm_map(removeNumbers) %>% # removes numbers
  tm_map(tolower)  %>% # converts to lowercase
  tm_map(removeWords, stopwords("english")) %>% # removes common words (i.e., 'a', 'the'); I also removed words that re-stated the prompt (i.e., norm, norms, deviated)
  tm_map(stripWhitespace) %>% # removes whitespace
  tm_map(PlainTextDocument)


tdm <- TermDocumentMatrix(docs) %>% # This creates a document in which each unique word is a row, and each document (which in this case is each participant's response) is a column
  as.matrix()
colnames(tdm) <- 1:nrow(data_short)

dtm <- DocumentTermMatrix(docs) %>% # This creates a document in which each unique word is a row, and each document (which in this case is each participant's response) is a column
  as.matrix()

# head(dtm)

Calculate frequency of each word

freq <- colSums(dtm)

Gather columns representing unique words into a single variable column

dtm_gather <- gather(as.data.frame(dtm), key = unique_words, frequency, acquaintance :younger)

# Create a new variable containing only the unique words
unique_words <- unique(dtm_gather$unique_words)

Combine unique words and frequencies into a single dataframe

word_set <- cbind.data.frame(unique_words, freq)

Arrange word set in descending order from most to least frequently used words

word_set_ordered <- word_set[order(-freq),]

Wordcloud

ggwordcloud(word_set$unique_words, word_set$freq, colors = c("#1B9E77", "#D95F02", "#7570B3", "#E7298A", "#66A61E")) + labs(title = "Describe the identity of the person who you witnessed deviate from a norm.")

Text analysis of the Whether_Opinion_Changed_Text variable

  • Whether_Opinion_Changed_Text: If yes, please explain how your opinion about the original norm changed.
docs <- VCorpus(VectorSource(data_short$Whether_Opinion_Changed_Text)) %>%
  tm_map(removePunctuation) %>% # removes punctuation
  tm_map(removeNumbers) %>% # removes numbers
  tm_map(tolower)  %>% # converts to lowercase
  tm_map(removeWords, stopwords("english")) %>% # removes common words (i.e., 'a', 'the'); I also removed words that re-stated the prompt (i.e., norm, norms, deviated)
  tm_map(stripWhitespace) %>% # removes whitespace
  tm_map(PlainTextDocument)


tdm <- TermDocumentMatrix(docs) %>% # This creates a document in which each unique word is a row, and each document (which in this case is each participant's response) is a column
  as.matrix()
colnames(tdm) <- 1:nrow(data_short)

dtm <- DocumentTermMatrix(docs) %>% # This creates a document in which each unique word is a row, and each document (which in this case is each participant's response) is a column
  as.matrix()

# head(dtm)

Calculate frequency of each word

freq <- colSums(dtm)

Gather columns representing unique words into a single variable column

dtm_gather <- gather(as.data.frame(dtm), key = unique_words, frequency, everything())

# Create a new variable containing only the unique words
unique_words <- unique(dtm_gather$unique_words)

Combine unique words and frequencies into a single dataframe

word_set <- cbind.data.frame(unique_words, freq)

Arrange word set in descending order from most to least frequently used words

word_set_ordered <- word_set[order(-freq),]

Wordcloud

ggwordcloud(word_set$unique_words, word_set$freq, colors = c("#1B9E77", "#D95F02", "#7570B3", "#E7298A", "#66A61E")) + labs(title = "Explain how your opinion about the original norm changed.")