With Danielle’s help I was able to quickly adapt to using the pipe function and navigating the group_by, summarise, and ungroup functions.
Looking at a realistic example of how means and standard deviations of data would be plotted definitely helped me better understand the role of each function.
Data wrangling was definitely more intense, as is expected. Learning to filter data and adding another filter on top was super helpful.
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.4 ✓ purrr 0.3.4
## ✓ tibble 3.1.2 ✓ dplyr 1.0.6
## ✓ tidyr 1.1.3 ✓ stringr 1.4.0
## ✓ readr 1.4.0 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
swow <- read_tsv(file = "data_swow.csv.zip")
## ! Multiple files in zip: reading ''swow.csv''
##
## ── Column specification ────────────────────────────────────────────────────────
## cols(
## cue = col_character(),
## response = col_character(),
## R1 = col_double(),
## N = col_double(),
## R1.Strength = col_double()
## )
swow <- swow %>% mutate(id = 1:n())
swow <- swow %>%
rename(n_response = R1,
n_total = N,
strength = R1.Strength)
woman_fwd <- swow %>%
filter(
cue == "woman",
n_response > 1
)
ggplot(woman_fwd) +
geom_col(aes(
x = response,
y = strength
)) +
coord_flip()
Arranging the female_fwd, female_bck, male_fwd, and male_bck association data to see an order of decreasing strength as well as selecting specific variables of interest did wonders in terms of creating a more coherent representation of the data!
woman_fwd <- swow %>%
filter(cue == "woman", n_response > 1) %>%
arrange(desc(strength)) %>%
select(cue, response, strength, id)
woman_bck <- swow %>%
filter(response == "woman", n_response > 1) %>%
arrange(desc(strength)) %>%
select(cue, response, strength, id)
man_fwd <- swow %>%
filter(cue == "man", n_response > 1) %>%
arrange(desc(strength)) %>%
select(cue, response, strength, id)
man_bck <- swow %>%
filter(response == "man", n_response > 1) %>%
arrange(desc(strength)) %>%
select(cue, response, strength, id)
The mutate() stage of wrangling data is where it got interesting for me. Taking time to understand the rationale for the data itself helped me navigate through the various upcoming tasks.
# Mutate - lets replace the raw strengths with a rank ordering
# and create "type" variable (either forward or backward),
# "word" (either man or woman) and
# "associate" (which is the response or cue depending on fwd or bck)
woman_fwd <- swow %>%
filter(cue == "woman", n_response > 1) %>%
arrange(desc(strength)) %>%
select(cue, response, strength, id) %>%
mutate(
rank = rank(-strength),
type = "forward",
word = "woman",
associate = response)
woman_bck <- swow %>%
filter(response == "woman", n_response > 1) %>%
arrange(desc(strength)) %>%
select(cue, response, strength, id) %>%
mutate(
rank = rank(-strength),
type = "backward",
word = "woman",
associate = cue)
man_fwd <- swow %>%
filter(cue == "man", n_response > 1) %>%
arrange(desc(strength)) %>%
select(cue, response, strength, id) %>%
mutate(
rank = rank(-strength),
type = "forward",
word = "man",
associate = response)
man_bck <- swow %>%
filter(response == "man", n_response > 1) %>%
arrange(desc(strength)) %>%
select(cue, response, strength, id) %>%
mutate(
rank = rank(-strength),
type = "backward",
word = "man",
associate = cue)
# Use bind_rows() to stack data sets vertically
# Clean up using select()
# and filter() [we want to keep cases where associate is not equal (!= ) to woman (or man)]
# because cases where man and woman were associated with eachother are not useful for the analysis we want to run
gender <- bind_rows(woman_fwd, woman_bck, man_fwd, man_bck) %>%
select(id:associate) %>%
filter(associate != "man", associate != "woman")
It took me a while to fully understand how to use the S.W.O.W. data to pivot our gender forward associations. I was particularly confused with the coding coding that characterised creating larger values for strong associates and lower values for weak associates and replacing all missing values with 0 (e.g. ‘replace_na(1/woman, 0)’) but I got the hang of it eventually!
gender_fwd <- gender %>%
filter(
type == "forward") %>%
pivot_wider(
id_cols = associate,
names_from = word,
values_from = rank) %>%
mutate(
woman = (1/woman) %>% replace_na(0),
man = (1/man) %>% replace_na(0),
diff = woman - man)
ggplot(data = gender_fwd,
mapping = aes(
x = associate %>%
reorder(diff), #reorder by the value of diff (indicating strength of association)
y = diff)) +
geom_col() +
coord_flip()
Taking the BACKWARD association data and reshaping it was also a difficult task for me. I tried to Google a solution but I had no luck, and while I got it in the end with the help of my peers, I definitely want to work on independently solving the problems I will inevitably encounter in the future of my coding journey. I think this ties nicely into my future goals for this course. I am hoping to work diligently with my group to produce the descriptive stats required for our verification reports, and I wish to make the most out of the resources available to me to become confident with using R!