Homework 1

Fifty male and fifty female students fill out the same questionnaire in weekly intervals starting five weeks before an important examination to measure state anxiety. The research interests are:

whether there are gender difference in state anxiety
individual differences in state anxiety

Explore the answers to both questions with plots involving confidence intervals or error bars for the means.

Source: Von Eye, A., & Schuster C. (1998). Regression Analysis for Social Sciences. San Diego: Academic Press.

dta1 <- read.table("C:/Users/TheorEco Lab/Desktop/108-2/DataManagement/0420/stateAnxiety.txt", header = T)

str(dta1)

## 'data.frame':    50 obs. of  10 variables:
##  $ f1: int  13 26 13 22 18 32 16 18 14 20 ...
##  $ f2: int  17 31 17 24 19 31 16 22 17 19 ...
##  $ f3: int  18 33 24 26 19 30 21 25 23 23 ...
##  $ f4: int  20 38 29 27 22 31 27 29 21 25 ...
##  $ f5: int  24 42 32 29 30 32 30 35 25 28 ...
##  $ m1: int  6 4 17 19 12 11 14 9 12 11 ...
##  $ m2: int  14 11 25 22 21 16 23 18 16 13 ...
##  $ m3: int  22 14 26 26 21 20 26 20 23 17 ...
##  $ m4: int  20 12 29 30 23 19 29 20 26 14 ...
##  $ m5: int  24 23 38 34 24 22 33 24 32 20 ...

Column 1: Anxiety score 5 weeks before exam for female

Column 2: Anxiety score 4 weeks before exam for female

Column 3: Anxiety score 3 weeks before exam for female

Column 4: Anxiety score 2 weeks before exam for female

Column 5: Anxiety score 1 weeks before exam for female

Column 6: Anxiety score 5 weeks before exam for male

Column 7: Anxiety score 4 weeks before exam for male

Column 8: Anxiety score 3 weeks before exam for male

Column 9: Anxiety score 2 weeks before exam for male

Column 10: Anxiety score 1 weeks before exam for male

重整資料

library(dplyr)

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

library(tidyr)
library(tidyverse)

## -- Attaching packages --------------------------------------- tidyverse 1.3.0 --

## √ ggplot2 3.3.0     √ purrr   0.3.3
## √ tibble  3.0.0     √ stringr 1.4.0
## √ readr   1.3.1     √ forcats 0.5.0

## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()

library(reshape2)

## 
## Attaching package: 'reshape2'

## The following object is masked from 'package:tidyr':
## 
##     smiths

dta1f <- subset(dta1[c(1:5)]) %>%
  mutate(ID = 1:50) %>%
  gather(key = sex, value = Score, contains("f")) %>%
  arrange(ID) %>%
  separate(sex, c("prefix", "trial")) %>%
  mutate(Trial = parse_number(prefix)) %>%
  dplyr::select(-c(prefix, trial)) %>%
  mutate(Sex = "F")

## Warning: Expected 2 pieces. Missing pieces filled with `NA` in 250 rows [1, 2,
## 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].

head(dta1f)

##   ID Score Trial Sex
## 1  1    13     1   F
## 2  1    17     2   F
## 3  1    18     3   F
## 4  1    20     4   F
## 5  1    24     5   F
## 6  2    26     1   F

dta1m <- subset(dta1[c(6:10)]) %>%
  mutate(ID = 1:50) %>%
  gather(key = sex, value = Score, contains("m")) %>%
  arrange(ID) %>%
  separate(sex, c("prefix", "trial")) %>%
  mutate(Trial = parse_number(prefix)) %>%
  dplyr::select(-c(prefix, trial)) %>%
  mutate(Sex = "M")

## Warning: Expected 2 pieces. Missing pieces filled with `NA` in 250 rows [1, 2,
## 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, ...].

head(dta1m)

##   ID Score Trial Sex
## 1  1     6     1   M
## 2  1    14     2   M
## 3  1    22     3   M
## 4  1    20     4   M
## 5  1    24     5   M
## 6  2     4     1   M

dta1L <- merge(dta1f, dta1m, by = c("ID", "Score", "Trial", "Sex"), all = T) %>%
  dplyr::select(-ID)

dta1L$Trial <- factor(dta1L$Trial, levels = c(1:5))

levels(dta1L$Trial)[1:5] <- c("5wb", "4wb", "3wb", "2wb", "1wb")

head(dta1L)

##   Score Trial Sex
## 1     6   5wb   M
## 2    13   5wb   F
## 3    14   4wb   M
## 4    17   4wb   F
## 5    18   3wb   F
## 6    20   2wb   F

1. Whether there are gender difference in state anxiety ?

library(ggplot2)

pd <- position_dodge(.3)

p11 <- dta1L %>% group_by(Sex, Trial) %>%
  summarize(m_s=mean(Score), 
            se_s=sd(Score)/sqrt(n())) %>%
  ggplot() + 
  aes(Trial, m_s, 
      group=Sex, 
      shape=Sex) +
  geom_errorbar(aes(ymin=m_s - se_s,
                    ymax=m_s + se_s),
                width=.2, size=.3, 
                position=pd) +
  geom_line(position=pd, 
            linetype='dotted') +
  geom_point(position=pd, 
             size=rel(3)) +
  scale_shape(guide=guide_legend(title="Gender")) +
  labs(x="Anxiety score before exam (xwb : x week(s) before)", y="Mean score") +
  theme_bw() 

p11

從上圖可看到，考試前五次焦慮指數，涵蓋 error bars 的範圍在男女間是沒有重疊的，可推測試前的焦慮指數在性別間是有一定程度差異的。

2. individual differences in state anxiety ?

p12 <- dta1L %>% group_by(Sex, Trial) %>%
  summarize(m_s=mean(Score), 
            se_s=sd(Score)/sqrt(n())) %>%
  ggplot() + 
  aes(Sex, m_s, 
      group=Trial, 
      shape=Trial) +
  geom_errorbar(aes(ymin=m_s - se_s,
                    ymax=m_s + se_s),
                width=.2, size=.3, 
                position=pd) +
  geom_line(position=pd, 
            linetype='dotted') +
  geom_point(position=pd, 
             size=rel(3)) +
  scale_shape(guide=guide_legend(title="Anxiety score before exam\n (xwb : x week(s) before)")) +
  labs(x="Gender", y="Mean score") +
  theme_bw()

p12

從上圖可看到，1不同性別的焦慮指數，涵蓋 error bars 的範圍在考試前 1 至 5 週間，僅 2、3、4 週有接近和部分重疊的現象，可推測試前的焦慮指數在不同測驗時間是有點差異的。

Homework 2

Use the markdown file to replicate the contents of Weissgerber, T.L., Milic, N.M., Winham, S.J., Garovic, V.D. (2015). Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLOS Biology , 13. The two data sets are here provided: journal.pbio.1002128.s002.XLS and journal.pbio.1002128.s003.XLS . You can also download everything in a zip file from this location.

Interpreting the plots

# suppress messages to reduce clutter
library(knitr)
opts_chunk$set(message = FALSE, warning = FALSE)
## install.packages("printr")
library(printr)

## Registered S3 method overwritten by 'printr':
##   method                from     
##   knit_print.data.frame rmarkdown

Independent data, points not jittered (no overlapping points)

library(readxl)
# read in data from PLOS Biology article supplementary materials
independent_data <- read_excel("C:/Users/TheorEco Lab/Desktop/108-2/DataManagement/0420/journal.pbio.1002128.s002.XLSX", sheet = 1)
# subset just groups 1-5 from the 'No overlapping points' sheet
independent_data <- independent_data[15:30,2:6]
# assign column names
names(independent_data) <- independent_data[1, ]
# remove first row with column names
independent_data <- independent_data[-1, ]
kable(independent_data)

Group 1	Group 2	Group 3	Group 4	Group 5
5	7	9	42	2
3	3	7	2	0
6	9	10	5	3
8	10	12	55	5
10	33	14	9	7
13	15	17	12	10
1	18	20	15	13
4	6	40	3	1
18	20	22	NA	15
4	30	35	NA	1
7	NA	42	NA	4
9	NA	13	NA	6
14	NA	NA	NA	11
15	NA	NA	NA	12
17	NA	NA	NA	14

# reshape for plotting
library(tidyr)
independent_data_long <- gather(independent_data, group, value, `Group 1`:`Group 5`, convert = TRUE)
kable(independent_data_long)

group	value
Group 1	5
Group 1	3
Group 1	6
Group 1	8
Group 1	10
Group 1	13
Group 1	1
Group 1	4
Group 1	18
Group 1	4
Group 1	7
Group 1	9
Group 1	14
Group 1	15
Group 1	17
Group 2	7
Group 2	3
Group 2	9
Group 2	10
Group 2	33
Group 2	15
Group 2	18
Group 2	6
Group 2	20
Group 2	30
Group 2	NA
Group 2	NA
Group 2	NA
Group 2	NA
Group 2	NA
Group 3	9
Group 3	7
Group 3	10
Group 3	12
Group 3	14
Group 3	17
Group 3	20
Group 3	40
Group 3	22
Group 3	35
Group 3	42
Group 3	13
Group 3	NA
Group 3	NA
Group 3	NA
Group 4	42
Group 4	2
Group 4	5
Group 4	55
Group 4	9
Group 4	12
Group 4	15
Group 4	3
Group 4	NA
Group 4	NA
Group 4	NA
Group 4	NA
Group 4	NA
Group 4	NA
Group 4	NA
Group 5	2
Group 5	0
Group 5	3
Group 5	5
Group 5	7
Group 5	10
Group 5	13
Group 5	1
Group 5	15
Group 5	1
Group 5	4
Group 5	6
Group 5	11
Group 5	12
Group 5	14

# plot
library(ggplot2)
library(dplyr)
# subset groups 1 & 2
independent_data_long_groups_1_and_2 <- independent_data_long %>% 
  filter(group %in% c("Group 1", "Group 2"))
# plot
ggplot(independent_data_long_groups_1_and_2, aes(group, as.numeric(value))) +
  geom_point(shape = 1, size = 4) +
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median, geom = "crossbar", width = 0.2, size = 1) +
  xlab("") +
    ylab("Measurement (units)") +
  theme_minimal(base_size = 16)

# subset groups 1, 2 & 3
independent_data_long_groups_1_2_3 <-  independent_data_long %>% 
  filter(group %in% c("Group 1", "Group 2", "Group 3"))
# plot
ggplot(independent_data_long_groups_1_2_3, aes(group, as.numeric(value))) +
  geom_point(shape = 1, size = 4) +
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median, geom = "crossbar", width = 0.2, size = 1) +
  xlab("") +
  ylab("Measurement (units)") +
  theme_minimal(base_size = 16)

# groups 1, 2, 3, & 4
independent_data_long_groups_1_2_3_4 <-  independent_data_long %>% 
  filter(group %in% c("Group 1", "Group 2", "Group 3", "Group 4"))
# plot
ggplot(independent_data_long_groups_1_2_3_4, aes(group, as.numeric(value))) +
  geom_point(shape = 1, size = 4) +
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median, geom = "crossbar", width = 0.2, size = 1) +
  xlab("") +
  ylab("Measurement (units)") +
  theme_minimal(base_size = 16)

# all five groups
ggplot(independent_data_long, aes(group, as.numeric(value))) +
  geom_point(shape = 1, size = 4) +
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median, geom = "crossbar", width = 0.2, size = 1) +
  xlab("") +
  ylab("Measurement (units)") +
  theme_minimal(base_size = 16)

Independent data, points jittered

library(readxl)
# read in data from PLOS Biology article supplementary materials
independent_data_j <- read_excel("journal.pbio.1002128.s002.XLSX", sheet = 2)
# subset data from the 'points jittered' sheet
independent_data_j <- independent_data_j[16:115,2:3]
# group numbers are not given in the spreadsheet, so we'll add them
independent_data_j$Groups <- c(rep(1, 20), rep(2, 20), rep(3, 20), rep(4, 20), rep(5, 20))
# assign column names
names(independent_data_j) <- c("Subject ID", "Measurement", "Group")

# plot
library(ggplot2)
library(dplyr)
# groups 1 & 2
independent_data_j_groups_1_and_2 <- independent_data_j %>% 
  filter(Group %in% 1:2)
# plot
ggplot(independent_data_j_groups_1_and_2, aes(as.factor(Group), as.numeric(Measurement))) +
  geom_jitter(shape = 1, size = 4, position=position_jitter(width = 0.2, height = 0.2)) +
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median, geom = "crossbar", width = 0.2, size = 1) +
  xlab("") +
  ylab("Measurement (units)") +
  theme_minimal(base_size = 16)

# groups 1, 2 & 3
independent_data_j_groups_1_2_3 <- independent_data_j %>% 
  filter(Group %in% 1:3)
# plot
ggplot(independent_data_j_groups_1_2_3, aes(as.factor(Group), as.numeric(Measurement))) +
  geom_jitter(shape = 1, size = 4, position=position_jitter(width = 0.2, height = 0.2)) +
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median, geom = "crossbar", width = 0.2, size = 1) +
  xlab("") +
  ylab("Measurement (units)") +
  theme_minimal(base_size = 16)

# groups 1, 2, 3, & 4
independent_data_j_groups_1_2_3_4 <- independent_data_j %>% 
  filter(Group %in% 1:4)
# plot
ggplot(independent_data_j_groups_1_2_3_4, aes(as.factor(Group), as.numeric(Measurement))) +
  geom_jitter(shape = 1, size = 4, position=position_jitter(width = 0.2, height = 0.2)) +
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median, geom = "crossbar", width = 0.2, size = 1) +
  xlab("") +
  ylab("Measurement (units)") +
  theme_minimal(base_size = 16)

# all five groups
ggplot(independent_data_j, aes(as.factor(Group), as.numeric(Measurement))) +
  geom_jitter(shape = 1, size = 4, position=position_jitter(width = 0.2, height = 0.2)) +
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median, geom = "crossbar", width = 0.2, size = 1) +
  xlab("") +
  ylab("Measurement (units)") +
  theme_minimal(base_size = 16)

Paired or Non-independent Data: 1 Group, 2 Conditions

library(readxl)
# read in data from PLOS Biology article supplementary materials
One_group_two_conditions <- read_excel("journal.pbio.1002128.s003.XLS", sheet = 1)
# subset data from the 'points jittered' sheet
One_group_two_conditions <- One_group_two_conditions[12:23,1:3]
# assign column names
names(One_group_two_conditions) <- c("Subject ID", "Condition 1 Name",  "Condition 2 Name")
One_group_two_conditions$difference <- as.numeric(One_group_two_conditions$`Condition 2 Name`)  - as.numeric(One_group_two_conditions$`Condition 1 Name`)

# reshape for plotting
library(tidyr)
One_group_two_conditions_long <- gather(One_group_two_conditions, group, value, `Condition 1 Name`:`Condition 2 Name`, -`Subject ID`, -difference, convert = TRUE)

# plot
library(ggplot2)
library(gridExtra)

g1 <- ggplot(One_group_two_conditions_long, aes(group, as.numeric(value), group = `Subject ID`)) + 
  geom_point(shape = 1, size = 4) + 
  geom_line() + 
  xlab("") +
  ylab("Measurement (units)") + 
  theme_minimal(base_size = 16) +
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust = 1))

# differences 
g2 <- ggplot(One_group_two_conditions_long, aes(x = 1, y = difference)) +
  geom_point(shape = 1, size = 4) + 
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median, geom = "crossbar", width = 0.001, size = 1) +
  xlab("") +
  ylab("Difference in Measurement (units)") +
  theme_minimal(base_size = 16) + 
  scale_x_continuous(breaks = NULL) +
  coord_fixed(ratio = 0.0005)

# combine the two plots
grid.arrange(g1, g2, ncol = 2)

Paired or Non-independent Data: 2 Groups, 2 Conditions

library(readxl)
# read in data from PLOS Biology article supplementary materials
Two_groups_two_conditions <- read_excel("journal.pbio.1002128.s003.XLS", sheet = 2)
# subset data from the 'points jittered' sheet
Two_groups_two_conditions <- Two_groups_two_conditions[12:41,2:5]
# assign group names
Two_groups_two_conditions$group <- c(rep("Group 1 Name", 15), rep("Group 2 Name", 15)) 
names(Two_groups_two_conditions) <- c("Condition 1 Name A", "Condition 2 Name A",   "Condition 1 Name B", "Condition 2 Name B")

# convert to simple long form
Two_groups_two_conditions[,1] <-  unlist(c(Two_groups_two_conditions[1:15,1], Two_groups_two_conditions[16:30,3]))
Two_groups_two_conditions[,2] <-  unlist(c(Two_groups_two_conditions[1:15,2], Two_groups_two_conditions[16:30,4]))
# drop unneeded columns
Two_groups_two_conditions <- Two_groups_two_conditions[,c(1:2, 5)]
# assign column names
names(Two_groups_two_conditions) <- c("Condition 1", "Condition 2", "Group")
Two_groups_two_conditions$`Subject ID` <- 1:30
# compute differences
Two_groups_two_conditions$difference <- as.numeric(Two_groups_two_conditions$`Condition 2`)  - as.numeric(Two_groups_two_conditions$`Condition 1`)

# convert to long again
library(tidyr)
Two_groups_two_conditions_long <- gather(Two_groups_two_conditions, condition, value, c(`Condition 1`, `Condition 2`), convert = TRUE)
kable(Two_groups_two_conditions_long)

Group	Subject ID	difference	condition	value
Group 1 Name	1	8	Condition 1	5
Group 1 Name	2	4	Condition 1	1
Group 1 Name	3	5	Condition 1	7
Group 1 Name	4	2	Condition 1	9
Group 1 Name	5	7	Condition 1	2
Group 1 Name	6	-1	Condition 1	6
Group 1 Name	7	1	Condition 1	4
Group 1 Name	8	3	Condition 1	11
Group 1 Name	9	-2	Condition 1	14
Group 1 Name	10	6	Condition 1	13
Group 1 Name	11	NA	Condition 1	NA
Group 1 Name	12	NA	Condition 1	NA
Group 1 Name	13	NA	Condition 1	NA
Group 1 Name	14	NA	Condition 1	NA
Group 1 Name	15	NA	Condition 1	NA
Group 2 Name	16	-2	Condition 1	20
Group 2 Name	17	-4	Condition 1	13
Group 2 Name	18	1	Condition 1	15
Group 2 Name	19	5	Condition 1	8
Group 2 Name	20	2	Condition 1	3
Group 2 Name	21	1	Condition 1	7
Group 2 Name	22	-7	Condition 1	14
Group 2 Name	23	0	Condition 1	12
Group 2 Name	24	3	Condition 1	11
Group 2 Name	25	1	Condition 1	9
Group 2 Name	26	NA	Condition 1	NA
Group 2 Name	27	NA	Condition 1	NA
Group 2 Name	28	NA	Condition 1	NA
Group 2 Name	29	NA	Condition 1	NA
Group 2 Name	30	NA	Condition 1	NA
Group 1 Name	1	8	Condition 2	13
Group 1 Name	2	4	Condition 2	5
Group 1 Name	3	5	Condition 2	12
Group 1 Name	4	2	Condition 2	11
Group 1 Name	5	7	Condition 2	9
Group 1 Name	6	-1	Condition 2	5
Group 1 Name	7	1	Condition 2	5
Group 1 Name	8	3	Condition 2	14
Group 1 Name	9	-2	Condition 2	12
Group 1 Name	10	6	Condition 2	19
Group 1 Name	11	NA	Condition 2	NA
Group 1 Name	12	NA	Condition 2	NA
Group 1 Name	13	NA	Condition 2	NA
Group 1 Name	14	NA	Condition 2	NA
Group 1 Name	15	NA	Condition 2	NA
Group 2 Name	16	-2	Condition 2	18
Group 2 Name	17	-4	Condition 2	9
Group 2 Name	18	1	Condition 2	16
Group 2 Name	19	5	Condition 2	13
Group 2 Name	20	2	Condition 2	5
Group 2 Name	21	1	Condition 2	8
Group 2 Name	22	-7	Condition 2	7
Group 2 Name	23	0	Condition 2	12
Group 2 Name	24	3	Condition 2	14
Group 2 Name	25	1	Condition 2	10
Group 2 Name	26	NA	Condition 2	NA
Group 2 Name	27	NA	Condition 2	NA
Group 2 Name	28	NA	Condition 2	NA
Group 2 Name	29	NA	Condition 2	NA
Group 2 Name	30	NA	Condition 2	NA

# plot
library(ggplot2)
g1 <- ggplot(Two_groups_two_conditions_long, aes(condition, as.numeric(value), group = `Subject ID`)) +
  geom_point(size = 4, shape = 1) +
  geom_line() + 
  xlab("") +
  ylab("Measurement (units)") +
  theme_minimal(base_size = 16) +
  facet_grid(~Group) + 
  theme(axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

# difference
g2 <-  ggplot(Two_groups_two_conditions_long, aes(Group, difference)) +
  geom_point(size = 4, shape = 1) +
  stat_summary(fun.y = median, fun.ymin = median, fun.ymax = median, geom = "crossbar", width = 0.3) +
  xlab("") +
  ylab("Difference in Measurement (units)") +
  theme_minimal(base_size = 16)  +
  coord_fixed(ratio = 0.15)

# combine the two plots
grid.arrange(g1, g2, ncol = 2)

Homework 3

The dataset consists of a sample of 14 primary school children between 8 and 12 years old. The children were asked to respond on 8 emotions and coping strategies scales for each of 6 situations: fail to fulfill assingments in class, not allowed to play with other children, forbidden to do something by the teacher, victim of bullying, too much school work, forbidden to do something by the mother.

Plot the data in some meaningful ways. You may have to manipulate data into a different format first.

Column 1: Unpleasant (Annoy) Column 2: Sad Column 3: Afraid Column 4: Angry Column 5: Approach coping Column 6: Avoidant coping Column 7: Social support seeking Column 8: Emotional reaction, especially agression Column 9: Situation ID Column 10: Children ID

Source: Roeder, I., Boekaerts, M., & Kroonenberg, P. M. (2002). The stress and coping questionnaire for children (School version and Asthma version): Construction, factor structure, and psychometric properties. Psychological Reports, 91, 29-36.

dta3 <- read.table("C:/Users/TheorEco Lab/Desktop/108-2/DataManagement/0420/coping.txt", header = T)

str(dta3)

## 'data.frame':    84 obs. of  10 variables:
##  $ annoy    : int  4 4 2 4 4 4 3 3 3 4 ...
##  $ sad      : int  2 4 2 3 2 3 2 1 1 4 ...
##  $ afraid   : int  2 4 2 4 1 1 2 1 1 2 ...
##  $ angry    : int  2 2 2 4 1 4 2 2 2 1 ...
##  $ approach : num  1 4 2.67 4 1 2.33 2 1.33 1 1.67 ...
##  $ avoid    : num  2 3 3 1.5 2.75 2.5 1 4 1 4 ...
##  $ support  : num  1 1.25 1 3.25 1.25 1 1.5 2.75 1.33 3.5 ...
##  $ agressive: num  2.5 1.5 2.33 1 1.5 3.67 1 2 1.67 2.5 ...
##  $ situation: Factor w/ 6 levels "Bully","Fail",..: 2 4 5 1 6 3 2 4 5 1 ...
##  $ sbj      : Factor w/ 14 levels "S135","S137",..: 6 6 6 6 6 6 4 4 4 4 ...

重整資料

library(reshape2)
dta3L <- melt(dta3, id = c("sbj", "situation"))
head(dta3L)

sbj	situation	variable	value
S2	Fail	annoy	4
S2	NoPart	annoy	4
S2	TeacNo	annoy	2
S2	Bully	annoy	4
S2	Work	annoy	4
S2	MomNo	annoy	4

面對六種情境，普遍皆表現較多煩躁 ( annoy )的情緒，較少表現害怕的情緒

pd <- position_dodge(.3)
p31 <- dta3L %>% group_by(situation, variable) %>%
  summarize(m_v=mean(value), 
            se_v=sd(value)/sqrt(n())) %>%
  ggplot() + 
  aes(variable, m_v, 
      group = situation, 
      color = situation) +
  geom_errorbar(aes(ymin=m_v - se_v,
                    ymax=m_v + se_v),
                width=.2, size=.3, 
                position=pd) +
  geom_line(position=pd, 
            linetype='dotted') +
  geom_point(position=pd, 
             size=rel(3)) +
  scale_shape(guide=guide_legend(title="Situation")) +
  labs(x="Reaction", y="Scales") +
  theme_light() +
  theme(legend.position="top")

p31

各情境的情緒與應對反應

p32 <- ggplot(dta3L, 
             aes(situation, value)) +
  geom_boxplot(col='gray') +
  geom_point(size=rel(.5)) +
  facet_grid(variable ~.) +
  geom_hline(aes(yintercept = mean(value)), col = "lightblue") + 
  theme_bw() +
  theme(axis.text.x=element_text(angle=60, 
                                 hjust=1))

p32

victim of bullying

感到煩躁的程度較平均高，表現攻擊性的程度較平均低，但感到害怕的程度也較平均低。

fail to fulfill assingments in class

呈現趨近因應較 6 種情形之平均高。

forbidden to do something by the mother

被母親禁止從事特定活動所感到的煩躁較平均高，也較傾向採取逃避因應，害怕程度較平均低。

not allowed to play with other children

和被母親禁止結果類似，感到較為煩躁，較傾向出現逃避因應，也較不感到害怕。

forbidden to do something by the teacher

被老師禁止活動沒有特別感到煩躁，也較不感到難過和害怕，相較於平均不採取趨近因應，而是逃避因應。在此情境中發現，學生對社會支持的需求較平均低。

too much school work

學生面對太多的學校作業感到較為煩躁，不會特別難過或害怕，採取趨近因應的策略。

Homework 5

Use the Cushings{MASS} data set to generate a plot similar to the following one:

dta5 <- MASS::Cushings

str(dta5)

## 'data.frame':    27 obs. of  3 variables:
##  $ Tetrahydrocortisone: num  3.1 3 1.9 3.8 4.1 1.9 8.3 3.8 3.9 7.8 ...
##  $ Pregnanetriol      : num  11.7 1.3 0.1 0.04 1.1 0.4 1 0.2 0.6 1.2 ...
##  $ Type               : Factor w/ 4 levels "a","b","c","u": 1 1 1 1 1 1 2 2 2 2 ...

dta5$lab <- factor(dta5$Type, levels(dta5$Type) [c(1, 2, 3, 4)], labels = c("Adenoma", "Bilateral Hyperplasia", "Carcinoma", "Unknown"))

head(dta5)

	Tetrahydrocortisone	Pregnanetriol	Type	lab
a1	3.1	11.70	a	Adenoma
a2	3.0	1.30	a	Adenoma
a3	1.9	0.10	a	Adenoma
a4	3.8	0.04	a	Adenoma
a5	4.1	1.10	a	Adenoma
a6	1.9	0.40	a	Adenoma

dta5 <- dta5 %>%
  mutate(type.x = c(rep(3, table(dta5$Type)['a']),  rep(9, table(dta5$Type)['b']), 
                rep(16, table(dta5$Type)['c']), rep(22, table(dta5$Type)['u'])),
         type.y = c(rep(11, table(dta5$Type)['a']), rep(2.5, table(dta5$Type)['b']),
                rep(7, table(dta5$Type)['c'] ), rep(1.5, table(dta5$Type)['u'] )))

head(dta5)

Tetrahydrocortisone	Pregnanetriol	Type	lab	type.x	type.y
3.1	11.70	a	Adenoma	3	11
3.0	1.30	a	Adenoma	3	11
1.9	0.10	a	Adenoma	3	11
3.8	0.04	a	Adenoma	3	11
4.1	1.10	a	Adenoma	3	11
1.9	0.40	a	Adenoma	3	11

p51 <- ggplot(dta5, aes(x = Tetrahydrocortisone, y = Pregnanetriol, label = lab)) + 
  geom_point(aes(color = Type)) + 
  xlab("Tetrahydrocortisone (mg/24 hours)") + 
  ylab("Pregnanetriol (mg/24 hours)") +
  scale_x_continuous(breaks = seq(0, 60, 10), limits = c(0, 60)) +
  scale_y_continuous(breaks = seq(0, 12, 2)) +
  ggtitle("Cushing\'s syndrome") +
  theme(legend.position = "none", 
        panel.background = element_blank(),
        panel.grid.major.x = element_blank(), 
        panel.grid.major.y = element_line(color = "gray"), 
        axis.ticks = element_blank(), 
        axis.text.y = element_text(angle = 90), 
        axis.text = element_text(size = 10)
        )
  
p51

p5 <- p51 +
  geom_text(aes(x = type.x, y = type.y, group = lab, color = lab))

p5

Grammar of Graphics: Homework

2020-Spring [Data Management] / Instructor: Prof. SHEU, Ching-Fan

CHIU, Ming-Tzu

2020-04-26

Homework 1

重整資料

1. Whether there are gender difference in state anxiety ?

2. individual differences in state anxiety ?

Homework 2

Interpreting the plots

Independent data, points not jittered (no overlapping points)

Independent data, points jittered

Paired or Non-independent Data: 1 Group, 2 Conditions

Paired or Non-independent Data: 2 Groups, 2 Conditions

Homework 3

重整資料

面對六種情境，普遍皆表現較多煩躁 ( annoy )的情緒，較少表現害怕的情緒

各情境的情緒與應對反應

victim of bullying

fail to fulfill assingments in class

forbidden to do something by the mother

not allowed to play with other children

forbidden to do something by the teacher

too much school work

Homework 5

Let it all hang out in informatics graphics.

Group 1	Group 2	Group 3	Group 4	Group 5
5	7	9	42	2
3	3	7	2	0
6	9	10	5	3
8	10	12	55	5
10	33	14	9	7
13	15	17	12	10
1	18	20	15	13
4	6	40	3	1
18	20	22	NA	15
4	30	35	NA	1
7	NA	42	NA	4
9	NA	13	NA	6
14	NA	NA	NA	11
15	NA	NA	NA	12
17	NA	NA	NA	14

Group 1	Group 2	Group 3	Group 4	Group 5
5	7	9	42	2
3	3	7	2	0
6	9	10	5	3
8	10	12	55	5
10	33	14	9	7
13	15	17	12	10
1	18	20	15	13
4	6	40	3	1
18	20	22	NA	15
4	30	35	NA	1
7	NA	42	NA	4
9	NA	13	NA	6
14	NA	NA	NA	11
15	NA	NA	NA	12
17	NA	NA	NA	14

Group 1	Group 2	Group 3	Group 4	Group 5
5	7	9	42	2
3	3	7	2	0
6	9	10	5	3
8	10	12	55	5
10	33	14	9	7
13	15	17	12	10
1	18	20	15	13
4	6	40	3	1
18	20	22	NA	15
4	30	35	NA	1
7	NA	42	NA	4
9	NA	13	NA	6
14	NA	NA	NA	11
15	NA	NA	NA	12
17	NA	NA	NA	14