Replace births data by gifted.
# Load packages
library(openintro) #for the use of email50 and county data
library(dplyr) #for the use of dplyr functions such as mutate
library(ggplot2) #for use of ggplot2 functions such ggplot()
# Load data
data(gifted)
# View its structure
str(gifted)
## 'data.frame': 36 obs. of 8 variables:
## $ score : int 159 164 154 157 156 150 155 161 163 162 ...
## $ fatheriq: int 115 117 115 113 110 113 118 117 111 122 ...
## $ motheriq: int 117 113 118 131 109 109 119 120 128 120 ...
## $ speak : int 18 20 20 12 17 13 19 18 22 18 ...
## $ count : int 26 37 32 24 34 28 24 32 28 27 ...
## $ read : num 1.9 2.5 2.2 1.7 2.2 1.9 1.8 2.3 2.1 2.1 ...
## $ edutv : num 3 1.75 2.75 2.75 2.25 1.25 2 2.25 1 2.25 ...
## $ cartoons: num 2 3.25 2.5 2.25 2.5 3.75 3 2.5 4 2.75 ...
Create a scatterplot to investigate the relationship between:
Describe the relationship between the two variables.
There is an upward trending line in the graph. This means that there is a relationship between higher analytical skills of young gifted children and the average number of hours per week the child’s mother or father reads to them.
ggplot(data = gifted, aes(x = score, y = read)) +
geom_point()
Suppose that you want to investigate whether the relationship we found above varies by mother’s IQ. Create a new categorical variable, motheriq_cat, and assign “”below average“, or”at or above average“.
motheriq_cat <- mean(gifted$motheriq, na.rm = TRUE)
motheriq_cat
## [1] 118.1667
gifted <- gifted %>%
mutate(motheriq_cat = ifelse(motheriq < motheriq_cat, "below average", "at or above average"))
head(gifted)
## score fatheriq motheriq speak count read edutv cartoons
## 1 159 115 117 18 26 1.9 3.00 2.00
## 2 164 117 113 20 37 2.5 1.75 3.25
## 3 154 115 118 20 32 2.2 2.75 2.50
## 4 157 113 131 12 24 1.7 2.75 2.25
## 5 156 110 109 17 34 2.2 2.25 2.50
## 6 150 113 109 13 28 1.9 1.25 3.75
## motheriq_cat
## 1 below average
## 2 below average
## 3 below average
## 4 at or above average
## 5 below average
## 6 below average
Add the third variable, motheriq_cat, to the scatterplot you created in Q1. Does the relationship you found in Q1 vary by mother’s IQ?
The relationship found in Q1 does vary by the mother’s IQ. If the mother has a higher IQ, the child tends to have a higher analytical score, and they are read to by their parents more, speaking generally. If the mother’s IQ score is below average, the child tends to have lower score as well as they are read to less.
ggplot(data = gifted, aes(x = score, y = read, color = motheriq_cat )) +
geom_point(show.legend = FALSE) +
facet_wrap(~ motheriq_cat)
You are only interested in gifted children with analytical skills greater than 150. Fiter the data. How many such children are there?
There are 35 of these children.
gifted_above <- gifted %>%
filter(score > 150)
str(gifted_above)
## 'data.frame': 35 obs. of 9 variables:
## $ score : int 159 164 154 157 156 155 161 163 162 154 ...
## $ fatheriq : int 115 117 115 113 110 118 117 111 122 111 ...
## $ motheriq : int 117 113 118 131 109 119 120 128 120 117 ...
## $ speak : int 18 20 20 12 17 19 18 22 18 19 ...
## $ count : int 26 37 32 24 34 24 32 28 27 32 ...
## $ read : num 1.9 2.5 2.2 1.7 2.2 1.8 2.3 2.1 2.1 2.2 ...
## $ edutv : num 3 1.75 2.75 2.75 2.25 2 2.25 1 2.25 1.75 ...
## $ cartoons : num 2 3.25 2.5 2.25 2.5 3 2.5 4 2.75 3.75 ...
## $ motheriq_cat: chr "below average" "below average" "below average" "at or above average" ...