Replace gifted data by gifted.
# Load packages
library(openintro) #for the use of email50 and county data
library(dplyr) #for the use of dplyr functions such as mutate
library(ggplot2) #for use of ggplot2 functions such ggplot()
# Load data
data(gifted )
# View its structure
str(gifted )
## 'data.frame': 36 obs. of 8 variables:
## $ score : int 159 164 154 157 156 150 155 161 163 162 ...
## $ fatheriq: int 115 117 115 113 110 113 118 117 111 122 ...
## $ motheriq: int 117 113 118 131 109 109 119 120 128 120 ...
## $ speak : int 18 20 20 12 17 13 19 18 22 18 ...
## $ count : int 26 37 32 24 34 28 24 32 28 27 ...
## $ read : num 1.9 2.5 2.2 1.7 2.2 1.9 1.8 2.3 2.1 2.1 ...
## $ edutv : num 3 1.75 2.75 2.75 2.25 1.25 2 2.25 1 2.25 ...
## $ cartoons: num 2 3.25 2.5 2.25 2.5 3.75 3 2.5 4 2.75 ...
Create a scatterplot to investigate the relationship between:
Describe the relationship between the two variables. The more the kids read the better scores they recieve.
ggplot(data = gifted, aes(x = score, y = read)) +
geom_point()
Suppose that you want to investigate whether the relationship we found above varies by mother’s IQ. Create a new categorical variable, motheriq_cat, and assign “”below average“, or”at or above average“.
avg_motheriq <- mean(gifted$motheriq, na.rm = TRUE)
avg_motheriq
## [1] 118.1667
gifted <- gifted %>%
mutate(motheriq_cat = ifelse(motheriq < avg_motheriq, "below average", "at or above average"))
head(gifted)
## score fatheriq motheriq speak count read edutv cartoons
## 1 159 115 117 18 26 1.9 3.00 2.00
## 2 164 117 113 20 37 2.5 1.75 3.25
## 3 154 115 118 20 32 2.2 2.75 2.50
## 4 157 113 131 12 24 1.7 2.75 2.25
## 5 156 110 109 17 34 2.2 2.25 2.50
## 6 150 113 109 13 28 1.9 1.25 3.75
## motheriq_cat
## 1 below average
## 2 below average
## 3 below average
## 4 at or above average
## 5 below average
## 6 below average
Add the third variable, motheriq_cat, to the scatterplot you created in Q1. Does the relationship you found in Q1 vary by mother’s IQ? The relationship that I found in Q1 varies by mother’s IQ because when the child has a mother with a high IQ, resulted in the children reading more
ggplot(data = gifted, aes(x = score, y = read, color = motheriq_cat)) +
geom_point(show.legend = FALSE) +
facet_wrap(~ motheriq_cat)
You are only interested in gifted children with analytical skills greater than 150. Fiter the data. How many such children are there? 35
gifted_above <- gifted %>%
filter(score > 150)
str(gifted_above)
## 'data.frame': 35 obs. of 9 variables:
## $ score : int 159 164 154 157 156 155 161 163 162 154 ...
## $ fatheriq : int 115 117 115 113 110 118 117 111 122 111 ...
## $ motheriq : int 117 113 118 131 109 119 120 128 120 117 ...
## $ speak : int 18 20 20 12 17 19 18 22 18 19 ...
## $ count : int 26 37 32 24 34 24 32 28 27 32 ...
## $ read : num 1.9 2.5 2.2 1.7 2.2 1.8 2.3 2.1 2.1 2.2 ...
## $ edutv : num 3 1.75 2.75 2.75 2.25 2 2.25 1 2.25 1.75 ...
## $ cartoons : num 2 3.25 2.5 2.25 2.5 3 2.5 4 2.75 3.75 ...
## $ motheriq_cat: chr "below average" "below average" "below average" "at or above average" ...