This document contain answer to the practical sesssion Analysis_of_the_Student_Survey_DataSet_I
First we load the required data set — I will assume that you have done this for in advance of any code snippets you submit.
library(MASS)
data(survey)
colors <- c("thistle", "powderblue", "khaki1", " wheat", "lightgreen", "plum2", "paleturquoise", "tan")
How many missing values are in variable representing preferred writing hand?
Building up the required command we have … first we output the variable
survey$W.Hnd
## [1] Right Left Right Right Right Right Right Right Right Right Right
## [12] Right Right Right Right Right Right Left Right Right Right Right
## [23] Right Right Right Right Right Right Right Right Right Right Right
## [34] Right Right Right Right Right Right Right Right Right Right Right
## [45] <NA> Right Right Right Right Right Left Right Right Right Right
## [56] Right Right Right Right Left Right Right Right Left Right Right
## [67] Right Right Right Right Right Right Right Right Right Right Right
## [78] Right Right Right Left Right Right Right Right Right Right Right
## [89] Right Left Right Right Right Right Right Right Right Right Right
## [100] Right Right Right Right Right Right Right Right Right Right Right
## [111] Right Right Right Right Right Right Right Left Left Right Right
## [122] Right Right Left Right Right Right Right Right Right Right Right
## [133] Right Left Right Right Left Right Right Right Right Right Right
## [144] Right Left Right Right Right Right Right Right Right Right Right
## [155] Right Right Right Right Right Right Right Left Right Right Right
## [166] Right Right Right Right Right Right Left Right Right Right Left
## [177] Right Right Right Right Right Right Right Right Right Right Right
## [188] Right Right Right Right Right Right Right Right Right Right Right
## [199] Right Right Right Right Right Right Right Right Right Right Left
## [210] Right Right Left Right Right Right Right Right Right Right Right
## [221] Right Right Right Right Right Right Right Right Right Right Right
## [232] Right Right Right Right Right Right
## Levels: Left Right
… next we apply the is.na test to check for missing values …
is.na(survey$W.Hnd)
## [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [23] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [34] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [45] TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [56] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [67] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [78] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [89] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [111] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [122] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [144] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [155] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [166] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [177] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [188] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [199] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [210] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [221] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [232] FALSE FALSE FALSE FALSE FALSE FALSE
… finally we count the TRUE values using the fact that R uses TRUE=1, FALSE=0.
sum(is.na(survey$W.Hnd))
## [1] 1
How many students prefer that their left hand is on top when clapping?
First note that
sum(survey$Clap=="Left")
## [1] NA
returns NA since the variable contains missing values. Hence we need to deal with these.
We can either omit the missing values first and then sum as in
sum(na.omit(survey$Clap)=="Left")
## [1] 39
or we can send the function sum a directive to ignore the missing values
sum(survey$Clap=="Left", na.rm=TRUE)
## [1] 39
How many students do not have a preference regarding which hand is on top when folding or did not answer this question?
sum(survey$Fold=='Neither') + sum(is.na(survey$Fold))
## [1] 18
Code to calculate number of students that do not have a preference regarding which hand is on top when clapping or did not answer this question?
sum(survey$Clap=='Neither') + sum(is.na(survey$Fold))
## [1] NA
What percentage of students (excluding missing values) are left handed (answer to 2 decimal places)?
freq <- table(survey$W.Hnd)
sprintf("%.2f%%", 100*freq['Left'] / sum(freq))
## [1] "7.63%"
Code to construct a bar chart of the distribution of preferred upper hand when clapping (ignore NA).
freq <- table(survey$Clap)
midpoints <- barplot(freq, ylab="Frequency", main="Which hand is on top when clapping?", col=colors)
text(midpoints, .9*freq, labels=freq)
Code to construct a pie chart of the distribution of preferred upper hand when Clapping (ignore NA).
freq <- table(survey$Clap)
labels <- paste(names(freq)," ", "\n", freq, sep="")
pie(freq, ylab="Frequency", main="Which hand is on top when clapping?",
col=colors, labels = labels)
Code to construct a pie chart of the distribution of preferred upper hand when Clapping (ignore NA and give percentages in labels).
freq <- table(survey$Fold)
labels <- sprintf("%s\n %.1f%%", names(freq), 100*freq/sum(freq))
pie(freq, ylab="Frequency", main="Which hand is on top when clapping?",
col=colors, labels = labels)
How many students are left handed and their left hand is the preferred hand on top when folding?
table(survey$W.Hnd, survey$Fold)
##
## L on R Neither R on L
## Left 10 1 7
## Right 88 17 113
and read off value coresponding the being left handed and their left hand begin the preferred hand on top when folding or use a criteria as in
sum(survey$W.Hnd=="Left" & survey$Fold=="L on R", na.rm=TRUE)
## [1] 10
What percentage of students who prefer their right hand on top when folding are also right handed (answer to 2 decimal places)?
table(survey$W.Hnd, survey$Fold)
##
## L on R Neither R on L
## Left 10 1 7
## Right 88 17 113