Introduction

This document contain answer to the practical sesssion Analysis_of_the_Student_Survey_DataSet_I

First we load the required data set — I will assume that you have done this for in advance of any code snippets you submit.

library(MASS)
data(survey)
colors <- c("thistle", "powderblue", "khaki1", " wheat", "lightgreen", "plum2", "paleturquoise", "tan")

Question 1

How many missing values are in variable representing preferred writing hand?

Building up the required command we have … first we output the variable

survey$W.Hnd
##   [1] Right Left  Right Right Right Right Right Right Right Right Right
##  [12] Right Right Right Right Right Right Left  Right Right Right Right
##  [23] Right Right Right Right Right Right Right Right Right Right Right
##  [34] Right Right Right Right Right Right Right Right Right Right Right
##  [45] <NA>  Right Right Right Right Right Left  Right Right Right Right
##  [56] Right Right Right Right Left  Right Right Right Left  Right Right
##  [67] Right Right Right Right Right Right Right Right Right Right Right
##  [78] Right Right Right Left  Right Right Right Right Right Right Right
##  [89] Right Left  Right Right Right Right Right Right Right Right Right
## [100] Right Right Right Right Right Right Right Right Right Right Right
## [111] Right Right Right Right Right Right Right Left  Left  Right Right
## [122] Right Right Left  Right Right Right Right Right Right Right Right
## [133] Right Left  Right Right Left  Right Right Right Right Right Right
## [144] Right Left  Right Right Right Right Right Right Right Right Right
## [155] Right Right Right Right Right Right Right Left  Right Right Right
## [166] Right Right Right Right Right Right Left  Right Right Right Left 
## [177] Right Right Right Right Right Right Right Right Right Right Right
## [188] Right Right Right Right Right Right Right Right Right Right Right
## [199] Right Right Right Right Right Right Right Right Right Right Left 
## [210] Right Right Left  Right Right Right Right Right Right Right Right
## [221] Right Right Right Right Right Right Right Right Right Right Right
## [232] Right Right Right Right Right Right
## Levels: Left Right

… next we apply the is.na test to check for missing values …

is.na(survey$W.Hnd)
##   [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [12] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [23] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [34] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [45]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [56] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [67] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [78] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
##  [89] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [100] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [111] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [122] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [133] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [144] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [155] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [166] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [177] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [188] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [199] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [210] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [221] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
## [232] FALSE FALSE FALSE FALSE FALSE FALSE

… finally we count the TRUE values using the fact that R uses TRUE=1, FALSE=0.

sum(is.na(survey$W.Hnd))
## [1] 1

Question 2

How many students prefer that their left hand is on top when clapping?

First note that

sum(survey$Clap=="Left")
## [1] NA

returns NA since the variable contains missing values. Hence we need to deal with these.

We can either omit the missing values first and then sum as in

sum(na.omit(survey$Clap)=="Left")
## [1] 39

or we can send the function sum a directive to ignore the missing values

sum(survey$Clap=="Left", na.rm=TRUE)
## [1] 39

Question 3

How many students do not have a preference regarding which hand is on top when folding or did not answer this question?

sum(survey$Fold=='Neither') + sum(is.na(survey$Fold))
## [1] 18

Question 4

Code to calculate number of students that do not have a preference regarding which hand is on top when clapping or did not answer this question?

sum(survey$Clap=='Neither') + sum(is.na(survey$Fold))
## [1] NA

Question 5

What percentage of students (excluding missing values) are left handed (answer to 2 decimal places)?

freq <- table(survey$W.Hnd)
sprintf("%.2f%%", 100*freq['Left'] / sum(freq))
## [1] "7.63%"

Question 6

Code to construct a bar chart of the distribution of preferred upper hand when clapping (ignore NA).

freq <- table(survey$Clap)
midpoints <- barplot(freq, ylab="Frequency", main="Which hand is on top when clapping?", col=colors)
text(midpoints, .9*freq, labels=freq)

Question 7

Code to construct a pie chart of the distribution of preferred upper hand when Clapping (ignore NA).

freq <- table(survey$Clap)
labels <- paste(names(freq)," ", "\n", freq, sep="")
pie(freq, ylab="Frequency", main="Which hand is on top when clapping?", 
    col=colors, labels = labels)

Question 8

Code to construct a pie chart of the distribution of preferred upper hand when Clapping (ignore NA and give percentages in labels).

freq <- table(survey$Fold)
labels <- sprintf("%s\n %.1f%%", names(freq), 100*freq/sum(freq))
pie(freq, ylab="Frequency", main="Which hand is on top when clapping?", 
  col=colors, labels = labels)

Question 9

How many students are left handed and their left hand is the preferred hand on top when folding?

table(survey$W.Hnd, survey$Fold)
##        
##         L on R Neither R on L
##   Left      10       1      7
##   Right     88      17    113

and read off value coresponding the being left handed and their left hand begin the preferred hand on top when folding or use a criteria as in

sum(survey$W.Hnd=="Left" & survey$Fold=="L on R", na.rm=TRUE)
## [1] 10

Question 10

What percentage of students who prefer their right hand on top when folding are also right handed (answer to 2 decimal places)?

table(survey$W.Hnd, survey$Fold)
##        
##         L on R Neither R on L
##   Left      10       1      7
##   Right     88      17    113