Week 3

Question 1 (Preparation) [2 marks]

  1. What are the names of the R functions you would use to produce the following plots of numeric data:

• Histogram

hist()

• Box plot

boxplot()

• Scatter plot

plot()

  1. What is the name of the R function you can use to calculate the sample standard deviation, 𝑠, of a numeric variable?

sd()

Question 2 [2 marks]

The following data represent the height increase (in cm) of 10 tomato plants after receiving organic fertilizer for 6 weeks: 12.4, 14.1, 13.8, 15.0, 14.6, 16.2, 13.9, 15.4, 14.8, 12.9

Load the data in a variable

tomato_heights <- c(12.4, 14.1, 13.8, 15.0, 14.6, 16.2, 13.9, 15.4, 14.8, 12.9)

  1. Calculate the sample mean, 𝒙‾. Show your working and then use R.

’’’{r}

xbar=mean(tomato_heights)

xbar=14.31

’’’

  1. Calculate the sample median, 𝑚. Show your working and then use R.

Median=Center point

median(tomato_heights)

median=14.35

Question 3[2 marks]

The following is the R output from summarising the monthly rent (in NZD) paid by households in Hamilton.

Rent

Min. : 420

1st Qu. : 610

Median : 780

Mean : 825

3rd Qu. : 980

Max. : 1650

  1. Calculate the range of the monthly rent. Show your working.

Range=Max-Min

Range=1650-420

Range=1230

  1. Calculate the interquartile range of the monthly rent. Show your working.

Question 4 [4 marks]

IQR=3rd Qu-1st Qu

IQR=980-610

IQR=370

We will again work with the penguin dataset here. Load the dataset in R.

data2 = read.csv(“/penguins.csv”)

  1. Produce a histogram of the penguins’ body mass. Your histogram must include the appropriate axis label(s).

hist(penguins$body_mass, main = “Histogram of Penguins Body Mass”, xlab=“Penguins body mass”)

  1. Use the histogram produced for Question 4(a) to describe any features of the penguins’ body mass.

The most common weight is between 3500 and 4000. There are more penguins that have a mass above the mode than below the mode.

  1. Produce a box plot of the penguins’ body mass. Your box plot must include the appropriate axis label(s).

boxplot(penguins$body_mass, main=“Box Plot of Penguins Body Mass”, xlab=“Penguins Body Mass”, horizontal = TRUE)

  1. Use the box plot produced for Question 4(c) to describe any features of the penguins’ body mass.

I can see that the median is just more than 4000, the lower quartile is around 3000, and the upper quartile is between 4500 and 5000. The range from the upper quartile to the Maximum is larger than the range from the minimum to the lower quartile.