A survey is done about the weight of backpacks of 15 different students. The weight of these backpacks is recorded in pounds. Our goal is to analyze these weights by computing a 5-number-summary, and sketching a histogram and a boxplot.
Here, we record the weight of the 15 backpacks:
BackpackWeight <- c(17,18,19,20,9,10,13,14,25,15,16,21,11,12,22)
Again, these weights are all in pounds.
Using R commands, we compute the minimum, 1st quartile, median, 3rd quartile, and maximum values of this data set.
min(BackpackWeight) #this calculates the minimum value
## [1] 9
max(BackpackWeight) #this calculates the maximum value
## [1] 25
median(BackpackWeight) #this calculates the median
## [1] 16
quantile(BackpackWeight, 0.5) #another way to calculate the median
## 50%
## 16
quantile(BackpackWeight, 0.25) #first quartile
## 25%
## 12.5
quantile(BackpackWeight, 0.75) #third quartile
## 75%
## 19.5
quantile(BackpackWeight, c(0, 0.25, 0.5, 0.75, 1)) #the full 5-number summary
## 0% 25% 50% 75% 100%
## 9.0 12.5 16.0 19.5 25.0
Finally, to display this data, we construct a histogram and a boxplot.
BackpackDataSet <- data.frame(BackpackWeight = BackpackWeight)
ggplot(BackpackDataSet, aes(x=BackpackWeight))+
geom_boxplot(col="darkgreen", fill="green")
ggplot(BackpackDataSet, aes(x=BackpackWeight))+
geom_histogram(binwidth=2, col="blue", fill="lightblue")
This data set is quite small, but it gives us a look at how we can record quantitative data (i.e., a list of numbers), and summarize this data by using a 5-number summary and various data displays.