Practice Exercise: DATA PRESENTATION
\(~\)
Before we have our scripts for each problem, let us first load the necessary packages in RStudio.
# Load necessary packages
library(pander)
\(~\)
Problem 1
The following table shows the data for 50 vehicle purchases during the last 5 years made at a certain car dealership.
| Suzuki | Suzuki | Toyota | Suzuki | Toyota | Ford | Honda | Ford | Suzuki | Suzuki |
| Toyota | Suzuki | Ford | Ford | Toyota | Toyota | Ford | Suzuki | Suzuki | Ford |
| Mitsubishi | Honda | Mitsubishi | Ford | Mitsubishi | Ford | Ford | Honda | Suzuki | Ford |
| Honda | Toyota | Toyota | Suzuki | Suzuki | Mitsubishi | Ford | Ford | Honda | Mitsubishi |
| Honda | Ford | Toyota | Toyota | Honda | Suzuki | Suzuki | Toyota | Ford | Mitsubishi |
Table 5.15. Data for 50 vehicle purchases.
\(~\)
\(~\)
The R Script:
# Import the "vehicles.csv" file and store it to 'vehicles'.
vehicles <- read.csv("vehicles.csv")
# Determine the frequencies for each vehicle brand.
cars.freq <- table(vehicles$brand)
# Determine the relative frequencies for each vehicle brand.
cars.relfreq <- cars.freq/sum(cars.freq)
# Determine the percent frequencies for each vehicle brand.
cars.pctfreq <- cars.relfreq*100
# Combine columns
cars.freqdist <- cbind(cars.freq, cars.relfreq, cars.pctfreq)
# Label the columns of the frequency distribution table
colnames(cars.freqdist) <- c("Frequency", "Relative Frequency", "Percent Frequency")
# Generate the Frequency Distribution Table
pander(cars.freqdist)
| Frequency | Relative Frequency | Percent Frequency | |
|---|---|---|---|
| Ford | 14 | 0.28 | 28 |
| Honda | 7 | 0.14 | 14 |
| Mitsubishi | 6 | 0.12 | 12 |
| Suzuki | 13 | 0.26 | 26 |
| Toyota | 10 | 0.2 | 20 |
The frequency distribution table shows that Ford was the most purchased vehicle brand with 13 customers availing of it. This is followed by Suzuki with 13 purchases. Mitsubishi turns out to be the brand which was least purchased.
\(~\)
The Bar Chart
The R Script
# Load necessary packages
library(tidyverse)
library(forcats)
# Construct the bar chart.
bar <- ggplot(vehicles, aes(x = brand)) + geom_bar(width = 0.5) + ggtitle("Vehicle Purchases during the Last 5 Years")
# Arrange the bars in decreasing frequencies.
bar <- ggplot(mutate(vehicles, brand = fct_infreq(brand))) + geom_bar(aes(x = brand), width = 0.5) + ggtitle("Vehicle Purchases during the Last 5 Years")
# Present the bar graph.
bar
The bar graph shows that the vehicle brand Ford was the most purchased brand, while Mitsubishi was the least purchased brand.
\(~\)
The Pie Chart
The R Script
# Determine the freqquencies for each car brand
car.freq <- table(vehicles$brand)
# Present the frequency table
car.freq
Ford Honda Mitsubishi Suzuki Toyota
14 7 6 13 10
# Create a vector of the frequencies and name it as freqs.
freqs <- c(14, 7, 6, 13, 10)
# Create a vector of the vehicle brands. Name this to 'brands'
brands <- c("Ford", "Honda", "Mitsubishi", "Suzuki", "Toyota")
# Calculate corresponding percentages of the frequencies. Round-off values to 2 decimal places.
pcnts <- round(freqs/sum(freqs)*100, 2)
# Add computed percentages to the brands
brands <- paste(brands, pcnts)
# Add the "%" sign to the labels
brands <- paste(brands, "%", sep = " ")
# Construct the pie chart.
piechart <- pie(freqs, labels = brands, col = rainbow(length(brands)), main = "Pie Chart of data on Vehicle Purchases")
# Present the pie chart.
piechart
NULL
(Note: In this output, the command to present the piechart was somehow shown after the piechart.)
The pie chart shows that Ford was the most purchased brand among all brands, corresponding to 28% of all the vehicle purchases. Mitsubishi, on the other hand, was the least purchased brand as indicated by only 12% of the purchases made.
\(~\)
Probem 2:
The data below shows the time in days required to complete year-end audits for a sample of 20 clients of Sanderson and Clifford, a small public accounting firm. Construct a dot plot for the sample.
Data:
| Year-end Audit Time (in days) | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| 12 | 20 | 14 | 15 | 21 | 18 | 22 | 18 | 17 | 13 |
| 15 | 22 | 14 | 27 | 18 | 19 | 33 | 16 | 23 | 28 |
\(~\)
The R Script:
# Load necessary packages
library(ggplot2)
# Create the data vector
audit <- c(12, 20, 14, 15, 21, 18, 22, 18, 17, 13, 15, 22, 14, 27, 18, 19, 33, 16, 23, 28)
# Create the dotplot
dot <- stripchart(audit, method = "stack", at = c(0.05), pch = 20, cex = 3.2, las = 1, frame.plot = FALSE, xlim = c(10, 35), main = "Year-End Audit Times")
# Present the dotplot.
dot
## NULL