This file contains a set of tasks that you need to complete in R for the lab assignment. The tasks may require you to add a code chuck, type code into a chunk, and/or execute code. In this lab you will also need to describe your results. Don’t forget that you need to acknowledge if you used any resources beyond class materials or got help to complete the assignment.
Instructions associated with this assignment can be found in the file “DescribingDataTutorial.html”. You can find the code book associated with the BBQ data on the AsULearn.
The data set you will use is different than the one used in the instructions. Pay attention to the differences in the Excel files name, any variable names, or object names. You will need to adjust your code accordingly.
Once you have completed the assignment, you will need to knit this R Markdown file to produce an html file. You will then need to upload the .html file and this .Rmd file to AsULearn.
The first thing you need to do in this file is to add your name and date in the lines underneath this document’s title (see the code in lines 9 and 10).
Author: Jacob Stockoton
Date:9-19-25
You need to identify and set your working directory in this section. If you are working in the cloud version of RStudio, enter a note here to tell us that you did not need to change the working directory because you are working in the cloud.
getwd()
## [1] "/Users/jacobstockton/Downloads/DescribingDataFall2025"
setwd("/Users/jacobstockton/Downloads/DescribingDataFall2025")
You need to install and load the packages and data set you’ll use for
the lab assignment in this section. In this lab, we will use the three
packages we have used in previous labs (dplyr
,
tidyverse
, and openxls
) and one new package
(modeest
). Remember, the first time you use a package you
need to install the package.
install.packages("modeest")
##
## The downloaded binary packages are in
## /var/folders/3j/tn1f_gtn4674lr97v0dkxc5c0000gn/T//Rtmpu2dia0/downloaded_packages
library(modeest)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(openxlsx)
DescribingData <- read.xlsx("DescribingDataAssignmentData.xlsx")
Display the names of the variables in your data set.
names(DescribingData)
## [1] "Observation" "Sex" "Age"
## [4] "Hometown" "Favorite.Meat" "Favorite.Sauce"
## [7] "Sweetness" "Favorite.Side" "Restaurant.City"
## [10] "Restaurant.Name" "Minutes.Driving" "Sandwich.Price"
## [13] "Dinner.Plate.Price" "Ribs.Price"
Display the last 5 observations in the data set.
tail(DescribingData, 5)
## Observation Sex Age Hometown Favorite.Meat Favorite.Sauce Sweetness
## 644 644 1 19 1 1 4 1
## 645 645 1 38 7 6 5 2
## 646 646 2 20 2 2 7 2
## 647 647 1 44 4 2 7 2
## 648 648 2 22 1 5 5 2
## Favorite.Side Restaurant.City Restaurant.Name Minutes.Driving
## 644 5 <NA> <NA> 0
## 645 6 <NA> <NA> 30
## 646 6 Charlotte NC Midwood Smokehouse 10
## 647 1 <NA> <NA> 20
## 648 5 <NA> <NA> 60
## Sandwich.Price Dinner.Plate.Price Ribs.Price
## 644 0 0 0
## 645 25 22 NA
## 646 20 15 36
## 647 8 40 52
## 648 40 15 45
Choose one variable other than Dinner.Plate.Price and display all the observations for that variable.
print(DescribingData$Sandwich.Price)
## [1] 20 12 5 23 22 5 15 16 15 22 20 15 22 15 30 20 15 16 25 15 15 11 7 10 10
## [26] 12 9 11 16 12 6 20 15 12 15 14 20 10 16 12 10 14 20 12 8 8 8 7 9 6
## [51] 7 6 5 15 20 15 10 15 10 15 10 20 15 20 8 20 15 12 10 16 5 10 8 8 1
## [76] 10 20 15 15 18 10 10 12 15 10 20 22 15 20 20 10 10 12 26 13 5 10 10 10 10
## [101] 13 69 15 15 13 10 15 25 8 13 15 10 20 25 10 20 13 3 3 12 3 13 40 15 12
## [126] 8 14 16 15 12 15 20 20 15 50 15 18 15 15 8 9 15 15 12 23 15 12 15 15 15
## [151] 12 13 0 5 14 12 12 20 24 22 11 15 12 15 13 8 10 9 9 13 10 9 13 17 20
## [176] 12 10 15 15 26 15 6 18 25 25 22 7 10 13 20 20 5 0 13 18 10 12 16 15 15
## [201] 0 20 16 15 18 10 15 14 15 10 15 15 15 15 15 12 15 10 10 15 10 14 10 1 10
## [226] 9 25 10 20 12 15 12 12 15 8 7 0 13 15 8 0 12 15 12 15 12 15 15 10 16
## [251] 20 6 22 10 25 16 8 15 30 12 20 6 15 8 20 10 0 15 16 NA 0 4 NA 15 15
## [276] 18 12 20 12 15 9 20 NA 25 25 12 10 15 15 20 16 14 10 15 16 12 30 12 24 25
## [301] 30 8 20 12 22 10 14 16 12 10 20 12 10 12 12 8 12 12 10 25 20 22 20 18 5
## [326] 5 10 6 14 6 11 12 10 10 15 15 15 20 20 15 15 26 12 10 15 10 30 30 15 14
## [351] 20 12 22 15 12 20 2 6 15 20 3 14 8 8 13 27 25 10 10 10 15 10 15 15 12
## [376] 0 10 15 30 10 10 15 7 12 15 15 14 9 10 12 10 10 22 14 13 10 12 20 10 10
## [401] 15 10 14 12 9 16 12 18 20 18 15 15 15 6 7 15 10 25 11 10 8 10 10 12 9
## [426] 18 0 20 13 15 10 9 20 12 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 15 17
## [451] 20 15 15 7 10 8 15 20 10 9 25 15 10 29 10 15 14 26 10 14 12 27 10 14 15
## [476] 15 22 15 10 0 15 15 13 10 14 12 15 10 10 15 40 16 11 10 15 16 10 15 6 20
## [501] 11 35 17 14 15 18 12 20 25 20 45 24 30 16 9 22 15 15 18 22 18 16 20 14 22
## [526] 13 20 12 NA 23 15 10 16 22 0 18 10 15 20 10 10 10 15 10 15 NA 15 14 20 10
## [551] 13 20 14 15 20 15 15 12 18 30 NA 20 15 12 10 10 12 7 16 15 15 10 8 12 10
## [576] 18 NA 20 10 14 12 15 24 15 15 22 12 17 12 15 18 20 22 16 15 15 15 15 12 15
## [601] 12 15 12 13 12 10 20 8 10 10 13 20 10 14 10 19 18 14 15 30 20 15 16 16 30
## [626] 10 13 10 20 15 5 26 6 10 20 30 12 15 15 20 7 22 NA 0 25 20 8 40
You need to calculate the means for variables measuring 1) the price of a dinner plate, 2) preferred sweetness of sauce, 3) how long the respondent is willing to drive, and 4) the price of a rib plate. Calculate the means of each variable separate chunks of code (that is, you’ll need four distinct chunks of code). After each chunk of code, write a one sentence description of the mean. Don’t forget about missing data.
mean(DescribingData$Dinner.Plate.Price)
## [1] NA
mean(DescribingData$Dinner.Plate.Price, na.rm=TRUE)
## [1] 19.74563
The mean 19.74563 is describing the average price in dollars individuals in the data are willing to pay for a pulled pork dinner plate with two sides.
mean(DescribingData$Sweetness)
## [1] NA
mean(DescribingData$Sweetness, na.rm=TRUE)
## [1] 2.889922
The mean of 2.889922 is describing, how sweet or sour one would think BBQ sauce should be on a scale of 1 to 5, where 1 is vinegar, and 5 is honey? So according to the mean of the data more people prefer a more sweet BBQ sauce.
mean(DescribingData$Minutes.Driving)
## [1] NA
mean(DescribingData$Minutes.Driving, na.rm=TRUE)
## [1] 41.71498
The mean of 41.71498 is describing the average time in minuites people are willing to drive for good BBQ.
mean(DescribingData$Ribs.Price)
## [1] NA
mean(DescribingData$Ribs.Price, na.rm=TRUE)
## [1] 23.54849
The mean of 23.54849 is describing the average money in dollars individuals are willing to pay for a half rack of ribs with two sides.
Recalculate the means, but round the calculated values. Again, use a separate chunk for each rounded mean. After each chunk of code, write a one sentence description of the mean. Don’t forget about missing data. Importantly, you need to round the means of the different variables to different decimal places.
round(mean(DescribingData$Dinner.Plate.Price, na.rm=TRUE),digits=2)
## [1] 19.75
The rounded mean of 19.75 is saying that people are willing to pay on average 19 dollars and 75 cents for a pulled pork dinner plate with two sides.
round(mean(DescribingData$Sweetness, na.rm=TRUE),digits=1)
## [1] 2.9
The rounded mean of 2.9 is describing, how sweet or sour one would think BBQ sauce should be on a scale of 1 to 5, where 1 is vinegar, and 5 is honey? So according to the mean of the data more people prefer a more sweet BBQ sauce.
round(mean(DescribingData$Minutes.Driving, na.rm=TRUE),digits=3)
## [1] 41.715
The rounded mean states that based on the data provided that people are willing to drive 41 minutes and 43 seconds to eat good BBQ
round(mean(DescribingData$Ribs.Price, na.rm=TRUE),digits=2)
## [1] 23.55
The rounded mean is 23.55 is saying that people are willing to pay on average 23 dollars and 55 cents for a half raxk of ribs with two sides.
You need to calculate and describe the medians of the variables measuring 1) age of the respondent, 2) how long the respondent is willing to drive for good BBQ, and 3) the price of a sandwich. Use a separate chunk of code for each variable. After each chunk of code write one sentence description of the median. Don’t forget about missing data.
median(DescribingData$Age, na.rm = TRUE)
## [1] 21
The median of 21 means that the age in the middle of the data set of age was 21 years old.
median(DescribingData$Minutes.Driving, na.rm = TRUE)
## [1] 30
The median of 30 shows that in the middle of the data set for good BBQ is 30 minutes.
median(DescribingData$Sandwich.Price, na.rm = TRUE)
## [1] 15
The median of 15 shows that the middle price of BBQ Sandwich is 15 dollars.
You need to calculate and describe the modes of the variables for 1) favorite meat, 2) favorite sauce, and 3) favorite side. These are all categorical variables. Use a separate chunk of code for each variable. After each chunk of code write one sentence description of the mode.
When describing these results, you need to convert the numerical modes of the different variables into words according to the survey code book, which is available on AsU Learn.
mfv(DescribingData$Favorite.Meat)
## [1] 1
This tells us that the most favorite meatin the data chart is pulled pork
mfv(DescribingData$Favorite.Sauce)
## [1] 1
This tells us that the favorite sauce in the data chart is Eastern style (with no tomato)
mfv(DescribingData$Favorite.Side)
## [1] 4
This shows us that the favorite side in the data chart is Hush puppies
You need to calculate and describe the ranges, maximums, and minimums of the variables that identify respondents’ 1) ages, 2) rib price, and 3) how many minutes they would drive for BBQ. Use a separate chunk of code for each variable. After each chunk of code write a one sentence description of the minimum, maximum, and range.
min(DescribingData$Age, na.rm = TRUE)
## [1] 10
The youngest age in the survey was 10 years old
max(DescribingData$Age, na.rm = TRUE)
## [1] 99
The oldest person in the servey was 99
max(DescribingData$Age, na.rm = TRUE) - min(DescribingData$Age, na.rm = TRUE)
## [1] 89
The range for the age was 89 years old
min(DescribingData$Minutes.Driving, na.rm = TRUE)
## [1] 0
The minimum number of mins one would drive for good BBQ is 0 mins
max(DescribingData$Minutes.Driving, na.rm = TRUE)
## [1] 500
The maximum number of mins one would drive for good BBQ is 500 mins
max(DescribingData$Minutes.Driving, na.rm = TRUE) - min(DescribingData$Minutes.Driving, na.rm = TRUE)
## [1] 500
The range of people driving for good BBQ is from 0 mins to 500 mins
max(DescribingData$Ribs.Price, na.rm = TRUE)
## [1] 75
The Max price one is willing to pay for BBQ Ribs with 2 sides is 75$
min(DescribingData$Ribs.Price, na.rm = TRUE)
## [1] 0
The Min price one is willing to pay for BBQ Ribs with 2 sides is 0$
max(DescribingData$Ribs.Price, na.rm = TRUE) - min(DescribingData$Ribs.Price, na.rm = TRUE)
## [1] 75
The range that people are willing to pay for BBQ ribs with 2 sides is from 0 dollars to 75 dollars
You need to calculate and describe the standard deviation of the variables that identify 1) the number of minutes a respondent would drive for BBQ and 2) the price they would pay for a sandwich in this section.
sd(DescribingData$Minutes.Driving, na.rm = TRUE)
## [1] 50.95685
sd(DescribingData$Sandwich.Price, na.rm = TRUE)
## [1] 6.608642
Enter the names of anyone one that assisted you with completing this lab. If no one helped you complete just type type out no one helped you. - no one helped me
Enter the names of anyone that you assisted with completing this lab. If you did not help anyone, then just type out that you helped no one. - Summer simpson
Click the “Knit” button to publish your work as an html document. This document or file will appear in the folder specified by your working directory. You will need to upload both this RMarkdown file and the html file it produces to AsU Learn to get all of the lab points for this week.