Overview

This file contains a set of tasks that you need to complete in R for the lab assignment. The tasks may require you to add a code chuck, type code into a chunk, and/or execute code. In this lab you will also need to describe your results. Don’t forget that you need to acknowledge if you used any resources beyond class materials or got help to complete the assignment.

Instructions associated with this assignment can be found in the file “DescribingDataTutorial.html”. You can find the code book associated with the BBQ data on the AsULearn.

The data set you will use is different than the one used in the instructions. Pay attention to the differences in the Excel files name, any variable names, or object names. You will need to adjust your code accordingly.

Once you have completed the assignment, you will need to knit this R Markdown file to produce an html file. You will then need to upload the .html file and this .Rmd file to AsULearn.

1. Add your Name and the Date

The first thing you need to do in this file is to add your name and date in the lines underneath this document’s title (see the code in lines 9 and 10).

Author: Jacob Stockoton

Date:9-19-25

2. Identify and Set Your Working Directory

You need to identify and set your working directory in this section. If you are working in the cloud version of RStudio, enter a note here to tell us that you did not need to change the working directory because you are working in the cloud.

getwd()
## [1] "/Users/jacobstockton/Downloads/DescribingDataFall2025"
setwd("/Users/jacobstockton/Downloads/DescribingDataFall2025")

3. Installing and Loading Packages and Data Set

You need to install and load the packages and data set you’ll use for the lab assignment in this section. In this lab, we will use the three packages we have used in previous labs (dplyr, tidyverse, and openxls) and one new package (modeest). Remember, the first time you use a package you need to install the package.

install.packages("modeest")
## 
## The downloaded binary packages are in
##  /var/folders/3j/tn1f_gtn4674lr97v0dkxc5c0000gn/T//Rtmpu2dia0/downloaded_packages
library(modeest)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.2     ✔ tibble    3.3.0
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.1.0     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
library(openxlsx)
DescribingData <- read.xlsx("DescribingDataAssignmentData.xlsx")

4. Names of Variables

Display the names of the variables in your data set.

names(DescribingData)
##  [1] "Observation"        "Sex"                "Age"               
##  [4] "Hometown"           "Favorite.Meat"      "Favorite.Sauce"    
##  [7] "Sweetness"          "Favorite.Side"      "Restaurant.City"   
## [10] "Restaurant.Name"    "Minutes.Driving"    "Sandwich.Price"    
## [13] "Dinner.Plate.Price" "Ribs.Price"

5. Look at data

Display the last 5 observations in the data set.

tail(DescribingData, 5)
##     Observation Sex Age Hometown Favorite.Meat Favorite.Sauce Sweetness
## 644         644   1  19        1             1              4         1
## 645         645   1  38        7             6              5         2
## 646         646   2  20        2             2              7         2
## 647         647   1  44        4             2              7         2
## 648         648   2  22        1             5              5         2
##     Favorite.Side Restaurant.City    Restaurant.Name Minutes.Driving
## 644             5            <NA>               <NA>               0
## 645             6            <NA>               <NA>              30
## 646             6    Charlotte NC Midwood Smokehouse              10
## 647             1            <NA>               <NA>              20
## 648             5            <NA>               <NA>              60
##     Sandwich.Price Dinner.Plate.Price Ribs.Price
## 644              0                  0          0
## 645             25                 22         NA
## 646             20                 15         36
## 647              8                 40         52
## 648             40                 15         45

6. Look at one variable

Choose one variable other than Dinner.Plate.Price and display all the observations for that variable.

print(DescribingData$Sandwich.Price)
##   [1] 20 12  5 23 22  5 15 16 15 22 20 15 22 15 30 20 15 16 25 15 15 11  7 10 10
##  [26] 12  9 11 16 12  6 20 15 12 15 14 20 10 16 12 10 14 20 12  8  8  8  7  9  6
##  [51]  7  6  5 15 20 15 10 15 10 15 10 20 15 20  8 20 15 12 10 16  5 10  8  8  1
##  [76] 10 20 15 15 18 10 10 12 15 10 20 22 15 20 20 10 10 12 26 13  5 10 10 10 10
## [101] 13 69 15 15 13 10 15 25  8 13 15 10 20 25 10 20 13  3  3 12  3 13 40 15 12
## [126]  8 14 16 15 12 15 20 20 15 50 15 18 15 15  8  9 15 15 12 23 15 12 15 15 15
## [151] 12 13  0  5 14 12 12 20 24 22 11 15 12 15 13  8 10  9  9 13 10  9 13 17 20
## [176] 12 10 15 15 26 15  6 18 25 25 22  7 10 13 20 20  5  0 13 18 10 12 16 15 15
## [201]  0 20 16 15 18 10 15 14 15 10 15 15 15 15 15 12 15 10 10 15 10 14 10  1 10
## [226]  9 25 10 20 12 15 12 12 15  8  7  0 13 15  8  0 12 15 12 15 12 15 15 10 16
## [251] 20  6 22 10 25 16  8 15 30 12 20  6 15  8 20 10  0 15 16 NA  0  4 NA 15 15
## [276] 18 12 20 12 15  9 20 NA 25 25 12 10 15 15 20 16 14 10 15 16 12 30 12 24 25
## [301] 30  8 20 12 22 10 14 16 12 10 20 12 10 12 12  8 12 12 10 25 20 22 20 18  5
## [326]  5 10  6 14  6 11 12 10 10 15 15 15 20 20 15 15 26 12 10 15 10 30 30 15 14
## [351] 20 12 22 15 12 20  2  6 15 20  3 14  8  8 13 27 25 10 10 10 15 10 15 15 12
## [376]  0 10 15 30 10 10 15  7 12 15 15 14  9 10 12 10 10 22 14 13 10 12 20 10 10
## [401] 15 10 14 12  9 16 12 18 20 18 15 15 15  6  7 15 10 25 11 10  8 10 10 12  9
## [426] 18  0 20 13 15 10  9 20 12 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 15 17
## [451] 20 15 15  7 10  8 15 20 10  9 25 15 10 29 10 15 14 26 10 14 12 27 10 14 15
## [476] 15 22 15 10  0 15 15 13 10 14 12 15 10 10 15 40 16 11 10 15 16 10 15  6 20
## [501] 11 35 17 14 15 18 12 20 25 20 45 24 30 16  9 22 15 15 18 22 18 16 20 14 22
## [526] 13 20 12 NA 23 15 10 16 22  0 18 10 15 20 10 10 10 15 10 15 NA 15 14 20 10
## [551] 13 20 14 15 20 15 15 12 18 30 NA 20 15 12 10 10 12  7 16 15 15 10  8 12 10
## [576] 18 NA 20 10 14 12 15 24 15 15 22 12 17 12 15 18 20 22 16 15 15 15 15 12 15
## [601] 12 15 12 13 12 10 20  8 10 10 13 20 10 14 10 19 18 14 15 30 20 15 16 16 30
## [626] 10 13 10 20 15  5 26  6 10 20 30 12 15 15 20  7 22 NA  0 25 20  8 40

7. Means

You need to calculate the means for variables measuring 1) the price of a dinner plate, 2) preferred sweetness of sauce, 3) how long the respondent is willing to drive, and 4) the price of a rib plate. Calculate the means of each variable separate chunks of code (that is, you’ll need four distinct chunks of code). After each chunk of code, write a one sentence description of the mean. Don’t forget about missing data.

mean(DescribingData$Dinner.Plate.Price)
## [1] NA
mean(DescribingData$Dinner.Plate.Price, na.rm=TRUE)
## [1] 19.74563

The mean 19.74563 is describing the average price in dollars individuals in the data are willing to pay for a pulled pork dinner plate with two sides.

mean(DescribingData$Sweetness)
## [1] NA
mean(DescribingData$Sweetness, na.rm=TRUE)
## [1] 2.889922

The mean of 2.889922 is describing, how sweet or sour one would think BBQ sauce should be on a scale of 1 to 5, where 1 is vinegar, and 5 is honey? So according to the mean of the data more people prefer a more sweet BBQ sauce.

mean(DescribingData$Minutes.Driving)
## [1] NA
mean(DescribingData$Minutes.Driving, na.rm=TRUE)
## [1] 41.71498

The mean of 41.71498 is describing the average time in minuites people are willing to drive for good BBQ.

mean(DescribingData$Ribs.Price)
## [1] NA
mean(DescribingData$Ribs.Price, na.rm=TRUE)
## [1] 23.54849

The mean of 23.54849 is describing the average money in dollars individuals are willing to pay for a half rack of ribs with two sides.

8. Rounding

Recalculate the means, but round the calculated values. Again, use a separate chunk for each rounded mean. After each chunk of code, write a one sentence description of the mean. Don’t forget about missing data. Importantly, you need to round the means of the different variables to different decimal places.

round(mean(DescribingData$Dinner.Plate.Price, na.rm=TRUE),digits=2)
## [1] 19.75

The rounded mean of 19.75 is saying that people are willing to pay on average 19 dollars and 75 cents for a pulled pork dinner plate with two sides.

  round(mean(DescribingData$Sweetness, na.rm=TRUE),digits=1)
## [1] 2.9

The rounded mean of 2.9 is describing, how sweet or sour one would think BBQ sauce should be on a scale of 1 to 5, where 1 is vinegar, and 5 is honey? So according to the mean of the data more people prefer a more sweet BBQ sauce.

  round(mean(DescribingData$Minutes.Driving, na.rm=TRUE),digits=3)
## [1] 41.715

The rounded mean states that based on the data provided that people are willing to drive 41 minutes and 43 seconds to eat good BBQ

round(mean(DescribingData$Ribs.Price, na.rm=TRUE),digits=2)
## [1] 23.55

The rounded mean is 23.55 is saying that people are willing to pay on average 23 dollars and 55 cents for a half raxk of ribs with two sides.

9. Medians

You need to calculate and describe the medians of the variables measuring 1) age of the respondent, 2) how long the respondent is willing to drive for good BBQ, and 3) the price of a sandwich. Use a separate chunk of code for each variable. After each chunk of code write one sentence description of the median. Don’t forget about missing data.

median(DescribingData$Age, na.rm = TRUE)
## [1] 21

The median of 21 means that the age in the middle of the data set of age was 21 years old.

median(DescribingData$Minutes.Driving, na.rm = TRUE)
## [1] 30

The median of 30 shows that in the middle of the data set for good BBQ is 30 minutes.

median(DescribingData$Sandwich.Price, na.rm = TRUE)
## [1] 15

The median of 15 shows that the middle price of BBQ Sandwich is 15 dollars.

10. Modes

You need to calculate and describe the modes of the variables for 1) favorite meat, 2) favorite sauce, and 3) favorite side. These are all categorical variables. Use a separate chunk of code for each variable. After each chunk of code write one sentence description of the mode.

When describing these results, you need to convert the numerical modes of the different variables into words according to the survey code book, which is available on AsU Learn.

mfv(DescribingData$Favorite.Meat)
## [1] 1

This tells us that the most favorite meatin the data chart is pulled pork

mfv(DescribingData$Favorite.Sauce)
## [1] 1

This tells us that the favorite sauce in the data chart is Eastern style (with no tomato)

mfv(DescribingData$Favorite.Side)
## [1] 4

This shows us that the favorite side in the data chart is Hush puppies

11. Ranges, Maximums, and Minimums

You need to calculate and describe the ranges, maximums, and minimums of the variables that identify respondents’ 1) ages, 2) rib price, and 3) how many minutes they would drive for BBQ. Use a separate chunk of code for each variable. After each chunk of code write a one sentence description of the minimum, maximum, and range.

min(DescribingData$Age, na.rm = TRUE)
## [1] 10

The youngest age in the survey was 10 years old

max(DescribingData$Age, na.rm = TRUE)
## [1] 99

The oldest person in the servey was 99

max(DescribingData$Age, na.rm = TRUE) - min(DescribingData$Age, na.rm = TRUE)
## [1] 89

The range for the age was 89 years old

min(DescribingData$Minutes.Driving, na.rm = TRUE)
## [1] 0

The minimum number of mins one would drive for good BBQ is 0 mins

max(DescribingData$Minutes.Driving, na.rm = TRUE)
## [1] 500

The maximum number of mins one would drive for good BBQ is 500 mins

max(DescribingData$Minutes.Driving, na.rm = TRUE) - min(DescribingData$Minutes.Driving, na.rm = TRUE)
## [1] 500

The range of people driving for good BBQ is from 0 mins to 500 mins

max(DescribingData$Ribs.Price, na.rm = TRUE)
## [1] 75

The Max price one is willing to pay for BBQ Ribs with 2 sides is 75$

min(DescribingData$Ribs.Price, na.rm = TRUE)
## [1] 0

The Min price one is willing to pay for BBQ Ribs with 2 sides is 0$

max(DescribingData$Ribs.Price, na.rm = TRUE) - min(DescribingData$Ribs.Price, na.rm = TRUE)
## [1] 75

The range that people are willing to pay for BBQ ribs with 2 sides is from 0 dollars to 75 dollars

12. Standard Deviations

You need to calculate and describe the standard deviation of the variables that identify 1) the number of minutes a respondent would drive for BBQ and 2) the price they would pay for a sandwich in this section.

sd(DescribingData$Minutes.Driving, na.rm = TRUE)
## [1] 50.95685
sd(DescribingData$Sandwich.Price, na.rm = TRUE)
## [1] 6.608642

13. Did you receive help?

Enter the names of anyone one that assisted you with completing this lab. If no one helped you complete just type type out no one helped you. - no one helped me

14. Did you provide anyone help with completing this lab?

Enter the names of anyone that you assisted with completing this lab. If you did not help anyone, then just type out that you helped no one. - Summer simpson

15. Knit the Document

Click the “Knit” button to publish your work as an html document. This document or file will appear in the folder specified by your working directory. You will need to upload both this RMarkdown file and the html file it produces to AsU Learn to get all of the lab points for this week.