This file contains a set of tasks that you need to complete in R for the lab assignment. The tasks may require you to add a code chuck, type code into a chunk, and/or execute code. Some tasks may also ask you to answer specific questions. Don’t forget that you need to acknowledge if you used any resources beyond class materials or got help to complete the assignment.
More information and examples that can help you with this assignment can be found in the file “GettingtoKnowYourDataTutorial.html”.
The data set you will use is different than the one used in the instructions. Pay attention to the differences in the Excel file name and any variable names. You will need to adjust your code accordingly.
Once you have completed the assignment, you will need to publish it to produce an html file. You will then need to upload the html file and this .Rmd file to AsULearn.
The first thing you need to do in this file is to add your name and date in the lines underneath this document’s title (see the code in lines 10 and 11). While you will change the things in lines 10 and 11, you should not add anything new in this file until after line 42. Do not delete anything in the file.
Author: Jacob Stockton Date: 9-9-25
You need to identify and set your working directory, load packages,
and load your data in this section. In addition to the
openxlsx
package that we used in the Getting Started in R
lab, you also need to load the packages dplyr
and
tidyverse
. Remember that before you load a package for the
1st time, you need to install the package. The name of the Excel file is
different than what is in the instructions, you will need to adjust the
code to read in the Excel file that was downloaded as part of the zip
file.
getwd()
## [1] "/Users/jacobstockton/Desktop/Research methods Lab/GettingToKnowYourDataFall2025"
setwd("/Users/jacobstockton/Desktop/Research methods Lab/GettingToKnowYourDataFall2025")
install.packages("openxlsx")
##
## The downloaded binary packages are in
## /var/folders/3j/tn1f_gtn4674lr97v0dkxc5c0000gn/T//RtmpBzimkB/downloaded_packages
library(openxlsx)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.2 ✔ tibble 3.3.0
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.1.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dplyr)
BBQData <- read.xlsx("BBQ_Assignment.xlsx")
Display the first 15 observations of your data set.
head(BBQData, 10)
## Sex Age Hometown FavoriteMeat FavoriteSauce
## 1 Male 19 Eastern or Central NC beef brisket Western style (with tomato)
## 2 Female 24 Eastern or Central NC pulled pork Eastern style (with no tomato)
## 3 Female 2 Eastern or Central NC pulled pork Eastern style (with no tomato)
## 4 Male 50 Elsewhere pulled pork Western style (with tomato)
## 5 Female 19 Eastern or Central NC pork ribs South Carolina Mustard
## 6 Female 20 Eastern or Central NC pulled pork Kansas style (with molasses)
## 7 Other 19 Eastern or Central NC pulled pork Eastern style (with no tomato)
## 8 Male 23 Elsewhere beef brisket Kansas style (with molasses)
## 9 Female 25 Eastern or Central NC pulled pork Western style (with tomato)
## 10 Female 49 Elsewhere pulled pork Eastern style (with no tomato)
## Sweetness FavoriteSide RestaurantCity RestaurantName MinutesDriving
## 1 4 fries Wilmington Jackson's 20
## 2 1 other Wilmington Jackson's BBQ 20
## 3 4 hush puppies Angier Stephenson’s BBQ 20
## 4 3 fried okra Wilson, NC Parkers 30
## 5 3 baked beans <NA> <NA> 0
## 6 4 coleslaw <NA> Smithfields 15
## 7 3 hush puppies n/a n/a 20
## 8 3 hush puppies Waco, TX <NA> 240
## 9 3 hush puppies Smithfield, NC Smithfield 30
## 10 2 fried okra Wilmington, NC Jackson’s Big Oak 45
## SandwichPrice DinnerPlatePrice RibsPrice
## 1 13 11 15
## 2 10 15 20
## 3 10 16 20
## 4 15 20 35
## 5 5 7 9
## 6 3 18 20
## 7 6 8 10
## 8 15 20 25
## 9 20 25 30
## 10 15 90 35
tail(BBQData, 5)
## Sex Age Hometown FavoriteMeat FavoriteSauce Sweetness
## 11 Female 25 Elsewhere beef brisket Other 3
## 12 Female 18 Elsewhere pulled pork Korean Style 4
## 13 Male 180 Elsewhere pork ribs Eastern style (with no tomato) 3
## 14 Female 24 Elsewhere pork ribs Korean Style 3
## 15 Male 22 Piedmount pulled pork Eastern style (with no tomato) 3
## FavoriteSide RestaurantCity RestaurantName MinutesDriving
## 11 hush puppies N/A N/A 10
## 12 fries Charlotte Midwood Smokehouse 60
## 13 other Asheville Daddy Mac’s 30
## 14 fries cary red robin 20
## 15 hush puppies Mooresville, NC Lancaster's BBQ 30
## SandwichPrice DinnerPlatePrice RibsPrice
## 11 120 20 30
## 12 15 20 35
## 13 15 17 200
## 14 10 15 15
## 15 11 15 20
names(BBQData)
## [1] "Sex" "Age" "Hometown" "FavoriteMeat"
## [5] "FavoriteSauce" "Sweetness" "FavoriteSide" "RestaurantCity"
## [9] "RestaurantName" "MinutesDriving" "SandwichPrice" "DinnerPlatePrice"
## [13] "RibsPrice"
Add a variable that is a unique id for each observation. Then take another look at the data. After creating the unique id variable, display the first 15 observations in your data set.
BBQData %>%
rowid_to_column(var = "CaseID") -> BBQData
head(BBQData,5)
## CaseID Sex Age Hometown FavoriteMeat
## 1 1 Male 19 Eastern or Central NC beef brisket
## 2 2 Female 24 Eastern or Central NC pulled pork
## 3 3 Female 2 Eastern or Central NC pulled pork
## 4 4 Male 50 Elsewhere pulled pork
## 5 5 Female 19 Eastern or Central NC pork ribs
## FavoriteSauce Sweetness FavoriteSide RestaurantCity
## 1 Western style (with tomato) 4 fries Wilmington
## 2 Eastern style (with no tomato) 1 other Wilmington
## 3 Eastern style (with no tomato) 4 hush puppies Angier
## 4 Western style (with tomato) 3 fried okra Wilson, NC
## 5 South Carolina Mustard 3 baked beans <NA>
## RestaurantName MinutesDriving SandwichPrice DinnerPlatePrice RibsPrice
## 1 Jackson's 20 13 11 15
## 2 Jackson's BBQ 20 10 15 20
## 3 Stephenson’s BBQ 20 10 16 20
## 4 Parkers 30 15 20 35
## 5 <NA> 0 5 7 9
In this section you need to covert variables into the numerical
format and clean up any messy observations. The numerical variables you
need address in this section are: Age
,
MinutesDirving
, SandwichPrice
,
DinnerPlatePrice
, and RibsPrice
.
You know the following things about your data that will be helpful when conducting your recodes. First, all the respondents should be between the ages of 18 and 90. Second, no respondent is willing to drive more than 100 miles for BBQ. Third, it is unreasonable that the price of a sandwich is less than $5, the price of a dinner plate less than $15, or the price of ribs is less than $20. Fourth, no one should be willing to pay more than $50 dollars for a sandwich, dinner, or ribs.
After you have reformatted the variables and done the necessary recodes, use the print command to compare the original variables to your new variables.
Remember > means greater than and < means less than.
BBQData$Age2 <- as.numeric(BBQData$Age)
BBQData$Age2[BBQData$Age2<18]<-NA
BBQData$Age2[BBQData$Age2>100]<-NA
print(BBQData$Age)
## [1] "19" "24" "2" "50" "19" "20" "19" "23" "25" "49" "25" "18"
## [13] "180" "24" "22"
print(BBQData$Age2)
## [1] 19 24 NA 50 19 20 19 23 25 49 25 18 NA 24 22
BBQData$MinutesDriving2 <- as.numeric(BBQData$MinutesDriving)
BBQData$MinutesDriving2[BBQData$MinutesDriving2>100]<-NA
print(BBQData$MinutesDriving)
## [1] "20" "20" "20" "30" "0" "15" "20" "240" "30" "45" "10" "60"
## [13] "30" "20" "30"
print(BBQData$MinutesDriving2)
## [1] 20 20 20 30 0 15 20 NA 30 45 10 60 30 20 30
BBQData$SandwitchPrice2 <- as.numeric(BBQData$SandwichPrice)
BBQData$SandwitchPrice2[BBQData$SandwitchPrice2<5]<-NA
BBQData$SandwitchPrice2[BBQData$SandwitchPrice2>50]<-NA
print(BBQData$SandwichPrice)
## [1] "13" "10" "10" "15" "5" "3" "6" "15" "20" "15" "120" "15"
## [13] "15" "10" "11"
print(BBQData$SandwitchPrice2)
## [1] 13 10 10 15 5 NA 6 15 20 15 NA 15 15 10 11
BBQData$DinnerPlatePrice2 <- as.numeric(BBQData$DinnerPlatePrice)
BBQData$DinnerPlatePrice2[BBQData$DinnerPlatePrice2<15]<-NA
BBQData$DinnerPlatePrice2[BBQData$DinnerPlatePrice2>50]<-NA
print(BBQData$DinnerPlatePrice)
## [1] "11" "15" "16" "20" "7" "18" "8" "20" "25" "90" "20" "20" "17" "15" "15"
print(BBQData$DinnerPlatePrice2)
## [1] NA 15 16 20 NA 18 NA 20 25 NA 20 20 17 15 15
BBQData$RibsPrice2 <- as.numeric(BBQData$RibsPrice)
BBQData$RibsPrice2[BBQData$RibsPrice2<20]<-NA
BBQData$RibsPrice2[BBQData$RibsPrice2>50]<-NA
print(BBQData$RibsPrice)
## [1] "15" "20" "20" "35" "9" "20" "10" "25" "30" "35" "30" "35"
## [13] "200" "15" "20"
print(BBQData$RibsPrice2)
## [1] NA 20 20 35 NA 20 NA 25 30 35 30 35 NA NA 20
In this section you need to recode categorical variables to assign
values to their different categories. The following are the categorical
variables in the data set: Sex
, FavoriteMeat
,
FavoriteSauce
, FavoriteSide
.
You should reference the code book for the BBQ data set to know the numerical values to assign to the different categories.
After you have completed recoding the variables, use the print command to compare the original variables to your new variables. Don’t forget capitalization matters.
BBQData %>%
mutate(gender2 = NA) %>%
mutate(gender2 = replace(gender2, BBQData$Sex == "Male", 1))%>%
mutate(gender2 = replace(gender2, BBQData$Sex == "Female", 2))%>%
mutate(gender2 = replace(gender2, BBQData$Sex == "Other", 3)) -> BBQData
print(BBQData$Sex)
## [1] "Male" "Female" "Female" "Male" "Female" "Female" "Other" "Male"
## [9] "Female" "Female" "Female" "Female" "Male" "Female" "Male"
print(BBQData$gender2)
## [1] 1 2 2 1 2 2 3 1 2 2 2 2 1 2 1
BBQData %>%
mutate(FavoriteMeat2 = NA) %>%
mutate(FavoriteMeat2 = replace(FavoriteMeat2, BBQData$FavoriteMeat == "beef brisket", 1))%>%
mutate(FavoriteMeat2 = replace(FavoriteMeat2, BBQData$FavoriteMeat == "pulled pork", 2))%>%
mutate(FavoriteMeat2 = replace(FavoriteMeat2, BBQData$FavoriteMeat == "pork ribs", 3)) -> BBQData
print(BBQData$FavoriteMeat)
## [1] "beef brisket" "pulled pork" "pulled pork" "pulled pork" "pork ribs"
## [6] "pulled pork" "pulled pork" "beef brisket" "pulled pork" "pulled pork"
## [11] "beef brisket" "pulled pork" "pork ribs" "pork ribs" "pulled pork"
print(BBQData$FavoriteMeat2)
## [1] 1 2 2 2 3 2 2 1 2 2 1 2 3 3 2
BBQData %>%
mutate(FavoriteSauce2 = NA) %>%
mutate(FavoriteSauce2 = replace(FavoriteSauce2, BBQData$FavoriteSauce == "Western style (with tomato)", 1))%>%
mutate(FavoriteSauce2 = replace(FavoriteSauce2, BBQData$FavoriteSauce == "Eastern style (with no tomato)", 2))%>%
mutate(FavoriteSauce2 = replace(FavoriteSauce2, BBQData$FavoriteSauce == "South Carolina Mustard", 3))%>%
mutate(FavoriteSauce2 = replace(FavoriteSauce2, BBQData$FavoriteSauce == "Kansas style (with molasses)", 4))%>%
mutate(FavoriteSauce2 = replace(FavoriteSauce2, BBQData$FavoriteSauce == "Korean Style", 5))%>%
mutate(FavoriteSauce2 = replace(FavoriteSauce2, BBQData$FavoriteSauce == "Other", 6)) -> BBQData
print(BBQData$FavoriteSauce)
## [1] "Western style (with tomato)" "Eastern style (with no tomato)"
## [3] "Eastern style (with no tomato)" "Western style (with tomato)"
## [5] "South Carolina Mustard" "Kansas style (with molasses)"
## [7] "Eastern style (with no tomato)" "Kansas style (with molasses)"
## [9] "Western style (with tomato)" "Eastern style (with no tomato)"
## [11] "Other" "Korean Style"
## [13] "Eastern style (with no tomato)" "Korean Style"
## [15] "Eastern style (with no tomato)"
print(BBQData$FavoriteSauce2)
## [1] 1 2 2 1 3 4 2 4 1 2 6 5 2 5 2
BBQData %>%
mutate(FavoriteSide2 = NA) %>%
mutate(FavoriteSide2 = replace(FavoriteSide2, BBQData$FavoriteSide == "baked beans", 1))%>%
mutate(FavoriteSide2 = replace(FavoriteSide2, BBQData$FavoriteSide == "coleslaw", 2))%>%
mutate(FavoriteSide2 = replace(FavoriteSide2, BBQData$FavoriteSide == "fried okra", 3))%>%
mutate(FavoriteSide2 = replace(FavoriteSide2, BBQData$FavoriteSide == "hush puppies", 4))%>%
mutate(FavoriteSide2 = replace(FavoriteSide2, BBQData$FavoriteSide == "fries", 5))%>%
mutate(FavoriteSide2 = replace(FavoriteSide2, BBQData$FavoriteSide == "Other", 6)) -> BBQData
print(BBQData$FavoriteSide)
## [1] "fries" "other" "hush puppies" "fried okra" "baked beans"
## [6] "coleslaw" "hush puppies" "hush puppies" "hush puppies" "fried okra"
## [11] "hush puppies" "fries" "other" "fries" "hush puppies"
print(BBQData$FavoriteSide2)
## [1] 5 NA 4 3 1 2 4 4 4 3 4 5 NA 5 4
Enter the names of anyone one that assisted you with completing this lab. If no one helped you complete just type type out no one helped you.
No one helped me
Enter the names of anyone that you assisted with completing this lab. If you did not help anyone, then just type out that you helped no one.
Brett Clayton, Summer Simpson
Click the “Knit” button to publish your work as an html document. This document or file will appear in the folder specified by your working directory. You will need to upload both this RMarkdown file and the html file it produces to AsU Learn to get all of the lab points for this week.