Overview

This file contains a set of tasks that you need to complete in R for the lab assignment. The tasks may require you to add a code chuck, type code into a chunk, and/or execute code. Some tasks may also ask you to answer specific questions. Don’t forget that you need to acknowledge if you used any resources beyond class materials or got help to complete the assignment.

More information and examples that can help you with this assignment can be found in the file “GettingtoKnowYourDataTutorial.html”.

The data set you will use is different than the one used in the instructions. Pay attention to the differences in the Excel file name and any variable names. You will need to adjust your code accordingly.

Once you have completed the assignment, you will need to publish it to produce an html file. You will then need to upload the html file and this .Rmd file to AsULearn.

1. Add your name and the date

The first thing you need to do in this file is to add your name and date in the lines underneath this document’s title (see the code in lines 10 and 11). While you will change the things in lines 10 and 11, you should not add anything new in this file until after line 42. Do not delete anything in the file.

2. Getting started

You need to identify and set your working directory, load packages, and load your data in this section. In addition to the openxlsx package that we used in the Getting Started in R lab, you also need to load the packages dplyr and tidyverse. Remember that before you load a package for the 1st time, you need to install the package. The name of the Excel file is different than what is in the instructions, you will need to adjust the code to read in the Excel file that was downloaded as part of the zip file.

Insert your chunks of code here to identify and set your working directory, load packages, and load the data. I recommend doing one thing per chunk of code.

getwd()
## [1] "C:/Users/Hovis/Research Methods Lab/GettingToKnowYourDataSpring2026/GettingToKnowYourDataSpring2026"
setwd("C:/Users/Hovis/Research Methods Lab/GettingToKnowYourDataSpring2026/GettingToKnowYourDataSpring2026")
install.packages("openxlsx")
## Installing package into 'C:/Users/Hovis/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## package 'openxlsx' successfully unpacked and MD5 sums checked
## Warning: cannot remove prior installation of package 'openxlsx'
## Warning in file.copy(savedcopy, lib, recursive = TRUE): problem copying
## C:\Users\Hovis\AppData\Local\R\win-library\4.5\00LOCK\openxlsx\libs\x64\openxlsx.dll
## to
## C:\Users\Hovis\AppData\Local\R\win-library\4.5\openxlsx\libs\x64\openxlsx.dll:
## Permission denied
## Warning: restored 'openxlsx'
## 
## The downloaded binary packages are in
##  C:\Users\Hovis\AppData\Local\Temp\RtmpiqrfHH\downloaded_packages
install.packages("dplyr")
## Installing package into 'C:/Users/Hovis/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## package 'dplyr' successfully unpacked and MD5 sums checked
## Warning: cannot remove prior installation of package 'dplyr'
## Warning in file.copy(savedcopy, lib, recursive = TRUE): problem copying
## C:\Users\Hovis\AppData\Local\R\win-library\4.5\00LOCK\dplyr\libs\x64\dplyr.dll
## to C:\Users\Hovis\AppData\Local\R\win-library\4.5\dplyr\libs\x64\dplyr.dll:
## Permission denied
## Warning: restored 'dplyr'
## 
## The downloaded binary packages are in
##  C:\Users\Hovis\AppData\Local\Temp\RtmpiqrfHH\downloaded_packages
install.packages("tidyverse")
## Installing package into 'C:/Users/Hovis/AppData/Local/R/win-library/4.5'
## (as 'lib' is unspecified)
## package 'tidyverse' successfully unpacked and MD5 sums checked
## 
## The downloaded binary packages are in
##  C:\Users\Hovis\AppData\Local\Temp\RtmpiqrfHH\downloaded_packages
library("dplyr")
## Warning: package 'dplyr' was built under R version 4.5.2
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library("openxlsx")
## Warning: package 'openxlsx' was built under R version 4.5.2
library("tidyverse")
## Warning: package 'tidyverse' was built under R version 4.5.2
## Warning: package 'ggplot2' was built under R version 4.5.2
## Warning: package 'tibble' was built under R version 4.5.2
## Warning: package 'tidyr' was built under R version 4.5.2
## Warning: package 'readr' was built under R version 4.5.2
## Warning: package 'purrr' was built under R version 4.5.2
## Warning: package 'stringr' was built under R version 4.5.2
## Warning: package 'forcats' was built under R version 4.5.2
## Warning: package 'lubridate' was built under R version 4.5.2
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.1     ✔ readr     2.1.6
## ✔ ggplot2   4.0.2     ✔ stringr   1.6.0
## ✔ lubridate 1.9.5     ✔ tibble    3.3.1
## ✔ purrr     1.2.1     ✔ tidyr     1.3.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
BBQData<-read.xlsx("BBQ_Assignment.xlsx")
print(BBQData$DinnerPlatePrice)
##  [1] "11" "15" "16" "20" "7"  "18" "8"  "20" "25" "90" "20" "20" "17" "15" "15"
view(BBQData)
head(BBQData,10)
##       Sex Age              Hometown FavoriteMeat                  FavoriteSauce
## 1    Male  19 Eastern or Central NC beef brisket    Western style (with tomato)
## 2  Female  24 Eastern or Central NC  pulled pork Eastern style (with no tomato)
## 3  Female   2 Eastern or Central NC  pulled pork Eastern style (with no tomato)
## 4    Male  50             Elsewhere  pulled pork    Western style (with tomato)
## 5  Female  19 Eastern or Central NC    pork ribs         South Carolina Mustard
## 6  Female  20 Eastern or Central NC  pulled pork   Kansas style (with molasses)
## 7   Other  19 Eastern or Central NC  pulled pork Eastern style (with no tomato)
## 8    Male  23             Elsewhere beef brisket   Kansas style (with molasses)
## 9  Female  25 Eastern or Central NC  pulled pork    Western style (with tomato)
## 10 Female  49             Elsewhere  pulled pork Eastern style (with no tomato)
##    Sweetness FavoriteSide RestaurantCity    RestaurantName MinutesDriving
## 1          4        fries     Wilmington         Jackson's             20
## 2          1        other     Wilmington     Jackson's BBQ             20
## 3          4 hush puppies        Angier   Stephenson’s BBQ             20
## 4          3   fried okra     Wilson, NC           Parkers             30
## 5          3  baked beans           <NA>              <NA>              0
## 6          4     coleslaw           <NA>       Smithfields             15
## 7          3 hush puppies            n/a               n/a             20
## 8          3 hush puppies       Waco, TX              <NA>            240
## 9          3 hush puppies Smithfield, NC        Smithfield             30
## 10         2   fried okra Wilmington, NC Jackson’s Big Oak             45
##    SandwichPrice DinnerPlatePrice RibsPrice
## 1             13               11        15
## 2             10               15        20
## 3             10               16        20
## 4             15               20        35
## 5              5                7         9
## 6              3               18        20
## 7              6                8        10
## 8             15               20        25
## 9             20               25        30
## 10            15               90        35
tail(BBQData,5)
##       Sex Age  Hometown FavoriteMeat                  FavoriteSauce Sweetness
## 11 Female  25 Elsewhere beef brisket                          Other         3
## 12 Female  18 Elsewhere  pulled pork                   Korean Style         4
## 13   Male 180 Elsewhere    pork ribs Eastern style (with no tomato)         3
## 14 Female  24 Elsewhere    pork ribs                   Korean Style         3
## 15   Male  22 Piedmount  pulled pork Eastern style (with no tomato)         3
##    FavoriteSide  RestaurantCity      RestaurantName MinutesDriving
## 11 hush puppies             N/A                 N/A             10
## 12        fries       Charlotte Midwood Smokehouse              60
## 13        other       Asheville         Daddy Mac’s             30
## 14        fries            cary           red robin             20
## 15 hush puppies Mooresville, NC     Lancaster's BBQ             30
##    SandwichPrice DinnerPlatePrice RibsPrice
## 11           120               20        30
## 12            15               20        35
## 13            15               17       200
## 14            10               15        15
## 15            11               15        20
names(BBQData)
##  [1] "Sex"              "Age"              "Hometown"         "FavoriteMeat"    
##  [5] "FavoriteSauce"    "Sweetness"        "FavoriteSide"     "RestaurantCity"  
##  [9] "RestaurantName"   "MinutesDriving"   "SandwichPrice"    "DinnerPlatePrice"
## [13] "RibsPrice"
BBQData%>%
  rowid_to_column(var="CaseID")->BBQData
head(BBQData,5)
##   CaseID    Sex Age              Hometown FavoriteMeat
## 1      1   Male  19 Eastern or Central NC beef brisket
## 2      2 Female  24 Eastern or Central NC  pulled pork
## 3      3 Female   2 Eastern or Central NC  pulled pork
## 4      4   Male  50             Elsewhere  pulled pork
## 5      5 Female  19 Eastern or Central NC    pork ribs
##                    FavoriteSauce Sweetness FavoriteSide RestaurantCity
## 1    Western style (with tomato)         4        fries     Wilmington
## 2 Eastern style (with no tomato)         1        other     Wilmington
## 3 Eastern style (with no tomato)         4 hush puppies        Angier 
## 4    Western style (with tomato)         3   fried okra     Wilson, NC
## 5         South Carolina Mustard         3  baked beans           <NA>
##     RestaurantName MinutesDriving SandwichPrice DinnerPlatePrice RibsPrice
## 1        Jackson's             20            13               11        15
## 2    Jackson's BBQ             20            10               15        20
## 3 Stephenson’s BBQ             20            10               16        20
## 4          Parkers             30            15               20        35
## 5             <NA>              0             5                7         9
BBQData$Age2<-as.numeric(BBQData$Age)
BBQData$Age2[BBQData$Age2<17]<-NA
print(BBQData$Age)
##  [1] "19"  "24"  "2"   "50"  "19"  "20"  "19"  "23"  "25"  "49"  "25"  "18" 
## [13] "180" "24"  "22"
print(BBQData$Age2)
##  [1]  19  24  NA  50  19  20  19  23  25  49  25  18 180  24  22
BBQData%>%
  mutate(gender2=NA) %>%
  mutate(gender2=replace(gender2, BBQData$Sex=="Male",1))%>%
  mutate(gender2=replace(gender2,BBQData$Sex=="Female",2))%>%
  mutate(gender2=replace(gender2,BBQData$Sex=="Other",3))->BBQData
print(BBQData$Sex)
##  [1] "Male"   "Female" "Female" "Male"   "Female" "Female" "Other"  "Male"  
##  [9] "Female" "Female" "Female" "Female" "Male"   "Female" "Male"
print(BBQData$gender2)
##  [1] 1 2 2 1 2 2 3 1 2 2 2 2 1 2 1

If you are working in the cloud version of RStudio, you do not need to set the working directory because you will have had to load this file and the Excel file into the cloud to be able to access them. Instead right before your chunk of code write in all capital letters that you are using RStuiod in the cloud.

3. Take a look at your data

Display the first 15 observations of your data set.

head(BBQData,15)
##    CaseID    Sex Age              Hometown FavoriteMeat
## 1       1   Male  19 Eastern or Central NC beef brisket
## 2       2 Female  24 Eastern or Central NC  pulled pork
## 3       3 Female   2 Eastern or Central NC  pulled pork
## 4       4   Male  50             Elsewhere  pulled pork
## 5       5 Female  19 Eastern or Central NC    pork ribs
## 6       6 Female  20 Eastern or Central NC  pulled pork
## 7       7  Other  19 Eastern or Central NC  pulled pork
## 8       8   Male  23             Elsewhere beef brisket
## 9       9 Female  25 Eastern or Central NC  pulled pork
## 10     10 Female  49             Elsewhere  pulled pork
## 11     11 Female  25             Elsewhere beef brisket
## 12     12 Female  18             Elsewhere  pulled pork
## 13     13   Male 180             Elsewhere    pork ribs
## 14     14 Female  24             Elsewhere    pork ribs
## 15     15   Male  22             Piedmount  pulled pork
##                     FavoriteSauce Sweetness FavoriteSide  RestaurantCity
## 1     Western style (with tomato)         4        fries      Wilmington
## 2  Eastern style (with no tomato)         1        other      Wilmington
## 3  Eastern style (with no tomato)         4 hush puppies         Angier 
## 4     Western style (with tomato)         3   fried okra      Wilson, NC
## 5          South Carolina Mustard         3  baked beans            <NA>
## 6    Kansas style (with molasses)         4     coleslaw            <NA>
## 7  Eastern style (with no tomato)         3 hush puppies             n/a
## 8    Kansas style (with molasses)         3 hush puppies        Waco, TX
## 9     Western style (with tomato)         3 hush puppies  Smithfield, NC
## 10 Eastern style (with no tomato)         2   fried okra  Wilmington, NC
## 11                          Other         3 hush puppies             N/A
## 12                   Korean Style         4        fries       Charlotte
## 13 Eastern style (with no tomato)         3        other       Asheville
## 14                   Korean Style         3        fries            cary
## 15 Eastern style (with no tomato)         3 hush puppies Mooresville, NC
##         RestaurantName MinutesDriving SandwichPrice DinnerPlatePrice RibsPrice
## 1            Jackson's             20            13               11        15
## 2        Jackson's BBQ             20            10               15        20
## 3     Stephenson’s BBQ             20            10               16        20
## 4              Parkers             30            15               20        35
## 5                 <NA>              0             5                7         9
## 6          Smithfields             15             3               18        20
## 7                  n/a             20             6                8        10
## 8                 <NA>            240            15               20        25
## 9           Smithfield             30            20               25        30
## 10   Jackson’s Big Oak             45            15               90        35
## 11                 N/A             10           120               20        30
## 12 Midwood Smokehouse              60            15               20        35
## 13         Daddy Mac’s             30            15               17       200
## 14           red robin             20            10               15        15
## 15     Lancaster's BBQ             30            11               15        20
##    Age2 gender2
## 1    19       1
## 2    24       2
## 3    NA       2
## 4    50       1
## 5    19       2
## 6    20       2
## 7    19       3
## 8    23       1
## 9    25       2
## 10   49       2
## 11   25       2
## 12   18       2
## 13  180       1
## 14   24       2
## 15   22       1

4. Create a unique id

Add a variable that is a unique id for each observation. Then take another look at the data. After creating the unique id variable, display the first 15 observations in your data set.

Insert your chunk of code here to create your unique id and then display your data.

5. Recode and clean numerical variables

In this section you need to covert variables into the numerical format and clean up any messy observations. The numerical variables you need address in this section are: Age, MinutesDirving, SandwichPrice, DinnerPlatePrice, and RibsPrice.

BBQData$MinutesDriving2<-as.numeric(BBQData$MinutesDriving)
BBQData$MinutesDriving2[BBQData$MinutesDriving2>100]<-NA
print(BBQData$MinutesDriving)
##  [1] "20"  "20"  "20"  "30"  "0"   "15"  "20"  "240" "30"  "45"  "10"  "60" 
## [13] "30"  "20"  "30"
print(BBQData$MinutesDriving2)
##  [1] 20 20 20 30  0 15 20 NA 30 45 10 60 30 20 30
BBQData$SandwichPrice2<-as.numeric(BBQData$SandwichPrice)
BBQData$SandwichPrice2[BBQData$SandwichPrice2<5]<-NA
print(BBQData$SandwichPrice)
##  [1] "13"  "10"  "10"  "15"  "5"   "3"   "6"   "15"  "20"  "15"  "120" "15" 
## [13] "15"  "10"  "11"
print(BBQData$SandwichPrice2)
##  [1]  13  10  10  15   5  NA   6  15  20  15 120  15  15  10  11
BBQData$DinnerPlatePrice2<-as.numeric(BBQData$DinnerPlatePrice)
BBQData$DinnerPlatePrice2[BBQData$DinnerPlatePrice2<15]<-NA
print(BBQData$DinnerPlatePrice)
##  [1] "11" "15" "16" "20" "7"  "18" "8"  "20" "25" "90" "20" "20" "17" "15" "15"
print(BBQData$DinnerPlatePrice2)
##  [1] NA 15 16 20 NA 18 NA 20 25 90 20 20 17 15 15
BBQData$RibsPrice2<-as.numeric(BBQData$RibsPrice)
BBQData$RibsPrice2[BBQData$RibsPrice2<20]<-NA
print(BBQData$RibsPrice)
##  [1] "15"  "20"  "20"  "35"  "9"   "20"  "10"  "25"  "30"  "35"  "30"  "35" 
## [13] "200" "15"  "20"
print(BBQData$RibsPrice2)
##  [1]  NA  20  20  35  NA  20  NA  25  30  35  30  35 200  NA  20

You know the following things about your data that will be helpful when conducting your recodes. First, all the respondents should be between the ages of 18 and 90. Second, no respondent is willing to drive more than 100 miles for BBQ. Third, it is unreasonable that the price of a sandwich is less than $5, the price of a dinner plate less than $15, or the price of ribs is less than $20. Fourth, no one should be willing to pay more than $50 dollars for a sandwich, dinner, or ribs.

After you have reformatted the variables and done the necessary recodes, use the print command to compare the original variables to your new variables.

Remember > means greater than and < means less than.

Insert your code here for recoding and cleaning numerical variables. I recommend doing one thing per chunk of code.

6. Recode and clean categorical variables

In this section you need to recode categorical variables to assign values to their different categories. The following are the categorical variables in the data set: Sex, FavoriteMeat, FavoriteSauce, FavoriteSide.

BBQData$Sex2<-as.character(BBQData$Sex)
BBQData$FavoriteMeat2<-as.character(BBQData$FavoriteMeat)
BBQData$FavoriteSauce2<-as.character(BBQData$FavoriteSauce)
BBQData$FavoriteSide2<-as.character(BBQData$FavoriteSide)

You should reference the code book for the BBQ data set to know the numerical values to assign to the different categories.

After you have completed recoding the variables, use the print command to compare the original variables to your new variables. Don’t forget capitalization matters.

Insert your code here for recoding and cleaning categorical variables. I recommend doing one thing per chunk of code.

Did you receive help?

Enter the names of anyone one that assisted you with completing this lab. If no one helped you complete just type type out no one helped you. No one helped me # Did you provide anyone help with completing this lab? Enter the names of anyone that you assisted with completing this lab. If you did not help anyone, then just type out that you helped no one. No # Publish Document Click the “Knit” button to publish your work as a pdf document. This document or file will appear in the folder specified by your working directory. You will need to upload both this RMarkdown file and the html file it produces to AsU Learn to get all of the lab points for this week.