Overview

This file contains a set of tasks that you need to complete in R for the lab assignment. The tasks may require you to add a code chuck, type code into a chunk, and/or execute code. Don’t forget that you need to acknowledge if you used any resources beyond class materials or got help to complete the assignment.

Instructions associated with this assignment can be found in the file “RegressionTutorial.html”.

The data set you will use is different than the one used in the instructions. Pay attention to the differences in the Excel files’ names, any variable names, or object names. You will need to adjust your code accordingly.

When asked to describe a relationship, your answer needs to directly engage with the statistical analysis you conducted. This instructions file provides detailed explanation of how to describe your results.

Once you have completed the assignment, you will need to knit this R Markdown file to produce an .html file. You will then need to upload the .html file and this .Rmd file to AsULearn.

1. Add your Name and the Date

The first thing you need to do in this file is to add your name and date in the lines underneath this document’s title.

2. Identify and Set Your Working Directory

You need to identify and set your working directory in this section. If you are working in the cloud version of RStudio, enter a note here to tell us that you did not need to change the working directory because you are working in the cloud.

getwd()
## [1] "/Users/rlmcollins/Desktop"
setwd("/Users/rlmcollins/Desktop")

3. Installing and Loading Packages and Data Set

You need to install and load the packages and data set you’ll use for the lab assignment in this section. In this lab, we will use the following packages: dplyr, tidyverse, forcats, ggplot2, janitor, texreg and openxlsx. You have not used the package texreg in previous labs, so make sure you install and load the package.

library("dplyr")
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library("tidyverse")
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats   1.0.0     ✔ readr     2.1.5
## ✔ ggplot2   3.5.2     ✔ stringr   1.5.1
## ✔ lubridate 1.9.4     ✔ tibble    3.3.0
## ✔ purrr     1.1.0     ✔ tidyr     1.3.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library("openxlsx")
library("forcats")
library("ggplot2")
library("janitor")
## 
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
## 
##     chisq.test, fisher.test
install.packages("texreg")
## 
## The downloaded binary packages are in
##  /var/folders/wg/xl6pvy053zx8fzvllrfstzmc0000gn/T//Rtmp7BLvh8/downloaded_packages
library("texreg")
## Version:  1.39.4
## Date:     2024-07-23
## Author:   Philip Leifeld (University of Manchester)
## 
## Consider submitting praise using the praise or praise_interactive functions.
## Please cite the JSS article in your publications -- see citation("texreg").
## 
## Attaching package: 'texreg'
## The following object is masked from 'package:tidyr':
## 
##     extract
BBQData <- read.xlsx("RegressionBBQ.xlsx")

4. Bivariate Regression Model of Rib Plate and Driving Distance

You want to know whether how far someone is willing to drive for BBQ influences the price they are willing to pay for a plate of ribs. Estimate a bivariate regression model, rounding your results to the 2nd decimal. Use your middle name as the name of the object where you store your regression results.

LeannMarie<- lm(Ribs.Price ~ Minutes.Driving, data = BBQData)
screenreg(LeannMarie, digits = 2)
## 
## ===========================
##                  Model 1   
## ---------------------------
## (Intercept)       21.05 ***
##                   (0.76)   
## Minutes.Driving    0.05 ***
##                   (0.01)   
## ---------------------------
## R^2                0.03    
## Adj. R^2           0.03    
## Num. obs.        379       
## ===========================
## *** p < 0.001; ** p < 0.01; * p < 0.05

5. Substantively Interpret the Relationship between Rib Plate and Driving Distance

Write a brief description of the relationship between Rib Plate and Driving.

There is a small but statistically significant positive relationship between how far someone is willing to drive and the price they are willing to pay for a rib plate.

6. Substantively Interpret the x-intercept in the regression model of Rib Plate and Driving Distance.

The x-intercept tell us the predicted price someone is willing to pay for a rib plate when someone isn’t willing to drive at all. In the model, is a person doesn’t drive, the price of a rib plate is predicted to cost 21.05. This is the starting point of the line before driving distance has any effect.

7. Estimate a Bivariate Regression Model of Driving Distance and Age

You want to know whether someone’s age influences how far they are willing to drive for good BBQ. Estimate a bivariate regression model. Round your results so there is 1 digit after the decimal place. Use the name of your hometown as the name of the object where you store your regression results.

Maysville<- lm(Minutes.Driving ~ Age, data = BBQData)
screenreg(Maysville, digits = 1)
## 
## ======================
##              Model 1  
## ----------------------
## (Intercept)   33.6 ***
##               (4.2)   
## Age            0.2    
##               (0.1)   
## ----------------------
## R^2            0.0    
## Adj. R^2       0.0    
## Num. obs.    379      
## ======================
## *** p < 0.001; ** p < 0.01; * p < 0.05

8. Substantively Interpret the Relationship between Rib Plate and Driving Distance

9.Substantively Interpret the x-intercept in the Regression model of Rib Plate and Driving Distance.

10. Creating Dichotomous Variables

You need to create three dichotomous variables based on existing variables in the data set in this section. The first should be named “Prefers.Eastern” and should take on a value of “1” if a respondent identified “Eastern Style (no tomato)” as their preferred type of BBQ sauce and a value of “0” if they did not. The second should be named “Prefers.HP” and should take on a value of “1” if a respondent identified hush puppies as their preferred side dish and a value of “0” if they did not. The third should be named “Pay.More” and should take on a value of “1” if a respondent is willing to pay above the average for dinner plate and and value of “0” if they are not.

BBQData %>%                            
mutate(Prefers.Eastern=NA) %>%
mutate(Prefers.Eastern=replace(Prefers.Eastern, Favorite.Sauce==5, 1)) %>%
mutate(Prefers.Eastern=replace(Prefers.Eastern, Favorite.Sauce < 5, 0)) ->BBQData
BBQData %>%                            
mutate(Prefers.HP=NA) %>%
mutate(Prefers.HP=replace(Prefers.HP, Favorite.Side==5, 1)) %>%
mutate(Prefers.HP=replace(Prefers.HP, Favorite.Side < 5, 0)) ->BBQData
BBQData %>%                            
mutate(Pay.More=NA) %>%
mutate(Pay.More=replace(Pay.More, Dinner.Plate.Price==5, 1)) %>%
mutate(Pay.More=replace(Pay.More, Dinner.Plate.Price < 5, 0)) ->BBQData

11. Estimate a Multivariate Regression Model with Three Variables.

Estimate a multivariate regression with the variables driving distance, age, and a preference for hush puppies. You want to know if how far respondents are willing to drive is a function of their age and their love of hush puppies. Round your results so there are 3 digits after the decimal place. Use the name of your favorite BBQ side as the name of the object where you store your regression results.

Hushpuppies<- lm(Prefers.HP ~ Minutes.Driving + Age, data = BBQData)
screenreg(Hushpuppies, digits = 3)
## 
## ============================
##                  Model 1    
## ----------------------------
## (Intercept)        0.126 ***
##                   (0.036)   
## Minutes.Driving   -0.000    
##                   (0.000)   
## Age               -0.002    
##                   (0.001)   
## ----------------------------
## R^2                0.019    
## Adj. R^2           0.011    
## Num. obs.        241        
## ============================
## *** p < 0.001; ** p < 0.01; * p < 0.05

12. Substantively Interpret the Regression Model in the Previous Question.

You do not need to interpret the x-intercept.

The coefficients for minutes driving and age are very close to zero and not statistically significant, meaning neither age nor love for hushpuppies has a meaningful effect on how far respondents are willing to drive.

13. Estimate a Multivariate Regression

Estimate a multivariate regression model with the variables for driving distance, age, preference for eastern style sauce, and being willing to pay above average price for a dinner plate. You want to know if the time someone is willing to spend driving is a function of their age, eastern sauce being their preferred sauce style, and if they are willing to pay a higher price for a BBQ plate. Come up with your own unique name for the object to store your regression results. Round to 4 digits.

Sweetpea<- lm(Minutes.Driving ~ Prefers.Eastern + Pay.More + Age, data = BBQData)
screenreg(Sweetpea, digits = 4)
## 
## =======================
##              Model 1   
## -----------------------
## (Intercept)   41.7748  
##              (27.9746) 
## Pay.More      65.1635 *
##              (22.3782) 
## Age           -0.8469  
##               (0.9263) 
## -----------------------
## R^2            0.7530  
## Adj. R^2       0.6296  
## Num. obs.      7       
## =======================
## *** p < 0.001; ** p < 0.01; * p < 0.05

14. Substantively Interpret the Regression Model in the Previous Question.

You do not need to interpret the x-intercept.

Based on the results from the model above, the value for if someone is willing to pay more for a BBQ plate is statistically significant, meaning that people who are willing to pay a higher price are predicted to drive for a longer time than those who aren’t willing to pay more. Age is not statistically significant, so age doesn’t have a meaningful effect on driving distance.

Publish Document

Click the “Knit” button to publish your work as an html document. This document or file will appear in the folder specified by your working directory. You will need to upload both this RMarkdown file and the html file it produces to AsU Learn to get all of the points associated with this lab.