Q1 Regression model from activity 1

library(tidyverse)
Activity1 <- read.csv("https://raw.githubusercontent.com/lebebr01/psqf_6243/main/data/rideshare_small.csv")
Lmmodel <- lm(price ~ distance, data = Activity1)
summary(Lmmodel)
## 
## Call:
## lm(formula = price ~ distance, data = Activity1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -21.074  -7.062  -1.544   5.067  43.304 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  10.4391     0.3825   27.29   <2e-16 ***
## distance      2.8899     0.1534   18.84   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.77 on 2498 degrees of freedom
## Multiple R-squared:  0.1244, Adjusted R-squared:  0.1241 
## F-statistic:   355 on 1 and 2498 DF,  p-value: < 2.2e-16

Q2

The intercept does not provide adequate interpretation as it estimates a price of $10.4 for ride of zero distance. This can be adjusted for by centering the model around a distance that makes more sense in the context of this data. Centering the model around the minimum distance will make the intercept provide information of what the price of a ride will be at the minimum distance.

preparing data for centering with minimum, mean, median and maximum distance

Newdata <- Activity1%>% mutate(cen_min=distance-min(distance), cen_mean=distance-mean(distance), cen_median=distance-median(distance), cen_max=distance-max(distance))

Sample of centered data

head(Newdata%>% select(price,distance, cen_min,cen_mean,cen_median,cen_max))
##   price distance cen_min  cen_mean cen_median cen_max
## 1  22.5     1.89    1.86 -0.326184     -0.385   -5.57
## 2  11.0     1.26    1.23 -0.956184     -1.015   -6.20
## 3  42.5     4.46    4.43  2.243816      2.185   -3.00
## 4  25.0     1.64    1.61 -0.576184     -0.635   -5.82
## 5  28.0     1.61    1.58 -0.606184     -0.665   -5.85
## 6   5.0     1.00    0.97 -1.216184     -1.275   -6.46

Q3

Centering regression minimum distance

Lmmin <- lm(price~cen_min, data = Newdata)
summary(Lmmin)
## 
## Call:
## lm(formula = price ~ cen_min, data = Newdata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -21.074  -7.062  -1.544   5.067  43.304 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  10.5258     0.3784   27.82   <2e-16 ***
## cen_min       2.8899     0.1534   18.84   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 8.77 on 2498 degrees of freedom
## Multiple R-squared:  0.1244, Adjusted R-squared:  0.1241 
## F-statistic:   355 on 1 and 2498 DF,  p-value: < 2.2e-16

Q4 Standard Error of Regression Coefficients

Summary <- summary(Lmmin)
Summary$coefficients[ ,2]
## (Intercept)     cen_min 
##   0.3784202   0.1533785

Question 5

Confidence Interval for Intercept

10.53 + c(-1, 1) * 1.96 * .3784

10.53 + c(-1, 1) * 1.96 * .3784
## [1]  9.788336 11.271664

Confidence Interval for Slope

10.53 + c(-1, 1) * 1.96 * .3784

2.9 + c(-1, 1) * 1.96 * .1533
## [1] 2.599532 3.200468

Question 6