7.25 The Coast Starlight, Part II. Exercise 7.13 introduces data on the Coast Starlight Amtrak train that runs from Seattle to Los Angeles. The mean travel time from one stop to the next on the Coast Starlight is 129 mins, with a standard deviation of 113 minutes. The mean distance traveled from one stop to the next is 108 miles with a standard deviation of 99 miles. The correlation between travel time and distance is 0.636.

pic

#(a)Write the equation of the regression line for predicting travel time. 

# Ans(a)
#Avg(TT)= 129 min  SD (TT)=113 mins
#Avg(dis)=108 miles SD(dis)=99 miles
#Correlation Coefficient(R)=0.636
#slope of the regression line=b=R * SD(TT)/SD (dis)

0.636 * (113/99)
## [1] 0.7259394
# Equation of regression line is:  y-y0=b (x-x0)
# y-129=.725(x-108)
# y=129-78.3 +.725x
# y=50.7 +.725x

#(b)Interpret the slope in this context. 

# Ans(b) slope of the line is 0.725 which is positive, that indicates  with increase in distance between stations, time of travel to reach the station also increases

#(c)Calculate R2 of the regression line for predicting travel time from distance traveled for the Coast Starlight, and interpret R2 in the context of the application. 
#Ans(C) #R^2 value, Variance (TT)=99 Residule from the model=42.63 (using value y=125.37 and actual=168)

(99^2-42.63^2)/99^2
## [1] 0.8145784
# above R^2 value explains around 81.45 % variablity in the data


# (d)The distance between Santa Barbara and Los Angeles is 103 miles. Use the model to estimate the time it takes for the Starlight to travel between these two cities. 
#Ans(d) distance between santa barbara to  Los angeles 103 , hence time taken for this travel is

50.7 +.725 *103
## [1] 125.375
# (e)It actually takes the Coast Starlight about 168 mins to travel from Santa Barbara to Los Angeles. Calculate the residual and explain the meaning of this residual value. 

#Ans(e) actual time taken is 168, residual is =
  168-125.37
## [1] 42.63
#this residule value implies the difference in values between actual time taken and values derived from model

#(f)Suppose Amtrak is considering adding a stop to the Coast Starlight 500 miles away from Los Angeles. Would it be appropriate to use this linear model to predict the travel time from Los Angeles to this point?

#Ans(f) Looking at the data set there are no points beyond 350 miles. That means using this model for this case will be carrying out extrapolation of data points which is an unknown territory here. This can increase chance of inaccurate outcome from this model.  

This is a product of OpenIntro that is released under a Creative Commons Attribution-ShareAlike 3.0 Unported. This lab was adapted for OpenIntro by Mine Çetinkaya-Rundel from a lab written by the faculty and TAs of UCLA Statistics.