data <- read.csv("C:/Users/ywang/Desktop/HW1DatasetToyotaPrices _2_.csv")
plot(Price~AgeInMonths, data)
## 2)
fit <- lm(Price~AgeInMonths, data)
summary(fit)
##
## Call:
## lm(formula = Price ~ AgeInMonths, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -8423.0 -997.4 -24.6 878.5 12889.7
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 20294.059 146.097 138.91 <2e-16 ***
## AgeInMonths -170.934 2.478 -68.98 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1746 on 1434 degrees of freedom
## Multiple R-squared: 0.7684, Adjusted R-squared: 0.7682
## F-statistic: 4758 on 1 and 1434 DF, p-value: < 2.2e-16
The fitted equation is Price = 20294.059 -170.934 AgeInMonths.
Yes, I think the price is negatively associated with the age, as elder cars will be cheaper, and the sign of the slope is negative, which is consistent with my expectation.
Since the p-value of the slope is <2e-16, the coefficient of the slope is significantly different from 0, there is evidence of a linear relationship between price and age of the car.
The R-squared of the model is 0.7684.
predict(fit, data.frame(AgeInMonths=30))
## 1
## 15166.05
The predicted price is 15166.05(Euros).
predict(fit, data.frame(AgeInMonths=30), interval="prediction")
## fit lwr upr
## 1 15166.05 11737.48 18594.63
The 95% prediction interval is (11737.48 18594.63).
I think the price is too high, since 15166.05 < 19500.
data <- read.csv("C:/Users/ywang/Desktop/AirportViolHW1 _2_.csv")
plot(ViolDet~TurnRate, data)
There is a very weak negative association between the two variables.
fit <- lm(ViolDet~TurnRate, data)
summary(fit)
##
## Call:
## lm(formula = ViolDet ~ TurnRate, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.4440 -5.9082 -0.8105 5.2192 13.8808
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 21.86874 3.02808 7.222 1.43e-06 ***
## TurnRate -0.03035 0.01680 -1.807 0.0885 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 7.509 on 17 degrees of freedom
## Multiple R-squared: 0.1611, Adjusted R-squared: 0.1118
## F-statistic: 3.266 on 1 and 17 DF, p-value: 0.08848
The p-value is 0.0885 > 0.05, so there is no evidence that the two variables are linearly related.