log(f areit) = ηt + β1concenit + β2 log(disti) + β3[log(disti)]2 + ai + uit, t = 1, …, 4,
Since the coefficient is 0.36 and we are looking for 0.1 change the impact on lfare will be 0.036 or 3.6%
data("airfare")
str(airfare)
## 'data.frame': 4596 obs. of 14 variables:
## $ year : int 1997 1998 1999 2000 1997 1998 1999 2000 1997 1998 ...
## $ id : int 1 1 1 1 2 2 2 2 3 3 ...
## $ dist : int 528 528 528 528 861 861 861 861 852 852 ...
## $ passen : int 152 265 336 298 282 178 204 190 241 253 ...
## $ fare : int 106 106 113 123 104 105 115 129 207 188 ...
## $ bmktshr: num 0.839 0.813 0.826 0.861 0.58 ...
## $ ldist : num 6.27 6.27 6.27 6.27 6.76 ...
## $ y98 : int 0 1 0 0 0 1 0 0 0 1 ...
## $ y99 : int 0 0 1 0 0 0 1 0 0 0 ...
## $ y00 : int 0 0 0 1 0 0 0 1 0 0 ...
## $ lfare : num 4.66 4.66 4.73 4.81 4.64 ...
## $ ldistsq: num 39.3 39.3 39.3 39.3 45.7 ...
## $ concen : num 0.839 0.813 0.826 0.861 0.58 ...
## $ lpassen: num 5.02 5.58 5.82 5.7 5.64 ...
## - attr(*, "time.stamp")= chr "25 Jun 2011 23:03"
model1 <- (lfare~concen+ldist+ldistsq+y98+y99+y00)
fit.fare1 <- plm(model1, data = airfare,index=c("id", "year"), effect="individual", model="pooling") # OLS regression
coeftest(fit.fare1, vcovHC(fit.fare1, cluster="group", type="HC0"))
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.2092576 0.9107631 6.8176 1.045e-11 ***
## concen 0.3601203 0.0584923 6.1567 8.061e-10 ***
## ldist -0.9016004 0.2716505 -3.3190 0.0009105 ***
## ldistsq 0.1030196 0.0201382 5.1156 3.255e-07 ***
## y98 0.0211244 0.0041429 5.0990 3.553e-07 ***
## y99 0.0378496 0.0051739 7.3155 3.011e-13 ***
## y00 0.0998700 0.0056407 17.7052 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The OLS is CI(0.301170495, 0.41907017), that may not be correct because we are not using robust Standard erro. With Robust we have CI (0.2454, 0.4747) that indicates that we have a problem with heteroskedasticity
#0.3601203 - 0.0584923 * 1.96
#0.3601203 + 0.0584923 * 1.96
coefci(fit.fare1)
## 2.5 % 97.5 %
## (Intercept) 5.384630820 7.03388431
## concen 0.301170495 0.41907017
## ldist -1.153077230 -0.65012354
## ldistsq 0.083952912 0.12208632
## y98 -0.006404571 0.04865332
## y99 0.010321954 0.06537721
## y00 0.072338468 0.12740148
#confint(fit.fare1)
coefci(fit.fare1, vcov=vcovHC, type="HC3")
## 2.5 % 97.5 %
## (Intercept) 4.41562504 8.00289009
## concen 0.24523226 0.47500841
## ldist -1.43654587 -0.36665490
## ldistsq 0.06336509 0.14267413
## y98 0.01298980 0.02925895
## y99 0.02769066 0.04800850
## y00 0.08879362 0.11094633
min(airfare$lfare)
## [1] 3.610918
max(airfare$lfare)
## [1] 6.257668
min(airfare$ldist)
## [1] 4.553877
max(airfare$ldist)
## [1] 7.909857
ldist -0.9016004
ldistsq 0.1030196
(Intercept) 6.2092576
y = -0.9x+0.1x^2+6.2 y = -0.94.5+0.14.5^2+6.2 = 4.175 Solving that we equation we see that the inflection point occurs at X = 4.5. Looks like the point is inside the range of the data.
y = -0.9*4.5+0.1*4.5^2+6.2
plot(airfare$ldist, airfare$lfare)
#x <- min(airfare$ldist)- max(airfare$ldist)
#y <- min(airfare$lfare)- max(airfare$lfare)
#seq((100,0,abs(x)))
#xx <- seq(0,abs(x),0.1)
#yy <- seq(0,abs(y),0.1)
#plot(xx,yy)
Now B1 is 0.2089935, it became smaller coefficient, but same significance
fit.fare2 <- plm(model1, data = airfare,index=c("id", "year"), effect="individual", model="random") # Random regression
coeftest(fit.fare2, vcovHC(fit.fare2, cluster="group", type="HC0"))
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 6.2220050 0.9134118 6.8118 1.088e-11 ***
## concen 0.2089935 0.0421999 4.9525 7.591e-07 ***
## ldist -0.8520921 0.2717941 -3.1351 0.001729 **
## ldistsq 0.0974604 0.0201198 4.8440 1.314e-06 ***
## y98 0.0224743 0.0041416 5.4265 6.044e-08 ***
## y99 0.0366898 0.0051262 7.1573 9.537e-13 ***
## y00 0.0982120 0.0055181 17.7982 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Omar code doesn't match with mine
#fit.re <- plm(log(fare) ~ y98+y99+y00+concen+log(dist)+(log(dist)^2), data=airfare, effect="individual", index=c("id", "year"),
# model="random")
#summary(fit.re)
#coeftest(fit.re, vcovHC(fit.re, cluster="group", type="HC0"))
The result is almost the same because the Random effect is behaving almost like Fixed effect.
model2 <- (lfare~concen+ldist+ldistsq) # why using this model we get different results?
fit.fare3 <- plm(model1, data = airfare,index=c("id", "year"), effect="individual", model="within") # FE regression
coeftest(fit.fare3, vcovHC(fit.fare3, cluster="group", type="HC0"))
##
## t test of coefficients:
##
## Estimate Std. Error t value Pr(>|t|)
## concen 0.1688590 0.0494156 3.4171 0.0006402 ***
## y98 0.0228328 0.0041594 5.4895 4.323e-08 ***
## y99 0.0363819 0.0051230 7.1016 1.491e-12 ***
## y00 0.0977717 0.0055006 17.7746 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#library("sjPlot") #https://cran.r-project.org/web/packages/sjPlot/vignettes/tab_model_robust.html
#tab_model(fit.fare2, vcov.fun = "HC", show.se = TRUE)
#unname(sqrt(diag(sandwich::vcovHC(model))))
#tab_model(fit.fare2, vcov.fun = "CL", vcov.type = "HC1", show.se = TRUE)
I was not fully aware of the meaning of each variable. I would say that Passenger number is very specific to each rout, also bus type. Some might be correlated.
Yes, but not by that much, since we don’t know the scale of the variables we cannot confirm that. However, from the regression it increases in a positive way.