Why Quantile Regression?

Reason 1:

Quantile regression allows us to study the impact of independent variables on different quantiles of dependent variable’s distribution, and thus provides a complete picture of the relationship between Y and X.

Reason 2:

Robust to outliers in y observations.

Reason 3:

Estimation and inferences are distribution-free.

In this tutorial session we will learn:

Follow the procedure:

1. Switch on the QuantReg package*:

## Loading required package: SparseM
## 
## Attaching package: 'SparseM'
## 
## The following object is masked from 'package:base':
## 
##     backsolve

2. Import your data:

QRData <- read.csv("https://dl.dropboxusercontent.com/u/18255955/PhD-INCEIF/MSData.csv")

3. Attach your data and get descriptive statistics:

attach(QRData)

summary(QRData)
##       DATE         LRY             LRV               INT        
##  1997Q3 : 1   Min.   :11.39   Min.   : 0.1100   Min.   : 4.520  
##  1997Q4 : 1   1st Qu.:11.57   1st Qu.: 0.3125   1st Qu.: 5.015  
##  1998Q1 : 1   Median :11.83   Median : 0.6100   Median : 6.215  
##  1998Q2 : 1   Mean   :11.80   Mean   : 1.9211   Mean   : 6.619  
##  1998Q3 : 1   3rd Qu.:12.00   3rd Qu.: 1.5275   3rd Qu.: 7.025  
##  1998Q4 : 1   Max.   :12.21   Max.   :33.7100   Max.   :13.510  
##  (Other):60                                                     
##       LRC             LVS             LGS       
##  Min.   :13.22   Min.   :3.900   Min.   :3.360  
##  1st Qu.:13.31   1st Qu.:4.705   1st Qu.:3.853  
##  Median :13.36   Median :4.875   Median :3.935  
##  Mean   :13.43   Mean   :4.896   Mean   :3.978  
##  3rd Qu.:13.54   3rd Qu.:5.170   3rd Qu.:4.140  
##  Max.   :13.85   Max.   :5.400   Max.   :4.900  
## 

4. Define dataframe and obtain correlation matrix for your variables:

datatable=data.frame(LRY, LRV, LRC, INT)

cor(datatable)
##            LRY        LRV        LRC        INT
## LRY  1.0000000 -0.4332241  0.8569548 -0.8145417
## LRV -0.4332241  1.0000000 -0.2451381  0.6998176
## LRC  0.8569548 -0.2451381  1.0000000 -0.6122857
## INT -0.8145417  0.6998176 -0.6122857  1.0000000

5. Make scatterplots:

pairs(datatable, col="blue", main="Scatterplots")

6. Define your dependent (Y) and independent (X) variables:

Y=cbind(LRY)

X=cbind(LRV, LRC, INT)

6. Plot histogram and density function line for the dependent variable:

hist(Y, prob=TRUE, col = "blue", border = "black")

lines(density(Y))

7. Run OLS regression and get the output:

OLSreg=lm(Y~X)

summary(OLSreg)
## 
## Call:
## lm(formula = Y ~ X)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.13441 -0.06907 -0.01374  0.03885  0.21060 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.591353   1.216473   1.308    0.196    
## XLRV         0.003671   0.003565   1.030    0.307    
## XLRC         0.792362   0.087463   9.059 5.88e-13 ***
## XINT        -0.065277   0.010298  -6.339 2.96e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.09046 on 62 degrees of freedom
## Multiple R-squared:  0.871,  Adjusted R-squared:  0.8647 
## F-statistic: 139.5 on 3 and 62 DF,  p-value: < 2.2e-16

8. Run 0.25 Quantile regression and get the output:

Qreg25=rq(Y~X, tau=0.25)

summary(Qreg25)
## 
## Call: rq(formula = Y ~ X, tau = 0.25)
## 
## tau: [1] 0.25
## 
## Coefficients:
##             coefficients lower bd upper bd
## (Intercept)  1.83794      0.69054  3.65746
## XLRV         0.00739     -0.14305  0.01842
## XLRC         0.77456      0.66301  0.85183
## XINT        -0.07835     -0.15756 -0.04587

9. Similarly, you can run 0.75 Quantile regression and get the output:

Qreg75=rq(Y~X, tau=0.75)

summary(Qreg75)
## 
## Call: rq(formula = Y ~ X, tau = 0.75)
## 
## tau: [1] 0.75
## 
## Coefficients:
##             coefficients lower bd upper bd
## (Intercept)  5.13640      1.18049  6.79399
## XLRV        -0.00233     -0.00246  0.13938
## XLRC         0.53063      0.41264  0.83651
## XINT        -0.05693     -0.10921 -0.05131

10. Test whether 0.25 and 0.75 Quantile regressions coefficients are different:

anova(Qreg25, Qreg75)
## Warning in summary.rq(x, se = se, covariance = TRUE): 8 non-positive fis
## Quantile Regression Analysis of Deviance Table
## 
## Model: Y ~ X
## Joint Test of Equality of Slopes: tau in {  0.25 0.75  }
## 
##   Df Resid Df F value  Pr(>F)   
## 1  3      129  4.3474 0.00593 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

*** The Null hypothesis is that the coefficients are not different

11. Run several Quantile regressions simultaneously and get the outputs:

QR=rq(Y~X, tau=seq(0.2, 0.8, by=0.1))

sumQR=summary(QR)

12. Finally, plot Quantile regressions:

plot(sumQR)


Note:

*** * If you haven’t installed the package, then run the following code: install.packages(“quantreg”)