Introduction

Collinearity implies presence of linear near perfect relationship among the variables. In Econometrics, under multiple regression analysis, it is possible for regressors to be highly correlated. Multicollinearity is a case of perfect linear relationship among two or more regressors. In such a case, \(X^{'}X\) does not result yield an inverse owing to it being a singular matrix. This can be stated as \[\sum_{i=1}^k {\lambda_i X_i}=0\] where all \(\lambda_i\) are not zero. Then \(X_i\) are linear combination of one another, either row or column wise. Thus, the \(\hat{\beta}={(X^{'}X)}^{-1} X^{'} Y\) is non existent. Thus, under presence of perfect multicollinearity among any two or more regressors, estimation of population parameters is impossible. However, less than perfect multicollinearity allows for estimation of population parameters but very often such estimates are biased and bear wrong sign. The estimated standard errors of such parameter estimates is wide or inflated and thus, leads to insignificant t-values along with high R square values (Gujarati, Porter, and Gunasekar 2012) ]. This is because \({VCOV(\hat{\beta})}=\sigma^{2}{(X^{'}X)}^{-1}\) If \({(X^{'}X)}^{-1}=0\) then \({VCOV(\hat{\beta})}=\infty\) under finite variance of residual random error term. Consequently, calculated t-values would be zero.Currently, most of the econometric and statistical programs remove variables causing (multi)collinearity and then report the regression results. However, the user of regression tool must be aware of the diagnostic procedure. The next section outlines the diagnostic tools in detecting multicolinearity using R software for statistical computing. Thereafter the paper outlines some corrective procedures for multicollinearity followed by conclusion. This paper uses state.x77 dataset from datasets package (United States Department Of Commerce. Bureau Of The Census 1984) in R software (R Core Team 2021) . This data set contains data on 50 U.S. states on 8 parameters namely,

  1. Population:population estimate as of July 1, 1975

  2. Income:per capita income (1974)

  3. Illiteracy: illiteracy (1970, percent of population)

  4. Life Exp:life expectancy in years (1969–71)

  5. Murder: murder and non-negligent manslaughter rate per 100,000 population (1976)

  6. HS Grad:percent high-school graduates (1970)

  7. Frost:mean number of days with minimum temperature below freezing (1931–1960) in capital or large city

  8. Area:land area in square miles

Following package and commands load data in html format (Xie, Cheng, and Tan 2021)

#### If DT package is not already installed, install it using
#install.packages("DT")
library(DT)  ### loading DT package from package library
datatable(datasets::state.x77)  ###using datatable function to make HTML table

The hypothesis set here is that murder rate per 100,000 population is dependent on population, income, illiteracy, life expectancy, percentage of high school graduates, number of frosty days and area of the state. The OLS regression is run with lm command as demonstrated below:

states<-as.data.frame.array(state.x77)  ###converting the state.x777 data from array to data frame
colnames(states)<-c("Population", "Income", "Illiteracy", "LifeExp", "Murder", "HSGrad", "Frost","Area") ### Assigning new column names to make it compatible with olsrr package
states.lm<-lm(Murder~.,data = states)
library(sjPlot)
tab_model(states.lm,show.fstat = TRUE)
  Murder
Predictors Estimates CI p
(Intercept) 122.18 86.08 – 158.28 <0.001
Population 0.00 0.00 – 0.00 0.006
Income -0.00 -0.00 – 0.00 0.782
Illiteracy 1.37 -0.31 – 3.05 0.106
LifeExp -1.65 -2.17 – -1.14 <0.001
HSGrad 0.03 -0.08 – 0.15 0.575
Frost -0.01 -0.03 – 0.00 0.089
Area 0.00 -0.00 – 0.00 0.124
Observations 50
R2 / R2 adjusted 0.808 / 0.776
anova(states.lm)
## Analysis of Variance Table
## 
## Response: Murder
##            Df  Sum Sq Mean Sq F value    Pr(>F)    
## Population  1  78.854  78.854 25.8674 8.049e-06 ***
## Income      1  63.507  63.507 20.8328 4.322e-05 ***
## Illiteracy  1 236.196 236.196 77.4817 4.380e-11 ***
## LifeExp     1 139.466 139.466 45.7506 3.166e-08 ***
## HSGrad      1   8.066   8.066  2.6460    0.1113    
## Frost       1   6.109   6.109  2.0039    0.1643    
## Area        1   7.514   7.514  2.4650    0.1239    
## Residuals  42 128.033   3.048                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From the above results it is evident that population and life expectancy are the significant factors affecting Murder rates in the U.S. states. Before going to diagnostic tools, zero order correlation matrix would be examined.

tab_corr(states)
  Population Income Illiteracy LifeExp Murder HSGrad Frost Area
Population   0.208 0.108 -0.068 0.344* -0.098 -0.332* 0.023
Income 0.208   -0.437** 0.340* -0.230 0.620*** 0.226 0.363**
Illiteracy 0.108 -0.437**   -0.588*** 0.703*** -0.657*** -0.672*** 0.077
LifeExp -0.068 0.340* -0.588***   -0.781*** 0.582*** 0.262 -0.107
Murder 0.344* -0.230 0.703*** -0.781***   -0.488*** -0.539*** 0.228
HSGrad -0.098 0.620*** -0.657*** 0.582*** -0.488***   0.367** 0.334*
Frost -0.332* 0.226 -0.672*** 0.262 -0.539*** 0.367**   0.059
Area 0.023 0.363** 0.077 -0.107 0.228 0.334* 0.059  
Computed correlation used pearson-method with listwise-deletion.

The correlation structure of the data is merely a sufficient indicator of multicollinearity but not a necessary one; high correlation among regressors does not imply high multicollinearity. Therefore, it is not reported as part of diagnostic tools. From the correlation table it is evident that life expectancy a significant predictor of murder rates is highly correlated with illiteracy which is having high correlation with dependent variable. Illiteracy has moderate degree of correlation with Frost, HSGrad apart from LifeExp. This suggest that Illiteracy or LifeExp can a variable to examined further in diagnostic. Now the diagnostic tools would be demonstrated in next section.

Diagnostic Tools

  1. Variance Inflation Factor (VIF): VIFs are the most commonly used tool to identify the regressors responsible for multicollinearity(Gujarati, Porter, and Gunasekar 2012). \[{VIF_j}=\frac{1}{1-R_j^{2}}={(X^{'}X)}^{-1}_{jj}\] where \(R_j^{2}\) is the coefficient of multiple determination from auxiliary regression of \(X_j\) regressor on rest of the regressors. If the value of \(VIF_j>=01\) then \(X_j\) regressor is of the concern and if it is \(4<=VIF_j<10\) then the jth regressor needs to be investigated.

  2. Tolerance (TOL): It is the percentage of the variance not explained by other regressors. It is one minus coefficient of multiple determination from auxilliary regression used in calculating VIF. \[Tolerance_j=1-R_j^{2}\] The logic here is that if there exists high degree of collinearity among regressors then one variable should explain large amount of variation present in the other regressors leading to high \(R_j^{2}\) and low value of tolerance.

library(olsrr)
ols_vif_tol(states.lm)
##    Variables Tolerance      VIF
## 1 Population 0.7447732 1.342691
## 2     Income 0.5026655 1.989395
## 3 Illiteracy 0.2417821 4.135956
## 4    LifeExp 0.5259200 1.901430
## 5     HSGrad 0.2909280 3.437276
## 6      Frost 0.4213252 2.373463
## 7       Area 0.5914972 1.690625

From this table it can be observed that only illiteracy is the factor which is having VIF value above 4 and the lowest tolerance value. Thus, illiteracy variable should be examined further.

  1. Conditional Index (CI): Conditional index is the square root of the largest eigen value of the regressor matrix to the eigen value associated with given principal component or dimension. The reason eigen values were chosen to prepare conditional index was that singular matrix has zero eigen values. Thus, under exact multicollinearity, conditional index would be indeterminate. So lower values (near to zero) indicates presence of multicollinearity(Chatterjee and Hadi 2006).

\[CI_j=\sqrt \frac{Max {(\lambda_j)}}{\lambda_j}\]

An conditional index value greater than 15 indicates presence of multicollinearity and greater than 30 indicates severe multicollinearity. Associated with conditional index is output of variance decomposition for each principal component into intercept and regressors. For each component where conditional index exceeds 15, one should look for presence of variance concentration above 0.7 on at least two regressors. On 7th component, illiteracy and HSGrad have varince proportions above 0.6. However, some literature also suggest variance proportion threshold of 0.9. This indicate that on this principal component these two variables load very high, meaning that these two variables are highly correlated on that given dimension.

datatable(round(ols_eigen_cindex(states.lm),2))
  1. Part and Partial correlation Coefficients:It is known that zero order pearson correlation coefficient does not reveal the full information required to declare presence of multicollinearity. However, part and partial correlation coefficient can reveal useful information. Part and first order partial correlation coefficient is the correlation between dependent variable and independent variable when linear effect of other independent variables has been removed respectively from independent variable only and both dependent & independent variables both.
ols_correlations(states.lm)
##                 Correlations                  
## ---------------------------------------------
## Variable      Zero Order    Partial     Part  
## ---------------------------------------------
## Population         0.344      0.409     0.196 
## Income            -0.230     -0.043    -0.019 
## Illiteracy         0.703      0.247     0.111 
## LifeExp           -0.781     -0.706    -0.436 
## HSGrad            -0.488      0.087     0.038 
## Frost             -0.539     -0.260    -0.118 
## Area               0.228      0.235     0.106 
## ---------------------------------------------

From the above results it is evident that only population and life expectancy have high degree of relationship with Murder rates when linear effects of other regressors is removed (Partial correlation coefficients). The problematic variable is illiteracy which had very high effect of other regressors on it. Consequently its partial correlation coefficients are very low. Same is the case with Income, Frost and HSGrad variables. Part correlation coefficients also reveals to some degree importance of the variable in regression coefficient. Once again life expectancy and population emerges as the two most important regressors in the regression.

  1. R2 from Auxilliary Regression: If the \(R_j^{2}\) from auxiliary regression of \(X_j\) on rest of the regressors is higher than the \(R^{2}\) from the main regression then according to Klein(1962), the collinearity is harmful.

  2. Farrar and Glauber Test: This test was developed by D.E. Farrar and R.R. Glauber Test (Farrar and Glauber 1967). It has three test statistics, chi-square, F and t statistics. Farrar and Glauber have developed the Chi-square test for detecting the strength of the multicollinearity over the whole set of explanatory variables. This test is based on the fact that in case of perfect multicollinearity the simple correlation coefficients are equal to unity and so the determinant turns to zero. The Chi-square test statistic is given by

\[\chi^{2}=-{[n-1-\frac{1}{6(2K+5)}{log_e\Delta}]}\sim \chi^{2}_{\frac{1}{2}K(K-1)}\]

where \(\Delta\) is the determinant of the zero order correlation matrix of the data matrix. where \(n=sample size\) and \(k=number of explanatory variables\). If the observed value of the Chi-square test statistic is found to be greater than the critical value of Chi-square at the desired level of significance, we reject the assumption of orthogonality and accept the presence of multicollinearity in the model.If the observed value of the Chi-square test statistic is found to be less than the critical value of Chi-square at the desired level of significance, we accept that there is no problem of multicollinearity in the model.

The second test in the Farar-Glauber test is an F test (Wi) for the location of multicollinearity. To do this, they have computed the multiple correlation coefficients among the explanatory variables and tested the statistical significance of these multiple correlation coefficients using an F test. The test statistic is given as

\[F^{*}=\frac {{R^{2}_{X_i.X_1,X_2,...,X_i-1,X_i+1,...,X_k}}/{(K-1)}}{{(1-R^{2}_{X_i.X_1,X_2,...,X_i-1,X_i+1,...,X_k})}/{(n-K)}}\sim F_{[n-K ,K-1)]}\]

Null hypothesis here is that \(R^{2}_{X_i.X_1,X_2,...,X_i-1,X_i+1,...,X_k}=0\) vs alternate hypothesis of \(R^{2}_{X_i.X_1,X_2,...,X_i-1,X_i+1,...,X_k}!=0\)

If the observed value of F is found to be greater than the theoretical value of F with degrees of freedom at the desired level of significance, we accept that the variable \(X_i\) is multicollinear.On the other hand, if the observed value of F is less than the theoretical value of F, we accept that the variable Xi is not multicollinear.

Finally, the Farrar – Glauber test concludes with a t – test for the pattern of multicollinearity. In fact, this is a t-test which aims at the detection of the variables which cause multicollinearity.partial correlation coefficients among the explanatory variables are computed and their statistical significance are tested with the t test. Null hypothesis is that partial correlation coefficients are equal to zero.

\[t^{*}= \frac {{r^{2}_{X_i X_j.X_1,X_2,...,X_i-1,X_i+1,...,X_j-1,X_j+1,...,X_k}} \sqrt {n-k}}{\sqrt {1-r^{2}_{X_i X_j.X_1,X_2,...,X_i-1,X_i+1,...,X_j-1,X_j+1,...,X_k}}} \]

The above test statistic follows the t -distribution with (n–k) degrees of freedom. Thus, if the computed value of t -statistic is greater than the theoretical value of t with (n–k) degrees of freedom at the desired level of significance, we accept that the variables \(X_i\) and \(X_j\) are responsible for the multicollinearity in the model, otherwise the variables are not the cause of multicollinearity since their partial correlation coefficient is not statistically significant.

  1. Sum of Inverse of Eigen Value: Chatterjee and Hadi (2006) and Carlson, Dillon, and Goldstein (1986) suggested sum of inverse of eigen values of either \({(X^{'}X)}\) or its related correlation matrix greater than or equal to five times number of predictor indicates presence of multicollinearity.

  2. Other Measures: Other measures include Theil’s measure (Theil and Collection 1971) , determinanant of normalized correlation matrix of \({(X^{'}X)}\) (Asteriou and Hall 2007), Red indicator(Kovàcs, Petres, and Tóth 2006), Leamer index(Greene 2003), Corrected VIF (CVIF) (Curto and Pinto 2010 ), Klein’s method (Klein 1977), IND1 & IND2 measures (Imdad Ullah, Altaf, and Ahmed 2019 ). Imdad Ullah, Altaf, and Ahmed (2019) provides an excellent summary of all the measures.

library(mctest)
omcdiag(states.lm)
## 
## Call:
## omcdiag(mod = states.lm)
## 
## 
## Overall Multicollinearity Diagnostics
## 
##                        MC Results detection
## Determinant |X'X|:         0.0466         0
## Farrar Chi-Square:       140.5382         1
## Red Indicator:             0.3753         0
## Sum of Lambda Inverse:    16.8708         0
## Theil's Method:           -1.1685         0
## Condition Number:        262.9508         1
## 
## 1 --> COLLINEARITY is detected by the test 
## 0 --> COLLINEARITY is not detected by the test
imcdiag(states.lm)
## 
## Call:
## imcdiag(mod = states.lm)
## 
## 
## All Individual Multicollinearity Diagnostics Result
## 
##               VIF    TOL      Wi      Fi Leamer    CVIF Klein   IND1   IND2
## Population 1.3427 0.7448  2.4559  3.0157 0.8630 -0.3009     0 0.1039 0.4853
## Income     1.9894 0.5027  7.0907  8.7067 0.7090 -0.4458     0 0.0701 0.9457
## Illiteracy 4.1360 0.2418 22.4743 27.5964 0.4917 -0.9269     0 0.0337 1.4418
## LifeExp    1.9014 0.5259  6.4602  7.9326 0.7252 -0.4261     0 0.0734 0.9015
## HSGrad     3.4373 0.2909 17.4671 21.4480 0.5394 -0.7703     0 0.0406 1.3484
## Frost      2.3735 0.4213  9.8432 12.0865 0.6491 -0.5319     0 0.0588 1.1004
## Area       1.6906 0.5915  4.9495  6.0775 0.7691 -0.3789     0 0.0825 0.7768
## 
## 1 --> COLLINEARITY is detected by the test 
## 0 --> COLLINEARITY is not detected by the test
## 
## Income , Illiteracy , HSGrad , Frost , Area , coefficient(s) are non-significant may be due to multicollinearity
## 
## R-square of y on all x: 0.8083 
## 
## * use method argument to check which regressors may be the reason of collinearity
## ===================================

Corrective Measures

Corrective steps to ameliorate problem of multicollinearity can be any of the following:

  1. Do nothing: If the extent of multicollinearity is not severe then one can ignore it safely, i.e. tolerate it.

  2. Remove the independent variable causing problem: Based on various tools and measures discussed above, once it is indicated that one or more regressors are the cause of multicollinearity then one can consider removing these from regression equation. In the case of regression states.lm considered above, illiteracy was one of the regressor found with moderate levels of VIF. The following results shows regression without it.

states.lm1<-lm(Murder~Population+Income+LifeExp+HSGrad+Frost+Area,data = states)
tab_model(states.lm1)
  Murder
Predictors Estimates CI p
(Intercept) 134.35 100.84 – 167.86 <0.001
Population 0.00 0.00 – 0.00 0.013
Income -0.00 -0.00 – 0.00 0.653
LifeExp -1.75 -2.27 – -1.24 <0.001
HSGrad -0.01 -0.12 – 0.09 0.812
Frost -0.02 -0.03 – -0.01 0.001
Area 0.00 0.00 – 0.00 0.020
Observations 50
R2 / R2 adjusted 0.796 / 0.767
ols_vif_tol(states.lm1)
##    Variables Tolerance      VIF
## 1 Population 0.7701675 1.298419
## 2     Income 0.5088160 1.965347
## 3    LifeExp 0.5561071 1.798215
## 4     HSGrad 0.3746059 2.669472
## 5      Frost 0.7589571 1.317598
## 6       Area 0.7130561 1.402414

The results shows that after removing illiteracy variable, no other variable has VIF greater than 4. Earlier only two regressors, life expectancy and population were significant but now ‘Frost’ and ‘Area’ too are significant. Thus, removing illiteracy improved the regression output.

  1. Change the model specification: One reason for multicollinearity can be misspecification biase. Addition of unnecessary regressors or polynomial terms or wrong wrong functional form can lead to model misspecification. In some cases models are estimated with some restrictions like homogeneity and symmetry restrictions. In the states.lm model considered, income, murder rates, percentage of high school graduates are all measured in per unit of population. It is possible to replace population and area by population density.
states$population_density=states$Population/states$Area
states.lm2<-lm(Murder~population_density+Income+Illiteracy+LifeExp+HSGrad+Frost,data = states)
ols_vif_tol(states.lm2)
##            Variables Tolerance      VIF
## 1 population_density 0.7115184 1.405445
## 2             Income 0.4524071 2.210398
## 3         Illiteracy 0.3000624 3.332640
## 4            LifeExp 0.5340291 1.872557
## 5             HSGrad 0.3263090 3.064580
## 6              Frost 0.5153251 1.940523

Now it can be observed that coefficient of regresor income which hitherto was not significant is now significant and for none of the variable involved VIF level is greater than 4.

  1. Add more observation: Multicollinearity is more often observed in small cross section data sets and time series data. It is possible in small data set that due to non-random sampling or sampling from a sub-set of population various regressors are highly correlated and thus, causing multicollinearity. Add more observations sampled from different subset of population would reduce the problem of multicollinearity. In case of time series augmenting data is not possible but one can remove various lags of dependent and independent variables which are causing multicollinearity.

  2. Principal Component Regression: When multicollinearity is not tolerable and the researcher wants to keep all the regressors due to their importance or theoretical underpinnings then it is possible to get the principal component from the regressors and use principal components instead of original regessors and later on extract the coefficient for original regressors from principal component regression.

Acknowledgement

This article was prepared using Rmarkdown package (Xie, Dervieux, and Riederer 2020; Xie, Allaire, and Grolemund 2018; Allaire et al. 2021) in R software. To analyze data and report results, DT package (Xie, Cheng, and Tan 2021), stats package (R Core Team 2021), olsrr package (Hebbali 2020a) mcmc package (Imdadullah, Aslam, and Altaf 2016; Imdad and Aslam 2020), and sjPlot package (Lüdecke 2021) were used. Mathematics support in Rmarkdown was obtained from Pruim (2016). R code support and contents were derived from Ghosh (2017), Hebbali (2020b) and Thondamallu, Sagar, and Veetil (2018).

References

Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, Winston Chang, and Richard Iannone. 2021. Rmarkdown: Dynamic Documents for r. https://github.com/rstudio/rmarkdown.
Asteriou, D., and S. G. Hall. 2007. Applied Econometrics: A Modern Approach Using Eviews and Microfit Revised Edition. Palgrave Macmillan. https://books.google.co.in/books?id=e4x9QgAACAAJ.
Carlson, James, William R. Dillon, and Matthew Goldstein. 1986. “Multivariate Analysis Methods and Applications.” Journal of the American Statistical Association 81 (395): 863. https://doi.org/10.2307/2289033.
Chatterjee, S., and A.S. Hadi. 2006. “Analysis of Collinear Data.” In, 221–58. John Wiley & Sons, Inc. https://doi.org/10.1002/0470055464.ch9.
Curto, José Dias, and José Castro Pinto. 2010. “The Corrected VIF (CVIF).” Journal of Applied Statistics 38 (7): 1499–1507. https://doi.org/10.1080/02664763.2010.505956.
Farrar, Donald E., and Robert R. Glauber. 1967. “Multicollinearity in Regression Analysis: The Problem Revisited.” The Review of Economics and Statistics 49 (1): 92. https://doi.org/10.2307/1937887.
Ghosh, Bidyut. 2017. “Multicollinearity in r | DataScience+.” https://datascienceplus.com/multicollinearity-in-r/.
Greene, W. H. 2003. Econometric Analysis. Prentice Hall. https://books.google.co.in/books?id=JJkWAQAAMAAJ.
Gujarati, D. N., D. C. Porter, and S. Gunasekar. 2012. Basic Econometrics. McGraw-Hill Education (India) Private Limited. https://books.google.co.in/books?id=WcCjAgAAQBAJ.
Hebbali, Aravind. 2020a. Olsrr: Tools for Building OLS Regression Models. https://CRAN.R-project.org/package=olsrr.
———. 2020b. “Collinearity Diagnostics, Model Fit & Variable Contribution.” https://cran.r-project.org/web/packages/olsrr/vignettes/regression_diagnostics.html.
Imdad, M. U., and M. Aslam. 2020. mctest: Multicollinearity Diagnostic Measures. https://CRAN.R-project.org/package=mctest.
Imdad Ullah, Muhammad, Saima Altaf, and Munir Ahmed. 2019. “Some New Diagnostics of Multicollinearity in Linear Regression Model.” Sains Malaysiana 48 (September): 2051–60. https://doi.org/10.17576/jsm-2019-4809-26.
Imdadullah, M., M. Aslam, and S. Altaf. 2016. “Mctest: An r Package for Deteection of Collinearity Among Regressors.” The R Journal 8(2): 499–509. https://journal.r-project.org/archive/2016/RJ-2016-062/index.html.
Klein, L. R. 1977. An Introduction to Econometrics. Greenwood Press. https://books.google.co.in/books?id=JiO7AAAAIAAJ.
Kovàcs, Péter, Tibor Petres, and László Tóth. 2006. “A New Measure of Multicollinearity in Linear Regression Models.” International Statistical Review 73 (3): 405–12. https://doi.org/10.1111/j.1751-5823.2005.tb00156.x.
Lüdecke, Daniel. 2021. sjPlot: Data Visualization for Statistics in Social Science. https://CRAN.R-project.org/package=sjPlot.
Pruim, R. 2016. “Mathematics in r Markdown.” https://rpruim.github.io/s341/S19/from-class/MathinRmd.html.
R Core Team. 2021. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Theil, H., and Karreman Mathematics Research Collection. 1971. Principles of Econometrics. Wiley/Hamilton Publication. Wiley. https://books.google.co.in/books?id=X6W2AAAAIAAJ.
Thondamallu, Jyothirmayee, Chaitanya Sagar, and Saneesh Veetil. 2018. “Dealing with the Problem of Multicollinearity in r | r-Bloggers.” https://www.r-bloggers.com/2018/08/dealing-with-the-problem-of-multicollinearity-in-r/.
United States Department Of Commerce. Bureau Of The Census. 1984. “County and City Data Book, 1977.” ICPSR - Interuniversity Consortium for Political; Social Research. https://doi.org/10.3886/ICPSR07697.
Xie, Yihui, J. J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.
Xie, Yihui, Joe Cheng, and Xianying Tan. 2021. DT: A Wrapper of the JavaScript Library ’DataTables’. https://CRAN.R-project.org/package=DT.
Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook.