Variable Selection

Variable selection procedures in R using ‘olsrr’ package. Watch the video by Mike Crowson, Ph.D. January 14, 2020, to complement what is covered in this tutorial January 14, 2020

File data

Download .RData file here: https://drive.google.com/open?id=1_cRuM2-4dNhNxVPzgj8ddlpcrV-TAUZ2

Data are contained in a data frame named: ‘regdata’. Package information can be found here: https://cran.r-project.org/web/packages/olsrr/olsrr.pdf

Note

if you haven’t already done so, you need to install the package

to use the package, use the ‘library’ function to call it up

library(olsrr)

Import the data and View the few observations

regdata <- read.csv("regdata.csv") 
head(regdata,5)
  id perfgoal achieve  mastery interest  anxiety genderid masteryLMH
1  1 32.00000   6.125 5.714286      6.0 1.666667        1          3
2  2 32.25655   1.625 1.428571      4.0 6.333333        1          1
3  3 37.88265   4.500 1.285714      2.0 3.666667        1          1
4  4 58.09477   2.375 2.285714      4.0 3.666667        1          2
5  5 26.73999   5.125 4.571429      5.5 3.666667        0          3
  perfgoalLMH interestMS
1           2          2
2           2          2
3           2          1
4           3          2
5           2          2
str(regdata)
'data.frame':   140 obs. of  10 variables:
 $ id         : int  1 2 3 4 5 6 7 8 9 10 ...
 $ perfgoal   : num  32 32.3 37.9 58.1 26.7 ...
 $ achieve    : num  6.12 1.62 4.5 2.38 5.12 ...
 $ mastery    : num  5.71 1.43 1.29 2.29 4.57 ...
 $ interest   : num  6 4 2 4 5.5 4 4 5 4.5 4 ...
 $ anxiety    : num  1.67 6.33 3.67 3.67 3.67 ...
 $ genderid   : int  1 1 1 1 0 1 1 1 1 1 ...
 $ masteryLMH : int  3 1 1 2 3 2 2 2 2 2 ...
 $ perfgoalLMH: int  2 2 2 3 2 3 1 2 2 2 ...
 $ interestMS : int  2 2 1 2 2 2 2 2 2 2 ...

Forward regression using p-values

model<-lm(achieve~mastery+interest+anxiety+perfgoal+genderid,
data=regdata)
FWDfit.p<-ols_step_forward_p(model,penter=.05)

This gives you the short summary of the models at each step

FWDfit.p

                            Selection Summary                             
-------------------------------------------------------------------------
        Variable                  Adj.                                       
Step    Entered     R-Square    R-Square     C(p)        AIC        RMSE     
-------------------------------------------------------------------------
   1    mastery       0.3304      0.3255    16.5745    414.0549    1.0467    
   2    interest      0.3846      0.3756     6.2151    404.2283    1.0070    
   3    perfgoal      0.4044      0.3912     3.7171    401.6636    0.9944    
-------------------------------------------------------------------------

Forward regression using aic

model<-lm(achieve~mastery+interest+anxiety+perfgoal+genderid,
data=regdata)
FWDfit.aic<-ols_step_forward_aic(model)

This gives you the short summary of the models at each step

FWDfit.aic

                        Selection Summary                         
-----------------------------------------------------------------
Variable       AIC      Sum Sq      RSS       R-Sq      Adj. R-Sq 
-----------------------------------------------------------------
mastery      414.055    74.585    151.176    0.33037      0.32552 
interest     404.228    86.831    138.930    0.38462      0.37563 
perfgoal     401.664    91.288    134.473    0.40436      0.39122 
-----------------------------------------------------------------

This plots out the relative contributions of the predictors

plot(FWDfit.aic)

if you want the intermediate steps, add set ‘details’ argument = TRUE

FWDfit.aic<-ols_step_forward_aic(model,details=TRUE)
Forward Selection Method 
------------------------

Candidate Terms: 

1 . mastery 
2 . interest 
3 . anxiety 
4 . perfgoal 
5 . genderid 

 Step 0: AIC = 468.1994 
 achieve ~ 1 

---------------------------------------------------------------------
Variable     DF      AIC      Sum Sq      RSS      R-Sq     Adj. R-Sq 
---------------------------------------------------------------------
mastery       1    414.055    74.585    151.176    0.330        0.326 
interest      1    423.180    64.403    161.357    0.285        0.280 
perfgoal      1    458.101    18.691    207.070    0.083        0.076 
anxiety       1    462.003    12.838    212.923    0.057        0.050 
genderid      1    468.566     2.618    223.143    0.012        0.004 
---------------------------------------------------------------------


- mastery 


 Step 1 : AIC = 414.0549 
 achieve ~ mastery 

---------------------------------------------------------------------
Variable     DF      AIC      Sum Sq      RSS      R-Sq     Adj. R-Sq 
---------------------------------------------------------------------
interest      1    404.228    12.246    138.930    0.385        0.376 
perfgoal      1    410.577     5.800    145.375    0.356        0.347 
genderid      1    415.082     1.047    150.129    0.335        0.325 
anxiety       1    415.583     0.509    150.667    0.333        0.323 
---------------------------------------------------------------------

- interest 


 Step 2 : AIC = 404.2283 
 achieve ~ mastery + interest 

---------------------------------------------------------------------
Variable     DF      AIC      Sum Sq      RSS      R-Sq     Adj. R-Sq 
---------------------------------------------------------------------
perfgoal      1    401.664     4.457    134.473    0.404        0.391 
genderid      1    405.265     0.953    137.977    0.389        0.375 
anxiety       1    405.671     0.552    138.378    0.387        0.374 
---------------------------------------------------------------------

- perfgoal 


 Step 3 : AIC = 401.6636 
 achieve ~ mastery + interest + perfgoal 

---------------------------------------------------------------------
Variable     DF      AIC      Sum Sq      RSS      R-Sq     Adj. R-Sq 
---------------------------------------------------------------------
genderid      1    402.058     1.533    132.939    0.411        0.394 
anxiety       1    403.224     0.421    134.052    0.406        0.389 
---------------------------------------------------------------------


No more variables to be added.

Variables Entered: 

- mastery 
- interest 
- perfgoal 


Final Model Output 
------------------

                        Model Summary                          
--------------------------------------------------------------
R                       0.636       RMSE                0.994 
R-Squared               0.404       Coef. Var          29.714 
Adj. R-Squared          0.391       MSE                 0.989 
Pred R-Squared          0.355       MAE                 0.786 
--------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                               ANOVA                                 
--------------------------------------------------------------------
               Sum of                                               
              Squares         DF    Mean Square      F         Sig. 
--------------------------------------------------------------------
Regression     91.288          3         30.429    30.775    0.0000 
Residual      134.473        136          0.989                     
Total         225.761        139                                    
--------------------------------------------------------------------

                                  Parameter Estimates                                    
----------------------------------------------------------------------------------------
      model      Beta    Std. Error    Std. Beta      t        Sig      lower     upper 
----------------------------------------------------------------------------------------
(Intercept)     2.002         0.290                  6.900    0.000     1.428     2.576 
    mastery     0.340         0.077        0.373     4.437    0.000     0.188     0.491 
   interest     0.199         0.060        0.278     3.321    0.001     0.081     0.318 
   perfgoal    -0.009         0.004       -0.145    -2.123    0.036    -0.018    -0.001 
----------------------------------------------------------------------------------------

Backward regression using p-values

model<-lm(achieve~mastery+interest+anxiety+perfgoal+genderid,
data=regdata)
BWDfit.p<-ols_step_backward_p(model,prem=.05)
BWDfit.p

                          Elimination Summary                            
------------------------------------------------------------------------
        Variable                  Adj.                                      
Step    Removed     R-Square    R-Square     C(p)       AIC        RMSE     
------------------------------------------------------------------------
   1    anxiety       0.4111      0.3937    4.1695    402.0579    0.9923    
   2    genderid      0.4044      0.3912    3.7171    401.6636    0.9944    
------------------------------------------------------------------------

Backward regression using aic

BWDfit.aic<-ols_step_backward_aic(model)
BWDfit.aic

                    Backward Elimination Summary                    
------------------------------------------------------------------
Variable        AIC        RSS      Sum Sq     R-Sq      Adj. R-Sq 
------------------------------------------------------------------
Full Model    403.881    132.772    92.989    0.41189      0.38995 
anxiety       402.058    132.939    92.821    0.41115      0.39370 
genderid      401.664    134.473    91.288    0.40436      0.39122 
------------------------------------------------------------------
plot(BWDfit.aic)

if you want the intermediate steps, add set ‘details’ argument = TRUE

BWDfit.aic<-ols_step_backward_aic(model,details=TRUE)
Backward Elimination Method 
---------------------------

Candidate Terms: 

1 . mastery 
2 . interest 
3 . anxiety 
4 . perfgoal 
5 . genderid 

 Step 0: AIC = 403.881 
 achieve ~ mastery + interest + anxiety + perfgoal + genderid 

---------------------------------------------------------------------
Variable     DF      AIC      Sum Sq      RSS      R-Sq     Adj. R-Sq 
---------------------------------------------------------------------
anxiety      1     402.058     0.168    132.939    0.411        0.394 
genderid     1     403.224     1.280    134.052    0.406        0.389 
perfgoal     1     406.940     4.886    137.657    0.390        0.372 
interest     1     412.775    10.745    143.516    0.364        0.345 
mastery      1     418.293    16.514    149.285    0.339        0.319 
---------------------------------------------------------------------


Variables Removed: 

- anxiety 


  Step 1 : AIC = 402.0579 
 achieve ~ mastery + interest + perfgoal + genderid 

---------------------------------------------------------------------
Variable     DF      AIC      Sum Sq      RSS      R-Sq     Adj. R-Sq 
---------------------------------------------------------------------
genderid     1     401.664     1.533    134.473    0.404        0.391 
perfgoal     1     405.265     5.037    137.977    0.389        0.375 
interest     1     410.897    10.701    143.640    0.364        0.350 
mastery      1     418.493    18.710    151.649    0.328        0.313 
---------------------------------------------------------------------

- genderid 


  Step 2 : AIC = 401.6636 
 achieve ~ mastery + interest + perfgoal 

---------------------------------------------------------------------
Variable     DF      AIC      Sum Sq      RSS      R-Sq     Adj. R-Sq 
---------------------------------------------------------------------
perfgoal     1     404.228     4.457    138.930    0.385        0.376 
interest     1     410.577    10.902    145.375    0.356        0.347 
mastery      1     418.590    19.466    153.939    0.318        0.308 
---------------------------------------------------------------------


No more variables to be removed.

Variables Removed: 

- anxiety 
- genderid 


Final Model Output 
------------------

                        Model Summary                          
--------------------------------------------------------------
R                       0.636       RMSE                0.994 
R-Squared               0.404       Coef. Var          29.714 
Adj. R-Squared          0.391       MSE                 0.989 
Pred R-Squared          0.355       MAE                 0.786 
--------------------------------------------------------------
 RMSE: Root Mean Square Error 
 MSE: Mean Square Error 
 MAE: Mean Absolute Error 

                               ANOVA                                 
--------------------------------------------------------------------
               Sum of                                               
              Squares         DF    Mean Square      F         Sig. 
--------------------------------------------------------------------
Regression     91.288          3         30.429    30.775    0.0000 
Residual      134.473        136          0.989                     
Total         225.761        139                                    
--------------------------------------------------------------------

                                  Parameter Estimates                                    
----------------------------------------------------------------------------------------
      model      Beta    Std. Error    Std. Beta      t        Sig      lower     upper 
----------------------------------------------------------------------------------------
(Intercept)     2.002         0.290                  6.900    0.000     1.428     2.576 
    mastery     0.340         0.077        0.373     4.437    0.000     0.188     0.491 
   interest     0.199         0.060        0.278     3.321    0.001     0.081     0.318 
   perfgoal    -0.009         0.004       -0.145    -2.123    0.036    -0.018    -0.001 
----------------------------------------------------------------------------------------

Stepwise regression using p-values

model<-lm(achieve~mastery+interest+anxiety+perfgoal+genderid,
data=regdata)
Bothfit.p<-ols_step_both_p(model,pent=.05,prem=.05)
Bothfit.p

                             Stepwise Selection Summary                               
-------------------------------------------------------------------------------------
                     Added/                   Adj.                                       
Step    Variable    Removed     R-Square    R-Square     C(p)        AIC        RMSE     
-------------------------------------------------------------------------------------
   1    mastery     addition       0.330       0.326    16.5750    414.0549    1.0467    
   2    interest    addition       0.385       0.376     6.2150    404.2283    1.0070    
   3    perfgoal    addition       0.404       0.391     3.7170    401.6636    0.9944    
-------------------------------------------------------------------------------------

Stepwise regression using aic

model<-lm(achieve~mastery+interest+anxiety+perfgoal+genderid, data=regdata)
Bothfit.aic<-ols_step_both_aic(model)
Bothfit.aic

                               Stepwise Summary                               
----------------------------------------------------------------------------
Variable     Method       AIC        RSS      Sum Sq     R-Sq      Adj. R-Sq 
----------------------------------------------------------------------------
mastery     addition    414.055    151.176    74.585    0.33037      0.32552 
interest    addition    404.228    138.930    86.831    0.38462      0.37563 
perfgoal    addition    401.664    134.473    91.288    0.40436      0.39122 
----------------------------------------------------------------------------
plot(Bothfit.aic)

### All possible subsets regression

model<-lm(achieve~mastery+interest+anxiety+perfgoal+genderid,
data=regdata)
modcompare<-ols_step_all_possible(model)
modcompare
   Index N                                 Predictors   R-Square Adj. R-Square
1      1 1                                    mastery 0.33037183    0.32551945
2      2 1                                   interest 0.28527225    0.28009306
4      3 1                                   perfgoal 0.08279112    0.07614468
3      4 1                                    anxiety 0.05686591    0.05003161
5      5 1                                   genderid 0.01159687    0.00443453
6      6 2                           mastery interest 0.38461546    0.37563174
8      7 2                           mastery perfgoal 0.35606477    0.34666426
9      8 2                           mastery genderid 0.33500869    0.32530079
7      9 2                            mastery anxiety 0.33262424    0.32288153
11    10 2                          interest perfgoal 0.31813350    0.30817925
10    11 2                           interest anxiety 0.30354680    0.29337960
12    12 2                          interest genderid 0.29162154    0.28128025
13    13 2                           anxiety perfgoal 0.12526400    0.11249413
15    14 2                          perfgoal genderid 0.10151651    0.08839996
14    15 2                           anxiety genderid 0.06049667    0.04678130
17    16 3                  mastery interest perfgoal 0.40435677    0.39121758
18    17 3                  mastery interest genderid 0.38883726    0.37535573
16    18 3                   mastery interest anxiety 0.38705972    0.37353898
21    19 3                  mastery perfgoal genderid 0.36374981    0.34971487
19    20 3                   mastery anxiety perfgoal 0.35770200    0.34353366
20    21 3                   mastery anxiety genderid 0.33623356    0.32159166
22    22 3                  interest anxiety perfgoal 0.33280150    0.31808389
24    23 3                 interest perfgoal genderid 0.32827467    0.31345720
23    24 3                  interest anxiety genderid 0.30646123    0.29116258
25    25 3                  anxiety perfgoal genderid 0.13411781    0.11501747
28    26 4         mastery interest perfgoal genderid 0.41114901    0.39370157
26    27 4          mastery interest anxiety perfgoal 0.40622190    0.38862848
27    28 4          mastery interest anxiety genderid 0.39025253    0.37218594
29    29 4          mastery anxiety perfgoal genderid 0.36429984    0.34546428
30    30 4         interest anxiety perfgoal genderid 0.33874576    0.31915304
31    31 5 mastery interest anxiety perfgoal genderid 0.41189279    0.38994849
   Mallow's Cp
1    16.574518
2    26.850439
4    72.985686
3    78.892735
5    89.207268
6     6.215129
8    12.720391
9    17.518010
7    18.061308
11   21.363016
10   24.686590
12   27.403756
13   65.308257
15   70.719113
14   80.065467
17    3.717079
18    7.253192
16    7.658203
21   12.969359
19   14.347349
20   19.238924
22   20.020918
24   21.052355
23   26.022538
25   65.290920
28    4.169470
26    5.292109
27    8.930723
29   14.844035
30   20.666521
31    6.000000
as.data.frame(modcompare)
   mindex n                                 predictors    rsquare       adjr
1       1 1                                    mastery 0.33037183 0.32551945
2       2 1                                   interest 0.28527225 0.28009306
4       3 1                                   perfgoal 0.08279112 0.07614468
3       4 1                                    anxiety 0.05686591 0.05003161
5       5 1                                   genderid 0.01159687 0.00443453
6       6 2                           mastery interest 0.38461546 0.37563174
8       7 2                           mastery perfgoal 0.35606477 0.34666426
9       8 2                           mastery genderid 0.33500869 0.32530079
7       9 2                            mastery anxiety 0.33262424 0.32288153
11     10 2                          interest perfgoal 0.31813350 0.30817925
10     11 2                           interest anxiety 0.30354680 0.29337960
12     12 2                          interest genderid 0.29162154 0.28128025
13     13 2                           anxiety perfgoal 0.12526400 0.11249413
15     14 2                          perfgoal genderid 0.10151651 0.08839996
14     15 2                           anxiety genderid 0.06049667 0.04678130
17     16 3                  mastery interest perfgoal 0.40435677 0.39121758
18     17 3                  mastery interest genderid 0.38883726 0.37535573
16     18 3                   mastery interest anxiety 0.38705972 0.37353898
21     19 3                  mastery perfgoal genderid 0.36374981 0.34971487
19     20 3                   mastery anxiety perfgoal 0.35770200 0.34353366
20     21 3                   mastery anxiety genderid 0.33623356 0.32159166
22     22 3                  interest anxiety perfgoal 0.33280150 0.31808389
24     23 3                 interest perfgoal genderid 0.32827467 0.31345720
23     24 3                  interest anxiety genderid 0.30646123 0.29116258
25     25 3                  anxiety perfgoal genderid 0.13411781 0.11501747
28     26 4         mastery interest perfgoal genderid 0.41114901 0.39370157
26     27 4          mastery interest anxiety perfgoal 0.40622190 0.38862848
27     28 4          mastery interest anxiety genderid 0.39025253 0.37218594
29     29 4          mastery anxiety perfgoal genderid 0.36429984 0.34546428
30     30 4         interest anxiety perfgoal genderid 0.33874576 0.31915304
31     31 5 mastery interest anxiety perfgoal genderid 0.41189279 0.38994849
       predrsq        cp      aic      sbic      sbc     msep      fpe
1   0.30918074 16.574518 414.0549 16.408828 422.8798 153.3669 1.111126
2   0.26469681 26.850439 423.1799 25.276512 432.0049 163.6962 1.185961
4   0.04243592 72.985686 458.1006 59.259526 466.9256 210.0711 1.521941
3   0.02731876 78.892735 462.0029 63.063128 470.8278 216.0088 1.564959
5  -0.01698590 89.207268 468.5664 69.463903 477.3913 226.3769 1.640075
6   0.35215300  6.215129 404.2283  6.916331 415.9949 141.9797 1.035815
8   0.31654714 12.720391 410.5774 12.995632 422.3440 148.5668 1.083872
9   0.30255926 17.518010 415.0821 17.311607 426.8486 153.4248 1.119314
7   0.30059119 18.061308 415.5832 17.791873 427.3497 153.9749 1.123327
11  0.27907319 21.363016 418.5904 20.674793 430.3570 157.3182 1.147718
10  0.27096634 24.686590 421.5538 23.516715 433.3204 160.6836 1.172270
12  0.26108316 27.403756 423.9307 25.797072 435.6973 163.4350 1.192343
13  0.07295529 65.308257 453.4628 54.197491 465.2294 201.8165 1.472356
15  0.04819453 70.719113 457.2129 57.813378 468.9794 207.2954 1.512328
14  0.01508417 80.065467 463.4629 63.844712 475.2295 216.7594 1.581372
17  0.35455488  3.717079 401.6636  4.611247 416.3718 138.4430 1.017021
18  0.34601216  7.253192 405.2646  8.004617 419.9728 142.0501 1.043520
16  0.34386008  7.658203 405.6712  8.387930 420.3794 142.4633 1.046555
21  0.31368697 12.969359 410.8966 13.317196 425.6048 147.8811 1.086355
19  0.30828652 14.347349 412.2210 14.567508 426.9292 149.2868 1.096681
20  0.29169859 19.238924 416.8239 18.915616 431.5322 154.2766 1.133337
22  0.28218668 20.020918 417.5460 19.598079 432.2542 155.0743 1.139197
24  0.27976144 21.052355 418.4926 20.493058 433.2008 156.1265 1.146927
23  0.26283806 26.022538 422.9667 24.725494 437.6749 161.1965 1.184172
25  0.06684641 65.290920 454.0385 54.243999 468.7468 201.2536 1.478437
28  0.35117288  4.169470 402.0579  5.185944 419.7078 137.8857 1.019906
26  0.34633923  5.292109 403.2245  6.267266 420.8744 139.0394 1.028439
27  0.33572745  8.930723 406.9400  9.714043 424.5898 142.7788 1.056099
29  0.30315879 14.844035 412.7755 15.136027 430.4253 148.8559 1.101050
30  0.27748785 20.666521 418.2931 20.272308 435.9429 154.8397 1.145310
31  0.34093391  6.000000 403.8810  7.111514 424.4725 138.7469 1.033296
         apc         hsp
1  0.6890377 0.007996178
2  0.7354445 0.008534722
4  0.9437946 0.010952593
3  0.9704713 0.011262172
5  1.0170525 0.011802740
6  0.6423357 0.007456508
8  0.6721368 0.007802452
9  0.6941150 0.008057584
7  0.6966039 0.008086476
11 0.7117293 0.008262058
10 0.7269548 0.008438803
12 0.7394023 0.008583299
13 0.9130456 0.010599024
15 0.9378331 0.010886769
14 0.9806495 0.011383799
17 0.6306811 0.007324229
18 0.6471135 0.007515062
16 0.6489956 0.007536919
21 0.6736767 0.007823546
19 0.6800802 0.007897911
20 0.7028115 0.008161895
22 0.7064455 0.008204096
24 0.7112386 0.008259760
23 0.7343352 0.008527985
25 0.9168164 0.010647178
28 0.6324696 0.007348779
26 0.6377617 0.007410269
27 0.6549139 0.007609565
29 0.6827891 0.007933451
30 0.7102360 0.008252362
31 0.6407735 0.007449866

To obtain plots of Mallow’s C and other indices

plot(modcompare)

Best subsets regression

model<-lm(achieve~mastery+interest+anxiety+perfgoal+genderid, data=regdata)
modcompare<-ols_step_best_subset(model)
modcompare
                 Best Subsets Regression                 
---------------------------------------------------------
Model Index    Predictors
---------------------------------------------------------
     1         mastery                                    
     2         mastery interest                           
     3         mastery interest perfgoal                  
     4         mastery interest perfgoal genderid         
     5         mastery interest anxiety perfgoal genderid 
---------------------------------------------------------

                                                   Subsets Regression Summary                                                    
---------------------------------------------------------------------------------------------------------------------------------
                       Adj.        Pred                                                                                           
Model    R-Square    R-Square    R-Square     C(p)        AIC        SBIC        SBC         MSEP       FPE       HSP       APC  
---------------------------------------------------------------------------------------------------------------------------------
  1        0.3304      0.3255      0.3092    16.5745    414.0549    16.4088    422.8798    153.3669    1.1111    0.0080    0.6890 
  2        0.3846      0.3756      0.3522     6.2151    404.2283     6.9163    415.9949    141.9797    1.0358    0.0075    0.6423 
  3        0.4044      0.3912      0.3546     3.7171    401.6636     4.6112    416.3718    138.4430    1.0170    0.0073    0.6307 
  4        0.4111      0.3937      0.3512     4.1695    402.0579     5.1859    419.7078    137.8857    1.0199    0.0073    0.6325 
  5        0.4119      0.3899      0.3409     6.0000    403.8810     7.1115    424.4725    138.7469    1.0333    0.0074    0.6408 
---------------------------------------------------------------------------------------------------------------------------------
AIC: Akaike Information Criteria 
 SBIC: Sawa's Bayesian Information Criteria 
 SBC: Schwarz Bayesian Criteria 
 MSEP: Estimated error of prediction, assuming multivariate normality 
 FPE: Final Prediction Error 
 HSP: Hocking's Sp 
 APC: Amemiya Prediction Criteria 

Make the plot

plot(modcompare)

Information on the ‘olsrr’ package and variable selection procedures

https://cran.r-project.org/web/packages/olsrr/olsrr.pdf https://cran.r-project.org/web/packages/olsrr/vignettes/variable_selection.html https://olsrr.rsquaredacademy.com/articles/variable_selection.html