\[ INTRODUCTION \space TO \space REGRESSION \space IN \space R \]

Simple Linear Regression

Regression analysis is a statistical tool that is used to explain relationships between outcome variables as a function of one or more predictor variables. A linear regression model, we assume that the relationship can be explained with a linear function, in a simple linear regression model, it only has one outcome variable and one predictor variable.

Linear Function

The data in an SRM are in pairs. usually, in high school statistics, correlation topic, x will be the independent variable, and y is the dependent variable, same here also.

\(X:\) The predictor, \(x_1,x_2,…,x_n\) where n are observations from \(X\)

\(Y\): The response, \(y_1,y_2,..,y_n\) where n are observations from \(Y\)

A linear function can be determined by the intercept of and the slope of the line in this model,

\[ y=\beta_1x+\beta_0 \tag{1.1} \]

where \(\beta_1\) is the slope and \(\beta_0\) is the intercept.

In plotting the model in a Cartesian plane, we will use the method rise over run, \(\frac{\Delta y}{\Delta x}\).

Linear Regression as a statistical model

The model in (1.1) is not really applicable in real life situations because errors will occur in an obtained data. The regression model has two parts, linear function and an error term. \(\beta_0\) and \(\beta_1\) are called parameters, and are estimated from the data.

\[ Y=\beta_0 +\beta_1x+\epsilon \tag{1.2} \]

The linear function predicts the mean value of the outcome for a given value of the predictor variable.

The error term is the unexplained uncertainty of the model and is a random variable with mean equal to zero and unknown variance \(\sigma^2\). The error usually, assumed to have a normal distribution. In a simple regression model, for each individual or unit \(i=1,2,..,n\) in the study, we have a pair of observations, \((x_i,y_i)\).

Ordinary Least Squares

Decomposing the sum of squares and ANOVA

Running the regression model in RStudio

d <- read.csv("C:/Users/USER/Dropbox/PC/Downloads/elemapi2v2.csv")

 The data is in .csv format, and we use the read.csv function to read and load it in R. The resulting output will be a data.frame object that will appear in the environment tab. By clicking on the data.frame object in the environment pane we can view it as a spreadsheet.

Checking the nature of the data to make sure it is applicable for Linear regression.

Using the function class. , it will determine what is the environment is your data is.

class(d)
## [1] "data.frame"

Using the function names. , it will return the names of the variables which is set in column.

names(d)
##  [1] "snum"     "dnum"     "api00"    "api99"    "growth"   "meals"   
##  [7] "ell"      "yr_rnd"   "mobility" "acs_k3"   "acs_46"   "not_hsg" 
## [13] "hsg"      "some_col" "col_grad" "grad_sch" "avg_ed"   "full"    
## [19] "emer"     "enroll"   "mealcat"  "collcat"  "abv_hsg"  "lgenroll"

Using the dim. , function determines how many is your row (observations) and your column (variables).

dim(d)
## [1] 400  24
d
##     snum dnum api00 api99 growth meals ell yr_rnd mobility acs_k3 acs_46
## 1    906   41   693   600     93    67   9      0       11     16     22
## 2    889   41   570   501     69    92  21      0       33     15     32
## 3    887   41   546   472     74    97  29      0       36     17     25
## 4    876   41   571   487     84    90  27      0       27     20     30
## 5    888   41   478   425     53    89  30      0       44     18     31
## 6   4284   98   858   844     14    10   3      0       10     20     33
## 7   4271   98   918   864     54     5   2      0       16     19     28
## 8   2910  108   831   791     40     2   3      0       44     20     31
## 9   2899  108   860   838     22     5   6      0       10     20     30
## 10  2887  108   737   703     34    29  15      0       17     21     29
## 11  2911  108   851   808     43     1   2      0       16     20     30
## 12  2882  108   536   496     40    71  69      0        8     21     27
## 13  2907  108   847   815     32     3   2      0       11     20     29
## 14  2908  108   765   711     54    13   8      0       19     21     27
## 15  2895  108   809   802      7     7   8      0       16     20     31
## 16  2880  108   813   780     33    22  11      0       12     21     32
## 17  2890  108   856   816     40     7   6      0       13     21     29
## 18  3948  131   712   677     35    40  12      1       19     19     30
## 19  3956  131   805   759     46    10   1      0       26     21     31
## 20  3947  131   678   632     46    37  16      1       16     20     32
## 21  3952  131   619   570     49    57  25      1       24     19     31
## 22  3954  131   713   704      9    29  14      1       14     20     30
## 23  3943  131   704   618     86    57  19      1       22     20     30
## 24  3945  131   523   452     71    69  43      1       31     19     29
## 25  4293  135   655   632     23    65  44      0       21     20     29
## 26  4299  135   523   441     82    74  52      1       23     19     28
## 27  4318  135   521   473     48    74  63      0       15     19     31
## 28  4319  135   709   701      8    18  16      0       20     17     27
## 29  4296  135   505   452     53    75  59      0       27     19     30
## 30  4317  135   762   744     18    18  16      0       15     20     30
## 31  4322  135   722   684     38    11  21      0       20     20     28
## 32  4307  135   603   520     83    71  54      0       19     19     30
## 33  4302  135   657   579     78    67  36      0       17     19     30
## 34  4314  135   705   655     50    46  42      0       32     17     27
## 35  4292  135   754   748      6    25  25      0       18     20     27
## 36  4304  135   490   431     59    71  54      0       25     23     30
## 37  4308  135   698   680     18    44  18      0       21     19     29
## 38   600  140   843   805     38    25   6      0       17     20     32
## 39   596  140   800   769     31    56   6      0       41     19     30
## 40   611  140   857   857      0    13   4      0       12     20     32
## 41   595  140   713   664     49    63  35      0       15     21     32
## 42   592  140   804   873    -69    38   8      0       33     21     35
## 43   602  140   864   831     33    24   5      0       35     21     35
## 44  5222  166   940   917     23     0   2      0        3     20     31
## 45  5210  166   792   779     13     9   3      0       19     20     31
## 46  5217  166   887   882      5     3   5      0       14     21     31
## 47  3644  209   775   726     49    26  13      0       18     21     33
## 48  3643  209   822   766     56    16  10      1       47     21     32
## 49  3624  209   806   745     61    21   6      1       25     19     30
## 50  3623  209   816   745     71    16   6      0       18     19     32
## 51  3629  209   553   485     68    86  38      1       32     20     32
## 52  3622  209   686   633     53    75  33      1       24     21     32
## 53  4017  238   483   444     39    79  40      1       24     20     28
## 54    58  248   801   765     36    20  12      0       18     19     30
## 55    70  248   753   694     59    25  19      0       14     21     29
## 56    65  248   903   889     14     7   8      0       21     22     30
## 57   657  253   454   443     11   100  53      1       21     19     29
## 58   697  253   440   428     12   100  67      1       19     19     29
## 59   646  253   519   491     28    93  33      0       19     19     29
## 60   640  253   512   477     35    84  33      1       24     19     29
## 61   629  253   525   522      3    92  41      1       15     19     30
## 62   659  253   486   480      6    90  44      1       27     19     30
## 63   699  253   437   409     28    90  70      1       17     19     29
## 64   637  253   447   409     38   100  51      0       22     20     29
## 65   663  253   518   499     19    86  22      0       18     19     29
## 66   633  253   479   466     13   100  59      1       21     19     30
## 67   690  253   424   343     81   100  36      0       20     17     26
## 68  2982  259   536   465     71    93  84      0       14     21     29
## 69  3000  259   628   554     74    75  74      0       25     20     28
## 70  2997  259   627   546     81    71  71      0       15     20     33
## 71  3005  259   596   534     62    65  46      0       20     21     29
## 72  2972  259   664   617     47    72  71      0       13     19     30
## 73  3011  259   655   620     35    56  52      0       19     21     29
## 74  2977  259   616   566     50    75  65      0       23     21     31
## 75  3013  259   538   497     41    80  82      0       19     21     31
## 76  3024  259   727   670     57    60  50      0       24     21     30
## 77  3004  259   822   787     35     8   4      0       10     19     27
## 78  3010  259   638   569     69    80  71      0       11     20     31
## 79  2460  284   581   497     84    70  26      0       14     23     25
## 80  2479  284   784   739     45    23  25      0       21     21     36
## 81  2459  284   871   840     31    14  15      0       15     21     34
## 82   105  294   567   535     32    62  43      0       20     20     31
## 83    94  294   620   592     28    51  20      0       20     19     31
## 84    93  294   761   726     35    15   3      0       21     20     29
## 85   116  294   513   508      5    82  56      0       16     NA     31
## 86  3236  316   912   900     12     6   7      0       22     21     30
## 87  3240  316   883   821     62     7  20      0       35     19     31
## 88  3241  316   844   834     10    12  14      0       21     21     26
## 89  3256  316   880   841     39    11   7      0       11     21     31
## 90  3250  316   894   881     13     4   4      0       13     20     26
## 91  3247  316   858   818     40     4   8      0       17     20     27
## 92  3258  316   847   817     30     7   7      0       17     20     31
## 93  1497  395   604   529     75   100  68      1       12     20     35
## 94  1478  395   553   466     87   100  70      1       11     19     35
## 95  1474  395   657   567     90    64  19      0       25     19     32
## 96  1511  395   492   500     -8    89  48      0       42     18     31
## 97  1539  395   700   711    -11    80  51      0       11     20     34
## 98  1490  395   590   552     38    89  58      1       16     19     33
## 99  1500  395   777   752     25    37  11      0       15     20     34
## 100 1515  395   667   625     42    85  52      1       18     19     34
## 101 1512  395   769   752     17    47  20      0       17     16     25
## 102 1475  395   650   631     19    78  33      0       21     19     32
## 103 1522  395   561   427    134   100  86      1        8     20     35
## 104 1472  395   561   537     24    83  28      0       21     19     34
## 105 1516  395   597   543     54   100  71      1       15     20     34
## 106 1489  395   568   507     61    84  55      1       16     20     33
## 107 1493  395   637   622     15    78  30      0       28     19     29
## 108 1606  401   588   538     50    61   1      0       20     16     26
## 109 1866  401   821   793     28    48   9      0       20     18     29
## 110 1747  401   559   511     48   100  74      0       11     17     27
## 111 1905  401   693   663     30    50   4      0       11     18     27
## 112 1699  401   542   456     86    99  53      0       10     19     30
## 113 2077  401   840   816     24    33  25      0       11     20     29
## 114 1959  401   621   605     16    82  45      0       17     18     25
## 115 1914  401   487   489     -2    91  48      1       22     18     27
## 116 1685  401   686   621     65    62  21      0       10     16     26
## 117 1757  401   463   438     25    88  75      1       24     19     27
## 118 2087  401   476   430     46    90  70      1       10     18     29
## 119 1946  401   821   767     54    21   4      0       12     18     29
## 120 1821  401   412   379     33   100  77      1        8     19     29
## 121 1881  401   394   382     12    99  42      1       22     17     26
## 122 1862  401   815   785     30    24  18      0       15     18     30
## 123 1932  401   694   631     63    56  24      0       24     18     30
## 124 1919  401   433   425      8   100  74      1       15     19     31
## 125 1788  401   560   504     56    72   1      0       26     17     27
## 126 1664  401   572   535     37    76  44      0       12     18     25
## 127 1885  401   605   600      5    68  31      0       15     17     24
## 128 1682  401   755   716     39    27   2      0       10     16     31
## 129 1853  401   475   415     60    95  64      0       21     20     28
## 130 2082  401   520   476     44    95  53      1       27     18     29
## 131 1806  401   406   367     39   100  75      1       15     19     32
## 132 1997  401   471   416     55    99  46      1       15     19     33
## 133 1701  401   664   594     70    52  22      0       19     16     31
## 134 1926  401   457   399     58    91  74      1       31     20     28
## 135 1633  401   584   501     83    97  61      0       15     14     25
## 136 1690  401   735   693     42    71  24      0       10     18     27
## 137 1696  401   536   508     28    77   9      0       18     19     29
## 138 1775  401   771   708     63    31  11      0       16     17     28
## 139 1673  401   483   436     47   100  76      0       12     19     31
## 140 1863  401   387   338     49    96  80      1       14     19     28
## 141 1990  401   650   611     39    75  64      1        9     20     28
## 142 1752  401   467   442     25   100  70      0       16     19     28
## 143 1782  401   548   512     36    70  59      0       14     16     34
## 144 1995  401   756   717     39    17   4      0       14     19     31
## 145 1638  401   440   403     37    97  54      0       20     18     30
## 146 1815  401   581   548     33   100  63      0       23     19     24
## 147 1709  401   515   487     28    96  66      0       18     19     27
## 148 1812  401   917   902     15     6   3      0        8     17     31
## 149 1742  401   657   615     42    63  23      0       22     16     30
## 150 1799  401   723   673     50    41  11      1       15     18     31
## 151 1907  401   506   482     24    79   7      0       24     17     27
## 152 1924  401   569   514     55    76  28      0       13     16     24
## 153 1781  401   466   442     24    94  13      0       16     18     29
## 154 1897  401   660   624     36    55  31      0       14     18     29
## 155 1596  401   536   456     80    95  70      1       13     20     29
## 156 1868  401   486   422     64    96  75      1       16     19     30
## 157 1909  401   453   423     30    98  53      0       13     18     29
## 158 1600  401   683   631     52    63   5      0       21     17     28
## 159 1894  401   459   426     33    99  69      1        8     19     24
## 160 1895  401   456   415     41    95  65      1       20     18     29
## 161 1889  401   612   555     57    83  20      0       19     19     27
## 162 1741  401   387   365     22    99  75      1       21     18     28
## 163 1769  401   420   407     13    98  91      1       31     19     30
## 164 1941  401   495   451     44    96  41      1       21     18     29
## 165 1744  401   552   491     61    97  86      1       10     19     28
## 166 1680  401   744   737      7    31   2      0       22     16     29
## 167 1858  401   477   424     53    98  56      0       22     19     28
## 168 2074  401   474   427     47    89  64      0       28     19     27
## 169 1949  401   696   636     60    66  22      0       17     16     29
## 170 1903  401   592   489    103   100  67      0        8     18     27
## 171 1651  401   650   599     51    73  40      0        7     19     29
## 172 1723  401   369   354     15    99  83      1       16     19     31
## 173 1783  401   437   377     60    99  90      1        9     19     28
## 174 1634  401   551   505     46    76  59      0       16     16     28
## 175 2088  401   429   419     10    92  78      1       15     19     26
## 176 1925  401   435   425     10    98  72      1       14     18     27
## 177 1851  401   747   735     12    47  14      0       19     18     24
## 178 1616  401   512   480     32    88  27      0       13     16     22
## 179 2092  401   474   423     51    98  65      1       14     18     28
## 180 1652  401   850   832     18    12   5      0        5     18     31
## 181 1820  401   551   504     47    83  37      1       17     19     28
## 182 1791  401   482   441     41   100  67      1       35     18     29
## 183 1677  401   534   482     52   100  49      0       22     19     25
## 184 1836  401   506   462     44    87  22      0       22     18     23
## 185 1743  401   556   508     48    86  56      0       18     18     27
## 186 1801  401   457   395     62   100  59      1       21     19     27
## 187 1678  401   497   440     57    95  66      1       18     19     30
## 188 1731  401   719   684     35    79  48      0       15     19     32
## 189 1961  401   582   538     44    89  67      1       26     19     27
## 190 1900  401   486   453     33    96  84      1       19     19     26
## 191 1740  401   521   452     69    99  64      0       12     18     29
## 192 1977  401   386   358     28    96  65      1       23     19     25
## 193 1952  401   397   333     64    98  84      1        8     19     27
## 194 1625  401   631   604     27    65  18      0       23     18     27
## 195 1994  401   428   396     32    97  43      1       24     19     30
## 196 1978  401   894   860     34    13   1      0       22     19     26
## 197 1805  401   421   399     22    99  86      1        9     18     29
## 198 1704  401   524   511     13    97  53      1        9     17     26
## 199 1597  401   532   506     26   100  74      1        7     19     29
## 200 1872  401   674   634     40    72  44      0        7     17     21
## 201 1729  401   589   511     78    82  43      0       20     18     25
## 202 1612  401   437   363     74    97  71      1       25     19     29
## 203 1621  401   717   667     50    72  29      0       20     18     29
## 204 1763  401   586   539     47    80  50      0       19     19     27
## 205 1854  401   470   444     26    98  45      0       17     19     29
## 206 1611  401   474   415     59   100  73      1       12     19     31
## 207 1778  401   699   660     39    67  35      0       21     19     27
## 208 1615  401   657   625     32    75   6      0       26     18     26
## 209 1795  401   759   725     34    33  18      0       19     14     29
## 210 2080  401   493   484      9    82  67      1       20     19     28
## 211 1671  401   853   812     41    37  16      0       15     18     34
## 212  416  473   866   836     30     1   1      0       11     22     33
## 213  425  473   657   593     64    45  11      0       16     21     31
## 214  419  473   850   808     42    12   5      0       13     21     33
## 215  402  473   893   845     48     7   4      0       16     20     30
## 216  430  473   745   663     82    17   3      0       17     20     33
## 217  406  473   719   670     49    31   6      0       17     21     30
## 218  412  473   783   725     58    46   5      0       21     20     29
## 219  413  473   615   523     92    75  36      0       17     22     29
## 220 3070  491   479   443     36    91  82      0       14     20     28
## 221 3072  491   763   769     -6    15  19      0       18     19     NA
## 222 3060  491   669   670     -1    48  40      0       19     18     28
## 223 3051  491   892   882     10     3   1      0       20     19     30
## 224 3055  491   590   587      3    66  52      0       10     19     NA
## 225  194  507   514   527    -13    76  48      0       16     19     30
## 226  211  507   386   360     26    75  48      1       14     20     29
## 227  182  507   411   413     -2    81  45      0       22     19     28
## 228  167  507   774   757     17    69  51      0        7     19     26
## 229  203  507   844   827     17    11   2      0        6     19     29
## 230  187  507   498   480     18    77  30      0       19     20     30
## 231  201  507   624   558     66    59  16      0       13     18     29
## 232  210  507   432   430      2    90  29      0       17     19     29
## 233  184  507   435   445    -10    87  70      1        4     19     30
## 234  165  507   449   434     15    71  21      0       19     20     31
## 235  181  507   724   714     10    15   3      0       18     21     29
## 236  198  507   894   859     35     9   2      0        7     20     30
## 237 2240  541   599   541     58    76  12      0       20     19     24
## 238 2247  541   508   486     22    77  43      0       16     20     28
## 239 2267  570   897   888      9    11   1      0       13     22     31
## 240 2278  570   457   404     53    93  62      1       16     21     30
## 241 2282  570   582   553     29    64  25      0       20     21     31
## 242 4448  575   903   884     19     2  12      0        7     20     28
## 243 4435  575   863   839     24     5  10      0       12     18     33
## 244 4449  575   844   848     -4     3  13      0       15     19     28
## 245 4431  575   858   826     32     6   5      0       21     20     29
## 246 4427  575   802   784     18    21  11      0       11     18     32
## 247 4443  575   875   848     27     4   9      0        9     19     29
## 248 3698  600   720   643     77    55   9      0       19     18     26
## 249 3696  600   483   450     33    89  43      0       38     19     27
## 250 3715  600   696   636     60    50  11      0       23     19     30
## 251 3697  600   690   653     37    65   5      0       26     21     25
## 252 3700  600   717   665     52    77  11      0       29     19     27
## 253 3701  600   687   617     70    62  18      0       19     21     26
## 254 3518  605   623   564     59    60  23      0       16     19     27
## 255 3511  605   688   662     26    40  10      0       23     20     28
## 256 3520  605   643   575     68    63  29      1       19     19     27
## 257 3516  605   574   519     55    65  21      0       24     20     29
## 258 3525  605   632   568     64    51  18      1       19     19     31
## 259 3537  605   694   678     16    12   4      1       15     19     29
## 260 3519  605   532   525      7    62  36      0       19     19     34
## 261 3523  605   657   618     39    48  14      0       23     20     32
## 262 3535  605   705   673     32    20   6      0       15     19     30
## 263 3765  620   758   701     57    45  10      0       21     22     34
## 264 3736  620   556   542     14    61  46      0       39     22     25
## 265 3785  620   459   439     20    98  52      0       36     19     27
## 266 3741  620   573   516     57   100  51      0       31     20     32
## 267 3757  620   707   667     40    57  11      0       22     21     31
## 268 3751  620   592   533     59   100  22      0       35     20     30
## 269 3791  620   724   674     50    49  20      0       24     19     35
## 270 3772  620   829   824      5    22   7      0       14     19     33
## 271 3758  620   690   642     48    98  51      0       10     20     28
## 272 3794  620   610   540     70   100  47      0       31     20     23
## 273 3793  620   528   460     68    99  35      0       27     18     33
## 274 3759  620   585   556     29    54  11      0       27     25     37
## 275 3784  620   859   801     58    18   1      0        9     21     33
## 276 3754  620   732   672     60    63   8      0       37     19     30
## 277 3196  621   799   762     37    21   8      0       12     19     32
## 278 3203  621   845   799     46    14   4      0       12     19     30
## 279 3200  621   854   854      0     6   1      0       13     19     32
## 280 3184  621   790   758     32    26  15      0       17     19     32
## 281 3202  621   839   810     29    12   3      0       17     17     30
## 282 3187  621   689   682      7    51  36      0       13     19     32
## 283 3193  621   841   840      1     6   5      0       17     19     31
## 284 4132  627   493   471     22    94  41      1       33     20     32
## 285 4128  627   643   622     21    82  24      1       19     20     27
## 286 4131  627   698   686     12    62  29      1       15     20     32
## 287 4173  627   486   476     10    95  40      1       24     20     34
## 288 4167  627   612   551     61    75  26      0       16     20     32
## 289 4140  627   630   550     80    67  21      1       21     20     32
## 290 4145  627   525   519      6    78  18      1       25     20     31
## 291 4143  627   485   448     37    94  40      1       37     19     32
## 292 4136  627   474   437     37    92  34      1       28     20     31
## 293 4488  630   521   568    -47    97  77      0       14     19     NA
## 294 4576  630   801   762     39    32  18      0       14     20     30
## 295 4506  630   709   619     90    59  17      0       20     16     33
## 296 4554  630   836   785     51    29  27      0       17     19     32
## 297 4518  630   769   723     46    47   6      0       13     19     25
## 298 4486  630   706   701      5    62  15      0       13     21     28
## 299 4537  630   655   614     41    78  27      0       16     19     32
## 300 4585  630   845   843      2    31  24      0       25     19     31
## 301 4573  630   789   747     42    41  27      0       16     20     30
## 302 4534  630   445   421     24   100  70      0       NA     NA     29
## 303 4581  630   708   691     17    74   2      0       24     19     30
## 304 4507  630   751   679     72    71  19      0       20     17     20
## 305 4522  630   461   441     20   100  74      0       22     19     31
## 306 4530  630   757   723     34    58  14      0       16     18     30
## 307 4583  630   865   837     28    40   8      0       17     19     30
## 308 4485  630   567   524     43    81  45      0       26     18     25
## 309 4558  630   748   684     64    62  21      0       11     18     29
## 310 4480  630   612   528     84    79  34      0       15     20     31
## 311 4533  630   500   463     37    95  69      0       24     19     26
## 312 4574  630   740   672     68    54  18      0       14     20     33
## 313 4580  630   740   712     28    66   2      0       19     19     29
## 314 4596  630   883   864     19     4   3      0        9     20     30
## 315 4514  630   799   789     10    38  10      0       14     19     30
## 316 4519  630   887   852     35    19   8      0       10     19     27
## 317 4502  630   493   489      4    91  71      0       27     19     30
## 318 4516  630   586   529     57    93  59      0       21     20     32
## 319 4511  630   626   626      0    79  36      0       11     19     33
## 320 4487  630   528   476     52    91  68      1       10     18     27
## 321 4528  630   741   700     41    44  18      0       17     20     33
## 322 4539  630   620   621     -1    71  23      0       14     20     31
## 323 4547  630   803   751     52    48  18      0       17     18     29
## 324 4731  632   655   635     20    56  28      0       13     18     32
## 325 4720  632   597   576     21    71  37      0        8     17     30
## 326 4783  632   789   730     59    20  11      0        4     19     25
## 327 4737  632   808   812     -4    61  26      0        7     21     32
## 328 4736  632   532   450     82    83  39      0       10     18     28
## 329 4714  632   503   504     -1    68  29      0       10     18     21
## 330 4781  632   655   638     17    53  12      0        7     20     23
## 331 4698  632   781   750     31    28  26      0        9     20     29
## 332 4775  632   808   806      2    14   5      0        3     19     29
## 333 4747  632   643   661    -18    72  13      0       12     19     27
## 334 4744  632   700   666     34    78  44      0        5     20     25
## 335 4729  632   558   496     62    82  41      0        8     19     20
## 336 4780  632   795   781     14    15   6      0        5     19     30
## 337 4774  632   682   641     41    35  22      0        2     19     27
## 338 4777  632   682   675      7    75   6      0       10     19     25
## 339 5387  635   892   861     31     2   3      0       13     19     31
## 340 5386  635   876   796     80     9   3      0       14     19     33
## 341 5370  635   790   753     37    17  10      0       17     18     28
## 342 5366  635   558   508     50    57  47      0       22     17     30
## 343 5358  635   575   578     -3    56  50      0       19     18     30
## 344 5388  635   686   658     28    28  23      0       13     18     31
## 345 5371  635   710   684     26    37  23      0       18     20     27
## 346 5362  635   487   454     33    62  47      0       21     18     27
## 347 3867  636   762   739     23    23   1      0        8     20     42
## 348 3848  636   801   794      7    27   1      0       19     20     45
## 349 3834  636   640   622     18    69  42      0       22     19     43
## 350 3833  636   601   499    102   100  24      0       31     20     33
## 351 3828  636   624   596     28    78  12      0       36     20     44
## 352 3839  636   863   848     15    14   1      0       18     20     46
## 353 3843  636   709   690     19    52   3      0       24     20     31
## 354 3854  636   883   851     32    20   1      0       24     21     40
## 355 3845  636   763   748     15    40   2      0       24     17     41
## 356 3850  636   727   655     72    40   1      0       29     18     39
## 357 3864  636   568   527     41    78  30      0       36     20     49
## 358 3853  636   743   726     17    48   2      0       27     20     42
## 359 3869  636   724   686     38    48  25      0       19     20     42
## 360 3826  636   831   842    -11    11   2      0        9     20     46
## 361 3824  636   876   850     26    23   1      0       21     23     34
## 362 3835  636   556   537     19   100  28      0       32     20     50
## 363 3865  636   847   837     10     9   0      0        8     20     35
## 364 3822  636   622   588     34    76   6      0       38     20     35
## 365 3129  653   531   502     29    82  74      1       13     22     28
## 366 3128  653   521   502     19    90  74      1       12     21     28
## 367 3127  653   473   437     36    95  86      1       15     21     30
## 368 3121  653   529   502     27    88  75      1       13     19     29
## 369 3145  653   806   815     -9    32  14      0       20     21     30
## 370 3133  653   744   677     67    58  35      0       13     20     31
## 371 3151  653   444   413     31    91  88      1       14     21     27
## 372 6068  689   714   672     42    45  21      0       13     18     28
## 373 6072  689   865   843     22     5   0      0        4     20     30
## 374 6057  689   641   616     25    45  17      0       13     20     32
## 375 6065  689   772   745     27    11   2      0       16     18     28
## 376 6062  689   750   730     20    21   4      0       15     19     31
## 377 6060  689   738   708     30    18   2      0       13     19     30
## 378 4879  716   529   482     47    83  28      0       16     18     23
## 379 4871  716   554   513     41    84  24      0       23     19     26
## 380 4875  716   504   451     53    90  46      1       19     20     26
## 381 4859  716   592   556     36    80  33      0       22     18     27
## 382 4877  716   627   582     45    86  35      1       16     20     25
## 383 4880  716   625   578     47    59   8      1       19     19     28
## 384 4881  716   436   415     21   100  49      0       19     19     24
## 385 4876  716   487   443     44    86  26      0       26     18     21
## 386 4868  716   610   520     90    84  34      0       27     20     27
## 387 4862  716   516   433     83   100  35      1       33     18     25
## 388 4878  716   465   440     25    90  27      0       21     21     26
## 389 5920  779   654   581     73    48  15      0       16     18     31
## 390 5917  779   735   672     63    42   4      0       16     19     29
## 391 5927  779   576   518     58    68  26      1       21     19     29
## 392 5926  779   469   483    -14    91  54      0       17     18     28
## 393 5933  779   515   468     47    91  28      0       32     17     26
## 394  469  796   543   527     16    82  25      0       23     20     27
## 395  468  796   783   758     25    31   6      0       15     19     27
## 396  482  796   745   736      9    27  12      0       11     18     28
## 397  489  796   720   678     42    34   8      0       20     19     24
## 398  504  796   802   787     15    26  10      0       21     19     33
## 399  488  796   539   424    115    98  12      0       18     20     24
## 400  479  796   512   447     65    98  34      0       31     19     32
##     not_hsg hsg some_col col_grad grad_sch avg_ed full emer enroll mealcat
## 1         0   0        0        0        0     NA   76   24    247       2
## 2         0   0        0        0        0     NA   79   19    463       3
## 3         0   0        0        0        0     NA   68   29    395       3
## 4        36  45        9        9        0   1.91   87   11    418       3
## 5        50  50        0        0        0   1.50   87   13    520       3
## 6         1   8       24       36       31   3.89  100    0    343       1
## 7         1   4       18       34       43   4.13  100    0    303       1
## 8         0   4       16       50       30   4.06   96    2   1513       1
## 9         2   9       15       42       33   3.96  100    0    660       1
## 10        8  25       34       27        7   2.98   96    7    362       1
## 11        0   3       16       45       37   4.15  100    2    768       1
## 12       61  15        6       10        8   1.90  100    6    404       2
## 13        1   5       17       41       37   4.08   97    3    586       1
## 14        5  34       27       31        3   2.93   98    2    633       1
## 15        5  27       25       37        6   3.12   89    7    379       1
## 16       10  35       23       27        6   2.84  100    0    417       1
## 17        1  10       15       42       32   3.95  100    0    670       1
## 18       14  30       31       19        6   2.71   85   12    589       1
## 19        2  16       41       29       12   3.31   93    7    770       1
## 20       12  27       29       27        6   2.87   83   15    694       1
## 21       19  25       29       20        6   2.68   75   25    611       2
## 22       11  29       28       26        7   2.88   80   15    594       1
## 23       15  32       35       14        5   2.62   69   29    564       2
## 24       32  34       18       12        4   2.22   71   29    645       2
## 25        0   0        0        0        0     NA  100    0    414       2
## 26        0   0        0      100        0   4.00   87   13    631       2
## 27        0   0        0        0        0     NA   94    3    343       2
## 28        0   0        0        0        0     NA   96    7    404       1
## 29        0   0        0        0        0     NA   94    6    504       2
## 30        0 100        0        0        0   2.00  100    0    512       1
## 31        0   0        0        0        0     NA   95    5    637       1
## 32       50  25       25        0        0   1.75   88   15    594       2
## 33       22  44       33        0        0   2.11  100    0    500       2
## 34        0   0        0      100        0   4.00   96    0    320       1
## 35       13  40       33        0       13   2.60  100    4    309       1
## 36        0   0        0        0        0     NA   65   43    748       2
## 37      100   0        0        0        0   1.00  100    0    527       1
## 38        6  35       22       28        9   2.98   91    6    465       1
## 39        9  32       32       21        5   2.82   94    6    451       2
## 40        4  11       35       31       19   3.48  100    0    432       1
## 41       40  29       15       12        4   2.09   92    8    524       2
## 42       10  30       35       20        5   2.79   97    3    427       1
## 43        4  19       33       30       14   3.31  100    3    562       1
## 44        0   0        4       28       67   4.62   90    0    305       1
## 45        2   5       17       38       38   4.05   84    0    467       1
## 46        0   4        7       32       57   4.42   93   10    520       1
## 47        4  10       35       36       16   3.51   93    7    816       1
## 48        1  11       29       38       19   3.63   95    3    830       1
## 49        5  30       28       21       16   3.12   90    5    487       1
## 50        6  17       27       36       14   3.35  100    0    307       1
## 51       33  28       27        8        4   2.20   95    5    819       3
## 52       32  35       16       14        3   2.21   98    2    744       2
## 53       37  27       27        7        3   2.13   87   13    672       2
## 54        7  16       25       36       15   3.35   89   14    416       1
## 55       10  22       19       34       16   3.25   85   20    286       1
## 56        1   9       15       34       41   4.06   77   23    488       1
## 57       47  33       11        8        0   1.82   93   11    656       3
## 58       58  26       12        3        1   1.64   88   19    577       3
## 59       43  27       19       10        1   2.01  100    5    523       3
## 60       27  35       21       16        2   2.31   98   10    672       3
## 61       45  34       14        6        1   1.85   98    2    678       3
## 62       47  32       12        8        2   1.85   94   15    680       3
## 63       64  20        8        7        0   1.59   93   15    700       3
## 64       61  19       15        4        2   1.67  100    7    396       3
## 65       15  36       27       19        2   2.57   93   13    495       3
## 66       59  26        9        5        1   1.63   94   10    929       3
## 67       46  28       16       10        0   1.91   87   20    205       3
## 68        0   0        0        0        0     NA   92    8    570       3
## 69        0   0       50       50        0   3.50   84   16    485       2
## 70       39  24       15       20        2   2.22   96    4    410       2
## 71       40  42       14        3        0   1.81  100    0    224       2
## 72       19  30       26       23        2   2.58   91   13    306       2
## 73       24  34       32       10        0   2.27   96    4    447       2
## 74       34  31       15       19        1   2.22   93    7    432       2
## 75       24  42       22        7        5   2.27   91    9    544       2
## 76       21  24       17       25       12   2.83  100    0    547       2
## 77        0  19       19       29       33   3.76   87   13    313       1
## 78       64  30        4        2        0   1.44   93    7    705       2
## 79       22  29       23       21        6   2.60   63   32    182       2
## 80        5  21       20       44       10   3.32   58   25    166       1
## 81        1   8       13       41       36   4.02   83   13    404       1
## 82       19  24       32       21        4   2.67   80   25    626       2
## 83       10  28       40       15        7   2.79   88   12    242       2
## 84        1  14       30       37       19   3.60   86   14    267       1
## 85       25  36       26       10        4   2.31   46   42    558       3
## 86        0   0       33        0       67   4.33   97    6    611       1
## 87        0   0        0      100        0   4.00  100    0    440       1
## 88        0   0       67       33        0   3.33  100    3    381       1
## 89        0  50       50        0        0   2.50   90    3    503       1
## 90        0   0        0        0        0     NA   97    0    466       1
## 91        0  20        0       80        0   3.60   95    5    357       1
## 92        0   0        0        0        0     NA  100    0    388       1
## 93       48  33       13        6        1   1.78   37   53    591       3
## 94       62  25        9        3        0   1.54   73   17    637       3
## 95       23  46       17       14        1   2.24   90    3    403       2
## 96       32  32       24       10        2   2.19   44   47    810       3
## 97       26  27       30       13        4   2.44   83   14    295       2
## 98       29  39       22        7        2   2.13   58   29    837       3
## 99        4  32       23       31       10   3.11   84    9    669       1
## 100      41  36       12        8        3   1.97   80   16    621       3
## 101      16  11       30       30       13   3.13   95    5    467       2
## 102      19  37       26       11        8   2.52   85   15    262       2
## 103      44  37       14        5        0   1.81   48   31    795       3
## 104       7  39       25       26        3   2.79   45   41    750       3
## 105      55  26       13        5        1   1.71   68   32    590       3
## 106      33  47       14        4        2   1.96   78   15   1029       3
## 107      24  25       24       22        5   2.59   59   23    891       2
## 108       0  10       36       48        6   3.49   73   19    285       2
## 109       5  17       15       40       23   3.58   82   18    393       2
## 110      53  28       12        5        2   1.75   84    8    419       3
## 111      13  42       19       21        5   2.64   81   12    353       2
## 112      33  33       20       15        0   2.16   67   24    414       3
## 113       0   0        0        0        0     NA  100    0    276       1
## 114      42  50        8        0        0   1.67   86   11    320       3
## 115      33  32       22        8        5   2.21   72   21    717       3
## 116      10  40       28       19        3   2.66   85   15    278       2
## 117      48  32       10       11        1   1.85   75   20    631       3
## 118      28  43       14       14        1   2.18   71   25    852       3
## 119       4  25       22       35       14   3.31   76   12    405       1
## 120      67  21        8        4        1   1.50   62   28   1264       3
## 121      29  35       23       11        3   2.23   68   24    629       3
## 122      10   9       10       32       39   3.81  100    0    300       1
## 123      18  36       16       25        5   2.64   81   12    361       2
## 124      49  29        9        5        8   1.96   50   36    859       3
## 125       3   6       52       27       13   3.39   73   19    321       2
## 126      19  26       24       22        8   2.73   90    7    380       2
## 127      11  19       15       52        3   3.16   81   19    277       2
## 128       1  18       34       29       19   3.48   84    8    283       1
## 129      32  36       18       11        2   2.14   67   29    307       3
## 130      43  30       17        7        3   1.97   48   42   1036       3
## 131       0 100        0        0        0   2.00   59   33    699       3
## 132      36  27       15       13        9   2.31   64   30    743       3
## 133       4  17       28       38       13   3.39   64   20    371       2
## 134      45  36       11        8        1   1.84   66   23   1059       3
## 135      52  43        2        3        0   1.56   86    7    244       3
## 136       7  17       17       31       29   3.58   80    7    388       2
## 137       7  25       28       36        4   3.05   50   35    528       2
## 138      10  26       23       26       14   3.07   81   15    328       1
## 139      31  33       14       16        5   2.30   59   26    492       3
## 140      67  21        7        4        1   1.50   53   32   1013       3
## 141      19  24       13       38        6   2.88   80   10    848       2
## 142      52  32        9        6        1   1.73   77   17    621       3
## 143      13  31       17       30        9   2.90   83   17    336       2
## 144       5  48       10       32        6   2.85   90   10    455       1
## 145      26  27       12       32        1   2.55   45   39   1149       3
## 146      43  35       12        8        2   1.92   80   12    290       3
## 147       9  21       18       29       24   3.38   61   21    726       3
## 148       1   2        7       37       54   4.41   95    0    436       1
## 149      19  19       29       29        3   2.77   63   27    403       2
## 150      11  34       19       30        7   2.87   85    9    807       1
## 151      46  25       16       12        1   1.96   56   38    637       2
## 152      14  40       22       22        3   2.60   73   27    306       2
## 153       8  31       33       25        3   2.84   73   29    715       3
## 154      12  20       19       31       18   3.22   80   15    559       2
## 155      28  43       19        7        2   2.12   61   24    514       3
## 156      79  17        3        2        0   1.28   47   53    617       3
## 157      19  36       17       25        4   2.57   61   30    570       3
## 158       3  20       22       48        7   3.37   64   30    428       2
## 159      46  29       13        7        4   1.94   77   25    833       3
## 160      98   1        1        1        0   1.06   58   28    821       3
## 161       2  48       25       15       11   2.85   81    6    259       3
## 162      23  41       13       16        7   2.43   66   28    761       3
## 163      49  35       10        6        1   1.75   67   23   1529       3
## 164       0 100        0        0        0   2.00   57   30   1127       3
## 165      32  30       21       16        2   2.26   69   27   1009       3
## 166       1   6       21       55       16   3.78   81   11    326       1
## 167      10  48       10       33        0   2.67   42   39    446       3
## 168      44  30       18        5        4   1.94   75   17    450       3
## 169      11  44       25       19        2   2.57   96   11    373       2
## 170      42  21       19       16        3   2.18   80   10    443       3
## 171       7  25       16       43       10   3.24   56   33    722       2
## 172      74  18        7        1        0   1.34   60   33   1009       3
## 173      71  19        5        4        1   1.44   65   31    943       3
## 174       0  50       25       25        0   2.75   67   24    167       2
## 175      61  23       10        4        3   1.66   62   26    768       3
## 176      41  28       17       13        2   2.06   73   21    965       3
## 177       7  29       28       25       10   3.02   95    5    239       2
## 178      10  24       35       30        2   2.90   85   19    297       3
## 179      32  32       23        6        6   2.22   52   33    587       3
## 180       3   5       13       38       41   4.08   94    6    235       1
## 181      29  37       15       17        3   2.28   76   17    651       3
## 182      42  32       17        9        0   1.93   75   21    787       3
## 183      43  41       12        3        1   1.77   67   21    259       3
## 184      22  47       25        5        1   2.13   72   17    213       3
## 185      46  31        2       16        4   2.02   81   14    372       3
## 186      18  52       14       16        0   2.29   79   21    333       3
## 187      45  36       11        8        0   1.82   67   33   1112       3
## 188       7  19       28       34       13   3.26   69   23    367       2
## 189      26  17       16       28       13   2.84   68   32    401       3
## 190      48  27       13       11        1   1.89   69   27    739       3
## 191      49  25       17        7        1   1.87   79   15    484       3
## 192      40  60        0        0        0   1.60   46   42    364       3
## 193      33   0        0       67        0   3.00   54   40   1004       3
## 194      13  24       31       25        6   2.86   82   15    464       2
## 195      37  30       14       18        1   2.16   59   39    922       3
## 196       1  21       19       43       15   3.50   93    5    615       1
## 197      67   0        0        0       33   2.33   83   12   1081       3
## 198      39  29       16       13        2   2.09   61   33    338       3
## 199      42  26       15       15        3   2.13   75   20   1007       3
## 200      58  34        5        2        1   1.53   57   36    151       2
## 201      43  30       15       10        2   1.97   73   18    245       3
## 202      89  11        0        0        0   1.11   51   37    704       3
## 203      33  36       12       17        2   2.20   90   10    236       2
## 204      39  41       12        8        1   1.90   79   24    499       2
## 205      28  56       12        3        1   1.94   50   31    703       3
## 206      41  35       15        7        1   1.92   65   26    738       3
## 207       6  12       20       46       16   3.54   70   17    665       2
## 208       4  23       25       36       12   3.29   81   22    508       2
## 209       2  39       18       34        7   3.05   94   13    355       1
## 210      37  33       14       13        2   2.09   65   29   1570       3
## 211       2   6       11       29       53   4.25   92    8    311       1
## 212       0   2       18       51       30   4.07  100    3    517       1
## 213      10  24       40       19        7   2.90  100    3    510       1
## 214       5   7       27       40       22   3.68  100    3    422       1
## 215       0   3       14       39       43   4.22   93    4    382       1
## 216       4  26       42       23        5   3.00   89   11    240       1
## 217       6  25       34       26        8   3.05  100    0    384       1
## 218       7  35       38       15        5   2.77  100    4    333       1
## 219       0   0        0        0        0     NA   91   15    419       2
## 220      55  21       16        6        2   1.79   94    6    396       3
## 221       7  18       36       27       11   3.17  100    0    187       1
## 222      20  17       33       20       10   2.83   91   13    235       2
## 223       0   4       10       38       48   4.29  100    0    523       1
## 224      30  23       26       15        6   2.46   87   13    198       2
## 225      31  44       16        8        1   2.03   76    2    602       2
## 226      34  42       21        4        0   1.94   57   22    413       2
## 227      12  57       22        8        1   2.30   64   29    494       3
## 228      19  30       24       20        8   2.67   95    0    255       2
## 229       4  25       24       31       16   3.29  100    0    179       1
## 230      31  32       31        5        1   2.13   78   11    322       2
## 231      20  31       31       12        5   2.50  100    0    191       2
## 232      11  32       19       35        4   2.88   58   25    595       3
## 233      51  24       17        5        3   1.85   69   15    643       3
## 234      30  55        9        6        0   1.92   45   31    367       2
## 235       2  15       32       34       18   3.51   93    7    196       1
## 236       0  13       17       34       36   3.92   95    5    213       1
## 237       0   0        0        0        0     NA   61   44    198       2
## 238       0   0        0        0        0     NA   44   48    333       2
## 239       1  28       23       39        8   3.25   75   25    547       1
## 240      47  32       16        3        1   1.79   53   44    684       3
## 241      19  30       28       17        6   2.61   86   14    312       2
## 242       1   2       10       40       47   4.32  100    0    696       1
## 243       0   4       16       40       40   4.16  100    7    393       1
## 244       0   3       21       41       34   4.05  100    0    462       1
## 245       0   6       24       38       31   3.94  100    2    672       1
## 246       8   9       35       29       20   3.43  100    0    541       1
## 247       0   2       16       43       38   4.17  100    0    606       1
## 248      10  68       18        5        0   2.17  100    4    332       2
## 249      46  44        8        3        0   1.69  100    0    334       3
## 250       6  64       23        6        0   2.32  100    0    495       2
## 251       2  90        7        1        0   2.07  100    0    319       2
## 252      28  42       24        6        0   2.08  100    0    351       2
## 253       5  65       25        4        1   2.30  100    0    318       2
## 254      25  31       28       14        3   2.40   94   16    384       2
## 255      15  18       38       17       13   2.96   92   11    554       1
## 256      24  20       28       10       17   2.76   87   23    573       2
## 257      24  26       36       10        4   2.44   84   16    243       2
## 258      21  27       33       12        7   2.57   91   15    555       2
## 259       3  12       41       27       17   3.41   89   16    617       1
## 260      39  29       22        6        3   2.06   87   23    541       2
## 261      14  33       42        7        4   2.54   89   17    515       2
## 262       7  19       38       18       19   3.24   96   11    490       1
## 263       0  46       25       23        5   2.87   94   12    207       1
## 264      22  26       30       20        3   2.57   96   15    657       2
## 265      31  29       22       17        1   2.28  100   17    645       3
## 266      37  31       15       16        2   2.14  100    6    434       3
## 267       4  15       46       23       12   3.23  100   11    309       2
## 268       0  85       11        2        2   2.22  100   13    222       3
## 269       0  22       33       36        9   3.33  100   11    401       2
## 270      10  26       21       30       13   3.09  100    0    329       1
## 271      28  25       28       15        4   2.40  100    5    230       3
## 272      75  15        0        0       10   1.55  100   18    170       3
## 273      19  41       30        8        3   2.36   96    4    358       3
## 274       8  31       22       29       10   3.02  100    8    133       2
## 275       0   7       17       35       42   4.12  100    0    406       1
## 276      16  36       24       22        2   2.59  100    9    292       2
## 277      15  43       21       16        5   2.54   98    3    560       1
## 278       6  25       27       35        9   3.16   97    6    410       1
## 279       4  31       25       36        4   3.04   97    0    586       1
## 280      16  27       23       29        6   2.82   95    7    539       1
## 281       8  20       24       39        9   3.23  100    8    203       1
## 282      31  25       16       23        5   2.47   90   10    519       2
## 283       3  23       24       35       16   3.39   97    3    510       1
## 284      53  27       15        4        1   1.71   73   29    590       3
## 285      30  35       22        9        4   2.23   85   18    522       3
## 286      19  33       22       19        7   2.62   98    0    583       2
## 287      55  31       12        2        0   1.61   87   15    645       3
## 288      28  24       34       11        3   2.38  100    5    277       2
## 289      25  28       30       12        4   2.42   85   13    607       2
## 290      23  32       30       10        4   2.40   86   14    439       2
## 291      60  23       14        3        1   1.61   76   18    583       3
## 292      48  31       17        3        0   1.77   81   19    470       3
## 293      61  18       15        4        1   1.68   79    4    210       3
## 294       7  22       29       33        9   3.15   98    0    548       1
## 295      10  25       36       20        9   2.94  100    0    135       2
## 296      19  13        8       26       33   3.41  100    0    231       1
## 297       7  18       36       20       19   3.25   96    4    295       2
## 298      12  27       27       29        6   2.90  100    2    645       2
## 299      21  26       32       16        7   2.62  100    0    438       2
## 300       3   5       11       27       54   4.24  100    0    469       1
## 301      12  26       25       32        5   2.92  100    2    575       1
## 302      44  25       15       13        3   2.07   90    0    512       3
## 303       1  55       31       10        4   2.61   98    2    603       2
## 304      10  16       41       16       18   3.16  100    0    130       2
## 305      55  31       10        3        1   1.65   85    3    726       3
## 306       6  17       36       30       10   3.21   98    4    525       2
## 307       4  17       27       29       23   3.51  100    0    344       1
## 308      45  31       18        5        1   1.88   97    3    400       3
## 309      11  20       39       21        9   2.98   95    5    253       2
## 310      26  31       25       14        4   2.38   82    7    354       2
## 311      45  29       12        8        5   1.99   97    0    350       3
## 312      10  21       33       28        9   3.05  100    0    612       2
## 313       2  27       44       23        4   3.00  100    0    543       2
## 314       1  10        9       52       28   3.98   97    3    556       1
## 315       5  18       29       26       21   3.40  100    0    378       1
## 316       3  14       19       33       31   3.74  100    0    256       1
## 317      70  19        8        3        1   1.46   87    7    410       3
## 318      53  35        8        4        0   1.65   90    3    763       3
## 319      24  29       28       14        6   2.49   92    8    159       2
## 320      47  28       14        8        4   1.94   81    4    459       3
## 321       5  21       41       22       11   3.13  100    2    353       1
## 322      12  33       24       22        9   2.84   41   59    216       2
## 323       9  22       36       24        8   3.00  100    0    266       2
## 324       9  26       34       20       12   3.02   82   14    267       2
## 325      19  53       17       11        1   2.21   81    0    197       2
## 326       2   7       17       37       37   3.99   76   10    237       1
## 327       8  33       23       32        4   2.93   81   11    329       2
## 328      22  39       24       15        0   2.33   69   21    335       3
## 329      31  38       17       13        1   2.14   83   13    206       2
## 330       9  74       11        4        1   2.14   94    0    187       2
## 331       7  14       17       38       24   3.60   95    5    280       1
## 332       1   7       12       35       45   4.15   87   10    448       1
## 333       9  36       35       16        4   2.68   94    6    229       2
## 334      31  38       15       13        3   2.17   78    6    215       2
## 335      47  37        7        6        3   1.80   90    0    189       3
## 336       0   6       14       43       36   4.08   77   17    404       1
## 337       4   9       14       45       28   3.84   80    0    222       1
## 338       0  36       29       29        7   3.07   63   26    240       2
## 339       0   3       10       46       40   4.23   78    5    602       1
## 340       2   5       13       54       27   3.99   96    0    387       1
## 341       5  17       28       38       11   3.33   96    4    350       1
## 342      41  30       18        9        2   2.02  100    0    327       2
## 343      32  24       24       14        5   2.37   92    0    257       2
## 344       6  15       33       27       18   3.36   95    5    281       1
## 345      16  18       28       17       21   3.09   71    8    344       1
## 346      28  44       14       13        1   2.14   86    9    310       2
## 347       1  17       38       36        8   3.32  100    0    337       1
## 348       1  24       33       30       11   3.26   95    5    317       1
## 349       8  30       34       24        4   2.87   95    5    297       2
## 350      30  31       25       13        1   2.26   85   10    269       3
## 351      16  28       39       12        4   2.60   86   10    290       2
## 352       0   9       24       39       28   3.85  100    0    355       1
## 353       7  34       37       18        4   2.77  100    0    339       2
## 354       1  14       28       35       22   3.64   96    4    394       1
## 355      11  24       33       22       11   2.98   80   15    300       1
## 356      10  35       25       23        8   2.85  100    0    146       1
## 357      17  38       35        9        2   2.40   75   10    321       2
## 358       6  28       38       21        7   2.95   92    8    367       2
## 359      10  26       33       27        4   2.90   90    7    457       2
## 360       1   6       27       48       18   3.75   88    8    318       1
## 361       3  11       28       29       30   3.71   95    0    399       1
## 362      33  26       25       13        3   2.27   79   15    410       3
## 363       1  15       25       38       21   3.62   95    5    312       1
## 364       7  59       29        6        0   2.34   95    5    222       2
## 365      43  37       16        3        1   1.80   83   17    602       3
## 366      54  28       14        3        1   1.70   75   24    779       3
## 367      83   8        8        0        0   1.25   71   29    590       3
## 368      46  33       15        4        3   1.86   83   17    611       3
## 369      10  24       43       14        9   2.88   91    9    695       1
## 370      22  26       32       10       10   2.60   88   12    688       2
## 371      58  30       10        1        0   1.56   70   30    593       3
## 372      50  50        0        0        0   1.50   96    0    368       1
## 373       0   0        0        0        0     NA  100    0    451       1
## 374      50   0        0       50        0   2.50  100    0    657       1
## 375       0   0        0        0        0     NA   96    0    374       1
## 376       0   0        0        0        0     NA  100    0    397       1
## 377       0   0        0      100        0   4.00   97    3    392       1
## 378      31  23       29       14        4   2.37   80   13    352       3
## 379      75   9       10        5        1   1.47   83   12    535       3
## 380      33  17       20       20       10   2.57   83   17    657       3
## 381      22  31       29       16        2   2.45   92    8    532       2
## 382      41  26       22        8        2   2.03   80    9    612       3
## 383      18  44       25       13        1   2.36   93    5    559       2
## 384      40  20       13       28        0   2.28   80   17    363       3
## 385      50   0        0       50        0   2.50   80    8    300       3
## 386      27  39       15       18        1   2.27   82   12    331       3
## 387      32  26       25       13        4   2.31   72   21    704       3
## 388      13  25       40       17        4   2.73   74   15    466       3
## 389      21  35       21       19        4   2.50  100    0    364       2
## 390       8  42       24       20        7   2.75  100    0    360       1
## 391      36  37       16        9        2   2.03   96    4    657       2
## 392      52  36        6        6        0   1.66  100    3    391       3
## 393      40  45       11        4        0   1.79   88   12    218       3
## 394      27  46       16        9        1   2.12   80    6    400       3
## 395       7  38       22       23       10   2.89   96    8    317       1
## 396       5  29       23       37        7   3.12   95    5    266       1
## 397       8  34       26       26        6   2.88   85    7    461       1
## 398       3  27       21       37       12   3.29   91    5    360       1
## 399      10  75       13        1        0   2.06   93    7    301       3
## 400      31  48       11        9        1   2.01   83    6    269       3
##     collcat abv_hsg lgenroll
## 1         1     100 2.392697
## 2         1     100 2.665581
## 3         1     100 2.596597
## 4         1      64 2.621176
## 5         1      50 2.716003
## 6         2      99 2.535294
## 7         2      99 2.481443
## 8         2     100 3.179839
## 9         2      98 2.819544
## 10        3      92 2.558709
## 11        2     100 2.885361
## 12        1      39 2.606381
## 13        2      99 2.767898
## 14        3      95 2.801404
## 15        3      95 2.578639
## 16        2      90 2.620136
## 17        2      99 2.826075
## 18        3      86 2.770115
## 19        3      98 2.886491
## 20        3      88 2.841359
## 21        3      81 2.786041
## 22        3      89 2.773786
## 23        3      85 2.751279
## 24        2      68 2.809560
## 25        1     100 2.617000
## 26        1     100 2.800029
## 27        1     100 2.535294
## 28        1     100 2.606381
## 29        1     100 2.702431
## 30        1     100 2.709270
## 31        1     100 2.804139
## 32        3      50 2.773786
## 33        3      78 2.698970
## 34        1     100 2.505150
## 35        3      87 2.489958
## 36        1     100 2.873902
## 37        1       0 2.721811
## 38        2      94 2.667453
## 39        3      91 2.654177
## 40        3      96 2.635484
## 41        2      60 2.719331
## 42        3      90 2.630428
## 43        3      96 2.749736
## 44        1     100 2.484300
## 45        2      98 2.669317
## 46        1     100 2.716003
## 47        3      96 2.911690
## 48        3      99 2.919078
## 49        3      95 2.687529
## 50        3      94 2.487138
## 51        3      67 2.913284
## 52        2      68 2.871573
## 53        3      63 2.827369
## 54        3      93 2.619093
## 55        2      90 2.456366
## 56        2      99 2.688420
## 57        1      53 2.816904
## 58        1      42 2.761176
## 59        2      57 2.718502
## 60        2      73 2.827369
## 61        1      55 2.831230
## 62        1      53 2.832509
## 63        1      36 2.845098
## 64        2      39 2.597695
## 65        3      85 2.694605
## 66        1      41 2.968016
## 67        2      54 2.311754
## 68        1     100 2.755875
## 69        3     100 2.685742
## 70        2      61 2.612784
## 71        1      60 2.350248
## 72        3      81 2.485721
## 73        3      76 2.650308
## 74        2      66 2.635484
## 75        2      76 2.735599
## 76        2      79 2.737987
## 77        2     100 2.495544
## 78        1      36 2.848189
## 79        2      78 2.260071
## 80        2      95 2.220108
## 81        1      99 2.606381
## 82        3      81 2.796574
## 83        3      90 2.383815
## 84        3      99 2.426511
## 85        3      75 2.746634
## 86        3     100 2.786041
## 87        1     100 2.643453
## 88        3     100 2.580925
## 89        3     100 2.701568
## 90        1     100 2.668386
## 91        1     100 2.552668
## 92        1     100 2.588832
## 93        1      52 2.771587
## 94        1      38 2.804139
## 95        2      77 2.605305
## 96        2      68 2.908485
## 97        3      74 2.469822
## 98        2      71 2.922725
## 99        2      96 2.825426
## 100       1      59 2.793092
## 101       3      84 2.669317
## 102       3      81 2.418301
## 103       1      56 2.900367
## 104       3      93 2.875061
## 105       1      45 2.770852
## 106       1      67 3.012415
## 107       2      76 2.949878
## 108       3     100 2.454845
## 109       2      95 2.594393
## 110       1      47 2.622214
## 111       2      87 2.547775
## 112       2      67 2.617000
## 113       1     100 2.440909
## 114       1      58 2.505150
## 115       2      67 2.855519
## 116       3      90 2.444045
## 117       1      52 2.800029
## 118       1      72 2.930440
## 119       2      96 2.607455
## 120       1      33 3.101747
## 121       2      71 2.798651
## 122       1      90 2.477121
## 123       2      82 2.557507
## 124       1      51 2.933993
## 125       3      97 2.506505
## 126       2      81 2.579784
## 127       2      89 2.442480
## 128       3      99 2.451786
## 129       2      68 2.487138
## 130       2      57 3.015360
## 131       1     100 2.844477
## 132       2      64 2.870989
## 133       3      96 2.569374
## 134       1      55 3.024896
## 135       1      48 2.387390
## 136       2      93 2.588832
## 137       3      93 2.722634
## 138       2      90 2.515874
## 139       1      69 2.691965
## 140       1      33 3.005609
## 141       1      81 2.928396
## 142       1      48 2.793092
## 143       2      87 2.526339
## 144       1      95 2.658011
## 145       1      74 3.060320
## 146       1      57 2.462398
## 147       2      91 2.860937
## 148       1      99 2.639486
## 149       3      81 2.605305
## 150       2      89 2.906874
## 151       2      54 2.804139
## 152       2      86 2.485721
## 153       3      92 2.854306
## 154       2      88 2.747412
## 155       2      72 2.710963
## 156       1      21 2.790285
## 157       2      81 2.755875
## 158       2      97 2.631444
## 159       1      54 2.920645
## 160       1       2 2.914343
## 161       3      98 2.413300
## 162       1      77 2.881385
## 163       1      51 3.184407
## 164       1     100 3.051924
## 165       2      68 3.003891
## 166       2      99 2.513218
## 167       1      90 2.649335
## 168       2      56 2.653213
## 169       3      89 2.571709
## 170       2      58 2.646404
## 171       2      93 2.858537
## 172       1      26 3.003891
## 173       1      29 2.974512
## 174       3     100 2.222716
## 175       1      39 2.885361
## 176       2      59 2.984527
## 177       3      93 2.378398
## 178       3      90 2.472756
## 179       2      68 2.768638
## 180       1      97 2.371068
## 181       2      71 2.813581
## 182       2      58 2.895975
## 183       1      57 2.413300
## 184       3      78 2.328380
## 185       1      54 2.570543
## 186       1      82 2.522444
## 187       1      55 3.046105
## 188       3      93 2.564666
## 189       2      74 2.603144
## 190       1      52 2.868644
## 191       2      51 2.684845
## 192       1      60 2.561101
## 193       1      67 3.001734
## 194       3      87 2.666518
## 195       1      63 2.964731
## 196       2      99 2.788875
## 197       1      33 3.033826
## 198       2      61 2.528917
## 199       2      58 3.003029
## 200       1      42 2.178977
## 201       2      57 2.389166
## 202       1      11 2.847573
## 203       1      67 2.372912
## 204       1      61 2.698101
## 205       1      72 2.846955
## 206       2      59 2.868056
## 207       2      94 2.822822
## 208       3      96 2.705864
## 209       2      98 2.550228
## 210       1      63 3.195900
## 211       1      98 2.492760
## 212       2     100 2.713491
## 213       3      90 2.707570
## 214       3      95 2.625312
## 215       1     100 2.582063
## 216       3      96 2.380211
## 217       3      94 2.584331
## 218       3      93 2.522444
## 219       1     100 2.622214
## 220       2      45 2.597695
## 221       3      93 2.271842
## 222       3      80 2.371068
## 223       1     100 2.718502
## 224       3      70 2.296665
## 225       2      69 2.779596
## 226       2      66 2.615950
## 227       2      88 2.693727
## 228       2      81 2.406540
## 229       2      96 2.252853
## 230       3      69 2.507856
## 231       3      80 2.281033
## 232       2      89 2.774517
## 233       2      49 2.808211
## 234       1      70 2.564666
## 235       3      98 2.292256
## 236       2     100 2.328380
## 237       1     100 2.296665
## 238       1     100 2.522444
## 239       2      99 2.737987
## 240       2      53 2.835056
## 241       3      81 2.494155
## 242       1      99 2.842609
## 243       2     100 2.594393
## 244       2     100 2.664642
## 245       2     100 2.827369
## 246       3      92 2.733197
## 247       2     100 2.782473
## 248       2      90 2.521138
## 249       1      54 2.523746
## 250       2      94 2.694605
## 251       1      98 2.503791
## 252       2      72 2.545307
## 253       3      95 2.502427
## 254       3      75 2.584331
## 255       3      85 2.743510
## 256       3      76 2.758155
## 257       3      76 2.385606
## 258       3      79 2.744293
## 259       3      97 2.790285
## 260       2      61 2.733197
## 261       3      86 2.711807
## 262       3      93 2.690196
## 263       3     100 2.315970
## 264       3      78 2.817565
## 265       2      69 2.809560
## 266       2      63 2.637490
## 267       3      96 2.489958
## 268       1     100 2.346353
## 269       3     100 2.603144
## 270       2      90 2.517196
## 271       3      72 2.361728
## 272       1      25 2.230449
## 273       3      81 2.553883
## 274       2      92 2.123852
## 275       2     100 2.608526
## 276       2      84 2.465383
## 277       2      85 2.748188
## 278       3      94 2.612784
## 279       3      96 2.767898
## 280       2      84 2.731589
## 281       2      92 2.307496
## 282       2      69 2.715167
## 283       2      97 2.707570
## 284       2      47 2.770852
## 285       2      70 2.717671
## 286       2      81 2.765669
## 287       1      45 2.809560
## 288       3      72 2.442480
## 289       3      75 2.783189
## 290       3      77 2.642465
## 291       1      40 2.765669
## 292       2      52 2.672098
## 293       2      39 2.322219
## 294       3      93 2.738781
## 295       3      90 2.130334
## 296       1      81 2.363612
## 297       3      93 2.469822
## 298       3      88 2.809560
## 299       3      79 2.641474
## 300       1      97 2.671173
## 301       3      88 2.759668
## 302       2      56 2.709270
## 303       3      99 2.780317
## 304       3      90 2.113943
## 305       1      45 2.860937
## 306       3      94 2.720159
## 307       3      96 2.536558
## 308       2      55 2.602060
## 309       3      89 2.403121
## 310       3      74 2.549003
## 311       1      55 2.544068
## 312       3      90 2.786751
## 313       3      98 2.734800
## 314       1      99 2.745075
## 315       3      95 2.577492
## 316       2      97 2.408240
## 317       1      30 2.612784
## 318       1      47 2.882525
## 319       3      76 2.201397
## 320       1      53 2.661813
## 321       3      95 2.547775
## 322       2      88 2.334454
## 323       3      91 2.424882
## 324       3      91 2.426511
## 325       2      81 2.294466
## 326       2      98 2.374748
## 327       2      92 2.517196
## 328       2      78 2.525045
## 329       2      69 2.313867
## 330       1      91 2.271842
## 331       2      93 2.447158
## 332       1      99 2.651278
## 333       3      91 2.359835
## 334       2      69 2.332438
## 335       1      53 2.276462
## 336       1     100 2.606381
## 337       1      96 2.346353
## 338       3     100 2.380211
## 339       1     100 2.779596
## 340       1      98 2.587711
## 341       3      95 2.544068
## 342       2      59 2.514548
## 343       2      68 2.409933
## 344       3      94 2.448706
## 345       3      84 2.536558
## 346       1      72 2.491362
## 347       3      99 2.527630
## 348       3      99 2.501059
## 349       3      92 2.472756
## 350       3      70 2.429752
## 351       3      84 2.462398
## 352       2     100 2.550228
## 353       3      93 2.530200
## 354       3      99 2.595496
## 355       3      89 2.477121
## 356       3      90 2.164353
## 357       3      83 2.506505
## 358       3      94 2.564666
## 359       3      90 2.659916
## 360       3      99 2.502427
## 361       3      97 2.600973
## 362       3      67 2.612784
## 363       3      99 2.494155
## 364       3      93 2.346353
## 365       2      57 2.779596
## 366       1      46 2.891537
## 367       1      17 2.770852
## 368       2      54 2.786041
## 369       3      90 2.841985
## 370       3      78 2.837588
## 371       1      42 2.773055
## 372       1      50 2.565848
## 373       1     100 2.654177
## 374       1      50 2.817565
## 375       1     100 2.572872
## 376       1     100 2.598791
## 377       1     100 2.593286
## 378       3      69 2.546543
## 379       1      25 2.728354
## 380       2      67 2.817565
## 381       3      78 2.725912
## 382       2      59 2.786751
## 383       3      82 2.747412
## 384       1      60 2.559907
## 385       1      50 2.477121
## 386       2      73 2.519828
## 387       3      68 2.847573
## 388       3      87 2.668386
## 389       2      79 2.561101
## 390       2      92 2.556303
## 391       2      64 2.817565
## 392       1      48 2.592177
## 393       1      60 2.338456
## 394       2      73 2.602060
## 395       2      93 2.501059
## 396       2      95 2.424882
## 397       3      92 2.663701
## 398       2      97 2.556303
## 399       1      90 2.478566
## 400       1      69 2.429752

Exploring the data

Descriptive analysis will help us to understand our data better and investigate if there are any problems in the data. In this workshop we skip this part, but in a real study descriptive analysis is an important part of a good data analysis.

In Descriptive analysis we usually get the Central Tendency and Measurement of variance.

str(d)
## 'data.frame':    400 obs. of  24 variables:
##  $ snum    : int  906 889 887 876 888 4284 4271 2910 2899 2887 ...
##  $ dnum    : int  41 41 41 41 41 98 98 108 108 108 ...
##  $ api00   : int  693 570 546 571 478 858 918 831 860 737 ...
##  $ api99   : int  600 501 472 487 425 844 864 791 838 703 ...
##  $ growth  : int  93 69 74 84 53 14 54 40 22 34 ...
##  $ meals   : int  67 92 97 90 89 10 5 2 5 29 ...
##  $ ell     : int  9 21 29 27 30 3 2 3 6 15 ...
##  $ yr_rnd  : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ mobility: int  11 33 36 27 44 10 16 44 10 17 ...
##  $ acs_k3  : int  16 15 17 20 18 20 19 20 20 21 ...
##  $ acs_46  : int  22 32 25 30 31 33 28 31 30 29 ...
##  $ not_hsg : int  0 0 0 36 50 1 1 0 2 8 ...
##  $ hsg     : int  0 0 0 45 50 8 4 4 9 25 ...
##  $ some_col: int  0 0 0 9 0 24 18 16 15 34 ...
##  $ col_grad: int  0 0 0 9 0 36 34 50 42 27 ...
##  $ grad_sch: int  0 0 0 0 0 31 43 30 33 7 ...
##  $ avg_ed  : num  NA NA NA 1.91 1.5 ...
##  $ full    : int  76 79 68 87 87 100 100 96 100 96 ...
##  $ emer    : int  24 19 29 11 13 0 0 2 0 7 ...
##  $ enroll  : int  247 463 395 418 520 343 303 1513 660 362 ...
##  $ mealcat : int  2 3 3 3 3 1 1 1 1 1 ...
##  $ collcat : int  1 1 1 1 1 2 2 2 2 3 ...
##  $ abv_hsg : int  100 100 100 64 50 99 99 100 98 92 ...
##  $ lgenroll: num  2.39 2.67 2.6 2.62 2.72 ...
summary(d)
##       snum           dnum           api00           api99      
##  Min.   :  58   Min.   : 41.0   Min.   :369.0   Min.   :333.0  
##  1st Qu.:1720   1st Qu.:395.0   1st Qu.:523.8   1st Qu.:484.8  
##  Median :3008   Median :401.0   Median :643.0   Median :602.0  
##  Mean   :2867   Mean   :457.7   Mean   :647.6   Mean   :610.2  
##  3rd Qu.:4198   3rd Qu.:630.0   3rd Qu.:762.2   3rd Qu.:731.2  
##  Max.   :6072   Max.   :796.0   Max.   :940.0   Max.   :917.0  
##                                                                
##      growth           meals             ell            yr_rnd    
##  Min.   :-69.00   Min.   :  0.00   Min.   : 0.00   Min.   :0.00  
##  1st Qu.: 19.00   1st Qu.: 31.00   1st Qu.: 9.75   1st Qu.:0.00  
##  Median : 36.00   Median : 67.50   Median :25.00   Median :0.00  
##  Mean   : 37.41   Mean   : 60.31   Mean   :31.45   Mean   :0.23  
##  3rd Qu.: 53.25   3rd Qu.: 90.00   3rd Qu.:50.25   3rd Qu.:0.00  
##  Max.   :134.00   Max.   :100.00   Max.   :91.00   Max.   :1.00  
##                                                                  
##     mobility         acs_k3          acs_46         not_hsg      
##  Min.   : 2.00   Min.   :14.00   Min.   :20.00   Min.   :  0.00  
##  1st Qu.:13.00   1st Qu.:18.00   1st Qu.:27.00   1st Qu.:  4.00  
##  Median :17.00   Median :19.00   Median :29.00   Median : 14.00  
##  Mean   :18.25   Mean   :19.16   Mean   :29.69   Mean   : 21.25  
##  3rd Qu.:22.00   3rd Qu.:20.00   3rd Qu.:31.00   3rd Qu.: 34.00  
##  Max.   :47.00   Max.   :25.00   Max.   :50.00   Max.   :100.00  
##  NA's   :1       NA's   :2       NA's   :3                       
##       hsg            some_col        col_grad        grad_sch     
##  Min.   :  0.00   Min.   : 0.00   Min.   :  0.0   Min.   : 0.000  
##  1st Qu.: 17.00   1st Qu.:12.00   1st Qu.:  7.0   1st Qu.: 1.000  
##  Median : 26.00   Median :19.00   Median : 16.0   Median : 4.000  
##  Mean   : 26.02   Mean   :19.71   Mean   : 19.7   Mean   : 8.637  
##  3rd Qu.: 34.00   3rd Qu.:28.00   3rd Qu.: 30.0   3rd Qu.:10.000  
##  Max.   :100.00   Max.   :67.00   Max.   :100.0   Max.   :67.000  
##                                                                   
##      avg_ed           full             emer           enroll      
##  Min.   :1.000   Min.   : 37.00   Min.   : 0.00   Min.   : 130.0  
##  1st Qu.:2.070   1st Qu.: 76.00   1st Qu.: 3.00   1st Qu.: 320.0  
##  Median :2.600   Median : 88.00   Median :10.00   Median : 435.0  
##  Mean   :2.668   Mean   : 84.55   Mean   :12.66   Mean   : 483.5  
##  3rd Qu.:3.220   3rd Qu.: 97.00   3rd Qu.:19.00   3rd Qu.: 608.0  
##  Max.   :4.620   Max.   :100.00   Max.   :59.00   Max.   :1570.0  
##  NA's   :19                                                       
##     mealcat         collcat        abv_hsg          lgenroll    
##  Min.   :1.000   Min.   :1.00   Min.   :  0.00   Min.   :2.114  
##  1st Qu.:1.000   1st Qu.:1.00   1st Qu.: 66.00   1st Qu.:2.505  
##  Median :2.000   Median :2.00   Median : 86.00   Median :2.638  
##  Mean   :2.015   Mean   :2.02   Mean   : 78.75   Mean   :2.640  
##  3rd Qu.:3.000   3rd Qu.:3.00   3rd Qu.: 96.00   3rd Qu.:2.784  
##  Max.   :3.000   Max.   :3.00   Max.   :100.00   Max.   :3.196  
## 

lm function

is used to fit models, carry out regression, analysis of variance and covariance.

Our first linear model

In our data example we are interested to study the relationship between student academic performance and characteristics of the school. For example we can use variable api00, a school-wide measure of academic performance, as the outcome, and variable enroll, the number of students in the school, as the predictor.

for our example, let us use the api00 for academic performance and full for emer, percentage of teachers with emergency credential.

making sure our data is an lm function

m1 <- lm(api00~ell
         , data =d)
class( m1)
## [1] "lm"
print(m1)
## 
## Call:
## lm(formula = api00 ~ ell, data = d)
## 
## Coefficients:
## (Intercept)          ell  
##     785.890       -4.396

The estimated linear function is: \[ ap\hat{i}00 = 785.890 -4.396(ell) \] The coefficient for \(ell\) is -4.4, hence, for every one unit increase in ell, we would expect a 4.4 decrease in apio. Example, for a 200 english language learners, we would expect, on average, to have an api00 score 440 less than a 300 english language learners.

summary(m1)
## 
## Call:
## lm(formula = api00 ~ ell, data = d)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -262.741  -63.605    1.443   68.242  212.310 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  785.890      7.370  106.64   <2e-16 ***
## ell           -4.396      0.184  -23.89   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 91.28 on 398 degrees of freedom
## Multiple R-squared:  0.5893, Adjusted R-squared:  0.5882 
## F-statistic:   571 on 1 and 398 DF,  p-value: < 2.2e-16

we can see in the summary that both the intercept and slope are statistically signficant.

Multiple R-squared of the model is equal to 0.5893 and adjust R-squared is 0.5992, which is adjusted for number of predictors.

In the simple linear regression model \(R^2\) is equal to the square of the correlation between the response and predictor variables. We can run the function cor() to confirm this.

r<-cor(d$ell,d$api00)
r^2
## [1] 0.5892615

The last line gives the overall F-statistic, testing the fit of the current model against the null model, the model with only an intercept. F=44.83 with 1 and 398 degrees of freedom and with p-value equal to 7.339e-11. This p-value in a simple regression model is exactly equal to p-value of the slope, We can check the anova by using the function aov().

anova(m1)
## Analysis of Variance Table
## 
## Response: api00
##            Df  Sum Sq Mean Sq F value    Pr(>F)    
## ell         1 4757504 4757504  570.99 < 2.2e-16 ***
## Residuals 398 3316168    8332                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

As we said the output of the lm() function is an object of class lm. We can use functions ls() or str() to list the components of object m1.

ls(m1)
##  [1] "assign"        "call"          "coefficients"  "df.residual"  
##  [5] "effects"       "fitted.values" "model"         "qr"           
##  [9] "rank"          "residuals"     "terms"         "xlevels"
m1$coefficients
## (Intercept)         ell 
##  785.890265   -4.396082

a vector of fitted values

m1$fitted.values[1:10]
##        1        2        3        4        5        6        7        8 
## 746.3255 693.5725 658.4039 667.1961 654.0078 772.7020 777.0981 772.7020 
##        9       10 
## 759.5138 719.9490

Storing extracted components to new object

residuals <- m1$resid

getting confident intervals for the coefficient of the model,

confint(m1)
##                  2.5 %     97.5 %
## (Intercept) 771.401848 800.378681
## ell          -4.757761  -4.034403

Using anova to extract sum of squares model

anova(m1)
## Analysis of Variance Table
## 
## Response: api00
##            Df  Sum Sq Mean Sq F value    Pr(>F)    
## ell         1 4757504 4757504  570.99 < 2.2e-16 ***
## Residuals 398 3316168    8332                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The F-value in the anova states that the relationship has significance.

Another important function is predict() which can be use to get predicted value for new data points.

new.data <- data.frame(ell = c(500, 600, 700))
predict(m1, newdata = new.data)
##         1         2         3 
## -1412.151 -1851.759 -2291.367
library(sjPlot)
## Warning: package 'sjPlot' was built under R version 4.3.2
## Learn more about sjPlot with 'browseVignettes("sjPlot")'.
tab_model(m1)
  api00
Predictors Estimates CI p
(Intercept) 785.89 771.40 – 800.38 <0.001
ell -4.40 -4.76 – -4.03 <0.001
Observations 400
R2 / R2 adjusted 0.589 / 0.588

Plotting a scatter plot and the regression line

plot(api00 ~ ell, data = d)
abline(m1, col = "blue")   

It appears in the figure that there are no outliers,

plot(api00 ~ ell, data = d) 
text(d$ell, d$api00+20, labels = d$snum, cex = .7)
abline(m1, col = "blue")

 library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.3.2
 ggplot(d, aes(x = ell, y = api00)) + 
   geom_point() +
   stat_smooth(method = "lm", col = "red")
## `geom_smooth()` using formula = 'y ~ x'

### Exercise 1 In the data set elemapi2v2 the variable full is the percentage teachers with full credential for each school. Run the regression model of api00 on full. Use this model to predict the mean of api00 for full equal to 25%, 50%, 75%, and 90%.

Is the predicted value for full =25% valid?

Factor and dummy variable

In regression analysis, it is possible to include categorical predictors.

R uses object class factor to store and manipulate categorical variables. The lm function automatically treat variable type character as factor, but it is always safer to change the variable’s class to factor before the analysis. The function factor() is used to encode a variable (a vector) as a factor.

mealcat_F <-factor(d$mealcat)
str(mealcat_F)
##  Factor w/ 3 levels "1","2","3": 2 3 3 3 3 1 1 1 1 1 ...
d$yr_rnd_F <- factor(d$yr_rnd)
levels(d$yr_rnd_F)<-c("NO","Yes")
table(d$yr_rnd_F)
## 
##  NO Yes 
## 308  92

In R when we include a factor as a predictor to the model, the lm function by default generates dummy variables for each category of the factor. A dummy variable is a variable that takes values 0 and 1. If we are in that category the dummy variable is 1 and if we are not in that category the dummy variable is 0.The number of dummy variables is equal to the number of levels for the categorical variable minus 1. For example if we have 3 levels for the categorical variables, lm generates 2 dummy variables. If value for both dummy variable is 0 then we are in the third category. This third category is called the reference category.

###Regression of api00 on yr_rnd a two level categorical variable

m2<-lm(api00~yr_rnd_F, data=d)
summary(m2)
## 
## Call:
## lm(formula = api00 ~ yr_rnd_F, data = d)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -273.539  -95.662    0.967  103.341  297.967 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   684.54       7.14   95.88   <2e-16 ***
## yr_rnd_FYes  -160.51      14.89  -10.78   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 125.3 on 398 degrees of freedom
## Multiple R-squared:  0.226,  Adjusted R-squared:  0.2241 
## F-statistic: 116.2 on 1 and 398 DF,  p-value: < 2.2e-16
model.matrix(m2)[1:6,]
##   (Intercept) yr_rnd_FYes
## 1           1           0
## 2           1           0
## 3           1           0
## 4           1           0
## 5           1           0
## 6           1           0

by writing out the regrresion equation, we have

\[api00 = 684.539 - 160.5064 \times yr\_rnd\_F(YES)\]

If a school is not a year-round school (i.e. yr_rnd_F is 0) the regression equation would simplify to

\[api00 = 684.539\]

If a school is a year-round school, the regression equation would simplify to \[api00=524.0326\] This indicates the mean of api00 given the school is not a year-round school is 524.0326.

In general when we have a two level categorical variable the intercept is the expected outcome for the reference group and the slope of the other category is the difference of expected outcome of that group with the reference category.

plot(api00~yr_rnd, data=d)
abline(m2)

Multiple Regression

Adding more predictors to a single regression model

Now, let’s look at an example of multiple regression, in which we have one outcome (dependent) variable and multiple predictors.

The percentage of variability of api00 explained by variable enroll was only 10.12%. In order to explain more variation, we can add more predictors. In R we can do this by adding variables with + to the formula of our lm() function. We add meals, the percentage of students who get full meals as an indicator of socioeconomic status, and full, the percentage of teachers with full credentials, to our model.

m3 <- lm(api00~enroll+meals+full, data=d)
summary(m3)
## 
## Call:
## lm(formula = api00 ~ enroll + meals + full, data = d)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -181.721  -40.802    1.129   39.983  158.774 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 801.82983   26.42660  30.342  < 2e-16 ***
## enroll       -0.05146    0.01384  -3.719 0.000229 ***
## meals        -3.65973    0.10880 -33.639  < 2e-16 ***
## full          1.08109    0.23945   4.515 8.37e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 58.73 on 396 degrees of freedom
## Multiple R-squared:  0.8308, Adjusted R-squared:  0.8295 
## F-statistic: 648.2 on 3 and 396 DF,  p-value: < 2.2e-16
anova(m3)
## Analysis of Variance Table
## 
## Response: api00
##            Df  Sum Sq Mean Sq  F value    Pr(>F)    
## enroll      1  817326  817326  236.947 < 2.2e-16 ***
## meals       1 5820066 5820066 1687.263 < 2.2e-16 ***
## full        1   70313   70313   20.384 8.369e-06 ***
## Residuals 396 1365967    3449                       
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

We see that, on the F-test, the overall model is a significant improvement in fit compared to the intercept-only model. Also, all of the tests of the coefficients against the null value of zero are significant.

The R-square is 0.8308, meaning that approximately 83% of the variability of api00 is accounted for by the variables in the model. The adjusted R-square shows after taking the account of number of predictors in the model R_square is still about 0.83.

The coefficients for each of the variables indicates the amount of change one could expect in api00 given a one-unit increase in the value of that variable, given that all other variables in the model are held constant. For example, consider the variable meals. We would expect a decrease of about 3.66 in the api00 score for every one unit increase in percent free meals, assuming that all other variables in the model are held constant.

We see quite a difference in the coefficient of variable enroll compared to the simple linear regression. Before the coefficient for variable enroll was -.1999 and now it is -0.05146.

The ANOVA table shows the sum of squares explained by adding each variable sequentially to the model, or equivalently, the amount of sum of square residuals reduced by each additional variable.

For example variable enroll reduces the total error (RSS) by 817326. By adding variable meals we reduce additional 5820066 from the residual sum of squares and by adding variable full we reduce the error by 70313. Finally we have 1365967 left as unexplained error. The total sum of squares (TSS) is the sum of all of the sums of squares added together. To get the total sum of square of variable api00 we can multiply its’ variance by (n−1).

sum(anova(m3)$Sum)
## [1] 8073672
(400-1)*var(d$api00)
## [1] 8073672

Standardized regression coefficients

Some researchers are interested in comparing the relative strength of the various predictors within the model.To address this problem we use standardized regression coefficients, which can be obtain by transforming the outcome and predictor variables all to their standardized scores, also called z-scores, before running the regression.

m3.sd <- lm(scale(api00) ~  scale(enroll) + scale(meals) + scale(full), data = d)
summary(m3.sd)
## 
## Call:
## lm(formula = scale(api00) ~ scale(enroll) + scale(meals) + scale(full), 
##     data = d)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.27749 -0.28683  0.00793  0.28108  1.11617 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    4.454e-16  2.064e-02   0.000 1.000000    
## scale(enroll) -8.191e-02  2.203e-02  -3.719 0.000229 ***
## scale(meals)  -8.210e-01  2.441e-02 -33.639  < 2e-16 ***
## scale(full)    1.136e-01  2.517e-02   4.515 8.37e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.4129 on 396 degrees of freedom
## Multiple R-squared:  0.8308, Adjusted R-squared:  0.8295 
## F-statistic: 648.2 on 3 and 396 DF,  p-value: < 2.2e-16

In the standardized regression coefficients summary we see that the intercept is zero and all t-statistics (p-values) for the other coefficients are exactly the same as the original model.

Because the coefficients are all in the same standardized units, standard deviations, you can compare these coefficients to assess the relative strength of each of the predictors. In this example, meals has the largest Beta coefficient, -0.821.

Thus, a one standard deviation increase in meals leads to a 0.821 standard deviation decrease in predicted api00, with the other variables held constant.

Regression model with interaction between two continuous predictors

In this topic, we consider the interaction between the predictors

m4 <- lm(api00 ~  enroll + meals + enroll:meals , data = d)
summary(m4) 
## 
## Call:
## lm(formula = api00 ~ enroll + meals + enroll:meals, data = d)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -211.186  -38.834   -1.137   38.997  163.713 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   8.833e+02  1.665e+01  53.034   <2e-16 ***
## enroll        8.835e-04  3.362e-02   0.026   0.9790    
## meals        -3.425e+00  2.344e-01 -14.614   <2e-16 ***
## enroll:meals -9.537e-04  4.292e-04  -2.222   0.0269 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 59.85 on 396 degrees of freedom
## Multiple R-squared:  0.8243, Adjusted R-squared:  0.823 
## F-statistic: 619.3 on 3 and 396 DF,  p-value: < 2.2e-16

The intercept interpreted as before and is the expected api00 when both enroll and meals are set to zero.

The coefficient of enroll is interpreted as increase of the mean of api00 by 0.0008835 for one unit increase of number of enrollment given the value of meals is zero.

The coefficient of meals is interpreted as average decrease of the mean of api00 by 3.425 for one unit increase of meals (one percentage increase) given the value of enroll is zero.

Adding interaction term means that the effect of enroll and meals is no longer constant for different values of the other predictor.

The interaction term can be interpreted in two ways.

If we use enroll as moderator variable then we can say that the effect of meals changes by −0.0009537 for one unit increase of enroll.

If we use meals as moderator variable then we can say that the effect of enroll changes by −0.0009537 for one unit increase of meals.

Regression model with interaction between two categorical variables

In here, instead of two continuous variables, we consider the interaction between the two categorical variables

d$mealcat_F <- relevel(mealcat_F, ref="3")
m6 <- lm(api00 ~ d$yr_rnd_F*mealcat_F, data = d)
summary(m6) 
## 
## Call:
## lm(formula = api00 ~ d$yr_rnd_F * mealcat_F, data = d)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -207.533  -50.764   -1.843   48.874  179.000 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)               521.493      8.414  61.978  < 2e-16 ***
## d$yr_rnd_FYes             -33.493     11.771  -2.845  0.00467 ** 
## mealcat_F1                288.193     10.443  27.597  < 2e-16 ***
## mealcat_F2                123.781     10.552  11.731  < 2e-16 ***
## d$yr_rnd_FYes:mealcat_F1  -40.764     29.231  -1.395  0.16394    
## d$yr_rnd_FYes:mealcat_F2  -18.248     22.256  -0.820  0.41278    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 68.87 on 394 degrees of freedom
## Multiple R-squared:  0.7685, Adjusted R-squared:  0.7656 
## F-statistic: 261.6 on 5 and 394 DF,  p-value: < 2.2e-16

The intercept interpreted as the expected api00 when both yr_rnd_F and mealcat_F are at their reference level. Here, it is the expected api00 given school is not year round and the meals category is at level “3”.

The coefficient of yr_rnd_FYes is the difference of the expected api00 for year round school with school is not year round when meals category is at level “3”

The coefficient of mealcat_F1 is the difference of the expected api00 between mealcat_F at level “1” and mealcat_F at level “3” given school is not year round.

The coefficient of mealcat_F2 is the difference of the expected api00 between mealcat_F at level “2” and mealcat_F at level “3” given schoo is not year round.

The interaction terms can be interpreted in two ways. Here we only interpret it in one way:

The coefficient of yr_rnd_FYes:mealcat_F1 is the expected difference api00 for year round school with school is not year round if the meals category is “1” minus the expected difference api00 for year round school with school is not year round if the meals category is “3”.

The coefficient of yr_rnd_FYes:mealcat_F2 is the expected api00 for year round school with school is not year round if the meals category is “2” minus the expected api00 for year round school with school is not year round if the meals category is “3”.

For example the expected mean of api00 for a school that is year round with the mealcat equal to “2” is:

\[521.493+(-33.493)+123.781+(-18.248) = 593.533\tag{5.2.1}\] The expected mean of api00 for a school that is not year round with the mealcat equals to “2” is:

\[521+123.781=645.274 \tag{5.2.2}\] Thus, the expected difference api00 for year round school with school is not year round if the meals category equals to “2” is:

in (5.2.1) the expected mean of apio00 for a school that is year round with mealcat = 2, and in (5.2.2) is the expected mean of api00 for a school that is not a year round with the mealcat = 2, Here we subtract the sum in (5.2.1) and (5.2.2)
\[593.533-645.274 = -51.731 \tag{5.2.3}\] The difference of the expected api00 for year round school with school is not year round when meals category is at level “3 is the coefficient of yr_rnd_FYes = -33.493

If we calculate the difference between the above differences we get: \[-51.741-(-33.493) = -18.248 \tag{5.2.4}\] which is the coefficient for yr_rnd_FYes:mealcat_F2.

If we need to test overall effect of interaction terms (i.e. simultaneously test both interaction coefficients equal to zero) we can use likelihood ratio F test using anova function. To do so, we run a model without interaction terms and use anova function to test the difference between two models.

m0 <- lm(api00 ~ yr_rnd_F + mealcat_F , data = d)

anova(m0, m6)
## Analysis of Variance Table
## 
## Model 1: api00 ~ yr_rnd_F + mealcat_F
## Model 2: api00 ~ d$yr_rnd_F * mealcat_F
##   Res.Df     RSS Df Sum of Sq      F Pr(>F)
## 1    396 1879528                           
## 2    394 1868944  2     10584 1.1156 0.3288

The F test means that we do not have enough evidence to reject the base model m0 and therefore, the effect yr_rnd unlikely change for different levels of mealcat.

Regression model with interaction between continuous predictor and a categorical predictor

Here we consider two predictors, one is continuous and the other is categorical, in the above example, let us use enroll for continuous and yr_rnd_F for categorical, for their interactions, we simply multiply the two predictors in the linear model formula

m7 <- lm(api00 ~ yr_rnd_F * enroll , data = d)
summary(m7) 
## 
## Call:
## lm(formula = api00 ~ yr_rnd_F * enroll, data = d)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -274.043  -94.781    0.417   97.666  309.573 
## 
## Coefficients:
##                      Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        682.068556  18.797436  36.285   <2e-16 ***
## yr_rnd_FYes        -74.858236  48.224281  -1.552   0.1214    
## enroll               0.006021   0.042396   0.142   0.8871    
## yr_rnd_FYes:enroll  -0.120218   0.072075  -1.668   0.0961 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 125 on 396 degrees of freedom
## Multiple R-squared:  0.2335, Adjusted R-squared:  0.2277 
## F-statistic: 40.21 on 3 and 396 DF,  p-value: < 2.2e-16

The intercept interpreted as before and is the expected api00 when enroll is zero and the school is not year round.

The coefficient of enroll is interpreted as increase of the mean of api00 by for one unit increase of number of enrollment given the school is not year round.

The coefficient of yr_rnd_FYes is interpreted as the difference between the expected api00 for year round school and not year round school when given the value of enroll is zero.

The interaction term can be interpreted as: The change of the effect of enroll to the expected mean api00 when the school is year round to when the school is not year round.

Plotting the interaction between a continuous predictor and categorical predictor

Sometimes it is a good idea to plot the predicted values for different levels of predictors to visualize the interaction. We use function predict for a new data to plot this interaction.

First get the range of continuous predictor

range(d$enroll)
## [1]  130 1570
#make a sequence of number of enroll from range of enroll 130-1570 increment by 5
new.enroll <- seq(130, 1570, 5)

new.yr_rnd <- rep(levels(d$yr_rnd_F), each = length(new.enroll))

new.data.plot <- data.frame(enroll = rep(new.enroll, times = 2), yr_rnd_F = new.yr_rnd)

new.data.plot$predicted_api00 <- predict(m7, newdata = new.data.plot)

library(ggplot2)
ggplot(new.data.plot, aes(x = enroll, y = predicted_api00, colour = yr_rnd_F)) +
geom_line(lwd=1.2)

Regression Diagnostics

Introduction

In the previous part, we learned how to do ordinary linear regression with R. Without verifying that the data have met the assumptions underlying OLS regression, results of regression analysis may be misleading. Here will explore how you can use R to check on how well your data meet the assumptions of OLS regression. In particular, we will consider the following assumptions.

Homogeneity of variance (homoscedasticity): The error variance should be constant

Linearity: the relationships between the predictors and the outcome variable should be linear.

Independence: The errors associated with one observation are not correlated with the errors of any other observation

Normality: the errors should be normally distributed. Technically normality is necessary only for hypothesis tests to be valid.

Model specification: The model should be properly specified (including all relevant variables, and excluding irrelevant variables)

Additionally, there are issues that can arise during the analysis that, while strictly speaking are not assumptions of regression, are none the less, of great concern to data analysts.

Influence: individual observations that exert undue influence on the coefficients

Collinearity: predictors that are highly collinear, i.e., linearly related, can cause problems in estimating the regression coefficients.

Many graphical methods and numerical tests have been developed over the years for regression diagnostics.

R has many of these methods in stats package which is already installed and loaded in R. There are some other tools in different packages that we can use by installing and loading those packages in our R environment.

###unusual and Influential data

A single observation that is substantially different from all other observations can make a large difference in the results of your regression analysis. If a single observation (or small group of observations) substantially changes your results, you would want to know about this and investigate further. There are three ways that an observation can be unusual.

Outliers: In linear regression, an outlier is an observation with large residual. In other words, it is an observation whose dependent-variable value is unusual given its values on the predictor variables. An outlier may indicate a sample peculiarity or may indicate a data entry error or other problem.

Leverage: An observation with an extreme value on a predictor variable is called a point with high leverage. Leverage is a measure of how far an observation deviates from the mean of that variable. These leverage points can have an effect on the estimate of regression coefficients.

Influence: An observation is said to be influential if removing the observation substantially changes the estimate of coefficients. Influence can be thought of as the product of leverage and outlierness.

library(car)
## Warning: package 'car' was built under R version 4.3.2
## Loading required package: carData
## Warning: package 'carData' was built under R version 4.3.2
m9<-lm(api00~enroll+meals+full, data=d)
scatterplotMatrix(~api00 + enroll + meals+full, data=d)

The graphs of api00 with other variables show some potential problems. In every plot, we seeone or more data point that is far away from the rest of the data points.

Studentized residuals can be used to identify outliers. In R we use rstandard() function tocompute Studentized residuals.

res.std <- rstandard(m9)
plot(res.std, ylab="Standardized Residual", ylim=c(-3.5,3.5))
abline(h =c(-3,0,3),lty =2)
index <- which(res.std > 3 | res.std < -3)
text(index-20, res.std[index] , labels = d$snum[index])

print(index)
## 226 
## 226
print(d$snum[index])
## [1] 211

We should pay attention to studentized residuals that exceed +2 or -2, and get even more concerned about residuals that exceed +2.5 or -2.5 and even yet more concerned about residuals that exceed +3 or -3. These results show that school number 211 is the most worrisome observation.

outlierTest(m9)
## No Studentized residuals with Bonferroni p < 0.05
## Largest |rstudent|:
##      rstudent unadjusted p-value Bonferroni p
## 226 -3.151186            0.00175      0.70001
library(faraway)
## Warning: package 'faraway' was built under R version 4.3.2
## 
## Attaching package: 'faraway'
## The following objects are masked from 'package:car':
## 
##     logit, vif
h <- influence(m9)$hat
halfnorm(influence(m9)$hat, ylab = "leverage")

Cook’s distance is a measure for influence points. A point with high level of cook’s distance is considers as a point with high influence point. A cut of value for cook’s distance can be calculated as \[\frac{4}{n-p-1}\] Where n is sample size and p is number of predictors. We can plot Cook’s distance using the following code

cutoff <- 4/((nrow(d)-length(m2$coefficients)-2))
plot(m9, which = 4, cook.levels = cutoff)

We can use influencePlot() function in package “car” to identify influence point. It plots Studentized residuals against leverage with cook’s distance.

influencePlot(m9, main="Influence Plot",
sub="Circle size is proportional to Cook's Distance" )

##         StudRes        Hat        CookD
## 8    0.18718812 0.08016299 7.652779e-04
## 93   2.76307269 0.02940688 5.687488e-02
## 210  0.03127861 0.06083329 1.588292e-05
## 226 -3.15118603 0.01417076 3.489753e-02
## 346 -2.83932062 0.00412967 8.211170e-03
infIndexPlot(m9)

### Checking Homoscedasticity

One of the main assumptions for the ordinary least squares regression is the homogeneity of variance of the residuals. If the model is well-fitted, there should be no pattern to the residuals plotted against the fitted values. If the variance of the residuals is non-constant then the residual variance is said to be “heteroscedastic.” There are graphical and non-graphical methods for detecting heteroscedasticity. A commonly used graphical method is to plot the residuals versus fitted (predicted) values.

plot(m9$resid ~ m9$fitted.values)

abline(h = 0, lty = 2)

### Checking Linearity To check linearity residuals should be plotted against the fit as well as other predictors. If any of these plots show systematic shapes, then the linear model is not appropriate and some nonlinear terms may need to be added.

residualPlots(m9)

##            Test stat Pr(>|Test stat|)
## enroll        0.0111           0.9911
## meals        -0.6238           0.5331
## full          1.1565           0.2482
## Tukey test   -0.8411           0.4003

Issues on independence

A simple visual check would be to plot the residuals versus the time variable.

plot(m9$resid ~ d$snum)

### Checking Normality Residuals

Normality of residuals is only required for valid hypothesis testing, that is, the normality assumption assures that the p-values for the t-tests and F-test will be valid. Normality is not required in order to obtain unbiased estimates of the regression coefficients.

OLS regression merely requires that the residuals (errors) be identically and independently distributed. Furthermore, there is no assumption or requirement that the predictor variables be normally distributed. If this were the case than we would not be able to use dummy coded variables in our models.

because of large sample theory if we have large enough sample size we do not even need the residuals be normally distributed, however, for small sample sizes, normality is required.

qqnorm(m9$resid)
qqline(m9$resid)

Checking for Multicollinearity

When there is a perfect linear relationship among the predictors, the estimates for a regression model cannot be uniquely computed. The term collinearity implies that two variables are near perfect linear combinations of one another. When more than two variables are involved it is often called multicollinearity, although the two terms are often used interchangeably.

The primary concern is that as the degree of multicollinearity increases, the regression model estimates of the coefficients become unstable and the standard errors for the coefficients can get wildly inflated.

VIF, variance inflation factor, is used to measure the degree of multicollinearity. As a rule of thumb, a variable whose VIF values are greater than 10 may merit further investigation. Tolerance, defined as 1/VIF, is used by many researchers to check on the degree of collinearity. A tolerance value lower than 0.1 is comparable to a VIF of 10. It means that the variable could be considered as a linear combination of other independent variables.

car::vif(m9)
##   enroll    meals     full 
## 1.135733 1.394279 1.482305

Model Specification

A model specification error can occur when one or more relevant variables are omitted from the model or one or more irrelevant variables are included in the model.

If relevant variables are omitted from the model, the common variance they share with included variables may be wrongly attributed to those variables, and the error term is inflated. On the other hand, if irrelevant variables are included in the model, the common variance they share with included variables may be wrongly attributed to them. Model specification errors can substantially affect the estimate of regression coefficients.

avPlots(m9)