##ANALYSES

Running descriptive statistics for all quantitative variables, assigning them to descriptives data frame.

library(psych)
projectdata<-read.csv("Project2Data.csv")
descriptives<-describe(projectdata[,5:30], na.rm = TRUE)
options(digits=6)
options(scipen = 999)

Determining the skewed distributions, the variables in SkewedVar do not have a normal distrbution.

SkewedVar <- subset(descriptives[descriptives$skew < -1.0  | descriptives$skew > 1.0,])
print(SkewedVar)
##            vars    n     mean       sd  median trimmed     mad min
## UGDS          5 6990  2332.16  5438.85  406.00 1052.09  526.32   0
## UGDS_BLACK    7 6990     0.19     0.22    0.10    0.14    0.12   0
## UGDS_HISP     8 6990     0.16     0.22    0.07    0.11    0.08   0
## UGDS_ASIAN    9 6990     0.03     0.08    0.01    0.02    0.02   0
## UGDS_AIAN    10 6990     0.01     0.07    0.00    0.00    0.00   0
## UGDS_NHPI    11 6990     0.00     0.03    0.00    0.00    0.00   0
## UGDS_2MOR    12 6990     0.02     0.03    0.02    0.02    0.03   0
## UGDS_NRA     13 6990     0.02     0.05    0.00    0.01    0.00   0
## UGDS_UNKN    14 6990     0.05     0.09    0.01    0.02    0.02   0
## PPTUG_EF     15 6969     0.23     0.25    0.15    0.19    0.22   0
## TUITFTE      19 7270 10401.08 17375.87 9015.00 9227.39 6676.15   0
## INEXPFTE     20 7270  7360.21 12726.34 5490.00 5875.23 3399.60   0
## RET_FT4      25 2293     0.71     0.20    0.74    0.73    0.15   0
##                   max      range  skew kurtosis     se
## UGDS        151558.00  151558.00  6.84   104.77  65.05
## UGDS_BLACK       1.00       1.00  1.73     2.49   0.00
## UGDS_HISP        1.00       1.00  2.24     4.83   0.00
## UGDS_ASIAN       0.97       0.97  6.38    55.98   0.00
## UGDS_AIAN        1.00       1.00 11.16   136.65   0.00
## UGDS_NHPI        1.00       1.00 22.92   608.03   0.00
## UGDS_2MOR        0.53       0.53  4.11    36.08   0.00
## UGDS_NRA         0.93       0.93  8.29   101.58   0.00
## UGDS_UNKN        0.90       0.90  4.37    23.94   0.00
## PPTUG_EF         1.00       1.00  1.02     0.22   0.00
## TUITFTE    1292154.00 1292154.00 55.94  4076.80 203.79
## INEXPFTE    735077.00  735077.00 31.03  1570.11 149.26
## RET_FT4          1.00       1.00 -1.26     2.31   0.00

Determining the skewed distributions with a kurtosis > 3

SkewedKurtosisVar <- subset(SkewedVar[ SkewedVar$kurtosis > 3.0,])
print(SkewedKurtosisVar)
##            vars    n     mean       sd  median trimmed     mad min
## UGDS          5 6990  2332.16  5438.85  406.00 1052.09  526.32   0
## UGDS_HISP     8 6990     0.16     0.22    0.07    0.11    0.08   0
## UGDS_ASIAN    9 6990     0.03     0.08    0.01    0.02    0.02   0
## UGDS_AIAN    10 6990     0.01     0.07    0.00    0.00    0.00   0
## UGDS_NHPI    11 6990     0.00     0.03    0.00    0.00    0.00   0
## UGDS_2MOR    12 6990     0.02     0.03    0.02    0.02    0.03   0
## UGDS_NRA     13 6990     0.02     0.05    0.00    0.01    0.00   0
## UGDS_UNKN    14 6990     0.05     0.09    0.01    0.02    0.02   0
## TUITFTE      19 7270 10401.08 17375.87 9015.00 9227.39 6676.15   0
## INEXPFTE     20 7270  7360.21 12726.34 5490.00 5875.23 3399.60   0
##                   max      range  skew kurtosis     se
## UGDS        151558.00  151558.00  6.84   104.77  65.05
## UGDS_HISP        1.00       1.00  2.24     4.83   0.00
## UGDS_ASIAN       0.97       0.97  6.38    55.98   0.00
## UGDS_AIAN        1.00       1.00 11.16   136.65   0.00
## UGDS_NHPI        1.00       1.00 22.92   608.03   0.00
## UGDS_2MOR        0.53       0.53  4.11    36.08   0.00
## UGDS_NRA         0.93       0.93  8.29   101.58   0.00
## UGDS_UNKN        0.90       0.90  4.37    23.94   0.00
## TUITFTE    1292154.00 1292154.00 55.94  4076.80 203.79
## INEXPFTE    735077.00  735077.00 31.03  1570.11 149.26

###QUestion 1 The variabes UGDS, UGDS_HISP, UGDS_ASIAN, UGDS_AIAN, UGDS_NHPI, UGDS_2MOR, UGDS_NRA, UGDS_UNKN, TUITFTE, INEXPFTE do not have a normal distribution bby their skewness and kurtosis. The variabes PPTUG_EF, RET_FT4 do not have a normal distribution by their skewness.

Calculating variances:

variances_data<-data.frame(descriptives[,4]*descriptives[,4])
print(variances_data)
##    descriptives...4....descriptives...4.
## 1                            0.699779293
## 2                          130.935905752
## 3                            0.043651078
## 4                        17784.083824963
## 5                     29581070.838806145
## 6                            0.082554181
## 7                            0.050492190
## 8                            0.049550061
## 9                            0.005654128
## 10                           0.004851592
## 11                           0.001080576
## 12                           0.000972495
## 13                           0.002492304
## 14                           0.008708988
## 15                           0.060708464
## 16                    21805832.123101581
## 17                    52883812.827343225
## 18                   162884718.941394806
## 19                   301920777.617115259
## 20                   161959774.003822237
## 21                           0.093798135
## 22                           0.051049405
## 23                           0.045700584
## 24                           0.066115544
## 25                           0.038276927
## 26                           0.080706267

The variables TUITFE, INEXPFTE and UGD_NHPI present outliers. The trimmed versions of these variables are grouped in new data frames. These data frames were for seeing the difference between distributions and will not be included in the analysis (since I do not know how correct they are).

#trimming TUITFTE outliers
boxplot(projectdata$TUITFTE)
boxplot(projectdata$TUITFTE)$out

##   [1]   33714   71514   52798   30191   29651   29456   35778   27468
##   [9]   30407   30090   41132   30042   28151   33136   28202   31045
##  [17]   29570   27323   30355   47001   28399   27427   29662   31738
##  [25]   33097   37339   30675   34134   45599   28609   28347   35692
##  [33]   32490   27462   30787   29384   30440   28464   34183   29248
##  [41]   29286   34848   37194   38242   35867   28496   31502   27614
##  [49]   43479   27753   32781   36925   28158   32693   28395   29645
##  [57]   42907   31326   33907   50828   33733   37489   41051   28543
##  [65]   28496   36302   47530   33960   30778   29527   51665   28408
##  [73]   31367   62516   30239   30294   28155   28992   30965   37015
##  [81]   32990   29609   29681   28118   32881   29707   31418   39920
##  [89]   29358   27708   28282   43240   29892   58704   29924   29760
##  [97]   30401   33371   34733   64887   30824   30208   27721   41862
## [105]   41508   32018   29171   29583   29420   30389   30996   27513
## [113]   31892   29439   66287   30430   29409   32051   29026   28527
## [121]   29422   31788   36119   27496  109761   29020   29130   57333
## [129]   27378   87002   34560   34269   32135   32258   27431   33544
## [137]   28816   30816   36379   34663   28482   28727   42962  148376
## [145]   30649   29574   82287   27917   30363   38322   31715   27917
## [153]   35953   41623   27940   28924   31130   97668   29469   47231
## [161]   33765   62836   30888   35407   30110   39443   29112   31573
## [169]   44185   29353   27630   46351   30696   28036   52353   29890
## [177]   30494   32575   37667   27473   39198   40481   33369   51396
## [185]   35088   35928   61752   31193   34675   45692   44996   36857
## [193]   71583   35813   31949   35312   47796   37098   37721   29853
## [201]   44096   30001   39362   27984   30964   31038   34253   32822
## [209]   57588   48680   43041   41191   30523   39119   37084  248996
## [217]   29727   29006   56456   61433   32907   43931   37575   39634
## [225]   37738   43458   43610   36985   55197   29091   36374   32206
## [233]   27734 1292154   33953   42779   70354   44918   51452  163977
## [241]   62687   28735
outliers_TUITFTE <- boxplot(projectdata$TUITFTE, plot=FALSE)$out
projectdata_TUITFE_Trimmed <- projectdata[-which(projectdata$TUITFTE %in% outliers_TUITFTE),]
boxplot(projectdata_TUITFE_Trimmed$TUITFTE)

#trimming INEXPFTE outliers
boxplot(projectdata$INEXPFTE)
boxplot(projectdata$INEXPFTE)$out

##   [1]  17920  16892  21957  55110  38544  62580  17070  17174  19131  15586
##  [11]  23717  92590  18988  22820  18713  40020  24195 116672  21618  18391
##  [21]  18911  27210  25915  28406  15743  30078  16048  25295  21675  17405
##  [31]  18464  19228  18824  25056  37082  15669  24420  22285  28832  17809
##  [41]  41743  43170  26708  21002  19122  22554  33453  23628  25359 107982
##  [51]  19979  17395  16368  35595  20938  31102  22663  15547  15962  28897
##  [61]  18075  18531  40450 138585  19856  38813  22541  19027  15606  26316
##  [71]  83779  18050  17456  26053  17132  16167  20947  39052  29847  17333
##  [81]  37339  25848  28776  17894  15675  18823  27622  18608  21481  17148
##  [91]  21898  26632  20942  26091  18087  23944  17782  15813  69655  17259
## [101]  16874  60533  27497  17605  22995  21627  21684  27158  19918  77339
## [111]  33355  19343  25196  20631  20571  30750  26542  27699  32388  20890
## [121]  49500  46634  40311  18884  35861  62770  55789  22711  28846  26936
## [131]  15688  17905  15593  25201  18161  26786  21543  34705  39712  23528
## [141]  46336  22728  17259  26922  19210  37432  16837  21056  18549  17879
## [151]  20532  18186  35616  28657  17005  22992  16155  18230  18233  16440
## [161] 105933  49995  26182  25102  16809  16320  17080  19530  20653  52224
## [171]  18006  20507  39856  42245  23461  21263  16368  17511  19621  31328
## [181]  20625  22975  80944  21635  23818 193088  41614  33424  45328  33646
## [191]  25879  30436  16162  17646  17346  28982  18431  25582  26430  28869
## [201]  30033  30237  19412  18834  18698  19069  17731  28983  15966  20449
## [211]  19668  16834  15806  20564  55586  72057  60181  18301  19835  26884
## [221]  15985  29919  31658  33568  19228  50756  25710  22850  26803  30754
## [231]  15698  15817  17961  64962  30078  23413  17967  20711  54483  22319
## [241]  25737  28894  16075  22200  17261  23763 115982  47557  15982  15866
## [251]  21005  64019  18558  22586 105951  27370  20198  31807  52148  18027
## [261]  80997  19474  16497  28622  21109  19976  20469  29256  64202  17872
## [271]  49018  16511  16522  21426  16267  18097  57712  22330  17221  21294
## [281]  28135  32124  18435  41171  15732  18281  33278  23184  16533  18080
## [291]  20889  39537  32856  23526  22579  79372  17794  72196  24974  40056
## [301]  16152  16035 357489 116877  85666  16749  15779  20073  20116 118456
## [311]  46499  20658  21424  24640 111574  20798  16336  16893  59753  15806
## [321]  17109  28914  21467  16287  36226  21942  33533  16988  31496  16268
## [331]  27673  20016  16430  18755  24561  19220  16815  27076  27048  26244
## [341]  22859  25033  15635  55613  24333  40582  93146  16568  16043  34563
## [351]  16715  17344  38032  22543  45489  18983  18879  20186  18446  20148
## [361]  16936  64474  19807  19263  30299  17613  28733  22571  61857  16260
## [371]  94762  30352  28611  15607  25310  16462  16486  18925  16199  37599
## [381]  19037  16521  15917  15759  42605  16550  16308  17038  20904  36492
## [391]  44494  16189  16393  38265  91469  27472  28201  26330  18283  16120
## [401]  17550  30582  18174  19040  25627  18513  16306  21167  27397  20551
## [411]  29285  15697  20566  17037  17215  38718  21433  16496  17194  22750
## [421]  20382  49593  37165  17324  17740  30169  30513  18059  38734  15612
## [431]  17747  15885  18305  24553  20455 199372  16254  33569  16775  16892
## [441]  24621  18581  21921  16467  19615  53541  22740  18963  29338  17647
## [451]  20309 735077  24175  16783  42259  19355  23999  24815  16348  38669
## [461]  37932
outliers_INEXPFTE <- boxplot(projectdata$INEXPFTE, plot=FALSE)$out
projectdata_INEXPFTE_Trimmed <- projectdata[-which(projectdata$INEXPFTE %in% outliers_INEXPFTE),]
boxplot(projectdata_INEXPFTE_Trimmed$INEXPFTE)

#trimming UGDS_NHPI outliers
boxplot(projectdata$UGDS_NHPI)
boxplot(projectdata$UGDS_NHPI)$out

##   [1] 0.0096 0.0161 0.0086 0.0147 0.0109 0.0112 0.0577 0.0084 0.0400 0.0079
##  [11] 0.0449 0.0625 0.0430 0.0063 0.0067 0.0103 0.0087 0.0101 0.0081 0.0084
##  [21] 0.0342 0.0079 0.0098 0.0075 0.0090 0.0081 0.0085 0.0095 0.0069 0.0113
##  [31] 0.0076 0.0106 0.0119 0.0073 0.0108 0.0091 0.0085 0.0158 0.0127 0.0316
##  [41] 0.0345 0.0104 0.0170 0.0128 0.0095 0.0070 0.0114 0.0486 0.0231 0.0078
##  [51] 0.0067 0.0064 0.0192 0.0110 0.0243 0.0073 0.0228 0.0256 0.0123 0.0071
##  [61] 0.0068 0.0274 0.0069 0.0394 0.0211 0.0104 0.0495 0.0301 0.0092 0.0391
##  [71] 0.0065 0.0090 0.0207 0.0120 0.0188 0.0074 0.0170 0.0185 0.0072 0.0087
##  [81] 0.0063 0.0133 0.0196 0.0205 0.0164 0.0233 0.0348 0.0083 0.0067 0.0265
##  [91] 0.0231 0.0105 0.0098 0.0103 0.0118 0.0068 0.0109 0.0278 0.0181 0.0271
## [101] 0.0306 0.0260 0.0067 0.0098 0.0403 0.0103 0.0208 0.0086 0.0181 0.0076
## [111] 0.0079 0.0444 0.0122 0.0096 0.0179 0.0067 0.0087 0.0065 0.0100 0.0078
## [121] 0.0200 0.0099 0.0460 0.0256 0.0073 0.0504 0.0124 0.0071 0.0082 0.0077
## [131] 0.0298 0.0112 0.0114 0.0240 0.0067 0.0120 0.0403 0.0209 0.0120 0.0182
## [141] 0.0091 0.0193 0.0123 0.0114 0.0064 0.0237 0.0138 0.0117 0.0073 0.0107
## [151] 0.0075 0.0385 0.0123 0.0107 0.0087 0.0132 0.0152 0.0070 0.0260 0.0097
## [161] 0.0070 0.0092 0.0079 0.1770 0.1587 0.1111 0.0349 0.3333 0.0230 0.0776
## [171] 0.0338 0.0842 0.0596 0.1075 0.2391 0.6144 0.0693 0.1072 0.0101 0.0132
## [181] 0.0065 0.0113 0.0131 0.0074 0.0189 0.0105 0.0082 0.0065 0.0071 0.0089
## [191] 0.0159 0.0767 0.0093 0.0086 0.0113 0.0152 0.0164 0.0080 0.0066 0.0078
## [201] 0.0159 0.0086 0.0065 0.0065 0.0076 0.0098 0.0157 0.0200 0.0167 0.0120
## [211] 0.0071 0.0082 0.0073 0.0095 0.0263 0.0104 0.0164 0.0656 0.0072 0.0086
## [221] 0.0252 0.0107 0.0075 0.0155 0.0116 0.0753 0.0455 0.0075 0.0067 0.0222
## [231] 0.0069 0.0182 0.0400 0.0109 0.0144 0.0081 0.0092 0.0091 0.0069 0.0080
## [241] 0.0093 0.0090 0.0231 0.0816 0.0070 0.0091 0.0127 0.0108 0.0962 0.0375
## [251] 0.0066 0.0064 0.0075 0.0675 0.0179 0.0082 0.0343 0.0142 0.0071 0.0082
## [261] 0.0683 0.0179 0.0070 0.0064 0.0096 0.0099 0.0088 0.0097 0.0091 0.0063
## [271] 0.0084 0.0157 0.0113 0.0278 0.0114 0.0513 0.0109 0.0082 0.0072 0.0115
## [281] 0.0114 0.0432 0.0294 0.0127 0.0138 0.0064 0.0108 0.0283 0.0102 0.1743
## [291] 0.0110 0.0066 0.0133 0.0065 0.0081 0.0097 0.0113 0.0148 0.0432 0.0066
## [301] 0.0069 0.0984 0.0069 0.0069 0.0068 0.0250 0.0110 0.0073 0.0156 0.0092
## [311] 0.0067 0.0163 0.0065 0.0161 0.0248 0.0288 0.0317 0.0111 0.0227 0.0120
## [321] 0.0082 0.0090 0.0064 0.0233 0.0105 0.0202 0.0088 0.0074 0.0167 0.0090
## [331] 0.0105 0.0063 0.0086 0.0071 0.0087 0.0148 0.0069 0.0102 0.3509 0.0100
## [341] 0.0162 0.0065 0.0071 0.0139 0.0145 0.0067 0.0066 0.0152 0.0120 0.0091
## [351] 0.0081 0.0229 0.0801 0.0102 0.0500 0.0317 0.0141 0.0323 0.0092 0.0115
## [361] 0.0691 0.0247 0.0120 0.0185 0.0234 0.0094 0.0128 0.0110 0.0066 0.0109
## [371] 0.0163 0.0095 0.0100 0.0070 0.0076 0.0086 0.0517 0.0556 0.0130 0.0173
## [381] 0.0086 0.0323 0.0142 0.0199 0.0078 0.0154 0.0110 0.0322 0.0102 0.0143
## [391] 0.0191 0.0165 0.0115 0.0077 0.0091 0.0220 0.0088 0.0163 0.0068 0.0102
## [401] 0.0097 0.0121 0.0102 0.0082 0.0092 0.0068 0.0533 0.0073 0.0111 0.9193
## [411] 0.5318 0.4946 0.5232 0.9881 0.9983 0.0179 0.0066 0.0072 0.0139 0.0300
## [421] 0.0413 0.0100 0.0282 0.0152 0.0119 0.0067 0.0076 0.0115 0.0139 0.0075
## [431] 0.0117 0.0083 0.0144 0.0171 0.0087 0.0526 0.0078 0.0099 0.0200 0.0109
## [441] 0.0091 0.6730 0.0270 0.0064 0.0078 0.0118 0.0247 0.0113 0.0195 0.0530
## [451] 0.0204 0.0992 0.0071 0.0226 0.0146 0.0155 0.0073 0.0070 0.3009 0.0152
## [461] 0.0360 0.0265 0.9917 0.0108 0.0095 0.0213 0.0098 0.0169 0.0071 0.0163
## [471] 0.0250 0.0072 0.0450 0.0082 0.0204 0.0082 0.0093 0.0162 0.0164 0.0189
## [481] 0.0078 0.0074 0.0088 0.0322 0.1189 0.0096 0.0120 0.0080 0.0119 0.0419
## [491] 0.0194 0.0158 0.0190 0.0528 0.0071 0.0256 0.0063 0.0106 0.0070 0.0130
## [501] 0.0072 0.0132 0.0081 0.0112 0.0243 0.0172 0.0513 0.0242 0.0400 0.0139
## [511] 0.0459 0.0175 0.0161 0.0096 0.0130 0.0625 0.0094 0.0303 0.0070 0.0102
## [521] 0.0098 0.0078 0.0088 0.1215 0.0139 0.0092 0.0476 0.0092 0.0073 0.0123
## [531] 0.0127 0.0075 0.0233 0.0074 0.0121 0.0095 0.0098 0.0068 0.0132 0.0113
## [541] 0.0423 0.0092 0.0065 0.0891 0.0511 0.0121 0.0183 0.0073 0.0080 0.0508
## [551] 0.0459 0.0063 0.0098 0.0225 0.0063 0.0119 0.0298 0.0109 0.0099 0.0250
## [561] 0.0089 0.0263 0.0127 0.0170 0.0253 0.0471 0.9538 0.0347 0.0500 0.0173
## [571] 0.0139 0.0101 0.0117 0.0228 0.0095 0.0283 0.0108 0.0120 0.0317 0.0163
## [581] 0.0092 0.0304 0.0070 0.0080 0.0076 0.0103 0.0455 0.0349 0.0122 0.0200
## [591] 0.0095 0.0273 0.0088 0.0133 0.0667 0.0180 0.0080 0.0471 0.0143 0.0064
## [601] 0.0268 0.0143 0.0063 0.0626 0.0358 0.0063 0.0071 0.0076 0.0112 0.0136
## [611] 0.0134 0.0065 0.0112 0.0072 0.0212 0.0171 0.0156 0.0400 0.0068 0.0090
## [621] 0.0212 0.0201 0.0087 0.0401 0.0185 0.0086 0.0067 0.0119 0.0087 0.0132
## [631] 0.0152 0.0155 0.0223 0.0321 0.0068 0.0216 0.0526 0.0111 0.0226 0.0085
## [641] 0.0072 0.0188 0.0238 0.0076 0.0182 0.0160 0.0094 0.0256 0.0142 0.0096
## [651] 0.0093 0.0080 0.0200 0.0133 0.0099 0.0108 0.0241 0.0070 0.0121 0.0075
## [661] 0.0169 0.0500 0.0109 0.0256 0.0093 0.0107 0.0094 0.0080 0.0273 0.0271
## [671] 0.0168 0.0076 0.0084 0.0147 0.0117 0.0204 0.0086 0.0303 0.0105 0.0085
## [681] 0.0078 0.0088 0.0096 0.0132 0.0091 0.0080 0.0093 0.0229 0.0131 0.0283
## [691] 0.0310 0.0063 0.0167 0.0164 0.0241 0.0724 0.0096 0.0379 0.0502 0.0107
## [701] 0.0159 0.0072 0.0112 0.0196 0.0185 0.0067 0.0110 0.0105 0.0075 0.0094
## [711] 0.0114 0.0086 0.0108 0.0244 0.0119 0.2000 0.0202 0.0677 0.0262 0.0082
## [721] 0.0125 0.0263 0.0087 0.0088 0.1771 0.0101 0.0230 0.0084 0.0136 0.0127
## [731] 0.0100 0.0132 0.0328 0.0179 0.0196 0.0254 0.0294 0.0286 0.0085 0.0158
## [741] 0.0078 0.0106 0.0065 0.0066 0.0150 0.0116 0.0080 0.0210 0.0197 0.0367
## [751] 0.0181 0.0083 0.0179 0.0118 0.0500 0.0444 0.0129 0.0323 0.0097 0.0119
## [761] 0.0170 0.0095 0.3333 0.0154 0.0196 0.0081 0.0111 0.0086 0.1250 0.0227
## [771] 0.0642 0.0238 0.3425 0.0233 0.0333 0.0090 0.0182 0.0092 0.0169 0.0320
## [781] 0.0176 0.0134 0.0103 0.0189 0.0185 0.0129 0.0100 0.0107 0.0130 0.0204
## [791] 0.0091 0.0091 0.0157 0.0101 0.0114 0.0070 0.0094 0.0066 0.0063 0.0476
## [801] 0.0207 0.0072 0.0066 0.0084 0.0140 0.0208 0.0267 0.0357 0.0676 0.0175
## [811] 0.0143 0.0159 0.0217 0.0093 0.0543 0.0200 0.0847 0.0175 0.0138 0.0063
## [821] 0.0151 0.0063 0.0156 0.0080 0.0159 0.0066 0.0133 0.0256 0.0690 0.0366
## [831] 0.0139 0.0341 0.0136 0.0089 0.0112 0.0089 0.0156 0.0519 0.0086 0.0094
## [841] 0.0156 0.0066 0.0085 0.0253 0.0067 0.0588 0.0108 0.0206 0.0084 0.0074
## [851] 0.0086 0.0242 0.0089 0.0345 0.0294 0.0632 0.0526 0.0068 0.0173 0.0089
## [861] 0.0199 0.0105 0.0200 0.1842 0.0132 0.0149
outliers_UGDS_NHPI <- boxplot(projectdata$UGDS_NHPI, plot=FALSE)$out
projectdata_UGDS_NHPI_Trimmed <- projectdata[-which(projectdata$UGDS_NHPI %in% outliers_UGDS_NHPI),]
boxplot(projectdata_UGDS_NHPI_Trimmed$UGDS_NHPI)


###Question 2

Finding scatterplot matrix for the variables C150_4, SAT_AVG, UGDS_WHITE, PCTFLOAN

pairs(~ C150_4 + SAT_AVG + UGDS_WHITE + PCTFLOAN, data = projectdata, row1attop=FALSE)

The scatterplot matrix suggests that there might be a positive linear relation between C150_4 and SAT_AVG and a negative linear relation between SAT_AVG and PCTFLOAN. The relation of the former appears to be stronger. The other variables do not present a significant relation.

###Question 3 Findingd covariance matrix:

projectdata_matrix_quant<-data.matrix(projectdata[7:30], rownames.force = NA)
covar_matrix<-cov(projectdata_matrix_quant, y = projectdata_matrix_quant, use = "pairwise.complete.obs", method = "pearson")
print(covar_matrix)
##                   ADM_RATE        SAT_AVG            UGDS    UGDS_WHITE
## ADM_RATE       0.043651078     -9.1520820      -168.15310   0.005891843
## SAT_AVG       -9.152081974  17784.0838250    243791.75977   6.366862739
## UGDS        -168.153102500 243791.7597651  29581070.83881   6.762644608
## UGDS_WHITE     0.005891843      6.3668627         6.76264   0.082554181
## UGDS_BLACK    -0.001159646    -11.6270340      -119.05278  -0.032796138
## UGDS_HISP      0.000754761     -0.3110483        36.36321  -0.035446161
## UGDS_ASIAN    -0.002837813      3.7112738        52.37768  -0.004642505
## UGDS_AIAN      0.000320177     -0.3600650       -14.31268  -0.001798072
## UGDS_NHPI      0.000249786     -0.0712613        -2.65967  -0.000876311
## UGDS_2MOR     -0.000380094      0.7298147        17.13435  -0.000212685
## UGDS_NRA      -0.002637014      2.2638397        30.74179  -0.000971644
## UGDS_UNKN     -0.000202460     -0.7023401        -5.01941  -0.005300157
## PPTUG_EF       0.002788057     -5.8663104       217.06344  -0.001857259
## NPT4_PUB      22.287908711 154640.1284935   5346029.39297 260.162736663
## NPT4_PRIV   -162.639885169 314374.7201384   3119630.40280  81.207081643
## COSTT4_A    -746.514831982 925104.2174771 -13647207.01894 295.148815436
## TUITFTE     -215.481576282 467680.9591030  -4938152.91700 -53.860150986
## INEXPFTE    -669.256980891 735007.6231667   2801231.36981 228.723734132
## PFTFAC        -0.010183456      5.2872382        94.63431   0.014265993
## PCTPELL        0.010327329    -13.7517582      -319.31466  -0.022802708
## C150_4        -0.012291160     18.6214134       212.97709   0.013378217
## PFTFTUG1_EF   -0.002831299     10.4913317      -113.23971   0.005217475
## RET_FT4       -0.006093242     11.1577961       236.65239   0.011387344
## PCTFLOAN       0.004891957    -11.5521207      -281.30726   0.000890360
##                 UGDS_BLACK      UGDS_HISP    UGDS_ASIAN      UGDS_AIAN
## ADM_RATE      -0.001159646    0.000754761  -0.002837813    0.000320177
## SAT_AVG      -11.627033986   -0.311048251   3.711273821   -0.360064984
## UGDS        -119.052780239   36.363207141  52.377679715  -14.312681861
## UGDS_WHITE    -0.032796138   -0.035446161  -0.004642505   -0.001798072
## UGDS_BLACK     0.050492190   -0.010256731  -0.002185419   -0.001476804
## UGDS_HISP     -0.010256731    0.049550061   0.000552435   -0.000922533
## UGDS_ASIAN    -0.002185419    0.000552435   0.005654128   -0.000255289
## UGDS_AIAN     -0.001476804   -0.000922533  -0.000255289    0.004851592
## UGDS_NHPI     -0.000372339   -0.000116499   0.000258271   -0.000012265
## UGDS_2MOR     -0.000678691   -0.000615635   0.000250582   -0.000037438
## UGDS_NRA      -0.001361128   -0.000636489   0.000609475   -0.000118046
## UGDS_UNKN     -0.001174675   -0.001946117  -0.000207806   -0.000217406
## PPTUG_EF       0.001450254   -0.000362191   0.000718448    0.000702498
## NPT4_PUB     -24.657390012 -204.326024722   7.956597567  -50.327450792
## NPT4_PRIV    -48.454593345 -322.347557680  44.368208662  -19.539684747
## COSTT4_A    -189.039018340 -465.880822029 116.110053363 -120.299835993
## TUITFTE       55.258771677 -178.241323861  33.395045806  -53.976849602
## INEXPFTE    -177.902368858 -210.801636229  59.561253236    8.012348373
## PFTFAC        -0.008998135   -0.006329072  -0.000339975    0.001005123
## PCTPELL        0.017991040    0.010375593  -0.002565856    0.000186754
## C150_4        -0.013754691   -0.002437542   0.003741012   -0.001613990
## PFTFTUG1_EF   -0.003383117   -0.001987702  -0.000371023   -0.000431654
## RET_FT4       -0.012263766   -0.000788216   0.002978017   -0.000813807
## PCTFLOAN       0.013667358   -0.011181201  -0.003315926   -0.002840080
##                  UGDS_NHPI     UGDS_2MOR       UGDS_NRA      UGDS_UNKN
## ADM_RATE      0.0002497858 -0.0003800944  -0.0026370141  -0.0002024596
## SAT_AVG      -0.0712613221  0.7298147184   2.2638396660  -0.7023401403
## UGDS         -2.6596659709 17.1343532577  30.7417895619  -5.0194075289
## UGDS_WHITE   -0.0008763112 -0.0002126852  -0.0009716440  -0.0053001574
## UGDS_BLACK   -0.0003723387 -0.0006786910  -0.0013611281  -0.0011746753
## UGDS_HISP    -0.0001164985 -0.0006156348  -0.0006364885  -0.0019461172
## UGDS_ASIAN    0.0002582712  0.0002505822   0.0006094753  -0.0002078060
## UGDS_AIAN    -0.0000122650 -0.0000374380  -0.0001180458  -0.0002174059
## UGDS_NHPI     0.0010805755  0.0000674821  -0.0000104207  -0.0000139173
## UGDS_2MOR     0.0000674821  0.0009724945   0.0000467315   0.0002310853
## UGDS_NRA     -0.0000104207  0.0000467315   0.0024923038  -0.0000347095
## UGDS_UNKN    -0.0000139173  0.0002310853  -0.0000347095   0.0087089875
## PPTUG_EF      0.0000935616  0.0002083970  -0.0009876967   0.0000339962
## NPT4_PUB    -17.9221260879  5.1570790469  21.8246296769   2.1374055035
## NPT4_PRIV    -0.0893201556 43.8284227786  92.2463517510 127.9576167144
## COSTT4_A    -25.8975956709 38.5875710136 213.3744366696 133.4981393558
## TUITFTE      -2.3474377591 24.1699733818  74.0864556336  97.8220751594
## INEXPFTE     -0.5790653933 19.2105469156  73.9838572128  -0.9153065496
## PFTFAC       -0.0000151138  0.0001831850   0.0019258589  -0.0017790384
## PCTPELL       0.0002954484 -0.0007441574  -0.0031394392   0.0004440981
## C150_4       -0.0000527187  0.0006519264   0.0029001986  -0.0029494838
## PFTFTUG1_EF  -0.0000365025 -0.0000549873   0.0023148606  -0.0012676337
## RET_FT4      -0.0001330618  0.0004767810   0.0026655226  -0.0035090560
## PCTFLOAN     -0.0004383001  0.0004538808  -0.0013690732   0.0042744289
##                     PPTUG_EF       NPT4_PUB        NPT4_PRIV
## ADM_RATE        0.0027880573       22.28791     -162.6398852
## SAT_AVG        -5.8663104270   154640.12849   314374.7201384
## UGDS          217.0634360336  5346029.39297  3119630.4028008
## UGDS_WHITE     -0.0018572589      260.16274       81.2070816
## UGDS_BLACK      0.0014502540      -24.65739      -48.4545933
## UGDS_HISP      -0.0003621910     -204.32602     -322.3475577
## UGDS_ASIAN      0.0007184478        7.95660       44.3682087
## UGDS_AIAN       0.0007024979      -50.32745      -19.5396847
## UGDS_NHPI       0.0000935616      -17.92213       -0.0893202
## UGDS_2MOR       0.0002083970        5.15708       43.8284228
## UGDS_NRA       -0.0009876967       21.82463       92.2463518
## UGDS_UNKN       0.0000339962        2.13741      127.9576167
## PPTUG_EF        0.0607084636     -494.74628      -33.1514380
## NPT4_PUB     -494.7462837076 21805832.12310               NA
## NPT4_PRIV     -33.1514380193             NA 52883812.8273432
## COSTT4_A    -1464.0351911282 20420278.27211 60259561.8222371
## TUITFTE      -214.9796535216 11215184.60953 29011477.6565112
## INEXPFTE      -79.9221479595  6239926.76979 12658912.3186321
## PFTFAC         -0.0175401442      292.43019      -69.2458067
## PCTPELL        -0.0073582758      -82.78810     -499.5590628
## C150_4         -0.0171589736      450.89530      520.9487913
## PFTFTUG1_EF    -0.0427906546      359.34794      236.9126486
## RET_FT4        -0.0137613119      161.13032      293.7806259
## PCTFLOAN       -0.0160746129      729.64347      525.8665706
##                   COSTT4_A         TUITFTE         INEXPFTE         PFTFAC
## ADM_RATE         -746.5148      -215.48158      -669.256981  -0.0101834565
## SAT_AVG        925104.2175    467680.95910    735007.623167   5.2872381713
## UGDS        -13647207.0189  -4938152.91700   2801231.369813  94.6343055005
## UGDS_WHITE        295.1488       -53.86015       228.723734   0.0142659935
## UGDS_BLACK       -189.0390        55.25877      -177.902369  -0.0089981355
## UGDS_HISP        -465.8808      -178.24132      -210.801636  -0.0063290716
## UGDS_ASIAN        116.1101        33.39505        59.561253  -0.0003399748
## UGDS_AIAN        -120.2998       -53.97685         8.012348   0.0010051229
## UGDS_NHPI         -25.8976        -2.34744        -0.579065  -0.0000151138
## UGDS_2MOR          38.5876        24.16997        19.210547   0.0001831850
## UGDS_NRA          213.3744        74.08646        73.983857   0.0019258589
## UGDS_UNKN         133.4981        97.82208        -0.915307  -0.0017790384
## PPTUG_EF        -1464.0352      -214.97965       -79.922148  -0.0175401442
## NPT4_PUB     20420278.2721  11215184.60953   6239926.769789 292.4301896586
## NPT4_PRIV    60259561.8222  29011477.65651  12658912.318632 -69.2458066978
## COSTT4_A    162884718.9414  75891018.51725  38078362.824948 498.6624125201
## TUITFTE      75891018.5173 301920777.61712 161489240.074271 202.8546875181
## INEXPFTE     38078362.8249 161489240.07427 161959774.003822 732.1314752178
## PFTFAC            498.6624       202.85469       732.131475   0.0937981354
## PCTPELL          -686.4275      -208.65408      -566.606051  -0.0172571941
## C150_4           1501.5868       555.91617       763.661161   0.0186312462
## PFTFTUG1_EF      1522.7458       746.49789       450.675307   0.0170839536
## RET_FT4           809.5945       266.67038       530.979354   0.0159396898
## PCTFLOAN         1551.4418       664.30383      -178.500531  -0.0036342407
##                    PCTPELL          C150_4     PFTFTUG1_EF       RET_FT4
## ADM_RATE       0.010327329   -0.0122911600   -0.0028312987  -0.006093242
## SAT_AVG      -13.751758181   18.6214133765   10.4913316735  11.157796086
## UGDS        -319.314661688  212.9770901298 -113.2397061814 236.652385191
## UGDS_WHITE    -0.022802708    0.0133782167    0.0052174752   0.011387344
## UGDS_BLACK     0.017991040   -0.0137546905   -0.0033831173  -0.012263766
## UGDS_HISP      0.010375593   -0.0024375422   -0.0019877022  -0.000788216
## UGDS_ASIAN    -0.002565856    0.0037410123   -0.0003710226   0.002978017
## UGDS_AIAN      0.000186754   -0.0016139900   -0.0004316542  -0.000813807
## UGDS_NHPI      0.000295448   -0.0000527187   -0.0000365025  -0.000133062
## UGDS_2MOR     -0.000744157    0.0006519264   -0.0000549873   0.000476781
## UGDS_NRA      -0.003139439    0.0029001986    0.0023148606   0.002665523
## UGDS_UNKN      0.000444098   -0.0029494838   -0.0012676337  -0.003509056
## PPTUG_EF      -0.007358276   -0.0171589736   -0.0427906546  -0.013761312
## NPT4_PUB     -82.788101963  450.8952965380  359.3479413797 161.130321649
## NPT4_PRIV   -499.559062819  520.9487912600  236.9126486444 293.780625852
## COSTT4_A    -686.427546284 1501.5867861709 1522.7458450175 809.594518252
## TUITFTE     -208.654081105  555.9161716995  746.4978857114 266.670382196
## INEXPFTE    -566.606050543  763.6611606261  450.6753065749 530.979353542
## PFTFAC        -0.017257194    0.0186312462    0.0170839536   0.015939690
## PCTPELL        0.051049405   -0.0229234103   -0.0014985256  -0.018235528
## C150_4        -0.022923410    0.0457005844    0.0217898121   0.023206013
## PFTFTUG1_EF   -0.001498526    0.0217898121    0.0661155441   0.018617523
## RET_FT4       -0.018235528    0.0232060134    0.0186175229   0.038276927
## PCTFLOAN       0.026869500   -0.0044591631    0.0156008719  -0.010957764
##                   PCTFLOAN
## ADM_RATE       0.004891957
## SAT_AVG      -11.552120720
## UGDS        -281.307264777
## UGDS_WHITE     0.000890360
## UGDS_BLACK     0.013667358
## UGDS_HISP     -0.011181201
## UGDS_ASIAN    -0.003315926
## UGDS_AIAN     -0.002840080
## UGDS_NHPI     -0.000438300
## UGDS_2MOR      0.000453881
## UGDS_NRA      -0.001369073
## UGDS_UNKN      0.004274429
## PPTUG_EF      -0.016074613
## NPT4_PUB     729.643474564
## NPT4_PRIV    525.866570647
## COSTT4_A    1551.441788778
## TUITFTE      664.303830814
## INEXPFTE    -178.500530954
## PFTFAC        -0.003634241
## PCTPELL        0.026869500
## C150_4        -0.004459163
## PFTFTUG1_EF    0.015600872
## RET_FT4       -0.010957764
## PCTFLOAN       0.080706267

Finding corelation matrix:

correlation_matrix<-cor(projectdata_matrix_quant, y = projectdata_matrix_quant, use = "pairwise.complete.obs", method = "pearson")
print(correlation_matrix)
##               ADM_RATE    SAT_AVG        UGDS  UGDS_WHITE UGDS_BLACK
## ADM_RATE     1.0000000 -0.3543624 -0.12727960  0.11168300 -0.0290549
## SAT_AVG     -0.3543624  1.0000000  0.24961110  0.21669481 -0.4694846
## UGDS        -0.1272796  0.2496111  1.00000000  0.00432753 -0.0974138
## UGDS_WHITE   0.1116830  0.2166948  0.00432753  1.00000000 -0.5079735
## UGDS_BLACK  -0.0290549 -0.4694846 -0.09741380 -0.50797351  1.0000000
## UGDS_HISP    0.0203580 -0.0207152  0.03003538 -0.55421410 -0.2050571
## UGDS_ASIAN  -0.2185492  0.4657344  0.12807268 -0.21488201 -0.1293421
## UGDS_AIAN    0.0465511 -0.0835808 -0.03778087 -0.08984532 -0.0943558
## UGDS_NHPI    0.0312677 -0.0727155 -0.01487622 -0.09278148 -0.0504079
## UGDS_2MOR   -0.0712327  0.2437690  0.10102229 -0.02373690 -0.0968537
## UGDS_NRA    -0.2261192  0.3609904  0.11321962 -0.06773875 -0.1213351
## UGDS_UNKN   -0.0153158 -0.1044088 -0.00988921 -0.19766756 -0.0560173
## PPTUG_EF     0.0790390 -0.3470369  0.16177720 -0.02627129  0.0261768
## NPT4_PUB     0.0278255  0.3416979  0.14680391  0.21739600 -0.0298171
## NPT4_PRIV   -0.1054266  0.3629225  0.13204619  0.03835043 -0.0278392
## COSTT4_A    -0.2775023  0.5211713 -0.15917285  0.08629270 -0.0715568
## TUITFTE     -0.1414303  0.5252813 -0.05231641 -0.01081433  0.0141871
## INEXPFTE    -0.4089865  0.6351096  0.04265807  0.06601190 -0.0656525
## PFTFAC      -0.1725667  0.1689907  0.04540840  0.17884731 -0.1470991
## PCTPELL      0.2448835 -0.7122496 -0.25943897 -0.35141655  0.3540659
## C150_4      -0.3201865  0.8014783  0.13824336  0.23617301 -0.3172370
## PFTFTUG1_EF -0.0590828  0.3992876 -0.06883520  0.07690362 -0.0645547
## RET_FT4     -0.2102650  0.7419929  0.16824880  0.21713437 -0.3053423
## PCTFLOAN     0.1076242 -0.5103375 -0.18178246  0.01091330  0.2139277
##               UGDS_HISP UGDS_ASIAN  UGDS_AIAN    UGDS_NHPI   UGDS_2MOR
## ADM_RATE     0.02035797 -0.2185492  0.0465511  0.031267683 -0.07123272
## SAT_AVG     -0.02071523  0.4657344 -0.0835808 -0.072715506  0.24376903
## UGDS         0.03003538  0.1280727 -0.0377809 -0.014876220  0.10102229
## UGDS_WHITE  -0.55421410 -0.2148820 -0.0898453 -0.092781475 -0.02373690
## UGDS_BLACK  -0.20505708 -0.1293421 -0.0943558 -0.050407889 -0.09685375
## UGDS_HISP    1.00000000  0.0330047 -0.0595001 -0.015921018 -0.08868649
## UGDS_ASIAN   0.03300474  1.0000000 -0.0487424  0.104487793  0.10686212
## UGDS_AIAN   -0.05950010 -0.0487424  1.0000000 -0.005356698 -0.01723561
## UGDS_NHPI   -0.01592102  0.1044878 -0.0053567  1.000000000  0.06582903
## UGDS_2MOR   -0.08868649  0.1068621 -0.0172356  0.065829028  1.00000000
## UGDS_NRA    -0.05727538  0.1623577 -0.0339475 -0.006349936  0.03001691
## UGDS_UNKN   -0.09368349 -0.0296136 -0.0334461 -0.004536746  0.07940446
## PPTUG_EF    -0.00659837  0.0391123  0.0408755  0.011534641  0.02711095
## NPT4_PUB    -0.24145384  0.0294243 -0.0978197 -0.077583855  0.03663669
## NPT4_PRIV   -0.18494949  0.0824256 -0.0605881 -0.000501632  0.19581515
## COSTT4_A    -0.18425313  0.1582516 -0.1107736 -0.050629060  0.10983277
## TUITFTE     -0.04620914  0.0257312 -0.0446480 -0.004114318  0.04468052
## INEXPFTE    -0.07855471  0.0659662  0.0095265 -0.001458849  0.05104587
## PFTFAC      -0.10325707 -0.0193071  0.0380120 -0.001318111  0.02074955
## PCTPELL      0.20614434 -0.1522731  0.0118459  0.039706884 -0.10555957
## C150_4      -0.06106946  0.2888132 -0.1082722 -0.007114467  0.12051172
## PFTFTUG1_EF -0.04066460 -0.0246686 -0.0186501 -0.003444423 -0.00792163
## RET_FT4     -0.02128777  0.2524328 -0.0626292 -0.018949487  0.09734256
## PCTFLOAN    -0.17668579 -0.1565130 -0.1432786 -0.046850088  0.05120700
##                UGDS_NRA    UGDS_UNKN    PPTUG_EF   NPT4_PUB    NPT4_PRIV
## ADM_RATE    -0.22611922 -0.015315791  0.07903904  0.0278255 -0.105426630
## SAT_AVG      0.36099040 -0.104408770 -0.34703695  0.3416979  0.362922516
## UGDS         0.11321962 -0.009889212  0.16177720  0.1468039  0.132046190
## UGDS_WHITE  -0.06773875 -0.197667558 -0.02627129  0.2173960  0.038350432
## UGDS_BLACK  -0.12133509 -0.056017266  0.02617678 -0.0298171 -0.027839186
## UGDS_HISP   -0.05727538 -0.093683489 -0.00659837 -0.2414538 -0.184949487
## UGDS_ASIAN   0.16235768 -0.029613622  0.03911231  0.0294243  0.082425571
## UGDS_AIAN   -0.03394752 -0.033446091  0.04087551 -0.0978197 -0.060588083
## UGDS_NHPI   -0.00634994 -0.004536746  0.01153464 -0.0775839 -0.000501632
## UGDS_2MOR    0.03001691  0.079404465  0.02711095  0.0366367  0.195815154
## UGDS_NRA     1.00000000 -0.007450126 -0.08280563  0.1859134  0.247282716
## UGDS_UNKN   -0.00745013  1.000000000  0.00147759  0.0119817  0.166799674
## PPTUG_EF    -0.08280563  0.001477585  1.00000000 -0.4557973 -0.022019218
## NPT4_PUB     0.18591339  0.011981654 -0.45579731  1.0000000           NA
## NPT4_PRIV    0.24728272  0.166799674 -0.02201922         NA  1.000000000
## COSTT4_A     0.31615029  0.115186024 -0.48886305  0.9227081  0.708253429
## TUITFTE      0.08550662  0.060405858 -0.05028508  0.5806977  0.552026283
## INEXPFTE     0.12273734 -0.000812433 -0.02684930  0.2999074  0.269072986
## PFTFAC       0.11459295 -0.076987123 -0.23922490  0.2348420 -0.031521872
## PCTPELL     -0.28692824  0.021042973 -0.13224343 -0.1060222 -0.308497254
## C150_4       0.23192301 -0.144386225 -0.40474959  0.5679805  0.336135199
## PFTFTUG1_EF  0.15824294 -0.063471918 -0.71555914  0.4000409  0.121247282
## RET_FT4      0.21675128 -0.193997676 -0.36163965  0.3238603  0.190148653
## PCTFLOAN    -0.09951822  0.161087155 -0.22986218  0.6133979  0.291874643
##               COSTT4_A     TUITFTE     INEXPFTE      PFTFAC    PCTPELL
## ADM_RATE    -0.2775023 -0.14143032 -0.408986460 -0.17256667  0.2448835
## SAT_AVG      0.5211713  0.52528127  0.635109575  0.16899071 -0.7122496
## UGDS        -0.1591728 -0.05231641  0.042658066  0.04540840 -0.2594390
## UGDS_WHITE   0.0862927 -0.01081433  0.066011899  0.17884731 -0.3514166
## UGDS_BLACK  -0.0715568  0.01418705 -0.065652524 -0.14709910  0.3540659
## UGDS_HISP   -0.1842531 -0.04620914 -0.078554714 -0.10325707  0.2061443
## UGDS_ASIAN   0.1582516  0.02573125  0.065966199 -0.01930712 -0.1522731
## UGDS_AIAN   -0.1107736 -0.04464802  0.009526498  0.03801198  0.0118459
## UGDS_NHPI   -0.0506291 -0.00411432 -0.001458849 -0.00131811  0.0397069
## UGDS_2MOR    0.1098328  0.04468052  0.051045871  0.02074955 -0.1055596
## UGDS_NRA     0.3161503  0.08550662  0.122737340  0.11459295 -0.2869282
## UGDS_UNKN    0.1151860  0.06040586 -0.000812433 -0.07698712  0.0210430
## PPTUG_EF    -0.4888631 -0.05028508 -0.026849302 -0.23922490 -0.1322434
## NPT4_PUB     0.9227081  0.58069767  0.299907392  0.23484204 -0.1060222
## NPT4_PRIV    0.7082534  0.55202628  0.269072986 -0.03152187 -0.3084973
## COSTT4_A     1.0000000  0.76021421  0.455454439  0.12116344 -0.2527832
## TUITFTE      0.7602142  1.00000000  0.730286898  0.03024187 -0.0532675
## INEXPFTE     0.4554544  0.73028690  1.000000000  0.15337005 -0.2077151
## PFTFAC       0.1211634  0.03024187  0.153370047  1.00000000 -0.2773455
## PCTPELL     -0.2527832 -0.05326752 -0.207715114 -0.27734550  1.0000000
## C150_4       0.5508938  0.37689253  0.470649399  0.29737479 -0.5196974
## PFTFTUG1_EF  0.4500205  0.36173404  0.266256095  0.22371045 -0.0282553
## RET_FT4      0.3208939  0.19783707  0.350911976  0.30244389 -0.4661687
## PCTFLOAN     0.4346896  0.13489866 -0.052051315 -0.04399367  0.4186110
##                  C150_4 PFTFTUG1_EF    RET_FT4   PCTFLOAN
## ADM_RATE    -0.32018650 -0.05908279 -0.2102650  0.1076242
## SAT_AVG      0.80147835  0.39928761  0.7419929 -0.5103375
## UGDS         0.13824336 -0.06883520  0.1682488 -0.1817825
## UGDS_WHITE   0.23617301  0.07690362  0.2171344  0.0109133
## UGDS_BLACK  -0.31723700 -0.06455469 -0.3053423  0.2139277
## UGDS_HISP   -0.06106946 -0.04066460 -0.0212878 -0.1766858
## UGDS_ASIAN   0.28881316 -0.02466865  0.2524328 -0.1565130
## UGDS_AIAN   -0.10827221 -0.01865012 -0.0626292 -0.1432786
## UGDS_NHPI   -0.00711447 -0.00344442 -0.0189495 -0.0468501
## UGDS_2MOR    0.12051172 -0.00792163  0.0973426  0.0512070
## UGDS_NRA     0.23192301  0.15824294  0.2167513 -0.0995182
## UGDS_UNKN   -0.14438622 -0.06347192 -0.1939977  0.1610872
## PPTUG_EF    -0.40474959 -0.71555914 -0.3616396 -0.2298622
## NPT4_PUB     0.56798052  0.40004087  0.3238603  0.6133979
## NPT4_PRIV    0.33613520  0.12124728  0.1901487  0.2918746
## COSTT4_A     0.55089376  0.45002045  0.3208939  0.4346896
## TUITFTE      0.37689253  0.36173404  0.1978371  0.1348987
## INEXPFTE     0.47064940  0.26625610  0.3509120 -0.0520513
## PFTFAC       0.29737479  0.22371045  0.3024439 -0.0439937
## PCTPELL     -0.51969735 -0.02825530 -0.4661687  0.4186110
## C150_4       1.00000000  0.42646167  0.5810345 -0.0934037
## PFTFTUG1_EF  0.42646167  1.00000000  0.4135862  0.2171052
## RET_FT4      0.58103449  0.41358619  1.0000000 -0.2543560
## PCTFLOAN    -0.09340366  0.21710522 -0.2543560  1.0000000
correlation_matrix_round<-round(cor(projectdata_matrix_quant, y = projectdata_matrix_quant, use = "pairwise.complete.obs", method = "pearson"),2)
library(corrplot)
## corrplot 0.84 loaded
corrplot(correlation_matrix_round, method="circle")

Could not determine the variables with significant correlations in R, so used Excel to determine the correlations above 0.3 and below -0.3. -SAT_AVE has a correlation with UGDS_ASIAN (0.47), UGDS_NRA (0.36), NPT4_PUB (0.34), NPT4_PRIV (0.36), COSTT4_A (0.52), TUITFTE (0.53), INEXPFTE (0.64), C150_4 (0.8), PFTFTUG1_EF (0.4) and RET_FT4 (0.74). -UGDS_WHITE has a significant correlation with PCTPELL(0.35), C150_4 (-0.32). -UGDS_NRA has a significant correlation with COSTT4_A(0.32). -NPT4_PUB has a significant correlation with COSTT4_A(0.92) and TUITFTE (0.58); C150_4(0.57), PFTFTUG1_EF (0.4), RET_FT4(0.32), PCTFLOAN (0.61). -NPT4_PRIV has a significant correlation with COSTT4_A (0.71), TUITFTE (0.55), C150_4 (0.34). -COSTT4_A has a significant correlation with TUITFTE (0.76), INEXPFTE (0.46), C150_4 (0.55), PFTFTUG1_EF (0.45), RET_FT4 (0.32), PCTFLOAN (0.43). -TUITFE has a significant correlation with INEXPFTE (0.73), C150_4 (0.38), PFTFTUG1_EF (0.36). -INEXPFTEC has a significant correlation with 150_4 (0.47), RET_FT4 (0.35). -PCTPELL has a significant correlation with PCTFLOAN (0.42).

The blank locations in the correlation matrix visualization indicate that there is no significant relation.

###Question 4 strongest relation: INEXPFTE and TUITFE have the largest covariance 161489240.1. The correlation matrix indicates that their correlation score is 0.73028690.

weakest relation UGDS_NRA and UGDS_NHPI have the smallest covariance as -0.0000104207. The correlation matrix indicates that their correlation score is -0.006349936.

###Question 5 Grouping the schools:

projectdata_public <- subset(projectdata , CONTROL == "1") 
projectdata_private_np <- subset(projectdata , CONTROL == "2") 
projectdata_private_fp <- subset(projectdata , CONTROL == "3")

Running the descriptives for groups of schools: -Descriptives for public schools:

options(digits=6)
options(scipen = 999)
descriptives_public<-describeBy(projectdata_public[,5:30], na.rm = TRUE)
## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning
## Inf
## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning
## -Inf
## Warning in describeBy(projectdata_public[, 5:30], na.rm = TRUE): no
## grouping variable requested
print(descriptives_public)
##             vars    n     mean       sd   median  trimmed     mad      min
## CONTROL        1 2044     1.00     0.00     1.00     1.00    0.00     1.00
## CCBASIC*       2 2044    15.73    11.95    13.00    15.29   16.31     1.00
## ADM_RATE       3  636     0.69     0.18     0.71     0.70    0.19     0.16
## SAT_AVG        4  516  1038.56   111.64  1029.00  1032.59  102.30   774.00
## UGDS           5 1970  6032.04  7745.08  3211.50  4462.04 4077.89     5.00
## UGDS_WHITE     6 1970     0.59     0.26     0.64     0.61    0.25     0.00
## UGDS_BLACK     7 1970     0.14     0.18     0.08     0.10    0.09     0.00
## UGDS_HISP      8 1970     0.14     0.18     0.06     0.10    0.07     0.00
## UGDS_ASIAN     9 1970     0.03     0.06     0.01     0.02    0.02     0.00
## UGDS_AIAN     10 1970     0.03     0.11     0.00     0.01    0.00     0.00
## UGDS_NHPI     11 1970     0.01     0.05     0.00     0.00    0.00     0.00
## UGDS_2MOR     12 1970     0.03     0.03     0.02     0.02    0.02     0.00
## UGDS_NRA      13 1970     0.01     0.02     0.00     0.01    0.01     0.00
## UGDS_UNKN     14 1970     0.03     0.04     0.01     0.02    0.02     0.00
## PPTUG_EF      15 1970     0.35     0.24     0.35     0.35    0.31     0.00
## NPT4_PUB      16 1911  9624.66  4669.67  8751.00  9341.96 4293.61 -2434.00
## NPT4_PRIV     17    0      NaN       NA       NA      NaN      NA      Inf
## COSTT4_A      18 1619 14922.46  4956.71 13646.00 14485.10 4318.81  4610.00
## TUITFTE       19 1984  4505.03  5395.95  3119.50  3726.21 2775.43     9.00
## INEXPFTE      20 1984  8340.32 11578.61  6332.50  6743.12 2484.10     0.00
## PFTFAC        21 1627     0.60     0.28     0.57     0.60    0.37     0.00
## PCTPELL       22 1968     0.42     0.17     0.40     0.41    0.16     0.00
## C150_4        23  668     0.46     0.18     0.44     0.45    0.18     0.04
## PFTFTUG1_EF   24 1597     0.45     0.20     0.43     0.44    0.23     0.02
## RET_FT4       25  624     0.74     0.12     0.75     0.75    0.11     0.00
## PCTFLOAN      26 1968     0.32     0.26     0.31     0.30    0.34     0.00
##                   max     range  skew kurtosis     se
## CONTROL          1.00      0.00   NaN      NaN   0.00
## CCBASIC*        35.00     34.00  0.28    -1.40   0.26
## ADM_RATE         1.00      0.84 -0.44    -0.36   0.01
## SAT_AVG       1400.00    626.00  0.53     0.27   4.91
## UGDS         77657.00  77652.00  2.60    10.47 174.50
## UGDS_WHITE       1.00      1.00 -0.66    -0.44   0.01
## UGDS_BLACK       0.96      0.96  2.36     6.11   0.00
## UGDS_HISP        1.00      1.00  2.41     6.36   0.00
## UGDS_ASIAN       0.44      0.44  3.56    15.03   0.00
## UGDS_AIAN        1.00      1.00  7.12    53.41   0.00
## UGDS_NHPI        1.00      1.00 17.81   335.12   0.00
## UGDS_2MOR        0.43      0.43  5.46    51.55   0.00
## UGDS_NRA         0.29      0.29  3.53    19.80   0.00
## UGDS_UNKN        0.44      0.44  3.67    21.92   0.00
## PPTUG_EF         1.00      1.00  0.17    -1.08   0.01
## NPT4_PUB     28201.00  30635.00  0.61     0.10 106.82
## NPT4_PRIV        -Inf      -Inf    NA       NA     NA
## COSTT4_A     33826.00  29216.00  0.86     0.43 123.19
## TUITFTE     109761.00 109752.00  8.62   131.16 121.14
## INEXPFTE    357489.00 357489.00 16.72   435.66 259.95
## PFTFAC           1.00      1.00  0.14    -1.35   0.01
## PCTPELL          1.00      1.00  0.57     0.40   0.00
## C150_4           0.94      0.90  0.24    -0.35   0.01
## PFTFTUG1_EF      1.00      0.98  0.24    -0.77   0.01
## RET_FT4          1.00      1.00 -0.79     2.74   0.00
## PCTFLOAN         1.00      1.00  0.33    -0.91   0.01

-Descriptives for Private non-profit schools:

descriptives_private_np<-describeBy(projectdata_private_np[,5:30], na.rm = TRUE)
## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning
## Inf
## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning
## -Inf
## Warning in describeBy(projectdata_private_np[, 5:30], na.rm = TRUE): no
## grouping variable requested
print(descriptives_private_np)
##             vars    n     mean       sd   median  trimmed      mad  min
## CONTROL        1 1956     2.00     0.00     2.00     2.00     0.00    2
## CCBASIC*       2 1956    15.04     7.52    15.00    14.89     5.93    1
## ADM_RATE       3 1177     0.65     0.21     0.67     0.66     0.20    0
## SAT_AVG        4  782  1073.09   144.36  1049.00  1060.34   109.71  720
## UGDS           5 1636  1694.12  3054.70   930.50  1133.83  1146.79    0
## UGDS_WHITE     6 1636     0.57     0.28     0.64     0.60     0.25    0
## UGDS_BLACK     7 1636     0.14     0.20     0.06     0.09     0.07    0
## UGDS_HISP      8 1636     0.12     0.21     0.06     0.07     0.06    0
## UGDS_ASIAN     9 1636     0.04     0.07     0.02     0.02     0.02    0
## UGDS_AIAN     10 1636     0.01     0.07     0.00     0.00     0.00    0
## UGDS_NHPI     11 1636     0.00     0.03     0.00     0.00     0.00    0
## UGDS_2MOR     12 1636     0.02     0.03     0.02     0.02     0.03    0
## UGDS_NRA      13 1636     0.04     0.08     0.01     0.02     0.02    0
## UGDS_UNKN     14 1636     0.05     0.07     0.02     0.03     0.03    0
## PPTUG_EF      15 1623     0.15     0.22     0.06     0.10     0.09    0
## NPT4_PUB      16    0      NaN       NA       NA      NaN       NA  Inf
## NPT4_PRIV     17 1477 20266.36  7663.64 20045.00 20210.20  6932.64 1881
## COSTT4_A      18 1396 35607.98 13581.30 35299.00 35346.38 14768.18 6428
## TUITFTE       19 1872 15178.31 30872.95 13017.50 13641.76  6831.08    0
## INEXPFTE      20 1872 11332.52 20326.59  8285.00  9015.70  4754.70    0
## PFTFAC        21 1521     0.66     0.29     0.67     0.68     0.41    0
## PCTPELL       22 1623     0.42     0.21     0.40     0.41     0.20    0
## C150_4        23 1274     0.54     0.21     0.54     0.54     0.21    0
## PFTFTUG1_EF   24 1321     0.63     0.25     0.68     0.65     0.26    0
## RET_FT4       25 1272     0.74     0.17     0.76     0.75     0.14    0
## PCTFLOAN      26 1623     0.56     0.26     0.63     0.59     0.20    0
##                    max      range  skew kurtosis     se
## CONTROL           2.00       0.00   NaN      NaN   0.00
## CCBASIC*         35.00      34.00  0.33     0.86   0.17
## ADM_RATE          1.00       1.00 -0.48    -0.02   0.01
## SAT_AVG        1545.00     825.00  0.81     0.88   5.16
## UGDS          49340.00   49340.00  7.37    83.70  75.52
## UGDS_WHITE        1.00       1.00 -0.67    -0.47   0.01
## UGDS_BLACK        1.00       1.00  2.63     6.78   0.01
## UGDS_HISP         1.00       1.00  3.22    10.19   0.01
## UGDS_ASIAN        0.95       0.95  5.22    42.56   0.00
## UGDS_AIAN         0.95       0.95 11.74   142.12   0.00
## UGDS_NHPI         0.95       0.95 29.87  1036.52   0.00
## UGDS_2MOR         0.53       0.53  5.22    73.88   0.00
## UGDS_NRA          0.93       0.93  5.20    40.78   0.00
## UGDS_UNKN         0.71       0.71  3.60    20.53   0.00
## PPTUG_EF          1.00       1.00  2.06     4.06   0.01
## NPT4_PUB          -Inf       -Inf    NA       NA     NA
## NPT4_PRIV     46509.00   44628.00  0.15     0.15 199.41
## COSTT4_A      64988.00   58560.00  0.12    -0.78 363.50
## TUITFTE     1292154.00 1292154.00 37.87  1560.77 713.55
## INEXPFTE     735077.00  735077.00 25.19   861.92 469.80
## PFTFAC            1.00       1.00 -0.34    -1.11   0.01
## PCTPELL           1.00       1.00  0.52    -0.23   0.01
## C150_4            1.00       1.00 -0.17    -0.26   0.01
## PFTFTUG1_EF       1.00       1.00 -0.62    -0.45   0.01
## RET_FT4           1.00       1.00 -1.30     3.21   0.00
## PCTFLOAN          1.00       1.00 -0.90    -0.07   0.01

-Descriptives for Private for-profit schools:

descriptives_private_fp<-describeBy(projectdata_private_fp[,5:30], na.rm = TRUE)
## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning
## Inf
## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning
## -Inf
## Warning in describeBy(projectdata_private_fp[, 5:30], na.rm = TRUE): no
## grouping variable requested
print(descriptives_private_fp)
##             vars    n     mean      sd   median  trimmed     mad    min
## CONTROL        1 3703     3.00    0.00     3.00     3.00    0.00    3.0
## CCBASIC*       2 3703     8.74   11.86     1.00     6.46    0.00    1.0
## ADM_RATE       3  385     0.83    0.17     0.88     0.86    0.15    0.1
## SAT_AVG        4    6   995.67  129.57   971.50   995.67  103.04  855.0
## UGDS           5 3384   486.73 3180.20   161.00   219.51  166.05    0.0
## UGDS_WHITE     6 3384     0.43    0.29     0.42     0.43    0.36    0.0
## UGDS_BLACK     7 3384     0.24    0.25     0.16     0.20    0.19    0.0
## UGDS_HISP      8 3384     0.20    0.25     0.09     0.14    0.12    0.0
## UGDS_ASIAN     9 3384     0.03    0.09     0.01     0.01    0.02    0.0
## UGDS_AIAN     10 3384     0.01    0.02     0.00     0.00    0.00    0.0
## UGDS_NHPI     11 3384     0.00    0.02     0.00     0.00    0.00    0.0
## UGDS_2MOR     12 3384     0.02    0.03     0.01     0.02    0.02    0.0
## UGDS_NRA      13 3384     0.01    0.04     0.00     0.00    0.00    0.0
## UGDS_UNKN     14 3384     0.06    0.12     0.01     0.02    0.02    0.0
## PPTUG_EF      15 3376     0.19    0.24     0.08     0.14    0.11    0.0
## NPT4_PUB      16    0      NaN      NA       NA      NaN      NA    Inf
## NPT4_PRIV     17 3211 17293.57 6886.68 17346.00 17135.56 6763.62 -581.0
## COSTT4_A      18 1015 25902.30 6036.42 25962.00 25912.42 4213.55 8160.0
## TUITFTE       19 3414 11207.99 8380.16  9899.00 10430.98 4869.60    0.0
## INEXPFTE      20 3414  4612.49 4876.88  3701.00  4031.71 2258.00    0.0
## PFTFAC        21  897     0.35    0.28     0.26     0.31    0.22    0.0
## PCTPELL       22 3375     0.65    0.20     0.67     0.66    0.19    0.0
## C150_4        23  539     0.37    0.20     0.34     0.35    0.17    0.0
## PFTFTUG1_EF   24  746     0.52    0.31     0.53     0.52    0.40    0.0
## RET_FT4       25  397     0.54    0.28     0.52     0.55    0.26    0.0
## PCTFLOAN      26 3375     0.62    0.25     0.67     0.65    0.21    0.0
##                   max     range  skew kurtosis     se
## CONTROL          3.00      0.00   NaN      NaN   0.00
## CCBASIC*        35.00     34.00  1.25     0.01   0.19
## ADM_RATE         1.00      0.90 -1.48     2.41   0.01
## SAT_AVG       1211.00    356.00  0.48    -1.44  52.90
## UGDS        151558.00 151558.00 34.90  1540.42  54.67
## UGDS_WHITE       1.00      1.00  0.16    -1.14   0.00
## UGDS_BLACK       1.00      1.00  1.23     0.74   0.00
## UGDS_HISP        1.00      1.00  1.80     2.72   0.00
## UGDS_ASIAN       0.97      0.97  6.86    56.74   0.00
## UGDS_AIAN        0.40      0.40  7.45    82.40   0.00
## UGDS_NHPI        0.67      0.67 18.44   443.00   0.00
## UGDS_2MOR        0.44      0.44  3.21    20.21   0.00
## UGDS_NRA         0.87      0.87 13.05   209.30   0.00
## UGDS_UNKN        0.90      0.90  3.57    14.47   0.00
## PPTUG_EF         1.00      1.00  1.36     1.38   0.00
## NPT4_PUB         -Inf      -Inf    NA       NA     NA
## NPT4_PRIV    89406.00  89987.00  1.06     7.33 121.53
## COSTT4_A     79212.00  71052.00  0.71     7.16 189.47
## TUITFTE     248996.00 248996.00 10.05   229.30 143.42
## INEXPFTE    199372.00 199372.00 20.23   753.35  83.47
## PFTFAC           1.00      1.00  1.03     0.09   0.01
## PCTPELL          1.00      1.00 -0.69     0.39   0.00
## C150_4           1.00      1.00  0.75     0.75   0.01
## PFTFTUG1_EF      1.00      1.00 -0.12    -1.27   0.01
## RET_FT4          1.00      1.00 -0.11    -0.49   0.01
## PCTFLOAN         1.00      1.00 -1.02     0.52   0.00

Compared to the public schools, private schools have much larger skewness by variables of TUITFE, INEXPFTE. The average Net tuition revenue per full-time equivalent student (TUITFE) and Instructional expenditures per full-time equivalent student (INEXPFTE) is higher in non-profit private schools than in for-profit private schools.

The overall distribution of variables is closer to normal distribution than in the ungrouped distribution of schools. Average UGDS decreases from public to non-profit and for-profit schools. Average ADM_RATE is similar between public and non-profit schools.

###Question 6

-Finding scatterplot matrix for the variables C150_4, SAT_AVG, UGDS_WHITE, PCTFLOAN for public schools.

pairs(~ C150_4 + SAT_AVG + UGDS_WHITE + PCTFLOAN, data = projectdata_public, row1attop=FALSE)

The variables C150_4 and SAT_AVG display a positive linear relation, while the SAT_AVG and PCFTLOAN display a negative linear relation. These results are similar to the ungrouped scatterplot. The variablles SAT_AVG and UGDS_WHITE display a non-linear relation.

-Finding scatterplot matrix for the variables C150_4, SAT_AVG, UGDS_WHITE, PCTFLOAN for non-profit private schools.

pairs(~ C150_4 + SAT_AVG + UGDS_WHITE + PCTFLOAN, data = projectdata_private_np, row1attop=FALSE)

The variables C150_4 and SAT_AVG display a positive linear relation, while the SAT_AVG and PCFTLOAN display a negative linear relation, which seems to be stronger than in the case of public schools. These results are similar to the ungrouped scatterplot. The variables SAT_AVG and UGDS_WHITE display a non-linear relation but it does not seem to be as strong as for the case of public schools.

-Finding scatterplot matrix for the variables C150_4, SAT_AVG, UGDS_WHITE, PCTFLOAN for for-profit private schools.

pairs(~ C150_4 + SAT_AVG + UGDS_WHITE + PCTFLOAN, data = projectdata_private_fp, row1attop=FALSE)

There is not any significant relation between these variables for the case of for-profit private schools.

###Question 7

-Covariance matrix for public schools:

projectdata_p_matrix_quant<-data.matrix(projectdata_public[7:30], rownames.force = NA)
covar_matrix_p<-cov(projectdata_p_matrix_quant, y = projectdata_p_matrix_quant, use = "pairwise.complete.obs", method = "pearson")
print(covar_matrix_p)
##                    ADM_RATE        SAT_AVG          UGDS     UGDS_WHITE
## ADM_RATE       0.0336011727     -3.8686471     -212.5850    0.012402598
## SAT_AVG       -3.8686471337  12463.5085234   545068.3450    7.768439572
## UGDS        -212.5849738248 545068.3450102 59986256.4915 -402.643176991
## UGDS_WHITE     0.0124025981      7.7684396     -402.6432    0.065545908
## UGDS_BLACK    -0.0076538385    -11.2890796      -70.2881   -0.020816997
## UGDS_HISP     -0.0017084621     -0.0575393      305.1411   -0.027691548
## UGDS_ASIAN    -0.0033843472      2.9721245      164.0611   -0.005566508
## UGDS_AIAN      0.0007446680     -0.5342317     -107.4524   -0.006626126
## UGDS_NHPI      0.0006792911     -0.0324790      -12.1503   -0.001916665
## UGDS_2MOR      0.0002595622      0.3434692       30.8256   -0.000936299
## UGDS_NRA      -0.0004671890      1.2601887       76.8146   -0.000905470
## UGDS_UNKN     -0.0008725182     -0.4301800       15.6980   -0.001085878
## PPTUG_EF       0.0000954369     -4.2294166       89.4676   -0.010066535
## NPT4_PUB      22.2879087113 154640.1284935  5346029.3930  260.162736663
## NPT4_PRIV                NA             NA            NA             NA
## COSTT4_A    -103.2138961521 226194.8570397 11142000.7720  169.917311164
## TUITFTE      -10.1815344619 227986.7303426  6336576.3369  187.983704810
## INEXPFTE    -151.0771627139 241038.7476521 -1806406.3802   58.037873702
## PFTFAC         0.0016301072      2.4444223      -43.6577    0.005844623
## PCTPELL       -0.0005362106     -9.4702374     -284.4552   -0.006598399
## C150_4        -0.0052418089     14.2573628      757.4312    0.011827856
## PFTFTUG1_EF    0.0011840643      5.2694309      181.4096    0.004703051
## RET_FT4       -0.0043001302      7.8400274      491.8125    0.000410025
## PCTFLOAN       0.0015668700     -6.5138583       91.0523    0.015825849
##                UGDS_BLACK       UGDS_HISP     UGDS_ASIAN       UGDS_AIAN
## ADM_RATE     -0.007653838   -0.0017084621  -0.0033843472    0.0007446680
## SAT_AVG     -11.289079618   -0.0575393001   2.9721244502   -0.5342317250
## UGDS        -70.288139721  305.1411086094 164.0611333528 -107.4523552103
## UGDS_WHITE   -0.020816997   -0.0276915476  -0.0055665082   -0.0066261256
## UGDS_BLACK    0.030931962   -0.0050938181  -0.0010045128   -0.0024622749
## UGDS_HISP    -0.005093818    0.0326939649   0.0022412020   -0.0018724187
## UGDS_ASIAN   -0.001004513    0.0022412020   0.0033785450   -0.0006222649
## UGDS_AIAN    -0.002462275   -0.0018724187  -0.0006222649    0.0122932759
## UGDS_NHPI    -0.000483496   -0.0002986142   0.0003475660   -0.0000765116
## UGDS_2MOR    -0.000572416   -0.0000759129   0.0006569181   -0.0001897741
## UGDS_NRA     -0.000302662    0.0001092018   0.0004981175   -0.0002098941
## UGDS_UNKN    -0.000196143   -0.0000123227   0.0000708987   -0.0002340234
## PPTUG_EF     -0.000344859    0.0082208271   0.0009672637    0.0006569355
## NPT4_PUB    -24.657390012 -204.3260247216   7.9565975666  -50.3274507922
## NPT4_PRIV              NA              NA             NA              NA
## COSTT4_A    -36.206960690 -169.4033123712  52.4599092872  -53.2376832673
## TUITFTE     -60.699057245 -135.4007046202  26.4656643767  -43.3675931753
## INEXPFTE    -43.923711823  -80.8312285453  50.2257485056   17.2605623571
## PFTFAC       -0.000770282   -0.0066239972  -0.0012416101    0.0013896573
## PCTPELL       0.010390380   -0.0013850462  -0.0018940078    0.0011892883
## C150_4       -0.010463509   -0.0020199748   0.0037532020   -0.0035351973
## PFTFTUG1_EF  -0.001083308   -0.0032139455  -0.0011638976    0.0006136038
## RET_FT4      -0.005770553    0.0025524841   0.0030285946   -0.0011300382
## PCTFLOAN      0.005410460   -0.0145011063  -0.0014205867   -0.0053096948
##                  UGDS_NHPI      UGDS_2MOR      UGDS_NRA     UGDS_UNKN
## ADM_RATE      0.0006792911  0.00025956225 -0.0004671890 -0.0008725182
## SAT_AVG      -0.0324789636  0.34346922669  1.2601886848 -0.4301800045
## UGDS        -12.1503253439 30.82564738595 76.8146345980 15.6979901242
## UGDS_WHITE   -0.0019166651 -0.00093629865 -0.0009054697 -0.0010858781
## UGDS_BLACK   -0.0004834959 -0.00057241603 -0.0003026625 -0.0001961427
## UGDS_HISP    -0.0002986142 -0.00007591289  0.0001092018 -0.0000123227
## UGDS_ASIAN    0.0003475660  0.00065691814  0.0004981175  0.0000708987
## UGDS_AIAN    -0.0000765116 -0.00018977411 -0.0002098941 -0.0002340234
## UGDS_NHPI     0.0023745684  0.00008111316  0.0000247540 -0.0000526212
## UGDS_2MOR     0.0000811132  0.00090088652  0.0001255937  0.0000099156
## UGDS_NRA      0.0000247540  0.00012559370  0.0006214688  0.0000389606
## UGDS_UNKN    -0.0000526212  0.00000991560  0.0000389606  0.0014612872
## PPTUG_EF      0.0000975370  0.00025801446 -0.0011201983  0.0013311214
## NPT4_PUB    -17.9221260879  5.15707904691 21.8246296769  2.1374055035
## NPT4_PRIV               NA             NA            NA            NA
## COSTT4_A    -23.5268873687 13.39656493215 43.9638837981  2.6420805828
## TUITFTE      -6.3587580810  2.77551642169 27.9439680321  0.6592410772
## INEXPFTE     -5.0313468968 -3.97746566296 16.2428071613 -8.0177549656
## PFTFAC        0.0002952458  0.00006410420  0.0012561471 -0.0002140569
## PCTPELL       0.0009818011 -0.00093171297 -0.0007846350 -0.0009678444
## C150_4       -0.0007818610  0.00023910492  0.0016349433 -0.0006543867
## PFTFTUG1_EF   0.0007842150 -0.00051591510  0.0009330706 -0.0010569228
## RET_FT4       0.0002091550  0.00006617733  0.0009719275 -0.0003378427
## PCTFLOAN     -0.0009334880  0.00000454354  0.0007577333  0.0001666833
##                    PPTUG_EF       NPT4_PUB NPT4_PRIV       COSTT4_A
## ADM_RATE       0.0000954369       22.28791        NA     -103.21390
## SAT_AVG       -4.2294166268   154640.12849        NA   226194.85704
## UGDS          89.4675752305  5346029.39297        NA 11142000.77200
## UGDS_WHITE    -0.0100665353      260.16274        NA      169.91731
## UGDS_BLACK    -0.0003448588      -24.65739        NA      -36.20696
## UGDS_HISP      0.0082208271     -204.32602        NA     -169.40331
## UGDS_ASIAN     0.0009672637        7.95660        NA       52.45991
## UGDS_AIAN      0.0006569355      -50.32745        NA      -53.23768
## UGDS_NHPI      0.0000975370      -17.92213        NA      -23.52689
## UGDS_2MOR      0.0002580145        5.15708        NA       13.39656
## UGDS_NRA      -0.0011201983       21.82463        NA       43.96388
## UGDS_UNKN      0.0013311214        2.13741        NA        2.64208
## PPTUG_EF       0.0559198372     -494.74628        NA     -662.95187
## NPT4_PUB    -494.7462837076 21805832.12310        NA 20420278.27211
## NPT4_PRIV                NA             NA        NA             NA
## COSTT4_A    -662.9518662241 20420278.27211        NA 24568968.58818
## TUITFTE     -399.4380944136 11215184.60953        NA 13486786.88130
## INEXPFTE    -464.0903082576  6239926.76979        NA  8950068.16662
## PFTFAC        -0.0255763136      292.43019        NA      368.13577
## PCTPELL       -0.0075163227      -82.78810        NA     -153.10940
## C150_4        -0.0177376815      450.89530        NA      595.65463
## PFTFTUG1_EF   -0.0328960439      359.34794        NA      444.70495
## RET_FT4       -0.0078700556      161.13032        NA      255.23165
## PCTFLOAN      -0.0271153765      729.64347        NA      717.23173
##                     TUITFTE        INEXPFTE         PFTFAC         PCTPELL
## ADM_RATE         -10.181534      -151.07716   0.0016301072   -0.0005362106
## SAT_AVG       227986.730343    241038.74765   2.4444222799   -9.4702374145
## UGDS         6336576.336850  -1806406.38016 -43.6577038172 -284.4552191197
## UGDS_WHITE       187.983705        58.03787   0.0058446230   -0.0065983987
## UGDS_BLACK       -60.699057       -43.92371  -0.0007702821    0.0103903800
## UGDS_HISP       -135.400705       -80.83123  -0.0066239972   -0.0013850462
## UGDS_ASIAN        26.465664        50.22575  -0.0012416101   -0.0018940078
## UGDS_AIAN        -43.367593        17.26056   0.0013896573    0.0011892883
## UGDS_NHPI         -6.358758        -5.03135   0.0002952458    0.0009818011
## UGDS_2MOR          2.775516        -3.97747   0.0000641042   -0.0009317130
## UGDS_NRA          27.943968        16.24281   0.0012561471   -0.0007846350
## UGDS_UNKN          0.659241        -8.01775  -0.0002140569   -0.0009678444
## PPTUG_EF        -399.438094      -464.09031  -0.0255763136   -0.0075163227
## NPT4_PUB    11215184.609534   6239926.76979 292.4301896586  -82.7881019627
## NPT4_PRIV                NA              NA             NA              NA
## COSTT4_A    13486786.881299   8950068.16662 368.1357704102 -153.1094027403
## TUITFTE     29116322.530603  25045435.39216 297.7349183726  -98.9527060368
## INEXPFTE    25045435.392156 134064195.62761 370.6366835022 -169.9340449429
## PFTFAC           297.734918       370.63668   0.0779592446   -0.0000969015
## PCTPELL          -98.952706      -169.93404  -0.0000969015    0.0288607627
## C150_4           422.699344       375.04814   0.0058391470   -0.0138992187
## PFTFTUG1_EF      323.381000       211.82217   0.0164332154    0.0028016766
## RET_FT4          197.187210       208.24043   0.0022860569   -0.0066317863
## PCTFLOAN         481.686849       385.05094   0.0196303662    0.0105145812
##                    C150_4   PFTFTUG1_EF        RET_FT4        PCTFLOAN
## ADM_RATE     -0.005241809   0.001184064  -0.0043001302   0.00156687003
## SAT_AVG      14.257362770   5.269430885   7.8400274284  -6.51385826338
## UGDS        757.431230038 181.409646380 491.8124856690  91.05230287144
## UGDS_WHITE    0.011827856   0.004703051   0.0004100246   0.01582584942
## UGDS_BLACK   -0.010463509  -0.001083308  -0.0057705531   0.00541045978
## UGDS_HISP    -0.002019975  -0.003213946   0.0025524841  -0.01450110632
## UGDS_ASIAN    0.003753202  -0.001163898   0.0030285946  -0.00142058665
## UGDS_AIAN    -0.003535197   0.000613604  -0.0011300382  -0.00530969480
## UGDS_NHPI    -0.000781861   0.000784215   0.0002091550  -0.00093348797
## UGDS_2MOR     0.000239105  -0.000515915   0.0000661773   0.00000454354
## UGDS_NRA      0.001634943   0.000933071   0.0009719275   0.00075773329
## UGDS_UNKN    -0.000654387  -0.001056923  -0.0003378427   0.00016668327
## PPTUG_EF     -0.017737682  -0.032896044  -0.0078700556  -0.02711537653
## NPT4_PUB    450.895296538 359.347941380 161.1303216488 729.64347456410
## NPT4_PRIV              NA            NA             NA              NA
## COSTT4_A    595.654629266 444.704951963 255.2316529109 717.23172941724
## TUITFTE     422.699344143 323.381000004 197.1872102852 481.68684943052
## INEXPFTE    375.048144975 211.822174023 208.2404316644 385.05094120121
## PFTFAC        0.005839147   0.016433215   0.0022860569   0.01963036622
## PCTPELL      -0.013899219   0.002801677  -0.0066317863   0.01051458120
## C150_4        0.032312755   0.015134162   0.0164080616   0.00415976947
## PFTFTUG1_EF   0.015134162   0.040962807   0.0077027546   0.01932123718
## RET_FT4       0.016408062   0.007702755   0.0136728973  -0.00279392609
## PCTFLOAN      0.004159769   0.019321237  -0.0027939261   0.06593407537

-correlation matrix for public schools:

correlation_matrix_p<-cor(projectdata_p_matrix_quant, y = projectdata_p_matrix_quant, use = "pairwise.complete.obs", method = "pearson")
print(correlation_matrix_p)
##                ADM_RATE     SAT_AVG       UGDS UGDS_WHITE  UGDS_BLACK
## ADM_RATE     1.00000000 -0.19654913 -0.1289392  0.2609101 -0.20674287
## SAT_AVG     -0.19654913  1.00000000  0.5370659  0.2907833 -0.49420723
## UGDS        -0.12893924  0.53706595  1.0000000 -0.2030587 -0.05160032
## UGDS_WHITE   0.26091012  0.29078330 -0.2030587  1.0000000 -0.46231881
## UGDS_BLACK  -0.20674287 -0.49420723 -0.0516003 -0.4623188  1.00000000
## UGDS_HISP   -0.04954429 -0.00358081  0.2178919 -0.5981921 -0.16017912
## UGDS_ASIAN  -0.28345346  0.39289918  0.3644307 -0.3740634 -0.09826229
## UGDS_AIAN    0.07237888 -0.09819105 -0.1251285 -0.2334281 -0.12626966
## UGDS_NHPI    0.06639876 -0.04654875 -0.0321936 -0.1536318 -0.05641527
## UGDS_2MOR    0.05115751  0.12441009  0.1326024 -0.1218447 -0.10843588
## UGDS_NRA    -0.08244967  0.38371121  0.3978400 -0.1418703 -0.06903116
## UGDS_UNKN   -0.12634535 -0.11152722  0.0530213 -0.1109534 -0.02917431
## PPTUG_EF     0.00367178 -0.32617487  0.0488491 -0.1662740 -0.00829191
## NPT4_PUB     0.02782553  0.34169792  0.1468039  0.2173960 -0.02981708
## NPT4_PRIV            NA          NA         NA         NA          NA
## COSTT4_A    -0.12742621  0.48318171  0.2802674  0.1358600 -0.04083219
## TUITFTE     -0.01200375  0.58939726  0.1944799  0.1745755 -0.08203029
## INEXPFTE    -0.20485437  0.54944882 -0.0217912  0.0211846 -0.02333119
## PFTFAC       0.04953418  0.12249666 -0.0194689  0.0833372 -0.01564803
## PCTPELL     -0.01777064 -0.65483119 -0.2161195 -0.1516916  0.34760307
## C150_4      -0.18390170  0.78405891  0.4630921  0.2560546 -0.29273837
## PFTFTUG1_EF  0.03725845  0.27929752  0.1116179  0.0924172 -0.03007132
## RET_FT4     -0.25587034  0.73079261  0.4765554  0.0134587 -0.24170436
## PCTFLOAN     0.04480260 -0.39718439  0.0457688  0.2407071  0.11975269
##               UGDS_HISP UGDS_ASIAN  UGDS_AIAN   UGDS_NHPI    UGDS_2MOR
## ADM_RATE    -0.04954429 -0.2834535  0.0723789  0.06639876  0.051157507
## SAT_AVG     -0.00358081  0.3928992 -0.0981911 -0.04654875  0.124410094
## UGDS         0.21789188  0.3644307 -0.1251285 -0.03219361  0.132602369
## UGDS_WHITE  -0.59819206 -0.3740634 -0.2334281 -0.15363179 -0.121844678
## UGDS_BLACK  -0.16017912 -0.0982623 -0.1262697 -0.05641527 -0.108435880
## UGDS_HISP    1.00000000  0.2132467 -0.0933975 -0.03389099 -0.013987712
## UGDS_ASIAN   0.21324672  1.0000000 -0.0965554  0.12271002  0.376540208
## UGDS_AIAN   -0.09339752 -0.0965554  1.0000000 -0.01416123 -0.057025369
## UGDS_NHPI   -0.03389099  0.1227100 -0.0141612  1.00000000  0.055457965
## UGDS_2MOR   -0.01398771  0.3765402 -0.0570254  0.05545796  1.000000000
## UGDS_NRA     0.02422625  0.3437615 -0.0759376  0.02037714  0.167850694
## UGDS_UNKN   -0.00178281  0.0319085 -0.0552151 -0.02824886  0.008642035
## PPTUG_EF     0.19226435  0.0703715  0.0250557  0.00846436  0.036351811
## NPT4_PUB    -0.24145384  0.0294243 -0.0978197 -0.07758385  0.036636685
## NPT4_PRIV            NA         NA         NA          NA           NA
## COSTT4_A    -0.18504184  0.1724786 -0.0939407 -0.08858280  0.089495452
## TUITFTE     -0.17799642  0.1082204 -0.0929608 -0.03101322  0.021985286
## INEXPFTE    -0.04176526  0.0807232  0.0145424 -0.00964505 -0.012383414
## PFTFAC      -0.12819576 -0.0712472  0.0434149  0.01968683  0.007593058
## PCTPELL     -0.04507280 -0.1917202  0.0631076  0.11853811 -0.182696543
## C150_4      -0.06191522  0.3105361 -0.1877793 -0.09553903  0.047267729
## PFTFTUG1_EF -0.08622650 -0.0934539  0.0257779  0.07182541 -0.084243762
## RET_FT4      0.11813979  0.3793869 -0.1030183  0.03812590  0.021151584
## PCTFLOAN    -0.31221251 -0.0951377 -0.1864073 -0.07456623  0.000589443
##               UGDS_NRA   UGDS_UNKN    PPTUG_EF   NPT4_PUB NPT4_PRIV
## ADM_RATE    -0.0824497 -0.12634535  0.00367178  0.0278255        NA
## SAT_AVG      0.3837112 -0.11152722 -0.32617487  0.3416979        NA
## UGDS         0.3978400  0.05302130  0.04884913  0.1468039        NA
## UGDS_WHITE  -0.1418703 -0.11095337 -0.16627397  0.2173960        NA
## UGDS_BLACK  -0.0690312 -0.02917431 -0.00829191 -0.0298171        NA
## UGDS_HISP    0.0242263 -0.00178281  0.19226435 -0.2414538        NA
## UGDS_ASIAN   0.3437615  0.03190845  0.07037155  0.0294243        NA
## UGDS_AIAN   -0.0759376 -0.05521507  0.02505567 -0.0978197        NA
## UGDS_NHPI    0.0203771 -0.02824886  0.00846436 -0.0775839        NA
## UGDS_2MOR    0.1678507  0.00864203  0.03635181  0.0366367        NA
## UGDS_NRA     1.0000000  0.04088352 -0.19002131  0.1859134        NA
## UGDS_UNKN    0.0408835  1.00000000  0.14725394  0.0119817        NA
## PPTUG_EF    -0.1900213  0.14725394  1.00000000 -0.4557973        NA
## NPT4_PUB     0.1859134  0.01198165 -0.45579731  1.0000000        NA
## NPT4_PRIV           NA          NA          NA         NA        NA
## COSTT4_A     0.3355728  0.01378607 -0.60014713  0.9227081        NA
## TUITFTE      0.2664402  0.00409934 -0.40157978  0.5806977        NA
## INEXPFTE     0.0608721 -0.01959601 -0.18338783  0.2999074        NA
## PFTFAC       0.1696686 -0.02048827 -0.41317250  0.2348420        NA
## PCTPELL     -0.1851993 -0.14898227 -0.18706290 -0.1060222        NA
## C150_4       0.2871060 -0.11057601 -0.59422589  0.5679805        NA
## PFTFTUG1_EF  0.1738276 -0.13971730 -0.73767993  0.4000409        NA
## RET_FT4      0.2688806 -0.08874030 -0.48068026  0.3238603        NA
## PCTFLOAN     0.1183278  0.01697540 -0.44647466  0.6133979        NA
##               COSTT4_A     TUITFTE    INEXPFTE      PFTFAC     PCTPELL
## ADM_RATE    -0.1274262 -0.01200375 -0.20485437  0.04953418 -0.01777064
## SAT_AVG      0.4831817  0.58939726  0.54944882  0.12249666 -0.65483119
## UGDS         0.2802674  0.19447995 -0.02179121 -0.01946888 -0.21611950
## UGDS_WHITE   0.1358600  0.17457550  0.02118460  0.08333716 -0.15169159
## UGDS_BLACK  -0.0408322 -0.08203029 -0.02333119 -0.01564803  0.34760307
## UGDS_HISP   -0.1850418 -0.17799642 -0.04176526 -0.12819576 -0.04507280
## UGDS_ASIAN   0.1724786  0.10822037  0.08072318 -0.07124724 -0.19172018
## UGDS_AIAN   -0.0939407 -0.09296081  0.01454237  0.04341494  0.06310760
## UGDS_NHPI   -0.0885828 -0.03101322 -0.00964505  0.01968683  0.11853811
## UGDS_2MOR    0.0894955  0.02198529 -0.01238341  0.00759306 -0.18269654
## UGDS_NRA     0.3355728  0.26644025  0.06087213  0.16966860 -0.18519928
## UGDS_UNKN    0.0137861  0.00409934 -0.01959601 -0.02048827 -0.14898227
## PPTUG_EF    -0.6001471 -0.40157978 -0.18338783 -0.41317250 -0.18706290
## NPT4_PUB     0.9227081  0.58069767  0.29990739  0.23484204 -0.10602216
## NPT4_PRIV           NA          NA          NA          NA          NA
## COSTT4_A     1.0000000  0.78742176  0.49775812  0.26572184 -0.20918041
## TUITFTE      0.7874218  1.00000000  0.40087034  0.23098977 -0.13850326
## INEXPFTE     0.4977581  0.40087034  1.00000000  0.11164839 -0.09348854
## PFTFAC       0.2657218  0.23098977  0.11164839  1.00000000 -0.00240791
## PCTPELL     -0.2091804 -0.13850326 -0.09348854 -0.00240791  1.00000000
## C150_4       0.6872294  0.65096780  0.51615425  0.15561680 -0.54165168
## PFTFTUG1_EF  0.4443971  0.47193010  0.30866292  0.29140123  0.09639879
## RET_FT4      0.4783138  0.47833487  0.46812526  0.09968770 -0.39660397
## PCTFLOAN     0.6252842  0.44606284  0.14015064  0.30634055  0.24103667
##                 C150_4 PFTFTUG1_EF    RET_FT4     PCTFLOAN
## ADM_RATE    -0.1839017   0.0372585 -0.2558703  0.044802604
## SAT_AVG      0.7840589   0.2792975  0.7307926 -0.397184388
## UGDS         0.4630921   0.1116179  0.4765554  0.045768841
## UGDS_WHITE   0.2560546   0.0924172  0.0134587  0.240707119
## UGDS_BLACK  -0.2927384  -0.0300713 -0.2417044  0.119752693
## UGDS_HISP   -0.0619152  -0.0862265  0.1181398 -0.312212514
## UGDS_ASIAN   0.3105361  -0.0934539  0.3793869 -0.095137722
## UGDS_AIAN   -0.1877793   0.0257779 -0.1030183 -0.186407327
## UGDS_NHPI   -0.0955390   0.0718254  0.0381259 -0.074566228
## UGDS_2MOR    0.0472677  -0.0842438  0.0211516  0.000589443
## UGDS_NRA     0.2871060   0.1738276  0.2688806  0.118327842
## UGDS_UNKN   -0.1105760  -0.1397173 -0.0887403  0.016975399
## PPTUG_EF    -0.5942259  -0.7376799 -0.4806803 -0.446474665
## NPT4_PUB     0.5679805   0.4000409  0.3238603  0.613397901
## NPT4_PRIV           NA          NA         NA           NA
## COSTT4_A     0.6872294   0.4443971  0.4783138  0.625284169
## TUITFTE      0.6509678   0.4719301  0.4783349  0.446062843
## INEXPFTE     0.5161543   0.3086629  0.4681253  0.140150643
## PFTFAC       0.1556168   0.2914012  0.0996877  0.306340546
## PCTPELL     -0.5416517   0.0963988 -0.3966040  0.241036669
## C150_4       1.0000000   0.4631214  0.7911738  0.123278190
## PFTFTUG1_EF  0.4631214   1.0000000  0.3729936  0.419293438
## RET_FT4      0.7911738   0.3729936  1.0000000 -0.139265961
## PCTFLOAN     0.1232782   0.4192934 -0.1392660  1.000000000
library(corrplot)
corrplot(correlation_matrix_p, method="circle")

The UGDS_BLACK and UGDS_HISP variables ae negatively related to the UGDS_WHITE. PCTPELL is negatively related to SAT_AVG. PFTFTUG1_EF (Share of undergraduate students who are first-time, full-time, degreeseeking undergraduates) is negatively related to PPTUG_EF (share of undergraduate degree/certificate-seeking students who are part-time) which is expected due to the part-time/full-time distribution. SAT_AVG is positively related to the RET_FT4 (First-time, full-time student retention rate at four-year institutions) which can be due to the informed preference for institution of the students who have higher SAT scores. The financial variables (tuition, loans etc.) are generally positively related with less strength. The relation of ethnic distributions to financial variables also seems to be weaker. However, the PCTPELL (Percentage of undergraduates who receive a Pell Grant) is reversely related to UGDS_WHITE (negative) and to UGDS_BLACK (positive).

-Covariance matrix for private non-profit schools:

projectdata_np_matrix_quant<-data.matrix(projectdata_private_np[7:30], rownames.force = NA)
covar_matrix_np<-cov(projectdata_np_matrix_quant, y = projectdata_np_matrix_quant, use = "pairwise.complete.obs", method = "pearson")
print(covar_matrix_np)
##                    ADM_RATE        SAT_AVG           UGDS    UGDS_WHITE
## ADM_RATE       0.0453689515     -11.908729     -110.76148   0.012746322
## SAT_AVG      -11.9087289581   20838.907239   155728.63317   5.106529069
## UGDS        -110.7614787900  155728.633174  9331218.64065 -20.594208659
## UGDS_WHITE     0.0127463220       5.106529      -20.59421   0.077553693
## UGDS_BLACK    -0.0030065986     -11.607361      -40.18257  -0.029650378
## UGDS_HISP     -0.0023196088      -0.126625       11.42529  -0.032652941
## UGDS_ASIAN    -0.0031350879       4.317976       21.03326  -0.004098247
## UGDS_AIAN     -0.0000871576      -0.193983       -9.41054  -0.002588628
## UGDS_NHPI     -0.0000507946      -0.100983       -1.79909  -0.000569806
## UGDS_2MOR     -0.0009704476       1.023550        9.21648  -0.000139274
## UGDS_NRA      -0.0029370425       2.752199       15.93747  -0.003803603
## UGDS_UNKN     -0.0002404558      -1.171670       19.54116  -0.002297023
## PPTUG_EF       0.0015815362      -6.234855       -3.96359  -0.006997340
## NPT4_PUB                 NA             NA             NA            NA
## NPT4_PRIV   -178.8793297321  320389.904046  5167513.08795 258.692059381
## COSTT4_A    -800.1455263117 1113884.973655 10438624.22181 588.729348241
## TUITFTE     -404.4904502340  517356.750215  3227969.28929 -97.161577467
## INEXPFTE    -853.1034785319 1032819.728866  3532012.28581  48.659956834
## PFTFAC        -0.0033877668       7.623504       23.94198   0.007097292
## PCTPELL        0.0053266341     -16.304788     -122.88680  -0.023807666
## C150_4        -0.0132518603      20.460465      131.29893   0.009853973
## PFTFTUG1_EF   -0.0077760267      12.719346       73.50802   0.004431194
## RET_FT4       -0.0056044842      13.333832       95.72951   0.007280116
## PCTFLOAN       0.0029752079     -16.271363      -16.27088   0.001903675
##                 UGDS_BLACK       UGDS_HISP      UGDS_ASIAN
## ADM_RATE      -0.003006599   -0.0023196088  -0.00313508788
## SAT_AVG      -11.607360984   -0.1266249693   4.31797630260
## UGDS         -40.182569660   11.4252946328  21.03326428748
## UGDS_WHITE    -0.029650378   -0.0326529408  -0.00409824677
## UGDS_BLACK     0.041404575   -0.0055318546  -0.00178374345
## UGDS_HISP     -0.005531855    0.0429432174  -0.00016361327
## UGDS_ASIAN    -0.001783743   -0.0001636133   0.00463245988
## UGDS_AIAN     -0.000708290   -0.0006828854  -0.00023905014
## UGDS_NHPI     -0.000192794   -0.0000422158   0.00011943681
## UGDS_2MOR     -0.000595397   -0.0004574627   0.00035459357
## UGDS_NRA      -0.001947165   -0.0012528201   0.00130168874
## UGDS_UNKN     -0.000570368   -0.0017786575  -0.00000959184
## PPTUG_EF       0.006105967    0.0020299213  -0.00114918383
## NPT4_PUB                NA              NA              NA
## NPT4_PRIV   -197.306048083 -405.8424164275 112.52150417329
## COSTT4_A    -633.316257688 -604.4282642539 266.37872495491
## TUITFTE       37.640371438 -203.0514036913  96.46181580454
## INEXPFTE    -143.810851630 -202.8239868737 173.51245863820
## PFTFAC        -0.001420142   -0.0032362485  -0.00049797771
## PCTPELL        0.017144172    0.0145329908  -0.00260347249
## C150_4        -0.012219382   -0.0046496476   0.00414471275
## PFTFTUG1_EF   -0.005915548   -0.0006838199   0.00138854396
## RET_FT4       -0.009843601   -0.0013492707   0.00279455864
## PCTFLOAN       0.013921967   -0.0101884062  -0.00250230920
##                    UGDS_AIAN       UGDS_NHPI      UGDS_2MOR
## ADM_RATE      -0.00008715762  -0.00005079462  -0.0009704476
## SAT_AVG       -0.19398334370  -0.10098307600   1.0235497292
## UGDS          -9.41053645858  -1.79909269061   9.2164786896
## UGDS_WHITE    -0.00258862776  -0.00056980559  -0.0001392742
## UGDS_BLACK    -0.00070829013  -0.00019279361  -0.0005953970
## UGDS_HISP     -0.00068288545  -0.00004221583  -0.0004574627
## UGDS_ASIAN    -0.00023905014   0.00011943681   0.0003545936
## UGDS_AIAN      0.00469859540   0.00000242073  -0.0000982416
## UGDS_NHPI      0.00000242073   0.00069664266   0.0000556151
## UGDS_2MOR     -0.00009824162   0.00005561508   0.0007657043
## UGDS_NRA      -0.00023532369  -0.00003241483   0.0001630202
## UGDS_UNKN     -0.00011546092  -0.00002626496   0.0000217808
## PPTUG_EF       0.00032368053   0.00008230183  -0.0004914570
## NPT4_PUB                  NA              NA             NA
## NPT4_PRIV    -65.63268523242  -3.86793811331  64.4358427011
## COSTT4_A    -137.20053279940 -22.07545476995 134.2530488192
## TUITFTE      -66.79438955305 -10.23563023858  81.6440330612
## INEXPFTE     -15.72634727466  -7.20926909172  62.1188635466
## PFTFAC         0.00015439198  -0.00004781543  -0.0000838565
## PCTPELL        0.00185053150   0.00036780322  -0.0012181417
## C150_4        -0.00114052948   0.00029056971   0.0014005282
## PFTFTUG1_EF   -0.00056061368  -0.00030392701   0.0008282901
## RET_FT4       -0.00114303563  -0.00027706113   0.0006279306
## PCTFLOAN      -0.00165586128  -0.00033147641   0.0008719972
##                    UGDS_NRA       UGDS_UNKN        PPTUG_EF NPT4_PUB
## ADM_RATE     -0.00293704245  -0.00024045579    0.0015815362       NA
## SAT_AVG       2.75219870289  -1.17167007263   -6.2348550578       NA
## UGDS         15.93747031261  19.54115695326   -3.9635885352       NA
## UGDS_WHITE   -0.00380360334  -0.00229702258   -0.0069973397       NA
## UGDS_BLACK   -0.00194716522  -0.00057036753    0.0061059666       NA
## UGDS_HISP    -0.00125282009  -0.00177865750    0.0020299213       NA
## UGDS_ASIAN    0.00130168874  -0.00000959184   -0.0011491838       NA
## UGDS_AIAN    -0.00023532369  -0.00011546092    0.0003236805       NA
## UGDS_NHPI    -0.00003241483  -0.00002626496    0.0000823018       NA
## UGDS_2MOR     0.00016302017   0.00002178079   -0.0004914570       NA
## UGDS_NRA      0.00591900769   0.00000790681   -0.0019970661       NA
## UGDS_UNKN     0.00000790681   0.00490897074    0.0020931840       NA
## PPTUG_EF     -0.00199706606   0.00209318398    0.0463292422       NA
## NPT4_PUB                 NA              NA              NA       NA
## NPT4_PRIV   127.24092957824 111.26084492356 -235.0124229864       NA
## COSTT4_A    260.27267622177 150.10181839110 -919.5262339096       NA
## TUITFTE      87.86793405931  70.08076511929  511.3895290661       NA
## INEXPFTE    117.19028369426 -29.00857894024   74.8328648514       NA
## PFTFAC       -0.00072988636  -0.00149102242   -0.0092037444       NA
## PCTPELL      -0.00498958210  -0.00153071952    0.0039124762       NA
## C150_4        0.00289521278  -0.00079732767   -0.0120833723       NA
## PFTFTUG1_EF   0.00238261759  -0.00156715181   -0.0303882137       NA
## RET_FT4       0.00245667251  -0.00054661046   -0.0079115045       NA
## PCTFLOAN     -0.00388669549   0.00248778217    0.0028908245       NA
##                  NPT4_PRIV       COSTT4_A        TUITFTE        INEXPFTE
## ADM_RATE        -178.87933      -800.1455      -404.4905      -853.10348
## SAT_AVG       320389.90405   1113884.9737    517356.7502   1032819.72887
## UGDS         5167513.08795  10438624.2218   3227969.2893   3532012.28581
## UGDS_WHITE       258.69206       588.7293       -97.1616        48.65996
## UGDS_BLACK      -197.30605      -633.3163        37.6404      -143.81085
## UGDS_HISP       -405.84242      -604.4283      -203.0514      -202.82399
## UGDS_ASIAN       112.52150       266.3787        96.4618       173.51246
## UGDS_AIAN        -65.63269      -137.2005       -66.7944       -15.72635
## UGDS_NHPI         -3.86794       -22.0755       -10.2356        -7.20927
## UGDS_2MOR         64.43584       134.2530        81.6440        62.11886
## UGDS_NRA         127.24093       260.2727        87.8679       117.19028
## UGDS_UNKN        111.26084       150.1018        70.0808       -29.00858
## PPTUG_EF        -235.01242      -919.5262       511.3895        74.83286
## NPT4_PUB                NA             NA             NA              NA
## NPT4_PRIV   58731353.43213  79611798.0941  38021056.4114  19866656.71815
## COSTT4_A    79611798.09413 184451822.5193  71409621.9854  65152783.84114
## TUITFTE     38021056.41143  71409621.9854 953139208.7634 542929991.97454
## INEXPFTE    19866656.71815  65152783.8411 542929991.9745 413170211.25653
## PFTFAC           -75.20620       316.5501       294.9318       589.58433
## PCTPELL         -839.79281     -1882.0627      -896.8566      -897.30906
## C150_4           708.65096      1800.9177       731.5198       961.19691
## PFTFTUG1_EF      424.01687      1497.0406       519.0187       681.79408
## RET_FT4          344.58691      1051.2906       425.5470       563.24991
## PCTFLOAN         526.22334       420.6266      -243.4516      -688.55275
##                     PFTFAC         PCTPELL         C150_4    PFTFTUG1_EF
## ADM_RATE     -0.0033877668     0.005326634   -0.013251860   -0.007776027
## SAT_AVG       7.6235039809   -16.304788297   20.460465273   12.719346411
## UGDS         23.9419796076  -122.886799153  131.298927028   73.508015485
## UGDS_WHITE    0.0070972925    -0.023807666    0.009853973    0.004431194
## UGDS_BLACK   -0.0014201421     0.017144172   -0.012219382   -0.005915548
## UGDS_HISP    -0.0032362485     0.014532991   -0.004649648   -0.000683820
## UGDS_ASIAN   -0.0004979777    -0.002603472    0.004144713    0.001388544
## UGDS_AIAN     0.0001543920     0.001850532   -0.001140529   -0.000560614
## UGDS_NHPI    -0.0000478154     0.000367803    0.000290570   -0.000303927
## UGDS_2MOR    -0.0000838565    -0.001218142    0.001400528    0.000828290
## UGDS_NRA     -0.0007298864    -0.004989582    0.002895213    0.002382618
## UGDS_UNKN    -0.0014910224    -0.001530720   -0.000797328   -0.001567152
## PPTUG_EF     -0.0092037444     0.003912476   -0.012083372   -0.030388214
## NPT4_PUB                NA              NA             NA             NA
## NPT4_PRIV   -75.2061973363  -839.792811020  708.650963767  424.016870727
## COSTT4_A    316.5500623777 -1882.062735106 1800.917662807 1497.040594097
## TUITFTE     294.9317664121  -896.856597545  731.519820716  519.018653502
## INEXPFTE    589.5843259118  -897.309061919  961.196907620  681.794083882
## PFTFAC        0.0826429088    -0.007665825    0.010467380    0.013912811
## PCTPELL      -0.0076658250     0.046162849   -0.024428260   -0.012576102
## C150_4        0.0104673800    -0.024428260    0.045770300    0.020763589
## PFTFTUG1_EF   0.0139128113    -0.012576102    0.020763589    0.061324537
## RET_FT4       0.0067340070    -0.016154747    0.019680184    0.015086202
## PCTFLOAN     -0.0043503049     0.009849093   -0.005069604   -0.004878319
##                    RET_FT4       PCTFLOAN
## ADM_RATE      -0.005604484    0.002975208
## SAT_AVG       13.333832398  -16.271363429
## UGDS          95.729509510  -16.270877988
## UGDS_WHITE     0.007280116    0.001903675
## UGDS_BLACK    -0.009843601    0.013921967
## UGDS_HISP     -0.001349271   -0.010188406
## UGDS_ASIAN     0.002794559   -0.002502309
## UGDS_AIAN     -0.001143036   -0.001655861
## UGDS_NHPI     -0.000277061   -0.000331476
## UGDS_2MOR      0.000627931    0.000871997
## UGDS_NRA       0.002456673   -0.003886695
## UGDS_UNKN     -0.000546610    0.002487782
## PPTUG_EF      -0.007911504    0.002890825
## NPT4_PUB                NA             NA
## NPT4_PRIV    344.586910636  526.223336920
## COSTT4_A    1051.290593681  420.626556191
## TUITFTE      425.547029882 -243.451586630
## INEXPFTE     563.249908317 -688.552747737
## PFTFAC         0.006734007   -0.004350305
## PCTPELL       -0.016154747    0.009849093
## C150_4         0.019680184   -0.005069604
## PFTFTUG1_EF    0.015086202   -0.004878319
## RET_FT4        0.028206445   -0.010527662
## PCTFLOAN      -0.010527662    0.065085252

-Correlation matrix for private non-profit schools:

correlation_matrix_np<-cor(projectdata_np_matrix_quant, y = projectdata_np_matrix_quant, use = "pairwise.complete.obs", method = "pearson")
print(correlation_matrix_np)
##                ADM_RATE    SAT_AVG        UGDS  UGDS_WHITE  UGDS_BLACK
## ADM_RATE     1.00000000 -0.4081278 -0.17562870  0.25171516 -0.08274692
## SAT_AVG     -0.40812784  1.0000000  0.34235941  0.17115911 -0.46664959
## UGDS        -0.17562870  0.3423594  1.00000000 -0.02420888 -0.06464640
## UGDS_WHITE   0.25171516  0.1711591 -0.02420888  1.00000000 -0.52324468
## UGDS_BLACK  -0.08274692 -0.4666496 -0.06464640 -0.52324468  1.00000000
## UGDS_HISP   -0.07168950 -0.0103186  0.01804891 -0.56581410 -0.13118957
## UGDS_ASIAN  -0.25105772  0.5559890  0.10116532 -0.21621759 -0.12879601
## UGDS_AIAN   -0.03056691 -0.1035242 -0.04494291 -0.13560770 -0.05078120
## UGDS_NHPI   -0.00781908 -0.0873136 -0.02231410 -0.07752119 -0.03589744
## UGDS_2MOR   -0.18793468  0.3417968  0.10903484 -0.01807336 -0.10574307
## UGDS_NRA    -0.20472686  0.3520770  0.06781500 -0.17752909 -0.12438104
## UGDS_UNKN   -0.01730108 -0.1460293  0.09130317 -0.11772503 -0.04000695
## PPTUG_EF     0.04644420 -0.3340795 -0.00601116 -0.11736925  0.13905951
## NPT4_PUB             NA         NA          NA          NA          NA
## NPT4_PRIV   -0.11602308  0.3736482  0.22292129  0.12262826 -0.12553620
## COSTT4_A    -0.29794493  0.7097195  0.24921743  0.16331019 -0.23390664
## TUITFTE     -0.27089575  0.5726043  0.03241599 -0.01072718  0.00567438
## INEXPFTE    -0.41632623  0.6722937  0.05700451  0.00863415 -0.03484281
## PFTFAC      -0.06060472  0.2013073  0.02679969  0.09642936 -0.02573481
## PCTPELL      0.13760766 -0.7355873 -0.18669795 -0.39867084  0.39103271
## C150_4      -0.32304393  0.8121478  0.19280411  0.17772058 -0.29202472
## PFTFTUG1_EF -0.15745050  0.4259549  0.10212831  0.06778534 -0.12121931
## RET_FT4     -0.18620908  0.7568601  0.17878605  0.16663777 -0.29896125
## PCTFLOAN     0.06276960 -0.6526462 -0.02081854  0.02684698  0.26742511
##               UGDS_HISP  UGDS_ASIAN   UGDS_AIAN   UGDS_NHPI  UGDS_2MOR
## ADM_RATE    -0.07168950 -0.25105772 -0.03056691 -0.00781908 -0.1879347
## SAT_AVG     -0.01031857  0.55598897 -0.10352416 -0.08731361  0.3417968
## UGDS         0.01804891  0.10116532 -0.04494291 -0.02231410  0.1090348
## UGDS_WHITE  -0.56581410 -0.21621759 -0.13560770 -0.07752119 -0.0180734
## UGDS_BLACK  -0.13118957 -0.12879601 -0.05078120 -0.03589744 -0.1057431
## UGDS_HISP    1.00000000 -0.01160020 -0.04807470 -0.00771832 -0.0797770
## UGDS_ASIAN  -0.01160020  1.00000000 -0.05123882  0.06648556  0.1882758
## UGDS_AIAN   -0.04807470 -0.05123882  1.00000000  0.00133801 -0.0517942
## UGDS_NHPI   -0.00771832  0.06648556  0.00133801  1.00000000  0.0761477
## UGDS_2MOR   -0.07977704  0.18827579 -0.05179418  0.07614773  1.0000000
## UGDS_NRA    -0.07858089  0.24858611 -0.04462284 -0.01596299  0.0765749
## UGDS_UNKN   -0.12250387 -0.00201141 -0.02404118 -0.01420288  0.0112343
## PPTUG_EF     0.04537615 -0.08136710  0.02185314  0.01443016 -0.0825216
## NPT4_PUB             NA          NA          NA          NA         NA
## NPT4_PRIV   -0.25010803  0.22724819 -0.11888230 -0.01829407  0.3492992
## COSTT4_A    -0.23325534  0.33173370 -0.13678672 -0.05735471  0.4163865
## TUITFTE     -0.03022032  0.04347091 -0.02988263 -0.01189237  0.0905414
## INEXPFTE    -0.04851425  0.12566981 -0.01130743 -0.01346177  0.1107142
## PFTFAC      -0.05884598 -0.03213546  0.00747275 -0.02076612 -0.0108205
## PCTPELL      0.32541626 -0.18464830  0.12516230  0.06460384 -0.2048507
## C150_4      -0.11925499  0.32150919 -0.08935202  0.04626030  0.2849257
## PFTFTUG1_EF -0.01448851  0.09489338 -0.03064254 -0.04417119  0.1422659
## RET_FT4     -0.04318268  0.27678361 -0.11404995 -0.05612715  0.1585182
## PCTFLOAN    -0.19213015 -0.14946458 -0.09432048 -0.04903436  0.1234979
##                UGDS_NRA   UGDS_UNKN    PPTUG_EF NPT4_PUB  NPT4_PRIV
## ADM_RATE    -0.20472686 -0.01730108  0.04644420       NA -0.1160231
## SAT_AVG      0.35207705 -0.14602928 -0.33407952       NA  0.3736482
## UGDS         0.06781500  0.09130317 -0.00601116       NA  0.2229213
## UGDS_WHITE  -0.17752909 -0.11772503 -0.11736925       NA  0.1226283
## UGDS_BLACK  -0.12438104 -0.04000695  0.13905951       NA -0.1255362
## UGDS_HISP   -0.07858089 -0.12250387  0.04537615       NA -0.2501080
## UGDS_ASIAN   0.24858611 -0.00201141 -0.08136710       NA  0.2272482
## UGDS_AIAN   -0.04462284 -0.02404118  0.02185314       NA -0.1188823
## UGDS_NHPI   -0.01596299 -0.01420288  0.01443016       NA -0.0182941
## UGDS_2MOR    0.07657488  0.01123435 -0.08252155       NA  0.3492992
## UGDS_NRA     1.00000000  0.00146684 -0.12051537       NA  0.2211421
## UGDS_UNKN    0.00146684  1.00000000  0.13869313       NA  0.2174576
## PPTUG_EF    -0.12051537  0.13869313  1.00000000       NA -0.1691922
## NPT4_PUB             NA          NA          NA       NA         NA
## NPT4_PRIV    0.22114209  0.21745762 -0.16919218       NA  1.0000000
## COSTT4_A     0.25016485  0.16772262 -0.37104040       NA  0.7717459
## TUITFTE      0.03503441  0.03068879  0.07266169       NA  0.6739957
## INEXPFTE     0.07509536 -0.02041571  0.01709140       NA  0.2770262
## PFTFAC      -0.03214406 -0.07302097 -0.16108951       NA -0.0358702
## PCTPELL     -0.30162399 -0.10164310  0.08493861       NA -0.5168906
## C150_4       0.18891061 -0.05468976 -0.33811189       NA  0.4487164
## PFTFTUG1_EF  0.11982486 -0.09769252 -0.69149972       NA  0.2261556
## RET_FT4      0.19833296 -0.04851303 -0.28669646       NA  0.2753361
## PCTFLOAN    -0.19787332  0.13912323  0.05297763       NA  0.2765490
##               COSTT4_A     TUITFTE    INEXPFTE      PFTFAC    PCTPELL
## ADM_RATE    -0.2979449 -0.27089575 -0.41632623 -0.06060472  0.1376077
## SAT_AVG      0.7097195  0.57260434  0.67229367  0.20130734 -0.7355873
## UGDS         0.2492174  0.03241599  0.05700451  0.02679969 -0.1866980
## UGDS_WHITE   0.1633102 -0.01072718  0.00863415  0.09642936 -0.3986708
## UGDS_BLACK  -0.2339066  0.00567438 -0.03484281 -0.02573481  0.3910327
## UGDS_HISP   -0.2332553 -0.03022032 -0.04851425 -0.05884598  0.3254163
## UGDS_ASIAN   0.3317337  0.04347091  0.12566981 -0.03213546 -0.1846483
## UGDS_AIAN   -0.1367867 -0.02988263 -0.01130743  0.00747275  0.1251623
## UGDS_NHPI   -0.0573547 -0.01189237 -0.01346177 -0.02076612  0.0646038
## UGDS_2MOR    0.4163865  0.09054137  0.11071416 -0.01082051 -0.2048507
## UGDS_NRA     0.2501649  0.03503441  0.07509536 -0.03214406 -0.3016240
## UGDS_UNKN    0.1677226  0.03068879 -0.02041571 -0.07302097 -0.1016431
## PPTUG_EF    -0.3710404  0.07266169  0.01709140 -0.16108951  0.0849386
## NPT4_PUB            NA          NA          NA          NA         NA
## NPT4_PRIV    0.7717459  0.67399569  0.27702619 -0.03587016 -0.5168906
## COSTT4_A     1.0000000  0.73577001  0.50867706  0.08439357 -0.6749549
## TUITFTE      0.7357700  1.00000000  0.86516943  0.03028048 -0.1279547
## INEXPFTE     0.5086771  0.86516943  1.00000000  0.09467623 -0.2057503
## PFTFAC       0.0843936  0.03028048  0.09467623  1.00000000 -0.1335008
## PCTPELL     -0.6749549 -0.12795475 -0.20575025 -0.13350079  1.0000000
## C150_4       0.6405116  0.48939824  0.47243477  0.18015840 -0.5672350
## PFTFTUG1_EF  0.4478425  0.29162092  0.29248156  0.20648485 -0.2528092
## RET_FT4      0.4744694  0.35959455  0.35140911  0.15085146 -0.4769415
## PCTFLOAN     0.1292588 -0.02923821 -0.13290468 -0.06808458  0.1796837
##                 C150_4 PFTFTUG1_EF    RET_FT4   PCTFLOAN
## ADM_RATE    -0.3230439  -0.1574505 -0.1862091  0.0627696
## SAT_AVG      0.8121478   0.4259549  0.7568601 -0.6526462
## UGDS         0.1928041   0.1021283  0.1787861 -0.0208185
## UGDS_WHITE   0.1777206   0.0677853  0.1666378  0.0268470
## UGDS_BLACK  -0.2920247  -0.1212193 -0.2989613  0.2674251
## UGDS_HISP   -0.1192550  -0.0144885 -0.0431827 -0.1921301
## UGDS_ASIAN   0.3215092   0.0948934  0.2767836 -0.1494646
## UGDS_AIAN   -0.0893520  -0.0306425 -0.1140499 -0.0943205
## UGDS_NHPI    0.0462603  -0.0441712 -0.0561271 -0.0490344
## UGDS_2MOR    0.2849257   0.1422659  0.1585182  0.1234979
## UGDS_NRA     0.1889106   0.1198249  0.1983330 -0.1978733
## UGDS_UNKN   -0.0546898  -0.0976925 -0.0485130  0.1391232
## PPTUG_EF    -0.3381119  -0.6914997 -0.2866965  0.0529776
## NPT4_PUB            NA          NA         NA         NA
## NPT4_PRIV    0.4487164   0.2261556  0.2753361  0.2765490
## COSTT4_A     0.6405116   0.4478425  0.4744694  0.1292588
## TUITFTE      0.4893982   0.2916209  0.3595946 -0.0292382
## INEXPFTE     0.4724348   0.2924816  0.3514091 -0.1329047
## PFTFAC       0.1801584   0.2064848  0.1508515 -0.0680846
## PCTPELL     -0.5672350  -0.2528092 -0.4769415  0.1796837
## C150_4       1.0000000   0.4162888  0.5665879 -0.1012121
## PFTFTUG1_EF  0.4162888   1.0000000  0.4010402 -0.0813882
## RET_FT4      0.5665879   0.4010402  1.0000000 -0.2647972
## PCTFLOAN    -0.1012121  -0.0813882 -0.2647972  1.0000000
library(corrplot)
corrplot(correlation_matrix_np, method="circle")

The correlation plot indicates a strong positive relation between the SAT_AVG and financial variables, with the exception of PCTPELL and PCFTLOAN which have a negative relation to SAT_AVG. NPT4_PRIV(Average net price for Title IV institutions (private for-profit and nonprofit)) and COSTT4_A (Average cost of attendance) have a negative relation to PCTPELL. The relation of UGDS_WHITE is also negatively related to UGDS_BLACK and UGDS_HISP.

-Covariance matrix for private for-profit schools:

projectdata_fp_matrix_quant<-data.matrix(projectdata_private_fp[7:30], rownames.force = NA)
covar_matrix_fp<-cov(projectdata_fp_matrix_quant, y = projectdata_fp_matrix_quant, use = "pairwise.complete.obs", method = "pearson")
print(covar_matrix_fp)
##                   ADM_RATE         SAT_AVG             UGDS
## ADM_RATE       0.030185280      -5.8130800      -47.8750644
## SAT_AVG       -5.813080000   16788.6666667  -140465.8000000
## UGDS         -47.875064386 -140465.8000000 10113641.9849641
## UGDS_WHITE    -0.001894686      -1.0583733      -34.9333201
## UGDS_BLACK    -0.000591637     -10.2013067        0.0174207
## UGDS_HISP      0.002906497      -2.5640133        1.1417769
## UGDS_ASIAN    -0.001526879       2.2801600       -2.5344615
## UGDS_AIAN      0.000267522      -0.1828133       -0.1088131
## UGDS_NHPI      0.000175481      -0.0592467        0.4945693
## UGDS_2MOR      0.000302380       1.6495800        5.1066832
## UGDS_NRA      -0.000460534       7.6161667        1.8725798
## UGDS_UNKN      0.000821507       2.5142600       29.2327989
## PPTUG_EF      -0.003562037     -14.8132867       33.2051063
## NPT4_PUB                NA              NA               NA
## NPT4_PRIV   -263.540648813   79303.2000000   937483.0697138
## COSTT4_A    -282.334004175  101399.0000000 -1054563.6294216
## TUITFTE      -54.828833594  364376.3333333   361288.6912671
## INEXPFTE     -70.297512995  367148.6666667  -525283.5728357
## PFTFAC        -0.003889060      15.0304000      -66.6861893
## PCTPELL        0.006797696      -8.2362400      -10.1543959
## C150_4        -0.002131956      14.3739467     -100.9311464
## PFTFTUG1_EF    0.016947413      15.7048733     -132.1670958
## RET_FT4       -0.008682872      11.3060667     -115.8528014
## PCTFLOAN      -0.000922060      -7.8462867        3.2957711
##                   UGDS_WHITE    UGDS_BLACK       UGDS_HISP     UGDS_ASIAN
## ADM_RATE      -0.00189468641  -0.000591637    0.0029064968 -0.00152687922
## SAT_AVG       -1.05837333333 -10.201306667   -2.5640133333  2.28016000000
## UGDS         -34.93332007912   0.017420719    1.1417768701 -2.53446151445
## UGDS_WHITE     0.08337790966  -0.033444264   -0.0363193825 -0.00468934410
## UGDS_BLACK    -0.03344426402   0.060892007   -0.0190279862 -0.00284058207
## UGDS_HISP     -0.03631938248  -0.019027986    0.0603273507  0.00006762256
## UGDS_ASIAN    -0.00468934410  -0.002840582    0.0000676226  0.00746468639
## UGDS_AIAN      0.00048268090  -0.000704966   -0.0001546391 -0.00006426224
## UGDS_NHPI     -0.00041816930  -0.000402310   -0.0000557663  0.00027503423
## UGDS_2MOR     -0.00000307124  -0.000673063   -0.0009452677 -0.00003839737
## UGDS_NRA      -0.00110480882  -0.000602317   -0.0000148682  0.00027930027
## UGDS_UNKN     -0.00762612894  -0.003052765   -0.0037610664 -0.00043539328
## PPTUG_EF      -0.00102267550   0.004049642   -0.0045413167  0.00141289628
## NPT4_PUB                  NA            NA              NA             NA
## NPT4_PRIV   -124.00657978294 114.638361612 -218.0677274519  6.73071699699
## COSTT4_A     136.62126605727 107.193396071 -390.8326208749 68.71750661478
## TUITFTE       25.82910801684  29.934961988 -230.7020828285  4.95773438022
## INEXPFTE      76.44523790220 -32.574193794 -129.6399315509 -1.96172537035
## PFTFAC         0.00370567694  -0.008937824    0.0034985059 -0.00000125873
## PCTPELL       -0.01429712612   0.010864005    0.0075761234 -0.00245861940
## C150_4         0.00266467594  -0.009138176    0.0083801056  0.00228254530
## PFTFTUG1_EF    0.00338693415  -0.001067939    0.0037227346 -0.00130161746
## RET_FT4        0.00803120835  -0.011971118    0.0036121878  0.00190125603
## PCTFLOAN       0.00731198380   0.008454763   -0.0155242172 -0.00452514202
##                 UGDS_AIAN      UGDS_NHPI      UGDS_2MOR       UGDS_NRA
## ADM_RATE     0.0002675216  0.00017548106  0.00030238000 -0.00046053363
## SAT_AVG     -0.1828133333 -0.05924666667  1.64958000000  7.61616666667
## UGDS        -0.1088131472  0.49456925613  5.10668321178  1.87257975607
## UGDS_WHITE   0.0004826809 -0.00041816930 -0.00000307124 -0.00110480882
## UGDS_BLACK  -0.0007049658 -0.00040231046 -0.00067306258 -0.00060231669
## UGDS_HISP   -0.0001546391 -0.00005576631 -0.00094526771 -0.00001486824
## UGDS_ASIAN  -0.0000642622  0.00027503423 -0.00003839737  0.00027930027
## UGDS_AIAN    0.0004743664  0.00001114916  0.00005467154 -0.00002517022
## UGDS_NHPI    0.0000111492  0.00051267416  0.00006369530 -0.00000667052
## UGDS_2MOR    0.0000546715  0.00006369530  0.00110916191 -0.00005444729
## UGDS_NRA    -0.0000251702 -0.00000667052 -0.00005444729  0.00155199495
## UGDS_UNKN   -0.0000691083  0.00002305510  0.00050016566 -0.00001969338
## PPTUG_EF    -0.0003013514  0.00000252910  0.00025562899  0.00008796026
## NPT4_PUB               NA             NA             NA             NA
## NPT4_PRIV   -1.5991696091  2.58375807661 33.88250271621 41.99022906075
## COSTT4_A     5.0136015877 -0.05355118625 14.02868667298 58.64872513685
## TUITFTE     -0.5818547961  8.88246206522 20.83927416210 47.63389258223
## INEXPFTE    -2.2562229522  6.67966354700  9.21244940657 23.67812145687
## PFTFAC       0.0004505304 -0.00011253086  0.00020654764  0.00113484903
## PCTPELL      0.0001131439 -0.00015155570 -0.00015426804 -0.00121552725
## C150_4      -0.0000723199  0.00035142700 -0.00029852673  0.00027380861
## PFTFTUG1_EF  0.0000352866 -0.00081739147  0.00010670493 -0.00007699709
## RET_FT4      0.0000872715  0.00024518451 -0.00009552644  0.00184503234
## PCTFLOAN     0.0000770569 -0.00009339149  0.00094653304 -0.00093168696
##                  UGDS_UNKN       PPTUG_EF NPT4_PUB      NPT4_PRIV
## ADM_RATE      0.0008215073  -0.0035620368       NA     -263.54065
## SAT_AVG       2.5142600000 -14.8132866667       NA    79303.20000
## UGDS         29.2327989294  33.2051063463       NA   937483.06971
## UGDS_WHITE   -0.0076261289  -0.0010226755       NA     -124.00658
## UGDS_BLACK   -0.0030527650   0.0040496424       NA      114.63836
## UGDS_HISP    -0.0037610664  -0.0045413167       NA     -218.06773
## UGDS_ASIAN   -0.0004353933   0.0014128963       NA        6.73072
## UGDS_AIAN    -0.0000691083  -0.0003013514       NA       -1.59917
## UGDS_NHPI     0.0000230551   0.0000025291       NA        2.58376
## UGDS_2MOR     0.0005001657   0.0002556290       NA       33.88250
## UGDS_NRA     -0.0000196934   0.0000879603       NA       41.99023
## UGDS_UNKN     0.0144737638   0.0000566491       NA      143.85835
## PPTUG_EF      0.0000566491   0.0570955492       NA       94.44090
## NPT4_PUB                NA             NA       NA             NA
## NPT4_PRIV   143.8583517192  94.4408979587       NA 47426314.07995
## COSTT4_A      0.6722633108  78.1106850835       NA 36529821.66504
## TUITFTE      93.5043605247 173.5878360520       NA 21807216.30778
## INEXPFTE     50.3107405402  23.3256439843       NA  4297221.04031
## PFTFAC       -0.0000444073  -0.0170696659       NA      -25.05588
## PCTPELL      -0.0000967285  -0.0039372726       NA     -132.61733
## C150_4       -0.0044432380  -0.0089963812       NA      199.98953
## PFTFTUG1_EF  -0.0039881814  -0.0428607141       NA      -37.57500
## RET_FT4      -0.0036557521  -0.0133598090       NA      295.01101
## PCTFLOAN      0.0043713202   0.0012708673       NA      577.29911
##                     COSTT4_A         TUITFTE       INEXPFTE
## ADM_RATE        -282.3340042      -54.828834      -70.29751
## SAT_AVG       101399.0000000   364376.333333   367148.66667
## UGDS        -1054563.6294216   361288.691267  -525283.57284
## UGDS_WHITE       136.6212661       25.829108       76.44524
## UGDS_BLACK       107.1933961       29.934962      -32.57419
## UGDS_HISP       -390.8326209     -230.702083     -129.63993
## UGDS_ASIAN        68.7175066        4.957734       -1.96173
## UGDS_AIAN          5.0136016       -0.581855       -2.25622
## UGDS_NHPI         -0.0535512        8.882462        6.67966
## UGDS_2MOR         14.0286867       20.839274        9.21245
## UGDS_NRA          58.6487251       47.633893       23.67812
## UGDS_UNKN          0.6722633       93.504361       50.31074
## PPTUG_EF          78.1106851      173.587836       23.32564
## NPT4_PUB                  NA              NA             NA
## NPT4_PRIV   36529821.6650392 21807216.307780  4297221.04031
## COSTT4_A    36438414.8549606 18136430.015341  5311616.67379
## TUITFTE     18136430.0153409 70227037.783148 26923072.09514
## INEXPFTE     5311616.6737896 26923072.095138 23783911.59706
## PFTFAC           152.3768831       26.077511      254.88947
## PCTPELL         -355.9557290     -191.643388     -109.74078
## C150_4           195.4150430       88.350315      108.25320
## PFTFTUG1_EF      -84.2746411      200.759507      -85.61791
## RET_FT4          292.0636731      226.081248       87.75900
## PCTFLOAN         253.9930416      337.304509       32.63181
##                      PFTFAC         PCTPELL          C150_4
## ADM_RATE     -0.00388906044    0.0067976960   -0.0021319555
## SAT_AVG      15.03040000000   -8.2362400000   14.3739466667
## UGDS        -66.68618928737  -10.1543958705 -100.9311463560
## UGDS_WHITE    0.00370567694   -0.0142971261    0.0026646759
## UGDS_BLACK   -0.00893782448    0.0108640046   -0.0091381759
## UGDS_HISP     0.00349850594    0.0075761234    0.0083801056
## UGDS_ASIAN   -0.00000125873   -0.0024586194    0.0022825453
## UGDS_AIAN     0.00045053036    0.0001131439   -0.0000723199
## UGDS_NHPI    -0.00011253086   -0.0001515557    0.0003514270
## UGDS_2MOR     0.00020654764   -0.0001542680   -0.0002985267
## UGDS_NRA      0.00113484903   -0.0012155272    0.0002738086
## UGDS_UNKN    -0.00004440726   -0.0000967285   -0.0044432380
## PPTUG_EF     -0.01706966589   -0.0039372726   -0.0089963812
## NPT4_PUB                 NA              NA              NA
## NPT4_PRIV   -25.05588485374 -132.6173302476  199.9895334019
## COSTT4_A    152.37688310732 -355.9557290053  195.4150430381
## TUITFTE      26.07751071030 -191.6433881382   88.3503145909
## INEXPFTE    254.88947315842 -109.7407820002  108.2532014480
## PFTFAC        0.07681148567   -0.0079480216    0.0160575752
## PCTPELL      -0.00794802163    0.0397572102   -0.0028238148
## C150_4        0.01605757515   -0.0028238148    0.0417717027
## PFTFTUG1_EF   0.00099900570    0.0119318084    0.0115778778
## RET_FT4       0.01481703235   -0.0063026671    0.0210652247
## PCTFLOAN     -0.00479599544    0.0220653582    0.0000224020
##                 PFTFTUG1_EF         RET_FT4       PCTFLOAN
## ADM_RATE       0.0169474129   -0.0086828725  -0.0009220602
## SAT_AVG       15.7048733333   11.3060666667  -7.8462866667
## UGDS        -132.1670957760 -115.8528014216   3.2957710673
## UGDS_WHITE     0.0033869341    0.0080312084   0.0073119838
## UGDS_BLACK    -0.0010679388   -0.0119711183   0.0084547628
## UGDS_HISP      0.0037227346    0.0036121878  -0.0155242172
## UGDS_ASIAN    -0.0013016175    0.0019012560  -0.0045251420
## UGDS_AIAN      0.0000352866    0.0000872715   0.0000770569
## UGDS_NHPI     -0.0008173915    0.0002451845  -0.0000933915
## UGDS_2MOR      0.0001067049   -0.0000955264   0.0009465330
## UGDS_NRA      -0.0000769971    0.0018450323  -0.0009316870
## UGDS_UNKN     -0.0039881814   -0.0036557521   0.0043713202
## PPTUG_EF      -0.0428607141   -0.0133598090   0.0012708673
## NPT4_PUB                 NA              NA             NA
## NPT4_PRIV    -37.5749984908  295.0110115608 577.2991089007
## COSTT4_A     -84.2746410746  292.0636731489 253.9930416133
## TUITFTE      200.7595069851  226.0812475434 337.3045088767
## INEXPFTE     -85.6179147112   87.7589996572  32.6318137137
## PFTFAC         0.0009990057    0.0148170324  -0.0047959954
## PCTPELL        0.0119318084   -0.0063026671   0.0220653582
## C150_4         0.0115778778    0.0210652247   0.0000224020
## PFTFTUG1_EF    0.0941052022    0.0129155141   0.0019342202
## RET_FT4        0.0129155141    0.0774854966   0.0001649988
## PCTFLOAN       0.0019342202    0.0001649988   0.0627023010

-Correlation matrix for private for-profit schools:

correlation_matrix_fp<-cor(projectdata_fp_matrix_quant, y = projectdata_fp_matrix_quant, use = "pairwise.complete.obs", method = "pearson")
print(correlation_matrix_fp)
##               ADM_RATE    SAT_AVG          UGDS   UGDS_WHITE    UGDS_BLACK
## ADM_RATE     1.0000000 -0.5568969 -0.1139367866 -0.045495357 -0.0165705203
## SAT_AVG     -0.5568969  1.0000000 -0.5145798977 -0.048155961 -0.9272817308
## UGDS        -0.1139368 -0.5145799  1.0000000000 -0.038041758  0.0000221989
## UGDS_WHITE  -0.0454954 -0.0481560 -0.0380417576  1.000000000 -0.4693707255
## UGDS_BLACK  -0.0165705 -0.9272817  0.0000221989 -0.469370725  1.0000000000
## UGDS_HISP    0.0781878 -0.4483872  0.0014617407 -0.512101240 -0.3139464378
## UGDS_ASIAN  -0.1308863  0.3712411 -0.0092241452 -0.187966657 -0.1332359478
## UGDS_AIAN    0.0760779 -0.4142103 -0.0015709799  0.076749873 -0.1311689212
## UGDS_NHPI    0.0532776 -0.2407702  0.0068683531 -0.063959627 -0.0720046366
## UGDS_2MOR    0.0681265  0.6210040  0.0482156090 -0.000319368 -0.0818988554
## UGDS_NRA    -0.0734653  0.4827147  0.0149465646 -0.097121746 -0.0619583156
## UGDS_UNKN    0.0570248  0.1361392  0.0764057197 -0.219526979 -0.1028306291
## PPTUG_EF    -0.0992728 -0.5284660  0.0436462386 -0.014834485  0.0686640249
## NPT4_PUB            NA         NA            NA           NA            NA
## NPT4_PRIV   -0.2451089  0.0609736  0.0417131129 -0.062326404  0.0673526370
## COSTT4_A    -0.2903676  0.0816682 -0.0306653773  0.086925768  0.0770745745
## TUITFTE     -0.0513230  0.2889447  0.0138040814  0.010879298  0.0147653811
## INEXPFTE    -0.1283049  0.6052686 -0.0339516154  0.054469751 -0.0271802286
## PFTFAC      -0.0800636  0.4105594 -0.0398558401  0.055537330 -0.1453901634
## PCTPELL      0.2163453 -0.5351338 -0.0159882678 -0.248328018  0.2206885528
## C150_4      -0.0838293  0.6311848 -0.0637333790  0.055175020 -0.2216000958
## PFTFTUG1_EF  0.3171937  0.5718464 -0.1658424506  0.042691099 -0.0150193033
## RET_FT4     -0.1735640  0.6074220 -0.0462813508  0.121976395 -0.2041337024
## PCTFLOAN    -0.0276072 -0.3479708  0.0041321521  0.101130962  0.1367613052
##               UGDS_HISP    UGDS_ASIAN   UGDS_AIAN    UGDS_NHPI
## ADM_RATE     0.07818781 -0.1308863132  0.07607786  0.053277614
## SAT_AVG     -0.44838718  0.3712411163 -0.41421026 -0.240770245
## UGDS         0.00146174 -0.0092241452 -0.00157098  0.006868353
## UGDS_WHITE  -0.51210124 -0.1879666569  0.07674987 -0.063959627
## UGDS_BLACK  -0.31394644 -0.1332359478 -0.13116892 -0.072004637
## UGDS_HISP    1.00000000  0.0031866086 -0.02890716 -0.010027533
## UGDS_ASIAN   0.00318661  1.0000000000 -0.03415019  0.140591883
## UGDS_AIAN   -0.02890716 -0.0341501899  1.00000000  0.022608115
## UGDS_NHPI   -0.01002753  0.1405918832  0.02260811  1.000000000
## UGDS_2MOR   -0.11555806 -0.0133443745  0.07537145  0.084467412
## UGDS_NRA    -0.00153659  0.0820579051 -0.02933492 -0.007478147
## UGDS_UNKN   -0.12728087 -0.0418875729 -0.02637438  0.008463609
## PPTUG_EF    -0.07734242  0.0686222379 -0.05790028  0.000466929
## NPT4_PUB             NA            NA          NA           NA
## NPT4_PRIV   -0.12748663  0.0125485808 -0.01058156  0.016373075
## COSTT4_A    -0.28911473  0.2417902780  0.05935236 -0.000337176
## TUITFTE     -0.11422730  0.0070338168 -0.00324646  0.047668259
## INEXPFTE    -0.10858521 -0.0047082497 -0.02129568  0.060640671
## PFTFAC       0.05485111 -0.0000856367  0.07649870 -0.014880133
## PCTPELL      0.15465220 -0.1430376858  0.02604121 -0.033517211
## C150_4       0.20669041  0.2190671611 -0.02037234  0.056560038
## PFTFTUG1_EF  0.06102147 -0.0858416650  0.00738219 -0.097799395
## RET_FT4      0.06455256  0.1548141778  0.02326999  0.025316590
## PCTFLOAN    -0.25234268 -0.2096347747  0.01412256 -0.016446551
##                UGDS_2MOR    UGDS_NRA    UGDS_UNKN     PPTUG_EF NPT4_PUB
## ADM_RATE     0.068126495 -0.07346531  0.057024811 -0.099272758       NA
## SAT_AVG      0.621003957  0.48271470  0.136139177 -0.528466037       NA
## UGDS         0.048215609  0.01494656  0.076405720  0.043646239       NA
## UGDS_WHITE  -0.000319368 -0.09712175 -0.219526979 -0.014834485       NA
## UGDS_BLACK  -0.081898855 -0.06195832 -0.102830629  0.068664025       NA
## UGDS_HISP   -0.115558055 -0.00153659 -0.127280870 -0.077342418       NA
## UGDS_ASIAN  -0.013344375  0.08205791 -0.041887573  0.068622238       NA
## UGDS_AIAN    0.075371449 -0.02933492 -0.026374381 -0.057900285       NA
## UGDS_NHPI    0.084467412 -0.00747815  0.008463609  0.000466929       NA
## UGDS_2MOR    1.000000000 -0.04149862  0.124831921  0.032102090       NA
## UGDS_NRA    -0.041498615  1.00000000 -0.004155127  0.010432075       NA
## UGDS_UNKN    0.124831921 -0.00415513  1.000000000  0.001969471       NA
## PPTUG_EF     0.032102090  0.01043207  0.001969471  1.000000000       NA
## NPT4_PUB              NA          NA           NA           NA       NA
## NPT4_PRIV    0.147230594  0.21158213  0.175455457  0.063226846       NA
## COSTT4_A     0.084108668  0.29786438  0.000745963  0.056007057       NA
## TUITFTE      0.076083794  0.14691938  0.094455492  0.088948630       NA
## INEXPFTE     0.056898153  0.12354459  0.085974477  0.020075514       NA
## PFTFAC       0.025923718  0.11140965 -0.001378630 -0.258219918       NA
## PCTPELL     -0.023216129 -0.17268570 -0.004028267 -0.082703168       NA
## C150_4      -0.055908652  0.03438323 -0.136293799 -0.176071281       NA
## PFTFTUG1_EF  0.014081728 -0.00582273 -0.097429529 -0.612138572       NA
## RET_FT4     -0.013354879  0.11908737 -0.080655490 -0.178099556       NA
## PCTFLOAN     0.113428335 -0.10539831  0.144960029  0.021248057       NA
##              NPT4_PRIV     COSTT4_A     TUITFTE    INEXPFTE        PFTFAC
## ADM_RATE    -0.2451089 -0.290367647 -0.05132298 -0.12830488 -0.0800635926
## SAT_AVG      0.0609736  0.081668228  0.28894473  0.60526858  0.4105594313
## UGDS         0.0417131 -0.030665377  0.01380408 -0.03395162 -0.0398558401
## UGDS_WHITE  -0.0623264  0.086925768  0.01087930  0.05446975  0.0555373300
## UGDS_BLACK   0.0673526  0.077074574  0.01476538 -0.02718023 -0.1453901634
## UGDS_HISP   -0.1274866 -0.289114733 -0.11422730 -0.10858521  0.0548511078
## UGDS_ASIAN   0.0125486  0.241790278  0.00703382 -0.00470825 -0.0000856367
## UGDS_AIAN   -0.0105816  0.059352356 -0.00324646 -0.02129568  0.0764986957
## UGDS_NHPI    0.0163731 -0.000337176  0.04766826  0.06064067 -0.0148801326
## UGDS_2MOR    0.1472306  0.084108668  0.07608379  0.05689815  0.0259237181
## UGDS_NRA     0.2115821  0.297864383  0.14691938  0.12354459  0.1114096525
## UGDS_UNKN    0.1754555  0.000745963  0.09445549  0.08597448 -0.0013786304
## PPTUG_EF     0.0632268  0.056007057  0.08894863  0.02007551 -0.2582199178
## NPT4_PUB            NA           NA          NA          NA            NA
## NPT4_PRIV    1.0000000  0.967357557  0.45727302  0.18241881 -0.0150769478
## COSTT4_A     0.9673576  1.000000000  0.40204302  0.31963997  0.1019986964
## TUITFTE      0.4572730  0.402043024  1.00000000  0.65876543  0.0139443941
## INEXPFTE     0.1824188  0.319639973  0.65876543  1.00000000  0.2956579316
## PFTFAC      -0.0150769  0.101998696  0.01394439  0.29565793  1.0000000000
## PCTPELL     -0.1017748 -0.332307279 -0.11780181 -0.11328996 -0.1658382295
## C150_4       0.1804407  0.185581964  0.08245811  0.18622118  0.3429518514
## PFTFTUG1_EF -0.0199389 -0.046508764  0.08331965 -0.09867887  0.0133467776
## RET_FT4      0.1756141  0.177174856  0.14915732  0.12505271  0.2309278001
## PCTFLOAN     0.3412980  0.208851934  0.16520422  0.02684142 -0.0878965077
##                 PCTPELL       C150_4 PFTFTUG1_EF     RET_FT4     PCTFLOAN
## ADM_RATE     0.21634529 -0.083829278  0.31719373 -0.17356395 -0.027607175
## SAT_AVG     -0.53513378  0.631184774  0.57184638  0.60742201 -0.347970803
## UGDS        -0.01598827 -0.063733379 -0.16584245 -0.04628135  0.004132152
## UGDS_WHITE  -0.24832802  0.055175020  0.04269110  0.12197640  0.101130962
## UGDS_BLACK   0.22068855 -0.221600096 -0.01501930 -0.20413370  0.136761305
## UGDS_HISP    0.15465220  0.206690405  0.06102147  0.06455256 -0.252342682
## UGDS_ASIAN  -0.14303769  0.219067161 -0.08584166  0.15481418 -0.209634775
## UGDS_AIAN    0.02604121 -0.020372343  0.00738219  0.02326999  0.014122556
## UGDS_NHPI   -0.03351721  0.056560038 -0.09779939  0.02531659 -0.016446551
## UGDS_2MOR   -0.02321613 -0.055908652  0.01408173 -0.01335488  0.113428335
## UGDS_NRA    -0.17268570  0.034383230 -0.00582273  0.11908737 -0.105398310
## UGDS_UNKN   -0.00402827 -0.136293799 -0.09742953 -0.08065549  0.144960029
## PPTUG_EF    -0.08270317 -0.176071281 -0.61213857 -0.17809956  0.021248057
## NPT4_PUB             NA           NA          NA          NA           NA
## NPT4_PRIV   -0.10177479  0.180440701 -0.01993887  0.17561405  0.341297971
## COSTT4_A    -0.33230728  0.185581964 -0.04650876  0.17717486  0.208851934
## TUITFTE     -0.11780181  0.082458112  0.08331965  0.14915732  0.165204223
## INEXPFTE    -0.11328996  0.186221181 -0.09867887  0.12505271  0.026841418
## PFTFAC      -0.16583823  0.342951851  0.01334678  0.23092780 -0.087896508
## PCTPELL      1.00000000 -0.092793128  0.24013401 -0.14047872  0.441937946
## C150_4      -0.09279313  1.000000000  0.20299115  0.37974183  0.000721788
## PFTFTUG1_EF  0.24013401  0.202991152  1.00000000  0.17019919  0.036983617
## RET_FT4     -0.14047872  0.379741827  0.17019919  1.00000000  0.003430510
## PCTFLOAN     0.44193795  0.000721788  0.03698362  0.00343051  1.000000000
library(corrplot)
corrplot(correlation_matrix_fp, method="circle")

The case of for-profit schools differs from non-profit schools. There is a strong negative relation between UGDS_BLACK and SAT_AVG. Also a relatively weaker negative relation between UGDS_HISP and SAT_AVG. These variables also have a negative covariance. The SAT_AVG has a strong positive relation with INEXPFTE which indicates that instructional expenditures per full-time equivalent student is positively related to the SAT_AVG.These variables also have a strong covariance (367148.66667). SAT_AVG and ADM_RATE are also negatively related with a negative covariance (-5.81308000), which can be an expected result to see that admission rates decrease as institutions accept higher SAT scores.

###Question 8

The analyses indicate that -The relation of SAT_AVG to ADM_RATE is significantly stronger for for-profit private schools than for non-profit private and public schools. -The negative relation of UGDS_BLACK and SAT_AVG is also significantly stronger for for-profit private schools. -The positive relation of SAT_AVG to INEXPFTE is stronger in private schools than in public schools. -PCTPELL (Percentage of undergraduates who receive a Pell Grant) is negatively related to COSTT4_A (Average cost of attendance) for private schools while they have a positive relation in public schools, while they have a negative covariance in each case.

###Question 9

REGRESSION ANALYSIS: y=COSTT4_A, x=SAT_AVG Installing the necessary packages:

library(tidyverse)
## -- Attaching packages ---------------------------------------------------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.1     v purrr   0.3.2
## v tibble  2.1.3     v dplyr   0.8.3
## v tidyr   0.8.3     v stringr 1.4.0
## v readr   1.3.1     v forcats 0.4.0
## -- Conflicts ------------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x ggplot2::%+%()   masks psych::%+%()
## x ggplot2::alpha() masks psych::alpha()
## x dplyr::filter()  masks stats::filter()
## x dplyr::lag()     masks stats::lag()
library(ggpubr)
## Loading required package: magrittr
## 
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
## 
##     set_names
## The following object is masked from 'package:tidyr':
## 
##     extract
detach(package:dplyr)
theme_set(theme_pubr())

Scatter Plot for the variables:

ggplot(projectdata, aes(x = SAT_AVG, y = COSTT4_A)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 6405 rows containing non-finite values (stat_smooth).
## Warning: Removed 6405 rows containing missing values (geom_point).

cor(x=projectdata$SAT_AVG, y = projectdata$COSTT4_A, use = "complete.obs",
    method = "pearson")
## [1] 0.521171

Looking for best-fit line:

model_9 <- lm(COSTT4_A ~ SAT_AVG, data = projectdata)
model_9
## 
## Call:
## lm(formula = COSTT4_A ~ SAT_AVG, data = projectdata)
## 
## Coefficients:
## (Intercept)      SAT_AVG  
##    -22454.5         51.9

Plotting with regression line:

ggplot(projectdata, aes(COSTT4_A, SAT_AVG)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 6405 rows containing non-finite values (stat_smooth).
## Warning: Removed 6405 rows containing missing values (geom_point).

Model assessment:

print(summary(model_9))
## 
## Call:
## lm(formula = COSTT4_A ~ SAT_AVG, data = projectdata)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -32053 -10000    942   9354  25867 
## 
## Coefficients:
##              Estimate Std. Error t value            Pr(>|t|)    
## (Intercept) -22454.50    2520.61   -8.91 <0.0000000000000002 ***
## SAT_AVG         51.90       2.36   21.98 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11400 on 1296 degrees of freedom
##   (6405 observations deleted due to missingness)
## Multiple R-squared:  0.272,  Adjusted R-squared:  0.271 
## F-statistic:  483 on 1 and 1296 DF,  p-value: <0.0000000000000002

The model can only explain 27% of the variation. Cost of attendance and SAT average have a positive linear relation with a correlation of 0.52.

###Question 10

REGRESSION ANALYSIS: y=COSTT4_A, x=ADM_RATE

ggplot(projectdata, aes(x = ADM_RATE, y = COSTT4_A)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 5707 rows containing non-finite values (stat_smooth).
## Warning: Removed 5707 rows containing missing values (geom_point).

cor(x=projectdata$ADM_RATE, y = projectdata$COSTT4_A, use = "complete.obs",
    method = "pearson")
## [1] -0.277502

cor= -0.277502

model_10 <- lm(COSTT4_A ~ ADM_RATE, data = projectdata)
ggplot(projectdata, aes(COSTT4_A, ADM_RATE)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 5707 rows containing non-finite values (stat_smooth).
## Warning: Removed 5707 rows containing missing values (geom_point).

print(summary(model_10))
## 
## Call:
## lm(formula = COSTT4_A ~ ADM_RATE, data = projectdata)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -31765  -9986  -1323   9705  36241 
## 
## Coefficients:
##             Estimate Std. Error t value            Pr(>|t|)    
## (Intercept)    43930        982    44.8 <0.0000000000000002 ***
## ADM_RATE      -17799       1380   -12.9 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12600 on 1994 degrees of freedom
##   (5707 observations deleted due to missingness)
## Multiple R-squared:  0.077,  Adjusted R-squared:  0.0765 
## F-statistic:  166 on 1 and 1994 DF,  p-value: <0.0000000000000002

The prediction distribution is positively skewed (median<mean=“0”) The R-squared=0.077 meaning that the analysis can explain only 7.7% of the variance which is low. There are alos a number of outliers in the data.

###Question 11

REGRESSION ANALYSIS: y=UGDS, x=SAT_AVG

ggplot(projectdata, aes(x = SAT_AVG, y = UGDS)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 6399 rows containing non-finite values (stat_smooth).
## Warning: Removed 6399 rows containing missing values (geom_point).

cor(x=projectdata$SAT_AVG, y = projectdata$UGDS, use = "complete.obs",
    method = "pearson")
## [1] 0.249611
model_11 <- lm(UGDS ~ SAT_AVG, data = projectdata)
ggplot(projectdata, aes(SAT_AVG, UGDS)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 6399 rows containing non-finite values (stat_smooth).
## Warning: Removed 6399 rows containing missing values (geom_point).

print(summary(model_11))
## 
## Call:
## lm(formula = UGDS ~ SAT_AVG, data = projectdata)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -11197  -3981  -2485   1077  45035 
## 
## Coefficients:
##             Estimate Std. Error t value             Pr(>|t|)    
## (Intercept) -8999.26    1573.21   -5.72          0.000000013 ***
## SAT_AVG        13.71       1.47    9.30 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 7090 on 1302 degrees of freedom
##   (6399 observations deleted due to missingness)
## Multiple R-squared:  0.0623, Adjusted R-squared:  0.0616 
## F-statistic: 86.5 on 1 and 1302 DF,  p-value: <0.0000000000000002

The prediction distribution is positively skewed (median<mean=“0”) The R-squared=0.0623 meaning that the analysis can explain only 6.2% of the variance which is low. The concentration of the variables at the low levels of UGDS indicates that there are outliers in UGDS. Removal of these outliers is required for an accurate prediction.

Trying to remove outliers from UGDS in projectdata.

Running the regression analysis with the trimmed UGDS:

model_11_2 <- lm(UGDS ~ SAT_AVG, data = projectdata_UGDS_Trimmed)
ggplot(projectdata_UGDS_Trimmed, aes(SAT_AVG, UGDS)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 5878 rows containing non-finite values (stat_smooth).
## Warning: Removed 5878 rows containing missing values (geom_point).

print(summary(model_11_2))
## 
## Call:
## lm(formula = UGDS ~ SAT_AVG, data = projectdata_UGDS_Trimmed)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -2222   -806   -202    647   2993 
## 
## Coefficients:
##             Estimate Std. Error t value  Pr(>|t|)    
## (Intercept)   305.96     303.08    1.01      0.31    
## SAT_AVG         1.41       0.29    4.88 0.0000013 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1080 on 872 degrees of freedom
##   (5878 observations deleted due to missingness)
## Multiple R-squared:  0.0266, Adjusted R-squared:  0.0254 
## F-statistic: 23.8 on 1 and 872 DF,  p-value: 0.00000128

The resulting analysis has a closer-to-normal distribution. The new R-squared is lower than the untrimmed version.

###Question 12

REGRESSION ANALYSIS: y=UGDS, x=ADM_RATE I will run this analysis with trimmed values of UGDS.

ggplot(projectdata_UGDS_Trimmed, aes(x = ADM_RATE, y = UGDS)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 5045 rows containing non-finite values (stat_smooth).
## Warning: Removed 5045 rows containing missing values (geom_point).

cor(x=projectdata_UGDS_Trimmed$ADM_RATE, y = projectdata_UGDS_Trimmed$UGDS, use = "complete.obs",
    method = "pearson")
## [1] -0.154717

cor=(-0.12728)

model_12 <- lm(UGDS ~ ADM_RATE, data = projectdata_UGDS_Trimmed)
ggplot(projectdata_UGDS_Trimmed, aes(UGDS, ADM_RATE)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 5045 rows containing non-finite values (stat_smooth).
## Warning: Removed 5045 rows containing missing values (geom_point).

print(summary(model_12))
## 
## Call:
## lm(formula = UGDS ~ ADM_RATE, data = projectdata_UGDS_Trimmed)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -1724   -850   -351    608   3483 
## 
## Coefficients:
##             Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)     1822         94   19.37 < 0.0000000000000002 ***
## ADM_RATE        -827        128   -6.47        0.00000000013 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1100 on 1705 degrees of freedom
##   (5045 observations deleted due to missingness)
## Multiple R-squared:  0.0239, Adjusted R-squared:  0.0234 
## F-statistic: 41.8 on 1 and 1705 DF,  p-value: 0.000000000131

Correlation value between UGDS and ADM_RATE is cor=(-0.12728), which is relatively small. There is not a significant relation between these variables.

###QUESTION 13 REGRESSION ANALYSIS: y=C150_4, x=SAT_AVG

ggplot(projectdata, aes(x = SAT_AVG, y = C150_4)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 6432 rows containing non-finite values (stat_smooth).
## Warning: Removed 6432 rows containing missing values (geom_point).

cor(x=projectdata$SAT_AVG, y = projectdata$C150_4, use = "complete.obs",
    method = "pearson")
## [1] 0.801478

cor= 0.801478

model_13 <- lm(C150_4 ~ SAT_AVG, data = projectdata)
ggplot(projectdata, aes(SAT_AVG, C150_4)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 6432 rows containing non-finite values (stat_smooth).
## Warning: Removed 6432 rows containing missing values (geom_point).

print(summary(model_13))
## 
## Call:
## lm(formula = C150_4 ~ SAT_AVG, data = projectdata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.5219 -0.0654  0.0072  0.0690  0.4613 
## 
## Coefficients:
##               Estimate Std. Error t value            Pr(>|t|)    
## (Intercept) -0.5741737  0.0237624   -24.2 <0.0000000000000002 ***
## SAT_AVG      0.0010598  0.0000222    47.7 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.105 on 1269 degrees of freedom
##   (6432 observations deleted due to missingness)
## Multiple R-squared:  0.642,  Adjusted R-squared:  0.642 
## F-statistic: 2.28e+03 on 1 and 1269 DF,  p-value: <0.0000000000000002

This regression has a much higher R-squared value that is 0.642, meaning that 64% of variance can be explained by the regression model. The residuals have a relatively normal distribution since the absolute min and max values are close and the median (0.0072) is close to zero. These findings indicate that the average SAT scores can be used to predict the completion rate. The correlation is indeed higher with (0.801478).

###QUESTION 14 REGRESSION ANALYSIS: y=C150_4, x=ADM_RATE

ggplot(projectdata, aes(x = ADM_RATE, y = C150_4)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 5897 rows containing non-finite values (stat_smooth).
## Warning: Removed 5897 rows containing missing values (geom_point).

cor(x=projectdata$ADM_RATE, y = projectdata$C150_4, use = "complete.obs",
    method = "pearson")
## [1] -0.320186

cor=-0.320186

model_14 <- lm(C150_4 ~ ADM_RATE, data = projectdata)
ggplot(projectdata, aes(ADM_RATE, C150_4)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 5897 rows containing non-finite values (stat_smooth).
## Warning: Removed 5897 rows containing missing values (geom_point).

print(summary(model_14))
## 
## Call:
## lm(formula = C150_4 ~ ADM_RATE, data = projectdata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.5848 -0.1237  0.0017  0.1391  0.5728 
## 
## Coefficients:
##             Estimate Std. Error t value            Pr(>|t|)    
## (Intercept)   0.7348     0.0151    48.7 <0.0000000000000002 ***
## ADM_RATE     -0.3076     0.0214   -14.4 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.182 on 1804 degrees of freedom
##   (5897 observations deleted due to missingness)
## Multiple R-squared:  0.103,  Adjusted R-squared:  0.102 
## F-statistic:  206 on 1 and 1804 DF,  p-value: <0.0000000000000002

The correlation is -0.320186, therefore rate of completion and admission rate are negatively related. The R-squared is 0.103, meaning that this model can only explain 10% of the variance in the prediction. The residuals have a relatively normal distribution.

Therefore the admission rate is not a very good predictor of the completion rate.

###QUESTION 15 REGRESSION ANALYSIS: y=PCTFLOAN, x=SAT_AVG

ggplot(projectdata, aes(x = SAT_AVG, y = PCTFLOAN)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 6401 rows containing non-finite values (stat_smooth).
## Warning: Removed 6401 rows containing missing values (geom_point).

cor(x=projectdata$SAT_AVG, y = projectdata$PCTFLOAN, use = "complete.obs",
    method = "pearson")
## [1] -0.510337

cor=(-0.510337)

model_15 <- lm(PCTFLOAN ~ SAT_AVG, data = projectdata)
ggplot(projectdata, aes(SAT_AVG, PCTFLOAN)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 6401 rows containing non-finite values (stat_smooth).
## Warning: Removed 6401 rows containing missing values (geom_point).

print(summary(model_15))
## 
## Call:
## lm(formula = PCTFLOAN ~ SAT_AVG, data = projectdata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.7234 -0.0792  0.0172  0.1009  0.3460 
## 
## Coefficients:
##               Estimate Std. Error t value            Pr(>|t|)    
## (Intercept)  1.2759381  0.0323876    39.4 <0.0000000000000002 ***
## SAT_AVG     -0.0006493  0.0000303   -21.4 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.146 on 1300 degrees of freedom
##   (6401 observations deleted due to missingness)
## Multiple R-squared:  0.26,   Adjusted R-squared:  0.26 
## F-statistic:  458 on 1 and 1300 DF,  p-value: <0.0000000000000002

The regression model can explain 26% of the variation in the prediction. The correlation is cor=(-0.510337) which indicates a positive relation between the two variables. The residuals have a long tail on to the left. SAT_AVG is a better predictor of PCTFLOAN for higher values of SAT_AVG.

###QUESTION 16 REGRESSION ANALYSIS: y=PCTFLOAN, x=ADM_RATE

ggplot(projectdata, aes(x = ADM_RATE, y = PCTFLOAN)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 5508 rows containing non-finite values (stat_smooth).
## Warning: Removed 5508 rows containing missing values (geom_point).

cor(x=projectdata$ADM_RATE, y = projectdata$PCTFLOAN, use = "complete.obs",
    method = "pearson")
## [1] 0.107624

cor=0.107624

model_16 <- lm(PCTFLOAN ~ ADM_RATE, data = projectdata)
ggplot(projectdata, aes(ADM_RATE, PCTFLOAN)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 5508 rows containing non-finite values (stat_smooth).
## Warning: Removed 5508 rows containing missing values (geom_point).

print(summary(model_16))
## 
## Call:
## lm(formula = PCTFLOAN ~ ADM_RATE, data = projectdata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.6243 -0.1127  0.0309  0.1566  0.4431 
## 
## Coefficients:
##             Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)   0.5119     0.0160   31.92 < 0.0000000000000002 ***
## ADM_RATE      0.1125     0.0222    5.07           0.00000043 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.217 on 2193 degrees of freedom
##   (5508 observations deleted due to missingness)
## Multiple R-squared:  0.0116, Adjusted R-squared:  0.0111 
## F-statistic: 25.7 on 1 and 2193 DF,  p-value: 0.000000432

The model can only explain 1% of the variance in the prediction. The correlation is not strong since cor=(0.107624). Therefore ADM_RATE is not a good predictor of PCTFLOAN.

###EXTRA REGRESSION ANALYSIS 1 REGRESSION ANALYSIS: y=UGDS_WHITE, x=COSTT4_A for private for-profit schools

ggplot(projectdata_private_fp, aes(x = COSTT4_A, y = UGDS_ASIAN)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 2688 rows containing non-finite values (stat_smooth).
## Warning: Removed 2688 rows containing missing values (geom_point).

cor(x=projectdata_private_fp$COSTT4_A, y = projectdata_private_fp$UGDS_ASIAN, use = "complete.obs",
    method = "pearson")
## [1] 0.24179

cor= 0.24179

model_17 <- lm(UGDS_ASIAN ~ COSTT4_A, data = projectdata_private_fp)
ggplot(projectdata_private_fp, aes(COSTT4_A, UGDS_ASIAN)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 2688 rows containing non-finite values (stat_smooth).
## Warning: Removed 2688 rows containing missing values (geom_point).

print(summary(model_17))
## 
## Call:
## lm(formula = UGDS_ASIAN ~ COSTT4_A, data = projectdata_private_fp)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.0593 -0.0198 -0.0111  0.0032  0.6091 
## 
## Coefficients:
##                 Estimate   Std. Error t value           Pr(>|t|)    
## (Intercept) -0.023182923  0.006324043   -3.67            0.00026 ***
## COSTT4_A     0.000001886  0.000000238    7.93 0.0000000000000057 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.0457 on 1013 degrees of freedom
##   (2688 observations deleted due to missingness)
## Multiple R-squared:  0.0585, Adjusted R-squared:  0.0575 
## F-statistic: 62.9 on 1 and 1013 DF,  p-value: 0.00000000000000572

The R-squares has a low value (5%) and the correlation is (0.24179) The residuals distribution has a long tail to the right. This regression analysis is affected by the outliers in the high values of cost of attendance. For this analysis, for the case of private for-profit schools, COSTT4_a is not a good predictor of UGDS_ASIAN.

###EXTRA REGRESSION ANALYSIS 2 REGRESSION ANALYSIS: y=INEXPFTE, x=COSTT4_A

ggplot(projectdata, aes(x = COSTT4_A, y = INEXPFTE)) +
  geom_point() +
  stat_smooth()
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
## Warning: Removed 3675 rows containing non-finite values (stat_smooth).
## Warning: Removed 3675 rows containing missing values (geom_point).

cor(x=projectdata$COSTT4_A, y = projectdata$INEXPFTE, use = "complete.obs",
    method = "pearson")
## [1] 0.455454

cor=0.455454

model_18 <- lm(INEXPFTE ~ COSTT4_A, data = projectdata)
ggplot(projectdata, aes(COSTT4_A, INEXPFTE)) +
  geom_point() +
  stat_smooth(method = lm)
## Warning: Removed 3675 rows containing non-finite values (stat_smooth).
## Warning: Removed 3675 rows containing missing values (geom_point).

print(summary(model_18))
## 
## Call:
## lm(formula = INEXPFTE ~ COSTT4_A, data = projectdata)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -10453  -3085   -518   1655  92003 
## 
## Coefficients:
##              Estimate Std. Error t value             Pr(>|t|)    
## (Intercept) 1577.6523   201.1727    7.84   0.0000000000000056 ***
## COSTT4_A       0.2337     0.0072   32.46 < 0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5830 on 4026 degrees of freedom
##   (3675 observations deleted due to missingness)
## Multiple R-squared:  0.207,  Adjusted R-squared:  0.207 
## F-statistic: 1.05e+03 on 1 and 4026 DF,  p-value: <0.0000000000000002

The regression model can explain 20% of the variance in the prediction and cor=0.455454. The residuals have a relatively normal, lightly skewed distribution. There seem to be outliers in the INEXPFTE, therefore it his hard to see the residuals in this plot. Cleaning the outliers from the data can provide a more accurate prediction.

###QUESTION 17 -Checking normality of SAT_AVG:

describeBy(projectdata$SAT_AVG)
## Warning in describeBy(projectdata$SAT_AVG): no grouping variable requested
qqnorm(projectdata$SAT_AVG,main="QQ plot of SAT_AVG")
qqline(projectdata$SAT_AVG)

The mean and the median are relatively close. The distribution has a moderate skewness and a low kurtosis. However, the qqplot indicates that the scores deviate from the normal distribution in higher values. Since the mean and median are close with a moderate skew, we can say that the distribution is relatively normal.

-Checking normality of ADM_RATE:

describeBy(projectdata$ADM_RATE)
## Warning in describeBy(projectdata$ADM_RATE): no grouping variable requested
qqnorm(projectdata$ADM_RATE,main="QQ plot of ADM_RATE")
qqline(projectdata$ADM_RATE)

The mean and median of the distribution are close. The distribution is moderately skewed (-0.58) The qqplot indicates a strong deviation from the normal line in minimum and maximum values. The distribution of ADM_RATE is not normal for higher scores.

-Checking normality of UGDS:

describeBy(projectdata$UGDS)
## Warning in describeBy(projectdata$UGDS): no grouping variable requested
qqnorm(projectdata$UGDS,main="QQ plot of UGDS")
qqline(projectdata$UGDS)

UGDS presents a large nehative skew since mean value is much higher than the median. There are outliers in maximum values and a strong deviation from the normal line. The distribution of uGDS is not normal.

###QUEStION 18 The previous plot indicate that ADM_RATE and UGDS deviate from a normal distribution while SAT_AVG can be considered a normal distrbution although not perfect.

#-First normalizing the ADM_RATE Looking at the boxplot to see the outliers:

boxplot(projectdata$ADM_RATE)

There are outlier scores below 0.2.

-Storing outliers in a vector:

The outlier values are as follows:

print(outliers_ADM_RATE)
##  [1] 0.0883 0.1077 0.1302 0.1219 0.0630 0.0876 0.1311 0.0000 0.1384 0.0596
## [11] 0.0788 0.0831 0.1150 0.0000 0.0744 0.0695 0.0843 0.1141 0.0505 0.1037
## [21] 0.0874 0.1309 0.1000 0.0509 0.1346 0.0000 0.1200 0.1111

-Finding the location of outliers:

projectdata[which(projectdata$ADM_RATE %in% outliers_ADM_RATE),]

-Removing the outliers:

projectdata_ADM_RATE_clean <- projectdata[-which(projectdata$ADM_RATE %in% outliers_ADM_RATE),]

-Checking with boxplot:

boxplot(projectdata_ADM_RATE_clean$ADM_RATE)

library(psych)
describeBy(projectdata$ADM_RATE)
## Warning in describeBy(projectdata$ADM_RATE): no grouping variable requested
describeBy(projectdata_ADM_RATE_clean$ADM_RATE)
## Warning in describeBy(projectdata_ADM_RATE_clean$ADM_RATE): no grouping
## variable requested

The mean and median are much closer now with moderate values of skewness and kurtosis.

-Checking with qplot:

qqnorm(projectdata_ADM_RATE_clean$ADM_RATE,main="QQ plot of ADM_RATE_outliers_removed")
qqline(projectdata_ADM_RATE_clean$ADM_RATE)

The scores fit better on the line. (I am not sure what to with the scores on the top-right of the plot.) The new mean is (0.7), median is (0.72), standard deviation is (0.2) and variance is (0.04).

#-Second, normalizing the UGDS:

Looking at the outliers with a box plot:

boxplot(projectdata$UGDS)

Storing outliers in a vector:

Finding the location of outliers:

projectdata[which(projectdata$UGDS %in% outliers_UGDS),]

Removing the outliers:

projectdata_UGDS_clean <- projectdata[-which(projectdata$UGDS %in% outliers_UGDS),]

Checking the new distribution with a boxplot:

boxplot(projectdata_UGDS_clean$UGDS)

The outcome still does not look very good. Checking with describeBy:

library(psych)
describeBy(projectdata_UGDS_clean$UGDS)
## Warning in describeBy(projectdata_UGDS_clean$UGDS): no grouping variable
## requested

The distribution is still largely skewed.

Transforming UGDS with square-root to decrease the skew:

projectdata_UGDS_clean$sqrt_UGDS <- sqrt(projectdata_UGDS_clean$UGDS)
describeBy(projectdata_UGDS_clean$sqrt_UGDS)
## Warning in describeBy(projectdata_UGDS_clean$sqrt_UGDS): no grouping
## variable requested

The mean and median are closer now with less skewness.

Checking with qqplot:

qqnorm(projectdata_UGDS_clean$sqrt_UGDS,main="QQ plot of sqrt_UGDS_outliers_removed")
qqline(projectdata_UGDS_clean$sqrt_UGDS)

The scores still do not follow the line but it is better than the previous version. The new mean is (22.33), median is (16.97), standard deviation is (15.92) and variance is (253.44).

###QUESTION 19 #The probability that the average SAT score for a school being greater than 1400: Finding the mean of SAT_AVG:

mean(projectdata$SAT_AVG, na.rm=TRUE)
## [1] 1059.07

Finding the standard deviation of SAT_AVG:

sd(projectdata$SAT_AVG, na.rm=TRUE)
## [1] 133.357

-Probability of SAT_AVG > 1400:

pnorm(1400, mean=1059.072, sd=133.357, lower.tail=FALSE)
## [1] 0.00528646

There is 0.52% probability of SAT_AVG > 1400.

-Probability of SAT_AVG < 800:

pnorm(800, mean=1059.072, sd=133.357, lower.tail=TRUE)
## [1] 0.0260265

There is 2.6% probability of SAT_AVG < 800.

###QUESTION 20

I need to do sampling distribution of means but could not figure out how to make it.