#1) Load your preferred dataset into R studio
library(readxl)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## âś” dplyr 1.1.4 âś” readr 2.1.5
## âś” forcats 1.0.0 âś” stringr 1.5.1
## âś” ggplot2 3.5.1 âś” tibble 3.2.1
## âś” lubridate 1.9.4 âś” tidyr 1.3.1
## âś” purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## âś– dplyr::filter() masks stats::filter()
## âś– dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(car)
## Loading required package: carData
##
## Attaching package: 'car'
##
## The following object is masked from 'package:dplyr':
##
## recode
##
## The following object is masked from 'package:purrr':
##
## some
district<-read_excel("district.xls")
library(lmtest)
## Loading required package: zoo
##
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
##
## as.Date, as.Date.numeric
library(MASS)
##
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
##
## select
cars_data<-mtcars
#1) Load your preferred dataset into R studio #2) Create a linear model “lm()” from the variables, with a continuous dependent variable as the outcome #3) Check the following assumptions: a) Linearity (plot and raintest) b) Independence of errors (durbin-watson) c) Homoscedasticity (plot, bptest) d) Normality of residuals (QQ plot, shapiro test) e) No multicolinarity (VIF, cor) #4) does your model meet those assumptions? You don’t have to be perfectly right, just make a good case. #5) If your model violates an assumption, which one? #6) What would you do to mitigate this assumption? Show your work.
str(district)
## tibble [1,207 Ă— 137] (S3: tbl_df/tbl/data.frame)
## $ DISTNAME : chr [1:1207] "CAYUGA ISD" "ELKHART ISD" "FRANKSTON ISD" "NECHES ISD" ...
## $ DISTRICT : chr [1:1207] "001902" "001903" "001904" "001906" ...
## $ DZCNTYNM : chr [1:1207] "001 ANDERSON" "001 ANDERSON" "001 ANDERSON" "001 ANDERSON" ...
## $ REGION : chr [1:1207] "07" "07" "07" "07" ...
## $ DZRATING : chr [1:1207] "A" "A" "A" "A" ...
## $ DZCAMPUS : num [1:1207] 3 4 3 2 6 4 2 6 4 5 ...
## $ DPETALLC : num [1:1207] 574 1150 808 342 3360 ...
## $ DPETBLAP : num [1:1207] 4.4 4 8.5 8.2 25.1 19.7 0.3 0.8 15.7 7.2 ...
## $ DPETHISP : num [1:1207] 11.5 11.8 11.3 13.5 42.9 26.2 8.6 68.7 31.2 27.9 ...
## $ DPETWHIP : num [1:1207] 79.1 80.3 75.2 75.1 27.3 48 87 28.2 48.5 60.6 ...
## $ DPETINDP : num [1:1207] 0 0.3 0.4 0.3 0.2 0.7 0 0.3 0.1 0.3 ...
## $ DPETASIP : num [1:1207] 0.5 0.2 1 0.3 0.7 0.5 0.6 0.3 1 1 ...
## $ DPETPCIP : num [1:1207] 0 0 0 0 0.1 0.1 0 0 0.1 0.1 ...
## $ DPETTWOP : num [1:1207] 4.5 3.4 3.6 2.6 3.7 4.9 3.6 1.7 3.4 3 ...
## $ DPETECOP : num [1:1207] 40.8 45.4 54.2 54.1 81.6 74 46.8 49.6 57.8 50.1 ...
## $ DPETLEPP : num [1:1207] 1 2.8 4.1 2 17.7 7.1 0.6 14.2 5.1 6.9 ...
## $ DPETSPEP : num [1:1207] 14.6 12.1 13.1 10.5 13.5 14.5 14.7 10.4 11.6 11.9 ...
## $ DPETBILP : num [1:1207] 1 2.7 4.1 2 16.1 6.8 0.6 15.2 5 6 ...
## $ DPETVOCP : num [1:1207] 30.5 31.8 43.9 29.5 30.6 38.7 37.7 24.8 18.9 34.4 ...
## $ DPETGIFP : num [1:1207] 6.1 4.6 7.3 5.6 2.3 3.2 3.3 6.8 9.2 6 ...
## $ DA0AT21R : num [1:1207] 96.7 96 95.4 95.8 93.7 94.5 96.7 92.8 97.3 95.2 ...
## $ DA0912DR21R : num [1:1207] 0 0.3 0.4 0 0 0 0 0.4 0.4 0.7 ...
## $ DAGC4X21R : num [1:1207] 100 100 95.2 95.8 99 97.8 100 96.8 100 94.1 ...
## $ DAGC5X20R : num [1:1207] 100 98.9 100 97 99.6 97 100 97.2 100 95.6 ...
## $ DAGC6X19R : num [1:1207] 96 98.8 33.3 100 98.6 97.4 100 96.7 100 95.9 ...
## $ DA0GR21N : num [1:1207] 36 91 41 23 201 95 32 293 52 196 ...
## $ DA0GS21N : num [1:1207] 34 79 40 17 198 77 27 238 52 154 ...
## $ DDA00A001S22R: num [1:1207] 84 85 83 90 74 69 86 76 82 86 ...
## $ DDA00A001222R: num [1:1207] 62 59 57 64 46 40 55 47 56 60 ...
## $ DDA00A001322R: num [1:1207] 33 30 25 27 20 16 25 21 30 31 ...
## $ DDA00AR01S22R: num [1:1207] 81 85 84 87 72 70 86 75 82 84 ...
## $ DDA00AR01222R: num [1:1207] 67 64 63 67 48 45 66 50 60 62 ...
## $ DDA00AR01322R: num [1:1207] 39 34 24 30 20 19 31 22 31 31 ...
## $ DDA00AM01S22R: num [1:1207] 88 84 85 94 75 66 81 76 81 88 ...
## $ DDA00AM01222R: num [1:1207] 65 49 57 69 44 34 42 44 53 62 ...
## $ DDA00AM01322R: num [1:1207] 34 23 26 27 20 14 19 21 29 33 ...
## $ DDA00AC01S22R: num [1:1207] 85 86 81 90 78 73 96 75 83 84 ...
## $ DDA00AC01222R: num [1:1207] 54 63 49 54 48 41 45 46 57 52 ...
## $ DDA00AC01322R: num [1:1207] 22 29 21 23 22 15 16 18 27 21 ...
## $ DDA00AS01S22R: num [1:1207] 78 90 74 83 72 68 92 81 82 87 ...
## $ DDA00AS01222R: num [1:1207] 47 63 48 51 42 38 73 50 51 60 ...
## $ DDA00AS01322R: num [1:1207] 21 42 26 26 20 15 38 27 32 36 ...
## $ DDB00A001S22R: num [1:1207] 60 46 74 88 64 56 -1 71 68 71 ...
## $ DDB00A001222R: num [1:1207] 17 22 38 48 33 26 -1 41 38 37 ...
## $ DDB00A001322R: num [1:1207] 3 8 6 19 11 11 -1 13 14 14 ...
## $ DDH00A001S22R: num [1:1207] 74 85 75 91 73 69 87 72 81 81 ...
## $ DDH00A001222R: num [1:1207] 53 56 46 69 44 36 57 42 50 53 ...
## $ DDH00A001322R: num [1:1207] 24 25 19 26 19 12 20 17 24 24 ...
## $ DDW00A001S22R: num [1:1207] 87 88 85 89 83 75 86 84 88 89 ...
## $ DDW00A001222R: num [1:1207] 66 61 62 66 60 48 55 58 67 66 ...
## $ DDW00A001322R: num [1:1207] 35 32 28 29 29 21 26 29 40 35 ...
## $ DDI00A001S22R: num [1:1207] NA 100 80 -1 75 NA NA 83 -1 62 ...
## $ DDI00A001222R: num [1:1207] NA 100 20 -1 50 NA NA 28 -1 8 ...
## $ DDI00A001322R: num [1:1207] NA 100 20 -1 17 NA NA 6 -1 0 ...
## $ DD300A001S22R: num [1:1207] 33 -1 84 -1 85 100 NA 100 93 97 ...
## $ DD300A001222R: num [1:1207] 33 -1 53 -1 77 100 NA 87 73 82 ...
## $ DD300A001322R: num [1:1207] 17 -1 16 -1 44 88 NA 67 53 56 ...
## $ DD400A001S22R: num [1:1207] NA NA NA NA -1 -1 NA NA -1 -1 ...
## $ DD400A001222R: num [1:1207] NA NA NA NA -1 -1 NA NA -1 -1 ...
## $ DD400A001322R: num [1:1207] NA NA NA NA -1 -1 NA NA -1 -1 ...
## $ DD200A001S22R: num [1:1207] 83 77 75 -1 74 62 88 85 74 83 ...
## $ DD200A001222R: num [1:1207] 54 46 58 -1 44 38 50 58 48 50 ...
## $ DD200A001322R: num [1:1207] 34 23 28 -1 18 13 6 31 13 29 ...
## $ DDE00A001S22R: num [1:1207] 76 77 77 86 70 65 81 67 77 78 ...
## $ DDE00A001222R: num [1:1207] 50 42 49 53 40 34 45 36 48 46 ...
## $ DDE00A001322R: num [1:1207] 23 19 17 17 16 14 17 14 23 19 ...
## $ DA0CT21R : num [1:1207] 58.3 51.6 92.7 87 43.3 40 12.5 42 9.6 38.3 ...
## $ DA0CC21R : num [1:1207] 19 27.7 36.8 15 49.4 28.9 -1 35.8 60 60 ...
## $ DA0CSA21R : num [1:1207] 980 979 980 1007 1048 ...
## $ DA0CAA21R : num [1:1207] NA -1 -1 18.8 21 -1 -1 22.3 NA 23.1 ...
## $ DPSATOFC : num [1:1207] 99.9 186.6 146.7 60.1 553.4 ...
## $ DPSTTOFC : num [1:1207] 46.7 104.9 74.5 30.2 260.3 ...
## $ DPSCTOFP : num [1:1207] 1.5 1.1 1.4 3.1 2.1 1.1 4.1 1.5 4.5 0.9 ...
## $ DPSSTOFP : num [1:1207] 5 2.1 3.5 5 3.4 4.6 3.4 2.6 3.1 3.9 ...
## $ DPSUTOFP : num [1:1207] 5.4 4.9 2 1.7 8.3 4.4 3 5.8 10 6 ...
## $ DPSTTOFP : num [1:1207] 46.8 56.2 50.8 50.3 47 45.5 56.7 50.8 50 49.7 ...
## $ DPSETOFP : num [1:1207] 14.8 16.2 15 13.7 19.7 19.2 9.8 15.4 11.1 8.2 ...
## $ DPSXTOFP : num [1:1207] 26.5 19.5 27.4 26.2 19.5 25.2 23 23.9 21.4 31.3 ...
## $ DPSCTOSA : num [1:1207] 93333 100313 98293 85537 99324 ...
## $ DPSSTOSA : num [1:1207] 73300 79305 71215 81593 80415 ...
## $ DPSUTOSA : num [1:1207] 59550 60616 58022 77642 63829 ...
## $ DPSTTOSA : num [1:1207] 55570 47916 50382 55346 48825 ...
## $ DPSAMIFP : num [1:1207] 15.6 13.4 10.9 16.3 32.1 29.9 1.9 41.3 22.2 18.8 ...
## $ DPSAKIDR : num [1:1207] 5.7 6.2 5.5 5.7 6.1 5 5.2 7.3 7.4 6.5 ...
## $ DPSTKIDR : num [1:1207] 12.3 11 10.8 11.3 12.9 11 9.3 14.4 14.8 13.2 ...
## $ DPST05FP : num [1:1207] 10.4 23.8 32.7 9.7 33.8 44.8 17.9 21.5 35 21.9 ...
## $ DPSTEXPA : num [1:1207] 16.7 13.5 12.8 14.8 12.7 10.3 15.4 13.8 10.2 13.8 ...
## $ DPSTADFP : num [1:1207] 14.8 19 30.7 9.6 15.4 17.4 16.9 24.3 18.5 22.4 ...
## $ DPSTURNR : num [1:1207] 19.1 13.9 21.6 18.3 17.9 30.6 14.6 11.5 17 9.5 ...
## $ DPSTBLFP : num [1:1207] 8.3 2.9 4 6.5 9.6 11.6 0 1.4 4.4 0.5 ...
## $ DPSTHIFP : num [1:1207] 0 6.7 1.3 0 13.8 6.6 0 25.7 8.9 5.6 ...
## $ DPSTWHFP : num [1:1207] 91.7 90.5 93.3 93.5 74.6 80.9 100 69 86.7 93.9 ...
## $ DPSTINFP : num [1:1207] 0 0 0 0 0 0.8 0 0.3 0 0 ...
## $ DPSTASFP : num [1:1207] 0 0 0 0 0 0 0 0.7 0 0 ...
## $ DPSTPIFP : num [1:1207] 0 0 0 0 0 0 0 0 0 0 ...
## $ DPSTTWFP : num [1:1207] 0 0 1.3 0 1.9 0 0 2.8 0 0 ...
## $ DPSTREFP : num [1:1207] 81.6 71.5 87.6 70 71.4 71.4 61 41.7 82.7 66.4 ...
## $ DPSTSPFP : num [1:1207] 9.9 8.4 7.5 5.5 10.2 6.4 5.8 14.4 6.8 9.6 ...
## $ DPSTCOFP : num [1:1207] 0 4.9 2.7 12 5 6.1 19.2 6.5 7.4 9.2 ...
## [list output truncated]
summary(district)
## DISTNAME DISTRICT DZCNTYNM REGION
## Length:1207 Length:1207 Length:1207 Length:1207
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
##
## DZRATING DZCAMPUS DPETALLC DPETBLAP
## Length:1207 Min. : 1.000 Min. : 4.0 Min. : 0.000
## Class :character 1st Qu.: 2.000 1st Qu.: 337.5 1st Qu.: 0.700
## Mode :character Median : 3.000 Median : 884.0 Median : 2.900
## Mean : 7.428 Mean : 4476.3 Mean : 8.765
## 3rd Qu.: 5.000 3rd Qu.: 2746.0 3rd Qu.:10.750
## Max. :273.000 Max. :193727.0 Max. :98.100
##
## DPETHISP DPETWHIP DPETINDP DPETASIP
## Min. : 0.00 Min. : 0.00 Min. : 0.0000 Min. : 0.000
## 1st Qu.: 21.00 1st Qu.:18.55 1st Qu.: 0.0000 1st Qu.: 0.000
## Median : 37.90 Median :44.40 Median : 0.2000 Median : 0.400
## Mean : 43.29 Mean :43.15 Mean : 0.3283 Mean : 1.614
## 3rd Qu.: 61.90 3rd Qu.:67.75 3rd Qu.: 0.4000 3rd Qu.: 1.000
## Max. :100.00 Max. :97.10 Max. :19.8000 Max. :54.300
##
## DPETPCIP DPETTWOP DPETECOP DPETLEPP
## Min. : 0.0000 Min. : 0.000 Min. : 0.00 Min. : 0.00
## 1st Qu.: 0.0000 1st Qu.: 1.200 1st Qu.: 47.95 1st Qu.: 2.90
## Median : 0.0000 Median : 2.400 Median : 61.90 Median : 7.50
## Mean : 0.1005 Mean : 2.758 Mean : 60.75 Mean : 12.69
## 3rd Qu.: 0.1000 3rd Qu.: 3.900 3rd Qu.: 77.15 3rd Qu.: 17.00
## Max. :14.5000 Max. :15.000 Max. :100.00 Max. :100.00
##
## DPETSPEP DPETBILP DPETVOCP DPETGIFP
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.000
## 1st Qu.: 9.90 1st Qu.: 2.90 1st Qu.:23.00 1st Qu.: 3.100
## Median :12.10 Median : 7.30 Median :27.80 Median : 5.400
## Mean :12.27 Mean : 12.58 Mean :26.47 Mean : 5.574
## 3rd Qu.:14.20 3rd Qu.: 16.80 3rd Qu.:32.90 3rd Qu.: 7.500
## Max. :51.70 Max. :100.00 Max. :82.80 Max. :100.000
##
## DA0AT21R DA0912DR21R DAGC4X21R DAGC5X20R
## Min. : -1.00 Min. :-1.000 Min. : -1.00 Min. : -1.00
## 1st Qu.: 94.05 1st Qu.: 0.000 1st Qu.: 93.20 1st Qu.: 95.50
## Median : 95.40 Median : 0.400 Median : 96.90 Median : 98.30
## Mean : 94.76 Mean : 1.243 Mean : 93.91 Mean : 95.76
## 3rd Qu.: 96.40 3rd Qu.: 1.400 3rd Qu.:100.00 3rd Qu.:100.00
## Max. :100.00 Max. :50.500 Max. :100.00 Max. :100.00
## NA's :4 NA's :112 NA's :133 NA's :141
## DAGC6X19R DA0GR21N DA0GS21N DDA00A001S22R
## Min. : -1.00 Min. : 1.0 Min. : 0.0 Min. : 4.00
## 1st Qu.: 95.20 1st Qu.: 29.0 1st Qu.: 26.0 1st Qu.: 68.00
## Median : 98.20 Median : 69.0 Median : 61.0 Median : 76.00
## Mean : 95.72 Mean : 331.6 Mean : 278.9 Mean : 74.77
## 3rd Qu.:100.00 3rd Qu.: 208.0 3rd Qu.: 167.0 3rd Qu.: 83.00
## Max. :100.00 Max. :11588.0 Max. :9607.0 Max. :100.00
## NA's :149 NA's :126 NA's :126 NA's :5
## DDA00A001222R DDA00A001322R DDA00AR01S22R DDA00AR01222R
## Min. : 0.00 Min. : 0.00 Min. : -1.00 Min. : -1.00
## 1st Qu.:37.00 1st Qu.:15.00 1st Qu.: 70.00 1st Qu.: 43.00
## Median :46.00 Median :20.00 Median : 77.00 Median : 52.00
## Mean :46.48 Mean :21.05 Mean : 76.22 Mean : 52.12
## 3rd Qu.:55.00 3rd Qu.:26.00 3rd Qu.: 84.00 3rd Qu.: 61.00
## Max. :88.00 Max. :64.00 Max. :100.00 Max. :100.00
## NA's :5 NA's :5 NA's :5 NA's :5
## DDA00AR01322R DDA00AM01S22R DDA00AM01222R DDA00AM01322R
## Min. :-1.00 Min. : -1.00 Min. :-1.00 Min. :-1.00
## 1st Qu.:17.00 1st Qu.: 66.00 1st Qu.:30.00 1st Qu.:11.00
## Median :22.00 Median : 74.00 Median :40.00 Median :17.00
## Mean :23.64 Mean : 72.78 Mean :40.51 Mean :18.21
## 3rd Qu.:29.00 3rd Qu.: 82.00 3rd Qu.:50.00 3rd Qu.:23.00
## Max. :66.00 Max. :100.00 Max. :91.00 Max. :65.00
## NA's :5 NA's :5 NA's :5 NA's :5
## DDA00AC01S22R DDA00AC01222R DDA00AC01322R DDA00AS01S22R
## Min. : -1.00 Min. : -1.00 Min. :-1.00 Min. : -1.00
## 1st Qu.: 68.00 1st Qu.: 34.00 1st Qu.:10.00 1st Qu.: 66.00
## Median : 77.00 Median : 44.00 Median :16.00 Median : 75.00
## Mean : 75.04 Mean : 44.29 Mean :17.39 Mean : 73.09
## 3rd Qu.: 85.00 3rd Qu.: 55.00 3rd Qu.:23.00 3rd Qu.: 83.00
## Max. :100.00 Max. :100.00 Max. :56.00 Max. :100.00
## NA's :11 NA's :11 NA's :11 NA's :43
## DDA00AS01222R DDA00AS01322R DDB00A001S22R DDB00A001222R
## Min. : -1.0 Min. :-1.00 Min. : -1.00 Min. : -1.0
## 1st Qu.: 37.0 1st Qu.:17.00 1st Qu.: 50.00 1st Qu.: 21.0
## Median : 46.0 Median :24.00 Median : 63.00 Median : 31.0
## Mean : 45.7 Mean :25.29 Mean : 57.65 Mean : 31.5
## 3rd Qu.: 55.0 3rd Qu.:32.00 3rd Qu.: 74.50 3rd Qu.: 43.0
## Max. :100.0 Max. :77.00 Max. :100.00 Max. :100.0
## NA's :43 NA's :43 NA's :192 NA's :192
## DDB00A001322R DDH00A001S22R DDH00A001222R DDH00A001322R
## Min. :-1.00 Min. : -1.0 Min. : -1.00 Min. :-1.00
## 1st Qu.: 5.00 1st Qu.: 66.0 1st Qu.: 34.00 1st Qu.:12.00
## Median :11.00 Median : 73.0 Median : 41.00 Median :16.00
## Mean :12.45 Mean : 71.7 Mean : 41.63 Mean :16.86
## 3rd Qu.:17.00 3rd Qu.: 80.0 3rd Qu.: 49.00 3rd Qu.:21.00
## Max. :90.00 Max. :100.0 Max. :100.00 Max. :59.00
## NA's :192 NA's :6 NA's :6 NA's :6
## DDW00A001S22R DDW00A001222R DDW00A001322R DDI00A001S22R
## Min. : -1.00 Min. : -1.00 Min. :-1.00 Min. : -1.0
## 1st Qu.: 75.00 1st Qu.: 45.00 1st Qu.:19.00 1st Qu.: -1.0
## Median : 82.00 Median : 54.00 Median :26.00 Median : 66.0
## Mean : 78.69 Mean : 53.24 Mean :26.15 Mean : 50.8
## 3rd Qu.: 88.00 3rd Qu.: 63.00 3rd Qu.:33.00 3rd Qu.: 83.0
## Max. :100.00 Max. :100.00 Max. :79.00 Max. :100.0
## NA's :26 NA's :26 NA's :26 NA's :472
## DDI00A001222R DDI00A001322R DD300A001S22R DD300A001222R
## Min. : -1.00 Min. : -1.00 Min. : -1.0 Min. : -1.00
## 1st Qu.: -1.00 1st Qu.: -1.00 1st Qu.: 50.0 1st Qu.: 17.00
## Median : 35.00 Median : 11.00 Median : 89.0 Median : 67.00
## Mean : 32.08 Mean : 14.35 Mean : 67.8 Mean : 53.76
## 3rd Qu.: 54.00 3rd Qu.: 23.00 3rd Qu.: 96.0 3rd Qu.: 80.00
## Max. :100.00 Max. :100.00 Max. :100.0 Max. :100.00
## NA's :472 NA's :472 NA's :423 NA's :423
## DD300A001322R DD400A001S22R DD400A001222R DD400A001322R
## Min. : -1.00 Min. : -1.00 Min. : -1.00 Min. :-1.00
## 1st Qu.: 0.00 1st Qu.: -1.00 1st Qu.: -1.00 1st Qu.:-1.00
## Median : 37.00 Median : 60.00 Median : 22.50 Median : 0.00
## Mean : 33.08 Mean : 44.08 Mean : 29.23 Mean :14.05
## 3rd Qu.: 52.00 3rd Qu.: 83.00 3rd Qu.: 56.75 3rd Qu.:25.75
## Max. :100.00 Max. :100.00 Max. :100.00 Max. :83.00
## NA's :423 NA's :797 NA's :797 NA's :797
## DD200A001S22R DD200A001222R DD200A001322R DDE00A001S22R
## Min. : -1.00 Min. : -1.00 Min. : -1.00 Min. : -1.00
## 1st Qu.: 64.00 1st Qu.: 33.00 1st Qu.: 10.00 1st Qu.: 63.00
## Median : 76.00 Median : 47.00 Median : 20.00 Median : 70.00
## Mean : 68.18 Mean : 43.98 Mean : 21.06 Mean : 69.68
## 3rd Qu.: 86.00 3rd Qu.: 60.00 3rd Qu.: 30.00 3rd Qu.: 77.00
## Max. :100.00 Max. :100.00 Max. :100.00 Max. :100.00
## NA's :133 NA's :133 NA's :133 NA's :10
## DDE00A001222R DDE00A001322R DA0CT21R DA0CC21R
## Min. : -1.00 Min. :-1.0 Min. : -2.00 Min. :-1.00
## 1st Qu.: 32.00 1st Qu.:12.0 1st Qu.: 40.42 1st Qu.:12.90
## Median : 39.00 Median :15.0 Median : 63.05 Median :23.55
## Mean : 39.23 Mean :15.9 Mean : 60.76 Mean :26.10
## 3rd Qu.: 46.00 3rd Qu.:19.0 3rd Qu.: 85.38 3rd Qu.:37.08
## Max. :100.00 Max. :80.0 Max. :100.00 Max. :97.70
## NA's :10 NA's :10 NA's :125 NA's :147
## DA0CSA21R DA0CAA21R DPSATOFC DPSTTOFC
## Min. : -1.0 Min. :-1.00 Min. : 1.0 Min. : 0.00
## 1st Qu.: 887.0 1st Qu.:16.30 1st Qu.: 58.9 1st Qu.: 30.57
## Median : 973.0 Median :19.00 Median : 144.1 Median : 72.00
## Mean : 823.9 Mean :16.13 Mean : 622.5 Mean : 307.06
## 3rd Qu.:1039.0 3rd Qu.:21.20 3rd Qu.: 405.2 3rd Qu.: 196.50
## Max. :1344.0 Max. :31.40 Max. :23716.2 Max. :10619.50
## NA's :262 NA's :236 NA's :3 NA's :3
## DPSCTOFP DPSSTOFP DPSUTOFP DPSTTOFP
## Min. : 0.000 Min. : 0.00 Min. : 0.000 Min. : 0.00
## 1st Qu.: 1.200 1st Qu.: 2.60 1st Qu.: 4.100 1st Qu.:46.60
## Median : 1.800 Median : 3.10 Median : 6.600 Median :50.80
## Mean : 2.178 Mean : 3.58 Mean : 7.169 Mean :50.98
## 3rd Qu.: 2.600 3rd Qu.: 3.90 3rd Qu.: 9.800 3rd Qu.:54.80
## Max. :14.900 Max. :100.00 Max. :49.700 Max. :88.30
## NA's :3 NA's :3 NA's :3 NA's :3
## DPSETOFP DPSXTOFP DPSCTOSA DPSSTOSA
## Min. : 0.00 Min. : 0.00 Min. : -2 Min. : -2
## 1st Qu.: 9.70 1st Qu.:19.38 1st Qu.: 95459 1st Qu.: 73469
## Median :12.60 Median :23.65 Median :106674 Median : 78723
## Mean :12.95 Mean :23.14 Mean :108039 Mean : 79435
## 3rd Qu.:16.20 3rd Qu.:27.50 3rd Qu.:119540 3rd Qu.: 84945
## Max. :48.80 Max. :55.40 Max. :270000 Max. :192500
## NA's :3 NA's :3 NA's :10 NA's :11
## DPSUTOSA DPSTTOSA DPSAMIFP DPSAKIDR
## Min. : -2 Min. : 36081 Min. : 0.00 Min. : 0.100
## 1st Qu.: 57969 1st Qu.: 50439 1st Qu.: 13.30 1st Qu.: 5.400
## Median : 63015 Median : 53382 Median : 26.60 Median : 6.300
## Mean : 62424 Mean : 53971 Mean : 35.24 Mean : 6.734
## 3rd Qu.: 67941 3rd Qu.: 56919 3rd Qu.: 50.62 3rd Qu.: 7.300
## Max. :228972 Max. :110560 Max. :100.00 Max. :349.100
## NA's :54 NA's :4 NA's :3 NA's :3
## DPSTKIDR DPST05FP DPSTEXPA DPSTADFP
## Min. :-2.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.:10.80 1st Qu.: 24.35 1st Qu.:10.07 1st Qu.:14.88
## Median :12.70 Median : 32.40 Median :12.00 Median :20.90
## Mean :12.56 Mean : 34.88 Mean :11.75 Mean :20.86
## 3rd Qu.:14.40 3rd Qu.: 41.65 3rd Qu.:13.90 3rd Qu.:26.12
## Max. :37.30 Max. :100.00 Max. :22.90 Max. :78.70
## NA's :3 NA's :3 NA's :3 NA's :3
## DPSTURNR DPSTBLFP DPSTHIFP DPSTWHFP
## Min. : 0.00 Min. : 0.00 Min. : 0.00 Min. : 0.00
## 1st Qu.: 14.80 1st Qu.: 0.00 1st Qu.: 4.20 1st Qu.: 58.67
## Median : 19.50 Median : 1.60 Median : 10.10 Median : 82.40
## Mean : 21.51 Mean : 6.99 Mean : 19.05 Mean : 71.57
## 3rd Qu.: 25.90 3rd Qu.: 6.20 3rd Qu.: 22.50 3rd Qu.: 92.60
## Max. :100.00 Max. :100.00 Max. :100.00 Max. :100.00
## NA's :7 NA's :3 NA's :3 NA's :3
## DPSTINFP DPSTASFP DPSTPIFP DPSTTWFP
## Min. : 0.0000 Min. : 0.000 Min. :0.00000 Min. : 0.0000
## 1st Qu.: 0.0000 1st Qu.: 0.000 1st Qu.:0.00000 1st Qu.: 0.0000
## Median : 0.0000 Median : 0.000 Median :0.00000 Median : 0.0000
## Mean : 0.3566 Mean : 1.118 Mean :0.08206 Mean : 0.7566
## 3rd Qu.: 0.3000 3rd Qu.: 0.900 3rd Qu.:0.00000 3rd Qu.: 1.2000
## Max. :16.7000 Max. :94.800 Max. :7.80000 Max. :11.7000
## NA's :3 NA's :3 NA's :3 NA's :3
## DPSTREFP DPSTSPFP DPSTCOFP DPSTBIFP
## Min. : 0.00 Min. : 0.000 Min. : 0.000 Min. : 0.000
## 1st Qu.: 71.00 1st Qu.: 4.500 1st Qu.: 0.000 1st Qu.: 0.000
## Median : 76.90 Median : 7.000 Median : 3.350 Median : 0.000
## Mean : 76.87 Mean : 7.145 Mean : 4.131 Mean : 2.311
## 3rd Qu.: 82.42 3rd Qu.: 9.600 3rd Qu.: 6.125 3rd Qu.: 2.100
## Max. :100.00 Max. :22.800 Max. :32.500 Max. :94.300
## NA's :3 NA's :3 NA's :3 NA's :3
## DPSTVOFP DPSTGOFP DPFVTOTK DPFTADPR
## Min. : 0.000 Min. : 0.000 Min. : 0 Min. :0.0000
## 1st Qu.: 4.800 1st Qu.: 0.000 1st Qu.: 238299 1st Qu.:0.9892
## Median : 7.000 Median : 1.200 Median : 419144 Median :1.1670
## Mean : 7.154 Mean : 2.308 Mean : 665067 Mean :1.0212
## 3rd Qu.: 9.425 3rd Qu.: 3.700 3rd Qu.: 670248 3rd Qu.:1.3017
## Max. :71.900 Max. :30.800 Max. :26416597 Max. :1.7480
## NA's :3 NA's :3 NA's :5 NA's :5
## DPFRAALLT DPFRAALLK DPFRAOPRT DPFRASTAP
## Min. :6.428e+05 Min. : 8923 Min. :6.428e+05 Min. : 1.70
## 1st Qu.:5.828e+06 1st Qu.: 12953 1st Qu.:5.524e+06 1st Qu.: 33.85
## Median :1.381e+07 Median : 14653 Median :1.241e+07 Median : 51.00
## Mean :5.995e+07 Mean : 16365 Mean :5.129e+07 Mean : 49.05
## 3rd Qu.:3.805e+07 3rd Qu.: 17081 3rd Qu.:3.241e+07 3rd Qu.: 64.80
## Max. :2.619e+09 Max. :214078 Max. :2.213e+09 Max. :103.40
## NA's :5 NA's :5 NA's :5 NA's :5
## DZRVLOCP DPFRAFEDP DPFRAORVT DPFUNAB1T
## Min. :-6.20 Min. : 0.00 Min. : -655726 Min. : -746998
## 1st Qu.:21.10 1st Qu.: 8.70 1st Qu.: 98311 1st Qu.: 1226730
## Median :35.30 Median :12.30 Median : 1093332 Median : 3589384
## Mean :37.92 Mean :13.04 Mean : 8659622 Mean : 13498263
## 3rd Qu.:53.70 3rd Qu.:16.20 3rd Qu.: 4471240 3rd Qu.: 9357248
## Max. :97.60 Max. :49.00 Max. :405596099 Max. :662450197
## NA's :5 NA's :5 NA's :5 NA's :5
## DPFUNA4T DPFEAALLT DPFEAOPFT DPFEAOPFK
## Min. : -7033092 Min. :6.229e+05 Min. :6.120e+05 Min. : 6755
## 1st Qu.: 0 1st Qu.:5.475e+06 1st Qu.:4.754e+06 1st Qu.: 10916
## Median : 0 Median :1.328e+07 Median :1.102e+07 Median : 12228
## Mean : 510514 Mean :6.597e+07 Mean :4.951e+07 Mean : 13121
## 3rd Qu.: 0 3rd Qu.:3.787e+07 3rd Qu.:3.038e+07 3rd Qu.: 14012
## Max. :126144201 Max. :2.656e+09 Max. :2.068e+09 Max. :178467
## NA's :5 NA's :5 NA's :5 NA's :5
## DPFEAINSP DZEXADMP DZEXADSP DZEXPLAP
## Min. :18.50 Min. : 2.700 Min. : 0.000 Min. : 0.20
## 1st Qu.:52.00 1st Qu.: 7.125 1st Qu.: 4.900 1st Qu.:10.40
## Median :55.10 Median : 8.800 Median : 5.700 Median :11.80
## Mean :54.73 Mean : 9.606 Mean : 6.015 Mean :12.46
## 3rd Qu.:57.80 3rd Qu.:11.200 3rd Qu.: 6.400 3rd Qu.:13.50
## Max. :84.40 Max. :35.800 Max. :22.700 Max. :43.10
## NA's :5 NA's :5 NA's :5 NA's :5
## DZEXOTHP DPFEAINST DPFEAINSK DPFPAREGP
## Min. : 0.30 Min. :2.439e+05 Min. : 3122 Min. : 0.00
## 1st Qu.:15.30 1st Qu.:2.563e+06 1st Qu.: 6056 1st Qu.:35.12
## Median :18.00 Median :6.013e+06 Median : 6702 Median :39.70
## Mean :17.15 Mean :2.835e+07 Mean : 7074 Mean :39.80
## 3rd Qu.:20.00 3rd Qu.:1.683e+07 3rd Qu.: 7577 3rd Qu.:43.90
## Max. :69.30 Max. :1.177e+09 Max. :54954 Max. :79.10
## NA's :5 NA's :5 NA's :5 NA's :5
## DPFPASPEP DPFPACOMP DPFPABILP DPFPAVOCP
## Min. : 0.000 Min. : 0.000 Min. : 0.0000 Min. : 0.00
## 1st Qu.: 5.800 1st Qu.: 6.500 1st Qu.: 0.1000 1st Qu.: 2.90
## Median : 8.900 Median : 9.200 Median : 0.4000 Median : 4.10
## Mean : 9.711 Mean : 9.883 Mean : 0.7496 Mean : 3.96
## 3rd Qu.:12.500 3rd Qu.:12.100 3rd Qu.: 1.0000 3rd Qu.: 5.20
## Max. :49.000 Max. :90.600 Max. :26.0000 Max. :19.80
## NA's :5 NA's :5 NA's :5 NA's :5
## DPFPAGIFP DPFPAATHP DPFPAHSAP DPFPREKP
## Min. :0.0000 Min. :0.000 Min. :0.0000 Min. : 0.0000
## 1st Qu.:0.0000 1st Qu.:1.600 1st Qu.:0.0000 1st Qu.: 0.0000
## Median :0.2000 Median :2.900 Median :0.0000 Median : 0.6000
## Mean :0.3823 Mean :2.809 Mean :0.1578 Mean : 0.8909
## 3rd Qu.:0.4000 3rd Qu.:4.000 3rd Qu.:0.1000 3rd Qu.: 1.3000
## Max. :6.9000 Max. :9.000 Max. :3.4000 Max. :31.7000
## NA's :5 NA's :5 NA's :5 NA's :5
## DPFPAOTHP DISTSIZE COMMTYPE PROPWLTH
## Min. : 3.50 Length:1207 Length:1207 Length:1207
## 1st Qu.:25.40 Class :character Class :character Class :character
## Median :28.70 Mode :character Mode :character Mode :character
## Mean :29.22
## 3rd Qu.:32.40
## Max. :76.30
## NA's :5
## TAXRATE
## Length:1207
## Class :character
## Mode :character
##
##
##
##
better_district<-district %>% dplyr::select(DAGC4X21R,DPETECOP,DPETWHIP,DPETHISP, DPSTKIDR, DA0912DR21R, DA0AT21R) %>% na.omit(.)
grad_model<- lm(DAGC4X21R ~ DPETECOP + DPETWHIP + DPSTKIDR + DA0912DR21R + DA0AT21R, data = better_district)
summary(grad_model)
##
## Call:
## lm(formula = DAGC4X21R ~ DPETECOP + DPETWHIP + DPSTKIDR + DA0912DR21R +
## DA0AT21R, data = better_district)
##
## Residuals:
## Min 1Q Median 3Q Max
## -96.938 -0.846 1.421 3.387 24.592
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 76.46763 14.32451 5.338 1.15e-07 ***
## DPETECOP -0.03607 0.02117 -1.704 0.08873 .
## DPETWHIP 0.02292 0.01630 1.406 0.16002
## DPSTKIDR 0.35319 0.12509 2.823 0.00484 **
## DA0912DR21R -2.01511 0.11816 -17.054 < 2e-16 ***
## DA0AT21R 0.17625 0.14169 1.244 0.21382
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.42 on 1066 degrees of freedom
## Multiple R-squared: 0.3395, Adjusted R-squared: 0.3364
## F-statistic: 109.6 on 5 and 1066 DF, p-value: < 2.2e-16
#2) Create a linear model “lm()” from the variables, with a continuous dependent variable as the outcome #3) Check the following assumptions: a) Linearity (plot and raintest)
plot(grad_model, which = 1)
raintest(grad_model)
##
## Rainbow test
##
## data: grad_model
## Rain = 0.92701, df1 = 536, df2 = 530, p-value = 0.8091
#b) Independence of errors (durbin-watson)
durbinWatsonTest(grad_model)
## lag Autocorrelation D-W Statistic p-value
## 1 0.02870758 1.942524 0.23
## Alternative hypothesis: rho != 0
#c) Homoscedasticity (plot, bptest)
plot(grad_model, which = 3)
bptest(grad_model)
##
## studentized Breusch-Pagan test
##
## data: grad_model
## BP = 25.929, df = 5, p-value = 9.212e-05
#d) Normality of residuals (QQ plot, shapiro test)
plot(grad_model, which = 2)
shapiro.test(grad_model$residuals)
##
## Shapiro-Wilk normality test
##
## data: grad_model$residuals
## W = 0.3828, p-value < 2.2e-16
#e) No multicolinarity (VIF, cor)
kitchen_sink<- lm(DAGC4X21R ~ DPETECOP + DPETWHIP + DPSTKIDR + DA0912DR21R + DA0AT21R, data = better_district)
vif(kitchen_sink)
## DPETECOP DPETWHIP DPSTKIDR DA0912DR21R DA0AT21R
## 1.838740 1.878939 1.223098 1.543300 1.690313
#4)Does your model meet those assumptions? You don’t have to be perfectly right, just take a good case.
Linearity visual check: residuals are mostly linear. The line is pretty straight throught the graph. Rainbow test: The rainbow test shows, p-value = 0.8091 which is less than 0.05. Representing that is not linear. Assumes homoscedasticity. Independence of Errors Durbin-Watson test: The test shows a p-value = 0.296.Meaning the errors are independent. Given the outcomes thus far, it can be said that this assumption is probably met. Homoscedasticity visual check: The majority of the points stay on the dotted line with some uneven spread of residuals.Meaning it assumes Homoscedastic. Breusch-Pagan test: The test gives the p-value = 9.212e-05 indicating we reject the null hypothesis of heteroscedasticity and assume homoscedastic (equal variance of residuals).Normality of Residuals QQ plot: Points seems to be on the straight line, but on beginning and ends of the tails. Residuals seem normal.Shapiro-Wilk test: The test shows a p-value < 2.2e-16. Meaning the residuals are not normally distributed. It can be said it is statistically significant and not just due to random chance.Multicollinearity VIF values:VIF less than 5, suggesting there may be no potential multicollinearity issues.Anything over 5 is problematic.Overall: The model primarily assumes important assumptions such as linearity, homoscedasticity, normality of residuals, with no multicollinearity.
#5) If your model violates an assumption, which one? This model does not violate any assumptions.
#6) What would you do to mitigate this assumption? Show your work. Not needed all assumptions met. No violations found.