library(ISLR2)
## Warning: package 'ISLR2' was built under R version 4.3.3
data("Auto")
head(Auto)
## mpg cylinders displacement horsepower weight acceleration year origin
## 1 18 8 307 130 3504 12.0 70 1
## 2 15 8 350 165 3693 11.5 70 1
## 3 18 8 318 150 3436 11.0 70 1
## 4 16 8 304 150 3433 12.0 70 1
## 5 17 8 302 140 3449 10.5 70 1
## 6 15 8 429 198 4341 10.0 70 1
## name
## 1 chevrolet chevelle malibu
## 2 buick skylark 320
## 3 plymouth satellite
## 4 amc rebel sst
## 5 ford torino
## 6 ford galaxie 500
auto <- na.omit(Auto)
Qualitative: name, orgin Quantitative: mpg, cylinders, displacement, horsepower, weight, acceleration, year,
range(auto$mpg)
## [1] 9.0 46.6
range(auto$cylinders)
## [1] 3 8
range(auto$displacement)
## [1] 68 455
range(auto$horsepower)
## [1] 46 230
range(auto$weight)
## [1] 1613 5140
range(auto$acceleration)
## [1] 8.0 24.8
range(auto$year)
## [1] 70 82
sapply(auto[, sapply(auto, is.numeric)], mean, na.rm = TRUE)
## mpg cylinders displacement horsepower weight acceleration
## 23.445918 5.471939 194.411990 104.469388 2977.584184 15.541327
## year origin
## 75.979592 1.576531
sapply(auto[, sapply(auto, is.numeric)], sd, na.rm = TRUE)
## mpg cylinders displacement horsepower weight acceleration
## 7.8050075 1.7057832 104.6440039 38.4911599 849.4025600 2.7588641
## year origin
## 3.6837365 0.8055182
subset_auto <- auto[-(10:85), ]
sapply(subset_auto[, sapply(subset_auto, is.numeric)], range)
## mpg cylinders displacement horsepower weight acceleration year origin
## [1,] 11.0 3 68 46 1649 8.5 70 1
## [2,] 46.6 8 455 230 4997 24.8 82 3
sapply(subset_auto[, sapply(subset_auto, is.numeric)], mean, na.rm = TRUE)
## mpg cylinders displacement horsepower weight acceleration
## 24.404430 5.373418 187.240506 100.721519 2935.971519 15.726899
## year origin
## 77.145570 1.601266
sapply(subset_auto[, sapply(subset_auto, is.numeric)], sd, na.rm = TRUE)
## mpg cylinders displacement horsepower weight acceleration
## 7.867283 1.654179 99.678367 35.708853 811.300208 2.693721
## year origin
## 3.106217 0.819910
pairs(auto)
We can see
the most coorelation between MPG and the variables weight, displacement,
horespower. And slight coorelation with the variables year and
acceleration.
plot(auto$weight, auto$mpg)
plot(auto$horsepower, auto$mpg)
plot(auto$displacement, auto$mpg)
Yes the
plots presented (weight, horespower, and displacement) all show a
coorelation to mpg that can be used in prediction.We can see that
generally, when weight, horespower, and displacement all increase, the
MPG decreases.
library(ISLR2)
data(Boston)
dim(Boston)
## [1] 506 13
506 rows, 13 columns. The columns represent: CRIM per capita crime rate by town ZN proportion of residential land zoned for lots over 25,000 sq.ft. INDUS proportion of non-retail business acres per town CHAS Charles River dummy variable (= 1 if tract bounds river; 0 otherwise) NOX nitric oxides concentration (parts per 10 million) RM average number of rooms per dwelling AGE proportion of owner-occupied units built prior to 1940 DIS weighted distances to five Boston employment centres RAD index of accessibility to radial highways TAX full-value property-tax rate per $10,000 PTRATIO pupil-teacher ratio by town B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town LSTAT % lower status of the population MEDV Median value of owner-occupied homes in $1000’s
The rows are suburbs numbered.
pairs(Boston)
We can see some correlation between: crime: age distance, and medv. zn:
lstat and nox. indus: dis. nox: age and dis. rm: lstat and medv. age:
crim, nox, and lstat. dis: crim, nox, indus, lstat lstat: zn, age, dis,
and medv medv: lstat, crim, rm
plot(Boston$age, Boston$crim)
plot(Boston$dis, Boston$crim)
plot(Boston$medv, Boston$crim)
We can see there is a slight relationship that when age increases the
per capita crime rate in town increases as well. Also we can see as the
weighted distances from the 5 employment centers increases the crime
decreases. Lastly, there is minimal relationship between crime rates
decreasing and the median value of owner occupied homes increasing.
range(Boston$crim)
## [1] 0.00632 88.97620
range(Boston$tax)
## [1] 187 711
range(Boston$ptratio)
## [1] 12.6 22.0
We can see the range of crime is aprox 88.97. The range of tax is 524. The range of ptratio is 9.4.
sum(Boston$chas == 1)
## [1] 35
median(Boston$ptratio)
## [1] 19.05
Boston[Boston$medv == min(Boston$medv), ]
## crim zn indus chas nox rm age dis rad tax ptratio lstat medv
## 399 38.3518 0 18.1 0 0.693 5.453 100 1.4896 24 666 20.2 30.59 5
## 406 67.9208 0 18.1 0 0.693 5.683 100 1.4254 24 666 20.2 22.98 5
#compare#
sapply(Boston, range)
## crim zn indus chas nox rm age dis rad tax ptratio lstat
## [1,] 0.00632 0 0.46 0 0.385 3.561 2.9 1.1296 1 187 12.6 1.73
## [2,] 88.97620 100 27.74 1 0.871 8.780 100.0 12.1265 24 711 22.0 37.97
## medv
## [1,] 5
## [2,] 50
Census tract 399 has the lowest median value of occupied homes at 5. These census tracts have high crime, lower than ideal ZN, above avg INDUS, low RM, above avg NOX, low DIS, high TAX, and does not bound the Charles river.
sum(Boston$rm > 7)
## [1] 64
Boston[Boston$rm > 7, ]
## crim zn indus chas nox rm age dis rad tax ptratio lstat
## 3 0.02729 0.0 7.07 0 0.4690 7.185 61.1 4.9671 2 242 17.8 4.03
## 5 0.06905 0.0 2.18 0 0.4580 7.147 54.2 6.0622 3 222 18.7 5.33
## 41 0.03359 75.0 2.95 0 0.4280 7.024 15.8 5.4011 3 252 18.3 1.98
## 56 0.01311 90.0 1.22 0 0.4030 7.249 21.9 8.6966 5 226 17.9 4.81
## 65 0.01951 17.5 1.38 0 0.4161 7.104 59.5 9.2229 3 216 18.6 8.05
## 89 0.05660 0.0 3.41 0 0.4890 7.007 86.3 3.4217 2 270 17.8 5.50
## 90 0.05302 0.0 3.41 0 0.4890 7.079 63.1 3.4145 2 270 17.8 5.70
## 98 0.12083 0.0 2.89 0 0.4450 8.069 76.0 3.4952 2 276 18.0 4.21
## 99 0.08187 0.0 2.89 0 0.4450 7.820 36.9 3.4952 2 276 18.0 3.57
## 100 0.06860 0.0 2.89 0 0.4450 7.416 62.5 3.4952 2 276 18.0 6.19
## 162 1.46336 0.0 19.58 0 0.6050 7.489 90.8 1.9709 5 403 14.7 1.73
## 163 1.83377 0.0 19.58 1 0.6050 7.802 98.2 2.0407 5 403 14.7 1.92
## 164 1.51902 0.0 19.58 1 0.6050 8.375 93.9 2.1620 5 403 14.7 3.32
## 167 2.01019 0.0 19.58 0 0.6050 7.929 96.2 2.0459 5 403 14.7 3.70
## 181 0.06588 0.0 2.46 0 0.4880 7.765 83.3 2.7410 3 193 17.8 7.56
## 183 0.09103 0.0 2.46 0 0.4880 7.155 92.2 2.7006 3 193 17.8 4.82
## 187 0.05602 0.0 2.46 0 0.4880 7.831 53.6 3.1992 3 193 17.8 4.45
## 190 0.08370 45.0 3.44 0 0.4370 7.185 38.9 4.5667 5 398 15.2 5.39
## 193 0.08664 45.0 3.44 0 0.4370 7.178 26.3 6.4798 5 398 15.2 2.87
## 196 0.01381 80.0 0.46 0 0.4220 7.875 32.0 5.6484 4 255 14.4 2.97
## 197 0.04011 80.0 1.52 0 0.4040 7.287 34.1 7.3090 2 329 12.6 4.08
## 198 0.04666 80.0 1.52 0 0.4040 7.107 36.6 7.3090 2 329 12.6 8.61
## 199 0.03768 80.0 1.52 0 0.4040 7.274 38.3 7.3090 2 329 12.6 6.62
## 201 0.01778 95.0 1.47 0 0.4030 7.135 13.9 7.6534 3 402 17.0 4.45
## 203 0.02177 82.5 2.03 0 0.4150 7.610 15.7 6.2700 2 348 14.7 3.11
## 204 0.03510 95.0 2.68 0 0.4161 7.853 33.2 5.1180 4 224 14.7 3.81
## 205 0.02009 95.0 2.68 0 0.4161 8.034 31.9 5.1180 4 224 14.7 2.88
## 225 0.31533 0.0 6.20 0 0.5040 8.266 78.3 2.8944 8 307 17.4 4.14
## 226 0.52693 0.0 6.20 0 0.5040 8.725 83.0 2.8944 8 307 17.4 4.63
## 227 0.38214 0.0 6.20 0 0.5040 8.040 86.5 3.2157 8 307 17.4 3.13
## 228 0.41238 0.0 6.20 0 0.5040 7.163 79.9 3.2157 8 307 17.4 6.36
## 229 0.29819 0.0 6.20 0 0.5040 7.686 17.0 3.3751 8 307 17.4 3.92
## 232 0.46296 0.0 6.20 0 0.5040 7.412 76.9 3.6715 8 307 17.4 5.25
## 233 0.57529 0.0 6.20 0 0.5070 8.337 73.3 3.8384 8 307 17.4 2.47
## 234 0.33147 0.0 6.20 0 0.5070 8.247 70.4 3.6519 8 307 17.4 3.95
## 238 0.51183 0.0 6.20 0 0.5070 7.358 71.6 4.1480 8 307 17.4 4.73
## 254 0.36894 22.0 5.86 0 0.4310 8.259 8.4 8.9067 7 330 19.1 3.54
## 257 0.01538 90.0 3.75 0 0.3940 7.454 34.2 6.3361 3 244 15.9 3.11
## 258 0.61154 20.0 3.97 0 0.6470 8.704 86.9 1.8010 5 264 13.0 5.12
## 259 0.66351 20.0 3.97 0 0.6470 7.333 100.0 1.8946 5 264 13.0 7.79
## 261 0.54011 20.0 3.97 0 0.6470 7.203 81.8 2.1121 5 264 13.0 9.59
## 262 0.53412 20.0 3.97 0 0.6470 7.520 89.4 2.1398 5 264 13.0 7.26
## 263 0.52014 20.0 3.97 0 0.6470 8.398 91.5 2.2885 5 264 13.0 5.91
## 264 0.82526 20.0 3.97 0 0.6470 7.327 94.5 2.0788 5 264 13.0 11.25
## 265 0.55007 20.0 3.97 0 0.6470 7.206 91.6 1.9301 5 264 13.0 8.10
## 267 0.78570 20.0 3.97 0 0.6470 7.014 84.6 2.1329 5 264 13.0 14.79
## 268 0.57834 20.0 3.97 0 0.5750 8.297 67.0 2.4216 5 264 13.0 7.44
## 269 0.54050 20.0 3.97 0 0.5750 7.470 52.6 2.8720 5 264 13.0 3.16
## 274 0.22188 20.0 6.96 1 0.4640 7.691 51.8 4.3665 3 223 18.6 6.58
## 277 0.10469 40.0 6.41 1 0.4470 7.267 49.0 4.7872 4 254 17.6 6.05
## 281 0.03578 20.0 3.33 0 0.4429 7.820 64.5 4.6947 5 216 14.9 3.76
## 283 0.06129 20.0 3.33 1 0.4429 7.645 49.7 5.2119 5 216 14.9 3.01
## 284 0.01501 90.0 1.21 1 0.4010 7.923 24.8 5.8850 1 198 13.6 3.16
## 285 0.00906 90.0 2.97 0 0.4000 7.088 20.8 7.3073 1 285 15.3 7.85
## 292 0.07886 80.0 4.95 0 0.4110 7.148 27.7 5.1167 4 245 19.2 3.56
## 300 0.05561 70.0 2.24 0 0.4000 7.041 10.0 7.8278 5 358 14.8 4.74
## 305 0.05515 33.0 2.18 0 0.4720 7.236 41.1 4.0220 7 222 18.4 6.93
## 307 0.07503 33.0 2.18 0 0.4720 7.420 71.9 3.0992 7 222 18.4 6.47
## 342 0.01301 35.0 1.52 0 0.4420 7.241 49.3 7.0379 1 284 15.5 5.49
## 365 3.47428 0.0 18.10 1 0.7180 8.780 82.9 1.9047 24 666 20.2 5.29
## 371 6.53876 0.0 18.10 1 0.6310 7.016 97.5 1.2024 24 666 20.2 2.96
## 376 19.60910 0.0 18.10 0 0.6710 7.313 97.9 1.3163 24 666 20.2 13.44
## 454 8.24809 0.0 18.10 0 0.7130 7.393 99.3 2.4527 24 666 20.2 16.74
## 483 5.73116 0.0 18.10 0 0.5320 7.061 77.0 3.4106 24 666 20.2 7.01
## medv
## 3 34.7
## 5 36.2
## 41 34.9
## 56 35.4
## 65 33.0
## 89 23.6
## 90 28.7
## 98 38.7
## 99 43.8
## 100 33.2
## 162 50.0
## 163 50.0
## 164 50.0
## 167 50.0
## 181 39.8
## 183 37.9
## 187 50.0
## 190 34.9
## 193 36.4
## 196 50.0
## 197 33.3
## 198 30.3
## 199 34.6
## 201 32.9
## 203 42.3
## 204 48.5
## 205 50.0
## 225 44.8
## 226 50.0
## 227 37.6
## 228 31.6
## 229 46.7
## 232 31.7
## 233 41.7
## 234 48.3
## 238 31.5
## 254 42.8
## 257 44.0
## 258 50.0
## 259 36.0
## 261 33.8
## 262 43.1
## 263 48.8
## 264 31.0
## 265 36.5
## 267 30.7
## 268 50.0
## 269 43.5
## 274 35.2
## 277 33.2
## 281 45.4
## 283 46.0
## 284 50.0
## 285 32.2
## 292 37.3
## 300 29.0
## 305 36.1
## 307 33.4
## 342 32.7
## 365 21.9
## 371 50.0
## 376 15.0
## 454 17.8
## 483 25.0
64 total census tracts avg more than 7 rooms. The other info for these tracts is displayed in the table.
sum(Boston$rm > 8)
## [1] 13
Boston[Boston$rm > 8, ]
## crim zn indus chas nox rm age dis rad tax ptratio lstat medv
## 98 0.12083 0 2.89 0 0.4450 8.069 76.0 3.4952 2 276 18.0 4.21 38.7
## 164 1.51902 0 19.58 1 0.6050 8.375 93.9 2.1620 5 403 14.7 3.32 50.0
## 205 0.02009 95 2.68 0 0.4161 8.034 31.9 5.1180 4 224 14.7 2.88 50.0
## 225 0.31533 0 6.20 0 0.5040 8.266 78.3 2.8944 8 307 17.4 4.14 44.8
## 226 0.52693 0 6.20 0 0.5040 8.725 83.0 2.8944 8 307 17.4 4.63 50.0
## 227 0.38214 0 6.20 0 0.5040 8.040 86.5 3.2157 8 307 17.4 3.13 37.6
## 233 0.57529 0 6.20 0 0.5070 8.337 73.3 3.8384 8 307 17.4 2.47 41.7
## 234 0.33147 0 6.20 0 0.5070 8.247 70.4 3.6519 8 307 17.4 3.95 48.3
## 254 0.36894 22 5.86 0 0.4310 8.259 8.4 8.9067 7 330 19.1 3.54 42.8
## 258 0.61154 20 3.97 0 0.6470 8.704 86.9 1.8010 5 264 13.0 5.12 50.0
## 263 0.52014 20 3.97 0 0.6470 8.398 91.5 2.2885 5 264 13.0 5.91 48.8
## 268 0.57834 20 3.97 0 0.5750 8.297 67.0 2.4216 5 264 13.0 7.44 50.0
## 365 3.47428 0 18.10 1 0.7180 8.780 82.9 1.9047 24 666 20.2 5.29 21.9
13 total tracts avg more than 8 rooms. The other info on these tracts is displayed in the table.