library(MASS)
dim(Boston)
## [1] 506 14
pairs(Boston)
I see a collection of all 14 variables ploted against one another.
The variables zn, indus, chas, rad, tax, and ptratio have significant spikes of data in one particular area. chas is categorical data, and the only take-away is that crime usually occurs in a neighborhood away from the river. The spikes from other variables tell are probably all coming from the same general area in Boston.
There is a positive correlation of crime associated with age. This might suggest that higher age is associated with higher rates of reporting crime, rather than committing it.
There is a negative correlation associated with distance from employment centers and median value of a home. The former is probably because of a concious decision by policymakers to place employment centers close to high crime areas. Negative correlation with median home value suggests a relationship between poverty and crime rate.
hist(Boston$crim)
hist(Boston$tax)
hist(Boston$ptratio)
There are a small number of neighborhoods that have a lot of crime, it is highly right-skewed.
Tax rates approximate a symmetrical bell-shaped distribution between $200 and $500, and then a very high property tax rate at $650
The Pupil-Teacher ratio is skewed left. There is one area of the town with a high ratio.
sum(Boston$chas)
## [1] 35
There are 35 suburbs that bound the Charles river.
median(Boston$chas)
## [1] 0
The median pupil-teacher ratio is 19.05
min(Boston$medv)
## [1] 5
The lowest median value of owner occupied homes is $5,000. There are two suburbs with that median value.The most significant deviations from the median are present in a high property tax and crime rate. The former could be explained by the presence of a few high-inocome families or a higher population density, and the latter is consistent with our findings from e)
Boston7=subset(Boston, Boston$rm > 7)
Boston8=subset(Boston, Boston$rm > 8)
Bostonless = subset(Boston, Boston$rm < 7)
summary(Bostonless)
## crim zn indus chas
## Min. : 0.00632 Min. : 0.00 Min. : 0.74 Min. :0.00000
## 1st Qu.: 0.09168 1st Qu.: 0.00 1st Qu.: 5.96 1st Qu.:0.00000
## Median : 0.26600 Median : 0.00 Median :10.01 Median :0.00000
## Mean : 3.99498 Mean : 8.93 Mean :11.91 Mean :0.06109
## 3rd Qu.: 4.51201 3rd Qu.: 0.00 3rd Qu.:18.10 3rd Qu.:0.00000
## Max. :88.97620 Max. :100.00 Max. :27.74 Max. :1.00000
## nox rm age dis
## Min. :0.3850 Min. :3.561 Min. : 2.90 Min. : 1.130
## 1st Qu.:0.4585 1st Qu.:5.857 1st Qu.: 45.80 1st Qu.: 2.055
## Median :0.5380 Median :6.127 Median : 79.75 Median : 3.079
## Mean :0.5620 Mean :6.099 Mean : 69.72 Mean : 3.736
## 3rd Qu.:0.6310 3rd Qu.:6.431 3rd Qu.: 94.67 3rd Qu.: 5.113
## Max. :0.8710 Max. :6.998 Max. :100.00 Max. :12.127
## rad tax ptratio black
## Min. : 1.00 Min. :187.0 Min. :13.00 Min. : 0.32
## 1st Qu.: 4.00 1st Qu.:287.0 1st Qu.:17.80 1st Qu.:372.56
## Median : 5.00 Median :364.0 Median :19.20 Median :391.48
## Mean :10.07 Mean :422.1 Mean :18.77 Mean :352.10
## 3rd Qu.:24.00 3rd Qu.:666.0 3rd Qu.:20.20 3rd Qu.:396.78
## Max. :24.00 Max. :711.0 Max. :22.00 Max. :396.90
## lstat medv
## Min. : 2.940 Min. : 5.00
## 1st Qu.: 8.303 1st Qu.:16.10
## Median :12.620 Median :20.30
## Mean :13.693 Mean :20.24
## 3rd Qu.:17.600 3rd Qu.:23.57
## Max. :37.970 Max. :50.00
summary(Boston7)
## crim zn indus chas
## Min. : 0.00906 Min. : 0.00 Min. : 0.460 Min. :0.000
## 1st Qu.: 0.04502 1st Qu.: 0.00 1st Qu.: 2.460 1st Qu.:0.000
## Median : 0.09786 Median :20.00 Median : 3.970 Median :0.000
## Mean : 0.97911 Mean :28.17 Mean : 5.776 Mean :0.125
## 3rd Qu.: 0.54289 3rd Qu.:45.00 3rd Qu.: 6.200 3rd Qu.:0.000
## Max. :19.60910 Max. :95.00 Max. :19.580 Max. :1.000
## nox rm age dis
## Min. :0.3940 Min. :7.007 Min. : 8.40 Min. :1.202
## 1st Qu.:0.4303 1st Qu.:7.183 1st Qu.: 36.00 1st Qu.:2.445
## Median :0.4880 Median :7.414 Median : 63.80 Median :3.495
## Mean :0.5045 Mean :7.570 Mean : 60.64 Mean :4.200
## 3rd Qu.:0.5825 3rd Qu.:7.859 3rd Qu.: 85.03 3rd Qu.:5.463
## Max. :0.7180 Max. :8.780 Max. :100.00 Max. :9.223
## rad tax ptratio black
## Min. : 1.000 Min. :193.0 Min. :12.60 Min. :354.3
## 1st Qu.: 3.000 1st Qu.:244.8 1st Qu.:14.70 1st Qu.:384.9
## Median : 5.000 Median :273.0 Median :17.40 Median :390.7
## Mean : 5.984 Mean :312.2 Mean :16.26 Mean :388.3
## 3rd Qu.: 7.000 3rd Qu.:329.0 3rd Qu.:17.93 3rd Qu.:395.3
## Max. :24.000 Max. :666.0 Max. :20.20 Max. :396.9
## lstat medv
## Min. : 1.730 Min. :15.00
## 1st Qu.: 3.555 1st Qu.:32.98
## Median : 4.775 Median :36.45
## Mean : 5.474 Mean :38.40
## 3rd Qu.: 6.590 3rd Qu.:46.17
## Max. :16.740 Max. :50.00
summary(Boston8)
## crim zn indus chas
## Min. :0.02009 Min. : 0.00 Min. : 2.680 Min. :0.0000
## 1st Qu.:0.33147 1st Qu.: 0.00 1st Qu.: 3.970 1st Qu.:0.0000
## Median :0.52014 Median : 0.00 Median : 6.200 Median :0.0000
## Mean :0.71879 Mean :13.62 Mean : 7.078 Mean :0.1538
## 3rd Qu.:0.57834 3rd Qu.:20.00 3rd Qu.: 6.200 3rd Qu.:0.0000
## Max. :3.47428 Max. :95.00 Max. :19.580 Max. :1.0000
## nox rm age dis
## Min. :0.4161 Min. :8.034 Min. : 8.40 Min. :1.801
## 1st Qu.:0.5040 1st Qu.:8.247 1st Qu.:70.40 1st Qu.:2.288
## Median :0.5070 Median :8.297 Median :78.30 Median :2.894
## Mean :0.5392 Mean :8.349 Mean :71.54 Mean :3.430
## 3rd Qu.:0.6050 3rd Qu.:8.398 3rd Qu.:86.50 3rd Qu.:3.652
## Max. :0.7180 Max. :8.780 Max. :93.90 Max. :8.907
## rad tax ptratio black
## Min. : 2.000 Min. :224.0 Min. :13.00 Min. :354.6
## 1st Qu.: 5.000 1st Qu.:264.0 1st Qu.:14.70 1st Qu.:384.5
## Median : 7.000 Median :307.0 Median :17.40 Median :386.9
## Mean : 7.462 Mean :325.1 Mean :16.36 Mean :385.2
## 3rd Qu.: 8.000 3rd Qu.:307.0 3rd Qu.:17.40 3rd Qu.:389.7
## Max. :24.000 Max. :666.0 Max. :20.20 Max. :396.9
## lstat medv
## Min. :2.47 Min. :21.9
## 1st Qu.:3.32 1st Qu.:41.7
## Median :4.14 Median :48.3
## Mean :4.31 Mean :44.2
## 3rd Qu.:5.12 3rd Qu.:50.0
## Max. :7.44 Max. :50.0
The most significant deviation is in median home value, which changes from 20.30 for subarbs with an average offewer than 7 rooms to 36.45 for suburbs between 7 and 8, and 48.3 for 8 rooms or higher. Property tax income is higher for suburbs with fewer than 7 rooms, likely again because of population density.