2.

Regression, Inference, n = 500 , p = 3 Classification, Prediction, n =20, p = 13 Regression, Prediction, n = 52, p = 3

5.

A very flexible approach can be disadvantageous when the model overfits to the training data, then leads to a reduced Test MSE. It can be advantageous to not have a strict shape over the data so we can reduce the bias in our model, but our variance will increase. A less flexible approach may be preferred if we want to make inferences on our data and understand the relationship between X and Y. Less flexible approaches are also less complex and easier to describe compared to very flexible models.

6.

In terms of statistics, parametric models usually follow the assumptions of the model while non-parametric models violate some assumptions of its parametric counterpart in order to get as close to f without being too wiggly. Parametric: Parametric models are much more structured and follow defined shapes that aren’t very flexible. Do not require as many observations to create a parametric model. Parametric models also have reduced computation time. Non-Parametric: Non-parametric models are very flexible and have a better potential to fit the shape of f due to having a wider range of shapes. A large disadvantage of non-parametric measures is that they require much more observations in order to make an accurate estimate of f.


8.

b)
View(college)
summary(college)
##      ...1             Private               Apps           Accept     
##  Length:777         Length:777         Min.   :   81   Min.   :   72  
##  Class :character   Class :character   1st Qu.:  776   1st Qu.:  604  
##  Mode  :character   Mode  :character   Median : 1558   Median : 1110  
##                                        Mean   : 3002   Mean   : 2019  
##                                        3rd Qu.: 3624   3rd Qu.: 2424  
##                                        Max.   :48094   Max.   :26330  
##      Enroll       Top10perc       Top25perc      F.Undergrad   
##  Min.   :  35   Min.   : 1.00   Min.   :  9.0   Min.   :  139  
##  1st Qu.: 242   1st Qu.:15.00   1st Qu.: 41.0   1st Qu.:  992  
##  Median : 434   Median :23.00   Median : 54.0   Median : 1707  
##  Mean   : 780   Mean   :27.56   Mean   : 55.8   Mean   : 3700  
##  3rd Qu.: 902   3rd Qu.:35.00   3rd Qu.: 69.0   3rd Qu.: 4005  
##  Max.   :6392   Max.   :96.00   Max.   :100.0   Max.   :31643  
##   P.Undergrad         Outstate       Room.Board       Books       
##  Min.   :    1.0   Min.   : 2340   Min.   :1780   Min.   :  96.0  
##  1st Qu.:   95.0   1st Qu.: 7320   1st Qu.:3597   1st Qu.: 470.0  
##  Median :  353.0   Median : 9990   Median :4200   Median : 500.0  
##  Mean   :  855.3   Mean   :10441   Mean   :4358   Mean   : 549.4  
##  3rd Qu.:  967.0   3rd Qu.:12925   3rd Qu.:5050   3rd Qu.: 600.0  
##  Max.   :21836.0   Max.   :21700   Max.   :8124   Max.   :2340.0  
##     Personal         PhD            Terminal       S.F.Ratio    
##  Min.   : 250   Min.   :  8.00   Min.   : 24.0   Min.   : 2.50  
##  1st Qu.: 850   1st Qu.: 62.00   1st Qu.: 71.0   1st Qu.:11.50  
##  Median :1200   Median : 75.00   Median : 82.0   Median :13.60  
##  Mean   :1341   Mean   : 72.66   Mean   : 79.7   Mean   :14.09  
##  3rd Qu.:1700   3rd Qu.: 85.00   3rd Qu.: 92.0   3rd Qu.:16.50  
##  Max.   :6800   Max.   :103.00   Max.   :100.0   Max.   :39.80  
##   perc.alumni        Expend        Grad.Rate     
##  Min.   : 0.00   Min.   : 3186   Min.   : 10.00  
##  1st Qu.:13.00   1st Qu.: 6751   1st Qu.: 53.00  
##  Median :21.00   Median : 8377   Median : 65.00  
##  Mean   :22.74   Mean   : 9660   Mean   : 65.46  
##  3rd Qu.:31.00   3rd Qu.:10830   3rd Qu.: 78.00  
##  Max.   :64.00   Max.   :56233   Max.   :118.00

c)

pairs(college[ ,3:13])

college$Private = as.factor(college$Private)
plot(college$Private, college$Outstate)

Elite=rep("No",nrow(college))
Elite[college$Top10perc >50]="Yes"
Elite=as.factor(Elite)
college=data.frame(college, Elite)

summary(college$Elite)
##  No Yes 
## 699  78
plot(college$Elite, college$Outstate)

par(mfrow = c(2, 2))
hist(college$Apps)
hist(college$Grad.Rate)
hist(college$S.F.Ratio)
hist(college$Expend)

college_glm = glm(Private ~.-...1, data = college, family = binomial)
summary(college_glm)
## 
## Call:
## glm(formula = Private ~ . - ...1, family = binomial, data = college)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -1.832e-02  1.904e+00  -0.010  0.99233    
## Apps        -4.002e-04  2.382e-04  -1.680  0.09296 .  
## Accept      -9.533e-05  4.634e-04  -0.206  0.83702    
## Enroll       1.470e-03  8.710e-04   1.688  0.09139 .  
## Top10perc    5.025e-02  3.371e-02   1.491  0.13609    
## Top25perc   -6.060e-03  2.005e-02  -0.302  0.76245    
## F.Undergrad -4.257e-04  1.472e-04  -2.892  0.00383 ** 
## P.Undergrad  2.079e-06  1.367e-04   0.015  0.98787    
## Outstate     7.250e-04  1.172e-04   6.185 6.22e-10 ***
## Room.Board   1.211e-04  2.690e-04   0.450  0.65252    
## Books        1.932e-03  1.354e-03   1.427  0.15369    
## Personal    -3.746e-04  2.706e-04  -1.384  0.16625    
## PhD         -6.917e-02  2.693e-02  -2.568  0.01022 *  
## Terminal    -2.618e-02  2.567e-02  -1.020  0.30786    
## S.F.Ratio   -8.065e-02  6.301e-02  -1.280  0.20053    
## perc.alumni  4.686e-02  2.108e-02   2.223  0.02620 *  
## Expend       1.799e-04  1.198e-04   1.501  0.13324    
## Grad.Rate    1.636e-02  1.184e-02   1.382  0.16696    
## EliteYes    -3.191e+00  1.219e+00  -2.617  0.00887 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 910.75  on 776  degrees of freedom
## Residual deviance: 232.98  on 758  degrees of freedom
## AIC: 270.98
## 
## Number of Fisher Scoring iterations: 8

I ran a logistic regression on all the variables being a function of Private and I found that F. Undergrad, Outstate, PhD,perc.alumni, and EliteYes are all significant predictors, at an alpha level of 0.05, for a university being Private.

college_glm = glm(Private ~  `F.Undergrad` + Outstate + PhD 
                  +  `perc.alumni` + Elite, data = college, family = binomial)
summary(college_glm)
## 
## Call:
## glm(formula = Private ~ F.Undergrad + Outstate + PhD + perc.alumni + 
##     Elite, family = binomial, data = college)
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept) -3.349e-01  8.938e-01  -0.375 0.707911    
## F.Undergrad -4.795e-04  6.326e-05  -7.579 3.47e-14 ***
## Outstate     8.228e-04  8.545e-05   9.629  < 2e-16 ***
## PhD         -6.991e-02  1.456e-02  -4.802 1.57e-06 ***
## perc.alumni  6.608e-02  1.967e-02   3.359 0.000781 ***
## EliteYes    -1.426e+00  8.892e-01  -1.604 0.108819    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 910.75  on 776  degrees of freedom
## Residual deviance: 260.86  on 771  degrees of freedom
## AIC: 272.86
## 
## Number of Fisher Scoring iterations: 7

When running only those significant predictors, Elite no longer becomes significant in predicting Private.

9.

a)

auto = read_csv("~/R-Studio/Predictive Modeling/ALL CSV FILES - 2nd Edition/Auto.csv")
## Rows: 397 Columns: 9
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): horsepower, name
## dbl (7): mpg, cylinders, displacement, weight, acceleration, year, origin
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
auto = na.omit(auto)
summary(auto)
##       mpg          cylinders      displacement    horsepower       
##  Min.   : 9.00   Min.   :3.000   Min.   : 68.0   Length:397        
##  1st Qu.:17.50   1st Qu.:4.000   1st Qu.:104.0   Class :character  
##  Median :23.00   Median :4.000   Median :146.0   Mode  :character  
##  Mean   :23.52   Mean   :5.458   Mean   :193.5                     
##  3rd Qu.:29.00   3rd Qu.:8.000   3rd Qu.:262.0                     
##  Max.   :46.60   Max.   :8.000   Max.   :455.0                     
##      weight      acceleration        year           origin     
##  Min.   :1613   Min.   : 8.00   Min.   :70.00   Min.   :1.000  
##  1st Qu.:2223   1st Qu.:13.80   1st Qu.:73.00   1st Qu.:1.000  
##  Median :2800   Median :15.50   Median :76.00   Median :1.000  
##  Mean   :2970   Mean   :15.56   Mean   :75.99   Mean   :1.574  
##  3rd Qu.:3609   3rd Qu.:17.10   3rd Qu.:79.00   3rd Qu.:2.000  
##  Max.   :5140   Max.   :24.80   Max.   :82.00   Max.   :3.000  
##      name          
##  Length:397        
##  Class :character  
##  Mode  :character  
##                    
##                    
## 

Everything is numeric EXCEPT horsepower and name according to the summary. I will change horsepower into numeric because that does not seem correct.

auto$horsepower = as.numeric(auto$horsepower)
## Warning: NAs introduced by coercion
auto = na.omit(auto)
summary(auto)
##       mpg          cylinders      displacement     horsepower        weight    
##  Min.   : 9.00   Min.   :3.000   Min.   : 68.0   Min.   : 46.0   Min.   :1613  
##  1st Qu.:17.00   1st Qu.:4.000   1st Qu.:105.0   1st Qu.: 75.0   1st Qu.:2225  
##  Median :22.75   Median :4.000   Median :151.0   Median : 93.5   Median :2804  
##  Mean   :23.45   Mean   :5.472   Mean   :194.4   Mean   :104.5   Mean   :2978  
##  3rd Qu.:29.00   3rd Qu.:8.000   3rd Qu.:275.8   3rd Qu.:126.0   3rd Qu.:3615  
##  Max.   :46.60   Max.   :8.000   Max.   :455.0   Max.   :230.0   Max.   :5140  
##   acceleration        year           origin          name          
##  Min.   : 8.00   Min.   :70.00   Min.   :1.000   Length:392        
##  1st Qu.:13.78   1st Qu.:73.00   1st Qu.:1.000   Class :character  
##  Median :15.50   Median :76.00   Median :1.000   Mode  :character  
##  Mean   :15.54   Mean   :75.98   Mean   :1.577                     
##  3rd Qu.:17.02   3rd Qu.:79.00   3rd Qu.:2.000                     
##  Max.   :24.80   Max.   :82.00   Max.   :3.000

NA’s were introduced when switching horsepower to numeric but have been removed. Now only ‘name’ should be the character and everything else numeric.

b)

Range of auto

sapply(auto[, 1:7], range)
##       mpg cylinders displacement horsepower weight acceleration year
## [1,]  9.0         3           68         46   1613          8.0   70
## [2,] 46.6         8          455        230   5140         24.8   82
#mpg cylinders displacement horsepower weight acceleration year
#[1,]  9.0         3           68         46   1613          8.0   70
#[2,] 46.6         8          455        230   5140         24.8   82

c)

Mean of auto

sapply(auto[, 1:7], mean)
##          mpg    cylinders displacement   horsepower       weight acceleration 
##    23.445918     5.471939   194.411990   104.469388  2977.584184    15.541327 
##         year 
##    75.979592
#mpg    cylinders displacement   horsepower       weight acceleration         year 
#23.445918     5.471939   194.411990   104.469388  2977.584184    15.541327    75.979592 

Standard Deviation of auto

sapply(auto[, 1:7], sd)
##          mpg    cylinders displacement   horsepower       weight acceleration 
##     7.805007     1.705783   104.644004    38.491160   849.402560     2.758864 
##         year 
##     3.683737
#mpg    cylinders displacement   horsepower       weight acceleration         year 
#7.805007     1.705783   104.644004    38.491160   849.402560     2.758864     3.683737 

d)

Subsample Range

auto2 <- auto[-c(10:85), ]
sapply(auto2[, 1:7], range)
##       mpg cylinders displacement horsepower weight acceleration year
## [1,] 11.0         3           68         46   1649          8.5   70
## [2,] 46.6         8          455        230   4997         24.8   82
#mpg cylinders displacement horsepower weight acceleration year
#[1,] 11.0         3           68         46   1649          8.5   70
#[2,] 46.6         8          455        230   4997         24.8   82

Subsample Mean

sapply(auto2[, 1:7], mean)
##          mpg    cylinders displacement   horsepower       weight acceleration 
##    24.404430     5.373418   187.240506   100.721519  2935.971519    15.726899 
##         year 
##    77.145570
#mpg    cylinders displacement   horsepower       weight acceleration         year 
#24.404430     5.373418   187.240506   100.721519  2935.971519    15.726899    77.145570 

Subsample Standard Deviation

sapply(auto2[, 1:7], sd)
##          mpg    cylinders displacement   horsepower       weight acceleration 
##     7.867283     1.654179    99.678367    35.708853   811.300208     2.693721 
##         year 
##     3.106217
#mpg    cylinders displacement   horsepower       weight acceleration         year 
#7.867283     1.654179    99.678367    35.708853   811.300208     2.693721     3.106217 

e)

pairs(auto[ ,1:8])

par(mfrow = c(1, 1))
plot(auto$weight, auto$acceleration)

Slight correlation that as weight increases, acceleration decreases

plot(auto$weight, auto$mpg)

Collelation that as weight increases, miles per gallon decreases

f)

Yes, weight seems to be a good predictor on mpg because there is a downward trend in mpg especially from 3500+ in weight, the mpg goes from ~20 to ~10. We can also make a model to see if this is true

auto_lm = lm(mpg~weight, data = auto)
summary(auto_lm)
## 
## Call:
## lm(formula = mpg ~ weight, data = auto)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -11.9736  -2.7556  -0.3358   2.1379  16.5194 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 46.216524   0.798673   57.87   <2e-16 ***
## weight      -0.007647   0.000258  -29.64   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.333 on 390 degrees of freedom
## Multiple R-squared:  0.6926, Adjusted R-squared:  0.6918 
## F-statistic: 878.8 on 1 and 390 DF,  p-value: < 2.2e-16

Weight is a significant predictor based on this linear model.

10.

a)

#install.packages("ISLR2")
library(ISLR2)
## Warning: package 'ISLR2' was built under R version 4.3.2
boston = ISLR2::Boston
dim(boston)
## [1] 506  13
?ISLR2::Boston
## starting httpd help server ... done

506 rows, 13 columns which means 506 suburbs of Boston listed with 13 variables to help predict housing value in the suburbs. The explaination of each column can be found in the help function that was called, while each row is a suburb of Boston.

b)

pairs(boston)

There is a lot going on in the pairwise scatterplots but some plots that look to be correlated at first glance: zn & crim, indus & nox, lstat & medv, rad & tax and possibily more just hard to see with mark I eyeballs

par(mfrow = c(2, 2))
plot(boston$tax, boston$crim)
plot(boston$ptratio, boston$crim)
plot(boston$medv, boston$crim)
plot(boston$rm, boston$crim)

It seems that more crime occurs in the higher tax range (~650 to be specific) More crime seems to occur when the pupil-teacher ratio is ~20. Most crime occurs in the lower median value range of around ~10000 More crime around 4-6 room houses.

c)

boston_lm = lm(crim ~., data = boston)
summary(boston_lm)
## 
## Call:
## lm(formula = crim ~ ., data = boston)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -8.534 -2.248 -0.348  1.087 73.923 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 13.7783938  7.0818258   1.946 0.052271 .  
## zn           0.0457100  0.0187903   2.433 0.015344 *  
## indus       -0.0583501  0.0836351  -0.698 0.485709    
## chas        -0.8253776  1.1833963  -0.697 0.485841    
## nox         -9.9575865  5.2898242  -1.882 0.060370 .  
## rm           0.6289107  0.6070924   1.036 0.300738    
## age         -0.0008483  0.0179482  -0.047 0.962323    
## dis         -1.0122467  0.2824676  -3.584 0.000373 ***
## rad          0.6124653  0.0875358   6.997 8.59e-12 ***
## tax         -0.0037756  0.0051723  -0.730 0.465757    
## ptratio     -0.3040728  0.1863598  -1.632 0.103393    
## lstat        0.1388006  0.0757213   1.833 0.067398 .  
## medv        -0.2200564  0.0598240  -3.678 0.000261 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 6.46 on 493 degrees of freedom
## Multiple R-squared:  0.4493, Adjusted R-squared:  0.4359 
## F-statistic: 33.52 on 12 and 493 DF,  p-value: < 2.2e-16

Using a linear model to find significant predictors of crim (alpha = 0.05), significant predictors were zn, dis, rad, and medv. Positive predictors were zn and rad while dis and medv were negative predictors of crime per capita. This tells us that as zn and rad increase then crime is likely to increase as well. If dis and medv were to increase then crime is likely to decrease. Based on these four variables alone, a suburb that would experience a lot of crime would have higher zn and rad while having lower dis and medv.

d)

par(mfrow=c(1,1))

hist(boston$crim,breaks=50)

range(boston$crim)
## [1]  0.00632 88.97620

Most places have low crime, but the value trails all the way up to ~88 per capita crime rate by town

hist(boston$tax)

range(boston$tax)
## [1] 187 711

Most places have a tax range between ~200-400, there there is a cap from 400-600, then a big jump in frequency occurs at the 700 tax range

hist(boston$ptratio)

range(boston$ptratio)
## [1] 12.6 22.0

There is a very high frequency on the ~20 pupil-teacher ratio which could be influencing our prior look at crime vs ptratio in b.

e)

dim(subset(Boston, chas == 1))
## [1] 35 13
# 35 bound the Charles River

35 bound the Charles River

f)

median(Boston$ptratio)
## [1] 19.05
#19.05

19.05 median pupil-teacher ratio

g)

boston[boston$medv == min(boston$medv), ]
##        crim zn indus chas   nox    rm age    dis rad tax ptratio lstat medv
## 399 38.3518  0  18.1    0 0.693 5.453 100 1.4896  24 666    20.2 30.59    5
## 406 67.9208  0  18.1    0 0.693 5.683 100 1.4254  24 666    20.2 22.98    5
summary(boston)
##       crim                zn             indus            chas        
##  Min.   : 0.00632   Min.   :  0.00   Min.   : 0.46   Min.   :0.00000  
##  1st Qu.: 0.08205   1st Qu.:  0.00   1st Qu.: 5.19   1st Qu.:0.00000  
##  Median : 0.25651   Median :  0.00   Median : 9.69   Median :0.00000  
##  Mean   : 3.61352   Mean   : 11.36   Mean   :11.14   Mean   :0.06917  
##  3rd Qu.: 3.67708   3rd Qu.: 12.50   3rd Qu.:18.10   3rd Qu.:0.00000  
##  Max.   :88.97620   Max.   :100.00   Max.   :27.74   Max.   :1.00000  
##       nox               rm             age              dis        
##  Min.   :0.3850   Min.   :3.561   Min.   :  2.90   Min.   : 1.130  
##  1st Qu.:0.4490   1st Qu.:5.886   1st Qu.: 45.02   1st Qu.: 2.100  
##  Median :0.5380   Median :6.208   Median : 77.50   Median : 3.207  
##  Mean   :0.5547   Mean   :6.285   Mean   : 68.57   Mean   : 3.795  
##  3rd Qu.:0.6240   3rd Qu.:6.623   3rd Qu.: 94.08   3rd Qu.: 5.188  
##  Max.   :0.8710   Max.   :8.780   Max.   :100.00   Max.   :12.127  
##       rad              tax           ptratio          lstat      
##  Min.   : 1.000   Min.   :187.0   Min.   :12.60   Min.   : 1.73  
##  1st Qu.: 4.000   1st Qu.:279.0   1st Qu.:17.40   1st Qu.: 6.95  
##  Median : 5.000   Median :330.0   Median :19.05   Median :11.36  
##  Mean   : 9.549   Mean   :408.2   Mean   :18.46   Mean   :12.65  
##  3rd Qu.:24.000   3rd Qu.:666.0   3rd Qu.:20.20   3rd Qu.:16.95  
##  Max.   :24.000   Max.   :711.0   Max.   :22.00   Max.   :37.97  
##       medv      
##  Min.   : 5.00  
##  1st Qu.:17.02  
##  Median :21.20  
##  Mean   :22.53  
##  3rd Qu.:25.00  
##  Max.   :50.00
# Compared to our ranges, crime is in the 3rd quartile, indus is in 3rd quartile,
# nox is in the 3rd quartile, rm is below the first quartile, age is max, 
# dis is below 1st quartile, rad is max, tax is in the 3rd quartile, ptratio is in the 3rd quartile
# and lstat is in the 3rd quartile.

Compared to our ranges, crime is in the 3rd quartile, indus is in 3rd quartile, nox is in the 3rd quartile, rm is below the first quartile, age is max, dis is below 1st quartile, rad is max, tax is in the 3rd quartile, ptratio is in the 3rd quartile and lstat is in the 3rd quartile.

dim(subset(boston, rm > 7))
## [1] 64 13
dim(subset(boston, rm > 8))
## [1] 13 13
# 64 houses with 7+ rooms, and 13 with 8+ rooms.

64 houses with 7+ rooms, and 13 with 8+ rooms.

summary((subset(boston, rm > 8)))
##       crim               zn            indus             chas       
##  Min.   :0.02009   Min.   : 0.00   Min.   : 2.680   Min.   :0.0000  
##  1st Qu.:0.33147   1st Qu.: 0.00   1st Qu.: 3.970   1st Qu.:0.0000  
##  Median :0.52014   Median : 0.00   Median : 6.200   Median :0.0000  
##  Mean   :0.71879   Mean   :13.62   Mean   : 7.078   Mean   :0.1538  
##  3rd Qu.:0.57834   3rd Qu.:20.00   3rd Qu.: 6.200   3rd Qu.:0.0000  
##  Max.   :3.47428   Max.   :95.00   Max.   :19.580   Max.   :1.0000  
##       nox               rm             age             dis       
##  Min.   :0.4161   Min.   :8.034   Min.   : 8.40   Min.   :1.801  
##  1st Qu.:0.5040   1st Qu.:8.247   1st Qu.:70.40   1st Qu.:2.288  
##  Median :0.5070   Median :8.297   Median :78.30   Median :2.894  
##  Mean   :0.5392   Mean   :8.349   Mean   :71.54   Mean   :3.430  
##  3rd Qu.:0.6050   3rd Qu.:8.398   3rd Qu.:86.50   3rd Qu.:3.652  
##  Max.   :0.7180   Max.   :8.780   Max.   :93.90   Max.   :8.907  
##       rad              tax           ptratio          lstat           medv     
##  Min.   : 2.000   Min.   :224.0   Min.   :13.00   Min.   :2.47   Min.   :21.9  
##  1st Qu.: 5.000   1st Qu.:264.0   1st Qu.:14.70   1st Qu.:3.32   1st Qu.:41.7  
##  Median : 7.000   Median :307.0   Median :17.40   Median :4.14   Median :48.3  
##  Mean   : 7.462   Mean   :325.1   Mean   :16.36   Mean   :4.31   Mean   :44.2  
##  3rd Qu.: 8.000   3rd Qu.:307.0   3rd Qu.:17.40   3rd Qu.:5.12   3rd Qu.:50.0  
##  Max.   :24.000   Max.   :666.0   Max.   :20.20   Max.   :7.44   Max.   :50.0
# Crime is very low with 75% of 8 room homes being less than 1 per capita crime rate
# and the median home value is between 21-50, with a mean of 44!

Crime is very low with 75% of 8 room homes being less than 1 per capita crime rate and the median home value is between 21-50, with a mean of 44!