library(ggplot2)
library(rmarkdown)
library(caret)
library(dplyr)
library(tidyr)
Reading in the data for West Roxbury:
housing.df <- read.csv("WestRoxbury.csv")
boston.df <- read.csv("BostonHousing.csv")
head(housing.df,9)
Create a titled and labeled scatter plot:
hist <- ggplot(data=housing.df, aes(x=GROSS.AREA, y=TOTAL.VALUE, color=REMODEL), binwidth = 5) +
geom_point() +
labs(
subtitle = ("Total home value seems to be determined by gross area, irrespective of remodelling status."),
caption = ("West Roxbury Dataset")) +
ggtitle("Newer, or Larger? Which Aspect Counts More to Housing Prices in W.Roxbury?") +
xlab("Gross Area") +
ylab("Total Home Value")
hist + theme(panel.background = element_rect(fill = "linen"))
Create a boxplot showing variation in a target variable by some categorical variable (factor):
housing.df$ROOMS <- factor(housing.df$ROOMS)
bp <- ggplot(housing.df) +
geom_boxplot(aes(x=ROOMS, y=LIVING.AREA)) +
xlab("Number of Rooms") +
ylab("Household Living Area") +
ggtitle("Living Area per Number of Rooms in West Roxbury") +
labs(
subtitle = ("The mean living area increases with the # of rooms up to 12 rooms, and then plateaus."),
caption = ("West Roxbury Dataset"))
bp + theme(panel.background = element_rect(fill = "linen"))
Fit your best regression to the Boston Housing data with MEDV as the target variable.
As the prompt suggested, the analysis began with a model which contained all predictors from the Boston dataset:
# I started by regressing all predictors:
mb <- lm( MEDV ~ ., data = boston.df)
summary(mb)
##
## Call:
## lm(formula = MEDV ~ ., data = boston.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -9.8156 -1.9975 -0.2335 1.6757 16.0932
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 42.954458 3.816870 11.254 < 2e-16 ***
## CRIM -0.129678 0.025517 -5.082 5.32e-07 ***
## ZN -0.005113 0.011103 -0.460 0.645396
## INDUS 0.114290 0.048362 2.363 0.018506 *
## CHAS 2.359846 0.673138 3.506 0.000497 ***
## NOX -15.362403 2.983384 -5.149 3.79e-07 ***
## RM 1.058350 0.354782 2.983 0.002995 **
## AGE -0.006162 0.010319 -0.597 0.550689
## DIS -0.733482 0.161312 -4.547 6.86e-06 ***
## RAD 0.205249 0.051933 3.952 8.88e-05 ***
## TAX -0.009369 0.002944 -3.182 0.001554 **
## PTRATIO -0.558002 0.104307 -5.350 1.35e-07 ***
## LSTAT -0.478377 0.039373 -12.150 < 2e-16 ***
## CAT..MEDV 11.813994 0.647596 18.243 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 3.709 on 492 degrees of freedom
## Multiple R-squared: 0.8415, Adjusted R-squared: 0.8373
## F-statistic: 200.9 on 13 and 492 DF, p-value: < 2.2e-16
Considering this regression, both ZN and AGE are not statistically significant predictors of MDEV. Since both ZN and AGE are insignificant, essentially functioning as noise, it was decided to exclude them from the model:
boston.sub = subset(boston.df, select = c(1,3,4,5,6,8,9,10,11,12,13) ) # Subsetting the original data, excluding AGE and ZN.
# Next, regress the new subset:
msub <- lm( MEDV ~ ., data = boston.sub)
summary(msub)
##
## Call:
## lm(formula = MEDV ~ ., data = boston.sub)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.4802 -2.8870 -0.7469 1.9880 26.9137
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 42.133053 4.967853 8.481 2.59e-16 ***
## CRIM -0.111672 0.033186 -3.365 0.000825 ***
## INDUS -0.009089 0.062373 -0.146 0.884197
## CHAS 2.878963 0.876962 3.283 0.001100 **
## NOX -19.378105 3.743070 -5.177 3.28e-07 ***
## RM 3.868604 0.411732 9.396 < 2e-16 ***
## DIS -1.206625 0.172645 -6.989 8.99e-12 ***
## RAD 0.265822 0.066998 3.968 8.33e-05 ***
## TAX -0.009815 0.003741 -2.624 0.008961 **
## PTRATIO -1.078801 0.125852 -8.572 < 2e-16 ***
## LSTAT -0.545408 0.048165 -11.324 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 4.844 on 495 degrees of freedom
## Multiple R-squared: 0.7281, Adjusted R-squared: 0.7226
## F-statistic: 132.6 on 10 and 495 DF, p-value: < 2.2e-16
Regressing without AGE and ZN yielded a lower R-squared, suggesting potential over fit of the original model. In the current model, the INDUS predictor becomes statistically insignificant.
The final regression excludes this predictor as well:
# I begin by creating a subset of the boston.sub datazet, excluding industry:
boston.sub2 = subset(boston.sub, select = c(1,3,4,5,6,7,8,9,11) )
# Now regressing:
msub2 <- lm( MEDV ~ ., data = boston.sub2)
summary(msub2)
##
## Call:
## lm(formula = MEDV ~ ., data = boston.sub2)
##
## Residuals:
## Min 1Q Median 3Q Max
## -15.471 -2.959 -0.613 1.984 37.049
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 27.666569 5.366424 5.155 3.66e-07 ***
## CRIM -0.180275 0.036542 -4.933 1.10e-06 ***
## CHAS 3.310918 0.977003 3.389 0.000758 ***
## NOX -28.545936 3.920340 -7.281 1.30e-12 ***
## RM 6.278281 0.393085 15.972 < 2e-16 ***
## DIS -1.021003 0.183439 -5.566 4.27e-08 ***
## RAD 0.270489 0.072313 3.741 0.000205 ***
## TAX -0.011664 0.003785 -3.082 0.002171 **
## PTRATIO -1.207237 0.138362 -8.725 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 5.43 on 497 degrees of freedom
## Multiple R-squared: 0.657, Adjusted R-squared: 0.6515
## F-statistic: 119 on 8 and 497 DF, p-value: < 2.2e-16
par(mfrow=c(2,2))
plot(msub2)
In the final regression, all factors are statistically significant. At the same time, the R-squared remains is substantially lower compared to the original regression, standing at 0.65 as opposed to the original 0.83. The model is nevetheless considered more reliable compared to the previous ones, since all predictors demonstrate statistical significance.
mr <- lm(TOTAL.VALUE ~ LOT.SQFT + YR.BUILT + GROSS.AREA + LIVING.AREA + FLOORS + FULL.BATH + HALF.BATH + KITCHEN + FIREPLACE + REMODEL , data = housing.df)
summary(mr)
##
## Call:
## lm(formula = TOTAL.VALUE ~ LOT.SQFT + YR.BUILT + GROSS.AREA +
## LIVING.AREA + FLOORS + FULL.BATH + HALF.BATH + KITCHEN +
## FIREPLACE + REMODEL, data = housing.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -268.010 -26.138 -0.054 25.052 292.966
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.314e+01 3.385e+01 -1.865 0.062180 .
## LOT.SQFT 8.530e-03 2.391e-04 35.680 < 2e-16 ***
## YR.BUILT 5.929e-02 1.691e-02 3.506 0.000458 ***
## GROSS.AREA 3.159e-02 1.618e-03 19.529 < 2e-16 ***
## LIVING.AREA 5.207e-02 2.916e-03 17.853 < 2e-16 ***
## FLOORS 4.031e+01 1.641e+00 24.570 < 2e-16 ***
## FULL.BATH 1.976e+01 1.314e+00 15.038 < 2e-16 ***
## HALF.BATH 1.890e+01 1.211e+00 15.611 < 2e-16 ***
## KITCHEN -1.439e+01 4.773e+00 -3.015 0.002581 **
## FIREPLACE 1.901e+01 1.056e+00 18.002 < 2e-16 ***
## REMODELOld 4.208e+00 1.909e+00 2.205 0.027528 *
## REMODELRecent 2.512e+01 1.647e+00 15.250 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 42.88 on 5790 degrees of freedom
## Multiple R-squared: 0.8135, Adjusted R-squared: 0.8131
## F-statistic: 2295 on 11 and 5790 DF, p-value: < 2.2e-16
par(mfrow=c(2,2))
plot(mr)
In this case, choosing the best model was more straightforward; the TAX predictor was simply excluded from the model, enabling the rest of the predictors to demonstrate statistical significance. The R-squared is satisfying, at 0.831.
Explain why one of your regression models should exclude TAX from the predictors.
The West Roxbury regression model should exclude TAX from the predictors because the TAX predictor overfits TOTAL.VALUE in a way that makes other predictors statistically insignificant. Consider the regression of TOTAL.VALUE on TAX alone, the R-squared received = 1:
mtax <- lm(TOTAL.VALUE ~ TAX,data = housing.df)
summary(mtax)
##
## Call:
## lm(formula = TOTAL.VALUE ~ TAX, data = housing.df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.04094 -0.01978 0.00017 0.01978 0.03953
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 3.892e-02 1.217e-03 31.97 <2e-16 ***
## TAX 7.949e-02 2.390e-07 332651.40 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.02271 on 5800 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 1.107e+11 on 1 and 5800 DF, p-value: < 2.2e-16
The TAX predictor makes most other variables statistically insignificant. In part, it may account for them; that is, the tax rate on housing is often times determined by the size of the living area, lot size, the house’s gross area, etc. This dependency might create a strong multicollinearity problem, rendering other predictors statistically insignificant compared to TAX. We see that once TAX is removed from the regression, most other variables demonstrate a high degree of statistical significance (three stars). The regression in part 5 excludes the TAX rate, and demonstrates the impact of its exclusion from the model. A model without the TAX predictor is much more reliable.
Consider the following table of graphs, which is based on data from West Roxbury:
housing.df %>%
gather(-TOTAL.VALUE, -LIVING.AREA, key = "var", value = "Predictors") %>%
ggplot(aes(x = Predictors, y = TOTAL.VALUE, color = LIVING.AREA)) +
geom_point() +
facet_wrap(~ var, scales = "free") +
theme_bw()
## Warning: attributes are not identical across measure variables;
## they will be dropped
The table of graphs above plots the relationship between the total value of the property and some predictors. From it, I drew the following estimations:
It is evident that the total value of the property is positively correlated to the number of bedrooms.
The number of fireplaces appears less significant, with total value varying across all 4 values.
The number of floors shows similar variation, with properties of higher value having 2 rather than 2.5 or 3 floors.
The number of full baths shows a similar pattern as floors, though with less variation towards 4 and 5 full baths, which demonstrate higher values (though the highest value properties have 3 full baths).
The gross area of the property seems roughly positively correlated to the value of the property (which makes intuitive sense).
Half baths do not seem particularly correlated to the value of the property, as well as the number of kitchens, and the size of the parking lot.
Indeed, the scatter plot even suggests that some houses with smaller lot size have higher value compared to houses with larger lots.
Surprisingly, the remodel status of the house appears to be insignificant in terms of its value, as well as the year in which the property was built.
Out of the predictors inspected thus far from the West Roxbury data set, the number of bedrooms, gross area and living area appear to be the most meaningful in terms of determining property value.
The analysis now proceeds to the Boston data set.
It considers MEDV, the median value of owner-occupied homes, to represent property value, with regards to question 7.
Consider the following table of graphs:
boston.df %>%
gather(-MEDV, -RM, key = "var", value = "Predictors") %>%
ggplot(aes(x = Predictors, y = MEDV, color = RM)) +
geom_point() +
facet_wrap(~ var, scales = "free")
theme_bw()
## List of 93
## $ line :List of 6
## ..$ colour : chr "black"
## ..$ size : num 0.5
## ..$ linetype : num 1
## ..$ lineend : chr "butt"
## ..$ arrow : logi FALSE
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_line" "element"
## $ rect :List of 5
## ..$ fill : chr "white"
## ..$ colour : chr "black"
## ..$ size : num 0.5
## ..$ linetype : num 1
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_rect" "element"
## $ text :List of 11
## ..$ family : chr ""
## ..$ face : chr "plain"
## ..$ colour : chr "black"
## ..$ size : num 11
## ..$ hjust : num 0.5
## ..$ vjust : num 0.5
## ..$ angle : num 0
## ..$ lineheight : num 0.9
## ..$ margin : 'margin' num [1:4] 0points 0points 0points 0points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : logi FALSE
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ title : NULL
## $ aspect.ratio : NULL
## $ axis.title : NULL
## $ axis.title.x :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : NULL
## ..$ vjust : num 1
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 2.75points 0points 0points 0points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ axis.title.x.top :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : NULL
## ..$ vjust : num 0
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 0points 0points 2.75points 0points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ axis.title.x.bottom : NULL
## $ axis.title.y :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : NULL
## ..$ vjust : num 1
## ..$ angle : num 90
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 0points 2.75points 0points 0points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ axis.title.y.left : NULL
## $ axis.title.y.right :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : NULL
## ..$ vjust : num 0
## ..$ angle : num -90
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 0points 0points 0points 2.75points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ axis.text :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : chr "grey30"
## ..$ size : 'rel' num 0.8
## ..$ hjust : NULL
## ..$ vjust : NULL
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : NULL
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ axis.text.x :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : NULL
## ..$ vjust : num 1
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 2.2points 0points 0points 0points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ axis.text.x.top :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : NULL
## ..$ vjust : num 0
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 0points 0points 2.2points 0points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ axis.text.x.bottom : NULL
## $ axis.text.y :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : num 1
## ..$ vjust : NULL
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 0points 2.2points 0points 0points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ axis.text.y.left : NULL
## $ axis.text.y.right :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : num 0
## ..$ vjust : NULL
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 0points 0points 0points 2.2points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ axis.ticks :List of 6
## ..$ colour : chr "grey20"
## ..$ size : NULL
## ..$ linetype : NULL
## ..$ lineend : NULL
## ..$ arrow : logi FALSE
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_line" "element"
## $ axis.ticks.x : NULL
## $ axis.ticks.x.top : NULL
## $ axis.ticks.x.bottom : NULL
## $ axis.ticks.y : NULL
## $ axis.ticks.y.left : NULL
## $ axis.ticks.y.right : NULL
## $ axis.ticks.length : 'simpleUnit' num 2.75points
## ..- attr(*, "unit")= int 8
## $ axis.ticks.length.x : NULL
## $ axis.ticks.length.x.top : NULL
## $ axis.ticks.length.x.bottom: NULL
## $ axis.ticks.length.y : NULL
## $ axis.ticks.length.y.left : NULL
## $ axis.ticks.length.y.right : NULL
## $ axis.line : list()
## ..- attr(*, "class")= chr [1:2] "element_blank" "element"
## $ axis.line.x : NULL
## $ axis.line.x.top : NULL
## $ axis.line.x.bottom : NULL
## $ axis.line.y : NULL
## $ axis.line.y.left : NULL
## $ axis.line.y.right : NULL
## $ legend.background :List of 5
## ..$ fill : NULL
## ..$ colour : logi NA
## ..$ size : NULL
## ..$ linetype : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_rect" "element"
## $ legend.margin : 'margin' num [1:4] 5.5points 5.5points 5.5points 5.5points
## ..- attr(*, "unit")= int 8
## $ legend.spacing : 'simpleUnit' num 11points
## ..- attr(*, "unit")= int 8
## $ legend.spacing.x : NULL
## $ legend.spacing.y : NULL
## $ legend.key :List of 5
## ..$ fill : chr "white"
## ..$ colour : logi NA
## ..$ size : NULL
## ..$ linetype : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_rect" "element"
## $ legend.key.size : 'simpleUnit' num 1.2lines
## ..- attr(*, "unit")= int 3
## $ legend.key.height : NULL
## $ legend.key.width : NULL
## $ legend.text :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : 'rel' num 0.8
## ..$ hjust : NULL
## ..$ vjust : NULL
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : NULL
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ legend.text.align : NULL
## $ legend.title :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : num 0
## ..$ vjust : NULL
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : NULL
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ legend.title.align : NULL
## $ legend.position : chr "right"
## $ legend.direction : NULL
## $ legend.justification : chr "center"
## $ legend.box : NULL
## $ legend.box.just : NULL
## $ legend.box.margin : 'margin' num [1:4] 0cm 0cm 0cm 0cm
## ..- attr(*, "unit")= int 1
## $ legend.box.background : list()
## ..- attr(*, "class")= chr [1:2] "element_blank" "element"
## $ legend.box.spacing : 'simpleUnit' num 11points
## ..- attr(*, "unit")= int 8
## $ panel.background :List of 5
## ..$ fill : chr "white"
## ..$ colour : logi NA
## ..$ size : NULL
## ..$ linetype : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_rect" "element"
## $ panel.border :List of 5
## ..$ fill : logi NA
## ..$ colour : chr "grey20"
## ..$ size : NULL
## ..$ linetype : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_rect" "element"
## $ panel.spacing : 'simpleUnit' num 5.5points
## ..- attr(*, "unit")= int 8
## $ panel.spacing.x : NULL
## $ panel.spacing.y : NULL
## $ panel.grid :List of 6
## ..$ colour : chr "grey92"
## ..$ size : NULL
## ..$ linetype : NULL
## ..$ lineend : NULL
## ..$ arrow : logi FALSE
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_line" "element"
## $ panel.grid.major : NULL
## $ panel.grid.minor :List of 6
## ..$ colour : NULL
## ..$ size : 'rel' num 0.5
## ..$ linetype : NULL
## ..$ lineend : NULL
## ..$ arrow : logi FALSE
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_line" "element"
## $ panel.grid.major.x : NULL
## $ panel.grid.major.y : NULL
## $ panel.grid.minor.x : NULL
## $ panel.grid.minor.y : NULL
## $ panel.ontop : logi FALSE
## $ plot.background :List of 5
## ..$ fill : NULL
## ..$ colour : chr "white"
## ..$ size : NULL
## ..$ linetype : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_rect" "element"
## $ plot.title :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : 'rel' num 1.2
## ..$ hjust : num 0
## ..$ vjust : num 1
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 0points 0points 5.5points 0points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ plot.title.position : chr "panel"
## $ plot.subtitle :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : num 0
## ..$ vjust : num 1
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 0points 0points 5.5points 0points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ plot.caption :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : 'rel' num 0.8
## ..$ hjust : num 1
## ..$ vjust : num 1
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 5.5points 0points 0points 0points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ plot.caption.position : chr "panel"
## $ plot.tag :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : 'rel' num 1.2
## ..$ hjust : num 0.5
## ..$ vjust : num 0.5
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : NULL
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ plot.tag.position : chr "topleft"
## $ plot.margin : 'margin' num [1:4] 5.5points 5.5points 5.5points 5.5points
## ..- attr(*, "unit")= int 8
## $ strip.background :List of 5
## ..$ fill : chr "grey85"
## ..$ colour : chr "grey20"
## ..$ size : NULL
## ..$ linetype : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_rect" "element"
## $ strip.background.x : NULL
## $ strip.background.y : NULL
## $ strip.placement : chr "inside"
## $ strip.text :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : chr "grey10"
## ..$ size : 'rel' num 0.8
## ..$ hjust : NULL
## ..$ vjust : NULL
## ..$ angle : NULL
## ..$ lineheight : NULL
## ..$ margin : 'margin' num [1:4] 4.4points 4.4points 4.4points 4.4points
## .. ..- attr(*, "unit")= int 8
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ strip.text.x : NULL
## $ strip.text.y :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : NULL
## ..$ vjust : NULL
## ..$ angle : num -90
## ..$ lineheight : NULL
## ..$ margin : NULL
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## $ strip.switch.pad.grid : 'simpleUnit' num 2.75points
## ..- attr(*, "unit")= int 8
## $ strip.switch.pad.wrap : 'simpleUnit' num 2.75points
## ..- attr(*, "unit")= int 8
## $ strip.text.y.left :List of 11
## ..$ family : NULL
## ..$ face : NULL
## ..$ colour : NULL
## ..$ size : NULL
## ..$ hjust : NULL
## ..$ vjust : NULL
## ..$ angle : num 90
## ..$ lineheight : NULL
## ..$ margin : NULL
## ..$ debug : NULL
## ..$ inherit.blank: logi TRUE
## ..- attr(*, "class")= chr [1:2] "element_text" "element"
## - attr(*, "class")= chr [1:2] "theme" "gg"
## - attr(*, "complete")= logi TRUE
## - attr(*, "validate")= logi TRUE
The table of graphs above suggests most evidently that:
There is a negative logarithmic relationship between the percentage of lower status of the population and the median value of owner-occupied homes.
The crime rate increases as the median value of owner occupied homes decreases – though with substantial variation across low crime areas.
The MEDV shows a negative appears to be negatively correlated to the concentration of nitric oxide (parts per 10 million), though the data varies across concentration levels.
This report was made by Noam Maman for BUS 212: Analyzing Big Data II Fall 2021 Brandeis University - International Business School