Problem 1

I chose to use X4 and Y4

1.1

1.1.a

\[P(X > x | Y > y) \\ \frac{P(X , Y)}{P(Y)} \] The above reads as the probability that an X value is greater than the 3rd quartile of the X values given that the Y value is greater than the 1st quartile of the Y values

## [1] 0.2

This means that when Y values are greater than the 1st quartile of Y, there is a 20% chance that the X is greater than the 3rd quartile of X

1.1.b

\(P(X > x, Y > y)\)

The above read as the probability that when X is greater than the 3rd quartile of X, Y is also greater than the 1st quartile of Y

## [1] 0.15

The probability of both of these events occuring is 15%

1.1.c

\[P(X < x | Y > y) \\ \frac{P(X, Y)}{P(Y)} \] The above reads as the probability that X is less than the 3rd quartile of X, given that Y is > than the 1st quartile of Y

## [1] 0.8

This means that there is an 80% probability that when Y is greater than the 1st quartile of Y, X is less than the 3rd quartile of X

I am adding the below just in case the probability was supposed to read less than or equal to \(P(X \leq x | Y > y)\)

## [1] 0.8

There is also an 80% chance of this occuring

1.2

##     X/Y X<=x  X>x Total
## 1  Y<=y 0.15 0.10  0.25
## 2   Y>y 0.60 0.15  0.75
## 3 Total 0.75 0.25  1.00

1.3

## 
##  Pearson's Chi-squared test with Yates' continuity correction
## 
## data:  test
## X-squared = 0.8, df = 1, p-value = 0.3711

The chi-square test’s p-value of .3711 shows that we X & Y are not independent due to the failure to reach significance at .05.

Problem 2

## Parsed with column specification:
## cols(
##   .default = col_character(),
##   Id = col_integer(),
##   MSSubClass = col_integer(),
##   LotFrontage = col_integer(),
##   LotArea = col_integer(),
##   OverallQual = col_integer(),
##   OverallCond = col_integer(),
##   YearBuilt = col_integer(),
##   YearRemodAdd = col_integer(),
##   MasVnrArea = col_integer(),
##   BsmtFinSF1 = col_integer(),
##   BsmtFinSF2 = col_integer(),
##   BsmtUnfSF = col_integer(),
##   TotalBsmtSF = col_integer(),
##   `1stFlrSF` = col_integer(),
##   `2ndFlrSF` = col_integer(),
##   LowQualFinSF = col_integer(),
##   GrLivArea = col_integer(),
##   BsmtFullBath = col_integer(),
##   BsmtHalfBath = col_integer(),
##   FullBath = col_integer()
##   # ... with 18 more columns
## )
## See spec(...) for full column specifications.
## Parsed with column specification:
## cols(
##   .default = col_character(),
##   Id = col_integer(),
##   MSSubClass = col_integer(),
##   LotFrontage = col_integer(),
##   LotArea = col_integer(),
##   OverallQual = col_integer(),
##   OverallCond = col_integer(),
##   YearBuilt = col_integer(),
##   YearRemodAdd = col_integer(),
##   MasVnrArea = col_integer(),
##   BsmtFinSF1 = col_integer(),
##   BsmtFinSF2 = col_integer(),
##   BsmtUnfSF = col_integer(),
##   TotalBsmtSF = col_integer(),
##   `1stFlrSF` = col_integer(),
##   `2ndFlrSF` = col_integer(),
##   LowQualFinSF = col_integer(),
##   GrLivArea = col_integer(),
##   BsmtFullBath = col_integer(),
##   BsmtHalfBath = col_integer(),
##   FullBath = col_integer()
##   # ... with 17 more columns
## )
## See spec(...) for full column specifications.

2a

EDA

Correlation

Most correlated variables - OverallQual, GrLivArea, GarageCars are the most highly correlated variables with SalePrice

Correlation Matrix

##             SalePrice OverallQual GrLivArea
## SalePrice   1.0000000   0.7909816 0.7086245
## OverallQual 0.7909816   1.0000000 0.5930074
## GrLivArea   0.7086245   0.5930074 1.0000000

Scatterplot Matrix

Hypothesis Test

GrLivArea & Saleprice
## 
##  Pearson's product-moment correlation
## 
## data:  train$GrLivArea and train$SalePrice
## t = 38.348, df = 1458, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 80 percent confidence interval:
##  0.6915087 0.7249450
## sample estimates:
##       cor 
## 0.7086245
  • The correlation between GrLivArea and SalePrice is not zero and the test shows the correlation to be ~.7086 and the p-value is very small showing statistical significance and thereby rejecting the null hypothesis
  • The confidence interval for GrLivArea and SalePrice is ~.6915 - .7249
GarageCars & Saleprice
## 
##  Pearson's product-moment correlation
## 
## data:  train$GarageCars and train$SalePrice
## t = 31.839, df = 1458, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 80 percent confidence interval:
##  0.6201771 0.6597899
## sample estimates:
##       cor 
## 0.6404092
  • The correlation between GarageCars and SalePrice is not zero and the test shows the correlation to be ~.6404 and the p-value is very small showing statistical significance and thereby rejecting the null hypothesis
  • The confidence interval for GarageCars and SalePrice is ~.6202 - .6598
OverallQual & Saleprice
## 
##  Pearson's product-moment correlation
## 
## data:  train$OverallQual and train$SalePrice
## t = 49.364, df = 1458, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 80 percent confidence interval:
##  0.7780752 0.8032204
## sample estimates:
##       cor 
## 0.7909816
  • The correlation between OverallQual and SalePrice is not zero and the test shows the correlation to be ~.7910 and the p-value is very small showing statistical significance and thereby rejecting the null hypothesis
  • The confidence interval for OverallQual and SalePrice is ~.7781 - .8032

Family-Wise Error

## [1] 0.488

I am worried about a family-wise error given that the FWER is .488 very close to a 50% chance

2b

Invert Correlation Matrix

##             SalePrice OverallQual GrLivArea
## SalePrice   1.0000000   0.7909816 0.7086245
## OverallQual 0.7909816   1.0000000 0.5930074
## GrLivArea   0.7086245   0.5930074 1.0000000

Multiply by Precision Matrix

##             SalePrice OverallQual GrLivArea
## SalePrice    2.127801    2.002183  1.886307
## OverallQual  2.002183    1.977310  1.746524
## GrLivArea    1.886307    1.746524  1.853806
##             SalePrice OverallQual GrLivArea
## SalePrice    2.127801    2.002183  1.886307
## OverallQual  2.002183    1.977310  1.746524
## GrLivArea    1.886307    1.746524  1.853806

LU Decomposition

## $L
##             SalePrice OverallQual GrLivArea
## SalePrice   1.0000000   0.0000000         0
## OverallQual 0.9409636   1.0000000         0
## GrLivArea   0.8865055  -0.3045399         1
## 
## $U
##             SalePrice OverallQual  GrLivArea
## SalePrice    2.127801  2.00218278  1.8863069
## OverallQual  0.000000  0.09332866 -0.0284223
## GrLivArea    0.000000  0.00000000  0.1729292

2c

## [1] 1300
## [1] 1e-09

Fitdistr

##         rate 
## 0.0001084972
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

CDF 5, 95%

##     rate 
## 472.7615
##     rate 
## 27611.15

Empirical 5, 95%

##     5% 
## 2011.7
##      95% 
## 16101.15

Confidence Interval

## [1] 8704.418 9729.238

Summary Table

The exponential distribution does not do a good job of estimating the actual data. The discrepancies/variance between the percentiles is vast. The bootstrapped confidence interval is a better estimator although still far from perfect.

2d

Clean Dataset

##   missForest iteration 1 in progress...done!
##   missForest iteration 2 in progress...done!
##   missForest iteration 3 in progress...done!
##   missForest iteration 1 in progress...done!
##   missForest iteration 2 in progress...done!
##   missForest iteration 3 in progress...done!
##   missForest iteration 4 in progress...done!

Regression Subsetting

## Reordering variables and trying again:
## Subset selection object
## Call: regsubsets.formula(SalePrice ~ ., data = newDfs[[1]], method = "exhaustive", 
##     nvmax = NULL, nbest = 1, really.big = T)
## 36 Variables  (and intercept)
##               Forced in Forced out
## MSSubClass        FALSE      FALSE
## LotFrontage       FALSE      FALSE
## LotArea           FALSE      FALSE
## OverallQual       FALSE      FALSE
## OverallCond       FALSE      FALSE
## YearBuilt         FALSE      FALSE
## YearRemodAdd      FALSE      FALSE
## MasVnrArea        FALSE      FALSE
## BsmtFinSF1        FALSE      FALSE
## BsmtFinSF2        FALSE      FALSE
## BsmtUnfSF         FALSE      FALSE
## `1stFlrSF`        FALSE      FALSE
## `2ndFlrSF`        FALSE      FALSE
## LowQualFinSF      FALSE      FALSE
## BsmtFullBath      FALSE      FALSE
## BsmtHalfBath      FALSE      FALSE
## FullBath          FALSE      FALSE
## HalfBath          FALSE      FALSE
## BedroomAbvGr      FALSE      FALSE
## KitchenAbvGr      FALSE      FALSE
## TotRmsAbvGrd      FALSE      FALSE
## Fireplaces        FALSE      FALSE
## GarageYrBlt       FALSE      FALSE
## GarageCars        FALSE      FALSE
## GarageArea        FALSE      FALSE
## WoodDeckSF        FALSE      FALSE
## OpenPorchSF       FALSE      FALSE
## EnclosedPorch     FALSE      FALSE
## `3SsnPorch`       FALSE      FALSE
## ScreenPorch       FALSE      FALSE
## PoolArea          FALSE      FALSE
## MiscVal           FALSE      FALSE
## MoSold            FALSE      FALSE
## YrSold            FALSE      FALSE
## TotalBsmtSF       FALSE      FALSE
## GrLivArea         FALSE      FALSE
## 1 subsets of each size up to 34
## Selection Algorithm: exhaustive
##           MSSubClass LotFrontage LotArea OverallQual OverallCond YearBuilt
## 1  ( 1 )  " "        " "         " "     "*"         " "         " "      
## 2  ( 1 )  " "        " "         " "     "*"         " "         " "      
## 3  ( 1 )  " "        " "         " "     "*"         " "         " "      
## 4  ( 1 )  " "        " "         " "     "*"         " "         " "      
## 5  ( 1 )  "*"        " "         " "     "*"         " "         " "      
## 6  ( 1 )  "*"        " "         " "     "*"         " "         "*"      
## 7  ( 1 )  "*"        " "         " "     "*"         " "         " "      
## 8  ( 1 )  "*"        " "         " "     "*"         "*"         "*"      
## 9  ( 1 )  "*"        " "         " "     "*"         "*"         "*"      
## 10  ( 1 ) "*"        " "         "*"     "*"         "*"         "*"      
## 11  ( 1 ) "*"        " "         "*"     "*"         "*"         "*"      
## 12  ( 1 ) "*"        " "         "*"     "*"         "*"         "*"      
## 13  ( 1 ) "*"        " "         "*"     "*"         "*"         "*"      
## 14  ( 1 ) "*"        " "         "*"     "*"         "*"         "*"      
## 15  ( 1 ) "*"        " "         "*"     "*"         "*"         "*"      
## 16  ( 1 ) "*"        " "         "*"     "*"         "*"         "*"      
## 17  ( 1 ) "*"        " "         "*"     "*"         "*"         "*"      
## 18  ( 1 ) "*"        " "         "*"     "*"         "*"         "*"      
## 19  ( 1 ) "*"        " "         "*"     "*"         "*"         "*"      
## 20  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 21  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 22  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 23  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 24  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 25  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 26  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 27  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 28  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 29  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 30  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 31  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 32  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 33  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
## 34  ( 1 ) "*"        "*"         "*"     "*"         "*"         "*"      
##           YearRemodAdd MasVnrArea BsmtFinSF1 BsmtFinSF2 BsmtUnfSF
## 1  ( 1 )  " "          " "        " "        " "        " "      
## 2  ( 1 )  " "          " "        " "        " "        " "      
## 3  ( 1 )  " "          " "        "*"        " "        " "      
## 4  ( 1 )  " "          " "        "*"        " "        " "      
## 5  ( 1 )  " "          " "        "*"        " "        " "      
## 6  ( 1 )  " "          " "        "*"        " "        " "      
## 7  ( 1 )  "*"          "*"        "*"        " "        " "      
## 8  ( 1 )  " "          " "        "*"        " "        " "      
## 9  ( 1 )  " "          "*"        " "        " "        " "      
## 10  ( 1 ) " "          "*"        " "        " "        " "      
## 11  ( 1 ) " "          "*"        " "        " "        " "      
## 12  ( 1 ) " "          "*"        "*"        " "        " "      
## 13  ( 1 ) " "          "*"        "*"        " "        " "      
## 14  ( 1 ) " "          "*"        "*"        " "        " "      
## 15  ( 1 ) " "          "*"        "*"        " "        " "      
## 16  ( 1 ) "*"          "*"        "*"        " "        " "      
## 17  ( 1 ) "*"          "*"        "*"        " "        " "      
## 18  ( 1 ) "*"          "*"        "*"        " "        " "      
## 19  ( 1 ) "*"          "*"        "*"        " "        " "      
## 20  ( 1 ) "*"          "*"        "*"        " "        " "      
## 21  ( 1 ) "*"          "*"        "*"        " "        " "      
## 22  ( 1 ) "*"          "*"        "*"        " "        " "      
## 23  ( 1 ) "*"          "*"        "*"        " "        " "      
## 24  ( 1 ) "*"          "*"        "*"        " "        " "      
## 25  ( 1 ) "*"          "*"        "*"        " "        " "      
## 26  ( 1 ) "*"          "*"        "*"        " "        " "      
## 27  ( 1 ) "*"          "*"        "*"        " "        " "      
## 28  ( 1 ) "*"          "*"        "*"        " "        " "      
## 29  ( 1 ) "*"          "*"        "*"        " "        " "      
## 30  ( 1 ) "*"          "*"        " "        "*"        "*"      
## 31  ( 1 ) "*"          "*"        "*"        "*"        "*"      
## 32  ( 1 ) "*"          "*"        "*"        "*"        "*"      
## 33  ( 1 ) "*"          "*"        "*"        "*"        "*"      
## 34  ( 1 ) "*"          "*"        "*"        "*"        "*"      
##           TotalBsmtSF `1stFlrSF` `2ndFlrSF` LowQualFinSF GrLivArea
## 1  ( 1 )  " "         " "        " "        " "          " "      
## 2  ( 1 )  " "         " "        " "        " "          "*"      
## 3  ( 1 )  " "         " "        " "        " "          "*"      
## 4  ( 1 )  " "         " "        " "        " "          "*"      
## 5  ( 1 )  " "         " "        " "        " "          "*"      
## 6  ( 1 )  " "         " "        " "        " "          "*"      
## 7  ( 1 )  " "         " "        " "        " "          "*"      
## 8  ( 1 )  " "         " "        " "        " "          "*"      
## 9  ( 1 )  " "         " "        " "        " "          "*"      
## 10  ( 1 ) " "         " "        " "        " "          "*"      
## 11  ( 1 ) "*"         " "        " "        " "          "*"      
## 12  ( 1 ) " "         " "        " "        " "          "*"      
## 13  ( 1 ) " "         " "        " "        " "          "*"      
## 14  ( 1 ) " "         " "        " "        " "          "*"      
## 15  ( 1 ) "*"         " "        " "        " "          "*"      
## 16  ( 1 ) "*"         " "        " "        " "          "*"      
## 17  ( 1 ) "*"         " "        " "        " "          "*"      
## 18  ( 1 ) "*"         " "        " "        " "          "*"      
## 19  ( 1 ) "*"         " "        " "        " "          "*"      
## 20  ( 1 ) "*"         " "        " "        " "          "*"      
## 21  ( 1 ) "*"         " "        " "        " "          "*"      
## 22  ( 1 ) "*"         " "        " "        " "          "*"      
## 23  ( 1 ) "*"         " "        " "        "*"          "*"      
## 24  ( 1 ) "*"         " "        " "        "*"          "*"      
## 25  ( 1 ) "*"         " "        " "        "*"          "*"      
## 26  ( 1 ) "*"         " "        " "        "*"          "*"      
## 27  ( 1 ) "*"         " "        " "        "*"          "*"      
## 28  ( 1 ) "*"         " "        " "        "*"          "*"      
## 29  ( 1 ) "*"         " "        " "        "*"          "*"      
## 30  ( 1 ) "*"         " "        " "        "*"          "*"      
## 31  ( 1 ) " "         " "        " "        "*"          "*"      
## 32  ( 1 ) " "         " "        " "        "*"          "*"      
## 33  ( 1 ) " "         " "        " "        "*"          "*"      
## 34  ( 1 ) " "         "*"        "*"        "*"          " "      
##           BsmtFullBath BsmtHalfBath FullBath HalfBath BedroomAbvGr
## 1  ( 1 )  " "          " "          " "      " "      " "         
## 2  ( 1 )  " "          " "          " "      " "      " "         
## 3  ( 1 )  " "          " "          " "      " "      " "         
## 4  ( 1 )  " "          " "          " "      " "      " "         
## 5  ( 1 )  " "          " "          " "      " "      " "         
## 6  ( 1 )  " "          " "          " "      " "      " "         
## 7  ( 1 )  " "          " "          " "      " "      " "         
## 8  ( 1 )  " "          " "          " "      " "      "*"         
## 9  ( 1 )  "*"          " "          " "      " "      "*"         
## 10  ( 1 ) "*"          " "          " "      " "      "*"         
## 11  ( 1 ) "*"          " "          " "      " "      "*"         
## 12  ( 1 ) "*"          " "          " "      " "      "*"         
## 13  ( 1 ) "*"          " "          " "      " "      "*"         
## 14  ( 1 ) "*"          " "          " "      " "      "*"         
## 15  ( 1 ) "*"          " "          " "      " "      "*"         
## 16  ( 1 ) "*"          " "          " "      " "      "*"         
## 17  ( 1 ) "*"          " "          " "      " "      "*"         
## 18  ( 1 ) "*"          " "          " "      " "      "*"         
## 19  ( 1 ) "*"          " "          "*"      " "      "*"         
## 20  ( 1 ) "*"          " "          "*"      " "      "*"         
## 21  ( 1 ) "*"          " "          "*"      " "      "*"         
## 22  ( 1 ) "*"          " "          "*"      " "      "*"         
## 23  ( 1 ) "*"          " "          "*"      " "      "*"         
## 24  ( 1 ) "*"          " "          "*"      "*"      "*"         
## 25  ( 1 ) "*"          " "          "*"      "*"      "*"         
## 26  ( 1 ) "*"          " "          "*"      "*"      "*"         
## 27  ( 1 ) "*"          " "          "*"      "*"      "*"         
## 28  ( 1 ) "*"          "*"          "*"      "*"      "*"         
## 29  ( 1 ) "*"          "*"          "*"      "*"      "*"         
## 30  ( 1 ) "*"          "*"          "*"      "*"      "*"         
## 31  ( 1 ) "*"          "*"          "*"      "*"      "*"         
## 32  ( 1 ) "*"          "*"          "*"      "*"      "*"         
## 33  ( 1 ) "*"          "*"          "*"      "*"      "*"         
## 34  ( 1 ) "*"          "*"          "*"      "*"      "*"         
##           KitchenAbvGr TotRmsAbvGrd Fireplaces GarageYrBlt GarageCars
## 1  ( 1 )  " "          " "          " "        " "         " "       
## 2  ( 1 )  " "          " "          " "        " "         " "       
## 3  ( 1 )  " "          " "          " "        " "         " "       
## 4  ( 1 )  " "          " "          " "        " "         "*"       
## 5  ( 1 )  " "          " "          " "        " "         "*"       
## 6  ( 1 )  " "          " "          " "        " "         "*"       
## 7  ( 1 )  " "          " "          " "        " "         "*"       
## 8  ( 1 )  " "          " "          " "        " "         "*"       
## 9  ( 1 )  " "          " "          " "        " "         "*"       
## 10  ( 1 ) " "          " "          " "        " "         "*"       
## 11  ( 1 ) " "          " "          " "        " "         "*"       
## 12  ( 1 ) " "          "*"          " "        " "         "*"       
## 13  ( 1 ) " "          "*"          " "        " "         "*"       
## 14  ( 1 ) " "          "*"          " "        " "         "*"       
## 15  ( 1 ) " "          "*"          " "        " "         "*"       
## 16  ( 1 ) " "          "*"          " "        " "         "*"       
## 17  ( 1 ) "*"          "*"          " "        " "         "*"       
## 18  ( 1 ) "*"          "*"          "*"        " "         "*"       
## 19  ( 1 ) "*"          "*"          "*"        " "         "*"       
## 20  ( 1 ) "*"          "*"          "*"        " "         "*"       
## 21  ( 1 ) "*"          "*"          "*"        " "         "*"       
## 22  ( 1 ) "*"          "*"          "*"        " "         "*"       
## 23  ( 1 ) "*"          "*"          "*"        " "         "*"       
## 24  ( 1 ) "*"          "*"          "*"        " "         "*"       
## 25  ( 1 ) "*"          "*"          "*"        " "         "*"       
## 26  ( 1 ) "*"          "*"          "*"        " "         "*"       
## 27  ( 1 ) "*"          "*"          "*"        "*"         "*"       
## 28  ( 1 ) "*"          "*"          "*"        "*"         "*"       
## 29  ( 1 ) "*"          "*"          "*"        "*"         "*"       
## 30  ( 1 ) "*"          "*"          "*"        "*"         "*"       
## 31  ( 1 ) "*"          "*"          "*"        "*"         "*"       
## 32  ( 1 ) "*"          "*"          "*"        "*"         "*"       
## 33  ( 1 ) "*"          "*"          "*"        "*"         "*"       
## 34  ( 1 ) "*"          "*"          "*"        "*"         "*"       
##           GarageArea WoodDeckSF OpenPorchSF EnclosedPorch `3SsnPorch`
## 1  ( 1 )  " "        " "        " "         " "           " "        
## 2  ( 1 )  " "        " "        " "         " "           " "        
## 3  ( 1 )  " "        " "        " "         " "           " "        
## 4  ( 1 )  " "        " "        " "         " "           " "        
## 5  ( 1 )  " "        " "        " "         " "           " "        
## 6  ( 1 )  " "        " "        " "         " "           " "        
## 7  ( 1 )  " "        " "        " "         " "           " "        
## 8  ( 1 )  " "        " "        " "         " "           " "        
## 9  ( 1 )  " "        " "        " "         " "           " "        
## 10  ( 1 ) " "        " "        " "         " "           " "        
## 11  ( 1 ) " "        " "        " "         " "           " "        
## 12  ( 1 ) " "        " "        " "         " "           " "        
## 13  ( 1 ) " "        "*"        " "         " "           " "        
## 14  ( 1 ) " "        "*"        " "         " "           " "        
## 15  ( 1 ) " "        "*"        " "         " "           " "        
## 16  ( 1 ) " "        "*"        " "         " "           " "        
## 17  ( 1 ) " "        "*"        " "         " "           " "        
## 18  ( 1 ) " "        "*"        " "         " "           " "        
## 19  ( 1 ) " "        "*"        " "         " "           " "        
## 20  ( 1 ) " "        "*"        " "         " "           " "        
## 21  ( 1 ) " "        "*"        " "         " "           " "        
## 22  ( 1 ) " "        "*"        " "         " "           " "        
## 23  ( 1 ) " "        "*"        " "         " "           " "        
## 24  ( 1 ) " "        "*"        " "         " "           " "        
## 25  ( 1 ) " "        "*"        " "         "*"           " "        
## 26  ( 1 ) " "        "*"        " "         "*"           "*"        
## 27  ( 1 ) " "        "*"        " "         "*"           "*"        
## 28  ( 1 ) " "        "*"        " "         "*"           "*"        
## 29  ( 1 ) " "        "*"        " "         "*"           "*"        
## 30  ( 1 ) " "        "*"        " "         "*"           "*"        
## 31  ( 1 ) " "        "*"        " "         "*"           "*"        
## 32  ( 1 ) "*"        "*"        " "         "*"           "*"        
## 33  ( 1 ) "*"        "*"        "*"         "*"           "*"        
## 34  ( 1 ) "*"        "*"        "*"         "*"           "*"        
##           ScreenPorch PoolArea MiscVal MoSold YrSold
## 1  ( 1 )  " "         " "      " "     " "    " "   
## 2  ( 1 )  " "         " "      " "     " "    " "   
## 3  ( 1 )  " "         " "      " "     " "    " "   
## 4  ( 1 )  " "         " "      " "     " "    " "   
## 5  ( 1 )  " "         " "      " "     " "    " "   
## 6  ( 1 )  " "         " "      " "     " "    " "   
## 7  ( 1 )  " "         " "      " "     " "    " "   
## 8  ( 1 )  " "         " "      " "     " "    " "   
## 9  ( 1 )  " "         " "      " "     " "    " "   
## 10  ( 1 ) " "         " "      " "     " "    " "   
## 11  ( 1 ) " "         " "      " "     " "    " "   
## 12  ( 1 ) " "         " "      " "     " "    " "   
## 13  ( 1 ) " "         " "      " "     " "    " "   
## 14  ( 1 ) "*"         " "      " "     " "    " "   
## 15  ( 1 ) "*"         " "      " "     " "    " "   
## 16  ( 1 ) "*"         " "      " "     " "    " "   
## 17  ( 1 ) "*"         " "      " "     " "    " "   
## 18  ( 1 ) "*"         " "      " "     " "    " "   
## 19  ( 1 ) "*"         " "      " "     " "    " "   
## 20  ( 1 ) "*"         " "      " "     " "    " "   
## 21  ( 1 ) "*"         "*"      " "     " "    " "   
## 22  ( 1 ) "*"         "*"      " "     " "    "*"   
## 23  ( 1 ) "*"         "*"      " "     " "    "*"   
## 24  ( 1 ) "*"         "*"      " "     " "    "*"   
## 25  ( 1 ) "*"         "*"      " "     " "    "*"   
## 26  ( 1 ) "*"         "*"      " "     " "    "*"   
## 27  ( 1 ) "*"         "*"      " "     " "    "*"   
## 28  ( 1 ) "*"         "*"      " "     " "    "*"   
## 29  ( 1 ) "*"         "*"      "*"     " "    "*"   
## 30  ( 1 ) "*"         "*"      "*"     " "    "*"   
## 31  ( 1 ) "*"         "*"      "*"     "*"    "*"   
## 32  ( 1 ) "*"         "*"      "*"     "*"    "*"   
## 33  ( 1 ) "*"         "*"      "*"     "*"    "*"   
## 34  ( 1 ) "*"         "*"      "*"     "*"    "*"

##   (Intercept)    MSSubClass   LotFrontage       LotArea   OverallQual 
##          TRUE          TRUE          TRUE          TRUE          TRUE 
##   OverallCond     YearBuilt  YearRemodAdd    MasVnrArea    BsmtFinSF1 
##          TRUE          TRUE          TRUE          TRUE          TRUE 
##    BsmtFinSF2     BsmtUnfSF   TotalBsmtSF    `1stFlrSF`    `2ndFlrSF` 
##         FALSE         FALSE          TRUE         FALSE         FALSE 
##  LowQualFinSF     GrLivArea  BsmtFullBath  BsmtHalfBath      FullBath 
##         FALSE          TRUE          TRUE         FALSE          TRUE 
##      HalfBath  BedroomAbvGr  KitchenAbvGr  TotRmsAbvGrd    Fireplaces 
##         FALSE          TRUE          TRUE          TRUE          TRUE 
##   GarageYrBlt    GarageCars    GarageArea    WoodDeckSF   OpenPorchSF 
##         FALSE          TRUE         FALSE          TRUE         FALSE 
## EnclosedPorch   `3SsnPorch`   ScreenPorch      PoolArea       MiscVal 
##         FALSE         FALSE          TRUE         FALSE         FALSE 
##        MoSold        YrSold 
##         FALSE         FALSE

Model

## 
## Call:
## lm(formula = SalePrice ~ MSSubClass + LotFrontage + LotArea + 
##     OverallQual + OverallCond + YearBuilt + YearRemodAdd + MasVnrArea + 
##     BsmtFinSF1 + TotalBsmtSF + GrLivArea + BsmtFullBath + FullBath + 
##     BedroomAbvGr + KitchenAbvGr + TotRmsAbvGrd + Fireplaces + 
##     GarageCars + WoodDeckSF + ScreenPorch, data = dfs[[1]])
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -469771  -17778   -2466   14317  294973 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -9.801e+05  1.412e+05  -6.941 6.42e-12 ***
## MSSubClass   -1.947e+02  3.110e+01  -6.261 5.35e-10 ***
## LotFrontage  -1.054e+02  5.682e+01  -1.855 0.063885 .  
## LotArea       5.623e-01  1.537e-01   3.657 0.000266 ***
## OverallQual   1.760e+04  1.367e+03  12.873  < 2e-16 ***
## OverallCond   4.304e+03  1.209e+03   3.561 0.000384 ***
## YearBuilt     2.912e+02  6.114e+01   4.763 2.15e-06 ***
## YearRemodAdd  1.768e+02  7.528e+01   2.349 0.019012 *  
## MasVnrArea    3.704e+01  6.782e+00   5.461 5.78e-08 ***
## BsmtFinSF1    1.046e+01  3.559e+00   2.939 0.003359 ** 
## TotalBsmtSF   7.376e+00  3.650e+00   2.021 0.043506 *  
## GrLivArea     4.441e+01  4.861e+00   9.135  < 2e-16 ***
## BsmtFullBath  9.861e+03  2.785e+03   3.540 0.000416 ***
## FullBath      7.485e+03  2.988e+03   2.505 0.012371 *  
## BedroomAbvGr -1.013e+04  1.926e+03  -5.258 1.73e-07 ***
## KitchenAbvGr -1.227e+04  5.538e+03  -2.215 0.026931 *  
## TotRmsAbvGrd  5.263e+03  1.407e+03   3.740 0.000193 ***
## Fireplaces    4.683e+03  2.061e+03   2.273 0.023236 *  
## GarageCars    1.139e+04  1.898e+03   6.003 2.58e-09 ***
## WoodDeckSF    2.405e+01  9.581e+00   2.510 0.012212 *  
## ScreenPorch   5.403e+01  1.959e+01   2.758 0.005907 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 36480 on 1174 degrees of freedom
##   (265 observations deleted due to missingness)
## Multiple R-squared:  0.8109, Adjusted R-squared:  0.8077 
## F-statistic: 251.7 on 20 and 1174 DF,  p-value: < 2.2e-16

Evaluation

The diagnostic plots show that a linear regression was appropriate. This model is able to account for .8077 or ~81% of the variation

Predict

Overall, the model does a good job of predicting the SalePrice with the exception of a few outliers

Kaggle

Username: https://www.kaggle.com/baroncurtin2 Score: .24747