Data621 Final Project

Kaggle Advanced Housing Regression

Angus Huang, Pavan Akula, Nnaemezue Obi-Eyisi, Aryeh Sturm, Nathan Cooper, Joshua Sturm

May 14, 2018

Goals

Team Structure

Tentative Project Breakdown:

##   Tasks Data.Exploration Data.Visualization Model.Building Analysis
## 1  Lead            Angus              Pavan          Pavan   Joshua
## 2  Lead             Mezu              Aryeh         Nathan   Joshua
##   Write.up Presentation
## 1    Aryeh       Nathan
## 2   Joshua       Nathan

1. Data Exploration

a. Read Data In

##   Id MSSubClass MSZoning LotFrontage LotArea Street
## 1  1         60       RL          65    8450   Pave
## 2  2         20       RL          80    9600   Pave
## 3  3         60       RL          68   11250   Pave
## 4  4         70       RL          60    9550   Pave
## 5  5         60       RL          84   14260   Pave
## 6  6         50       RL          85   14115   Pave

a. (con’t) Descriptions

A corresponding short description of the predictors variables are given as below.

## # A tibble: 81 x 2
##    Columns_Name Short_Desc                                  
##    <fct>        <fct>                                       
##  1 Id           RowINdex                                    
##  2 MSSubClass   Building Class                              
##  3 MSZoning     Zoning Classification                       
##  4 LotFrontage  Linear feet of street connnected to property
##  5 LotArea      Lot size in square feet                     
##  6 Street       Type of road access                         
##  7 Alley        Type of alley access                        
##  8 LotShape     General shape of property                   
##  9 LandContour  Flatness of the property                    
## 10 Utilities    Type of utilities available                 
## 11 LotConfig    Lot Configuration                           
## 12 LandSlope    Slope of property                           
## 13 Neighborhood Physical locations within Ames city limits  
## # ... with 68 more rows

b. Check for missing and NA

The data is checked to see if it contains any blank or NA values.

##    na_count    col_names
## 17     1453       PoolQC
## 19     1406  MiscFeature
## 2      1369        Alley
## 18     1179        Fence
## 11      690  FireplaceQu
## 1       259  LotFrontage
## 12       81   GarageType
## 13       81  GarageYrBlt
## 14       81 GarageFinish
## 15       81   GarageQual
## 16       81   GarageCond
## 7        38 BsmtExposure
## 9        38 BsmtFinType2
## 5        37     BsmtQual
## 6        37     BsmtCond
## 8        37 BsmtFinType1
## 3         8   MasVnrType
## 4         8   MasVnrArea
## 10        1   Electrical

b. (con’t) Identify Catagorical Data

There are quite a few categorical variables as shown in below examples.

unique(housetrain$GarageFinish)     
## [1] RFn  Unf  Fin  <NA>
## Levels: Fin RFn Unf
unique(housetrain$GarageQual)
## [1] TA   Fa   Gd   <NA> Ex   Po  
## Levels: Ex Fa Gd Po TA

c. Separate categorical data and numerical data

We will separate out the categorical data columns with missing variables and create a new data frame.

Here is an example of transforming a numeric variable PoolArea into a categorical variable that identifies No Pool.

housetrain.asis$PoolQC <- as.character(housetrain.asis$PoolQC)
housetrain.asis$PoolQC <- ifelse(housetrain.asis$PoolArea == 0, 'NP', housetrain.asis$PoolQC) #NP - No Pool
housetrain.asis$PoolQC <- factor(housetrain.asis$PoolQC)

d. Check for missing values

(con’t) Checking for Zeros

Variables With Zero Values
variable Total Percentage
YearBuilt 64 4.38%
YearRemodAdd 124 8.49%
MasVnrArea 861 58.97%
BsmtFinSF1 467 31.99%
BsmtFinSF2 1293 88.56%
BsmtUnfSF 118 8.08%
TotalBsmtSF 37 2.53%
X2ndFlrSF 829 56.78%
LowQualFinSF 1434 98.22%
GarageYrBlt 165 11.3%
GarageArea 81 5.55%
WoodDeckSF 761 52.12%
OpenPorchSF 656 44.93%
EnclosedPorch 1252 85.75%
X3SsnPorch 1436 98.36%
ScreenPorch 1344 92.05%
PoolArea 1453 99.52%
MiscVal 1408 96.44%

e. Data distribution select Variables

Data distribution select Variables - Multi-modal

Data distribution select Variables - Weak Correlations

Data distribution select Variables - Strong Correlations

2. Building Models

a. Data imputation

Tuning KNN Imputation Parameters

Plots show 11 or 13 is optimal number of neighbors. We selected k = 11.

b. Initial Model

## 
## Call:
## lm(formula = SalePrice ~ ., data = housetrain.knn)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -175318   -8247       0    8403  175318 
## 
## Coefficients: (12 not defined because of singularities)
##                          Estimate   Std. Error t value Pr(>|t|)    
## (Intercept)          -856585.8319  157716.3303  -5.431 6.82e-08 ***
## MSSubClass30             312.5315    4749.6819   0.066 0.947548    
## MSSubClass40            -409.5717   17178.9075  -0.024 0.980983    
## MSSubClass45           -8448.1143   24905.5631  -0.339 0.734516    
## MSSubClass50           -7789.8938    8883.9598  -0.877 0.380751    
## MSSubClass60           -3455.2813    7988.2812  -0.433 0.665427    
## MSSubClass70            1407.6420    8632.4992   0.163 0.870497    
## MSSubClass75          -16035.4407   17178.6156  -0.933 0.350781    
## MSSubClass80          -12511.4539   12767.3952  -0.980 0.327315    
## MSSubClass85          -19952.9352   11440.4242  -1.744 0.081413 .  
## MSSubClass90          -13673.4045    8666.1110  -1.578 0.114885    
## MSSubClass120         -16420.7858   14393.4399  -1.141 0.254168    
## MSSubClass160         -23288.8672   17582.6446  -1.325 0.185587    
## MSSubClass180         -27703.9733   19593.5634  -1.414 0.157652    
## MSSubClass190         -11951.5154   28135.6002  -0.425 0.671074    
## MSZoningFV             42969.1076   12180.5092   3.528 0.000436 ***
## MSZoningRH             35748.6420   12351.1849   2.894 0.003871 ** 
## MSZoningRL             37363.5453   10587.6176   3.529 0.000434 ***
## MSZoningRM             32768.1718    9986.3092   3.281 0.001064 ** 
## LotFrontage               20.9988      45.9370   0.457 0.647669    
## LotArea                    0.6401       0.1119   5.721 1.35e-08 ***
## StreetPave             26130.3168   12945.7496   2.018 0.043776 *  
## AlleyNAly              -2668.2968    4213.4032  -0.633 0.526671    
## AlleyPave              -1682.7633    6157.6373  -0.273 0.784686    
## LotShapeIR2             6358.7594    4106.9414   1.548 0.121826    
## LotShapeIR3             9541.4823    8621.2235   1.107 0.268636    
## LotShapeReg             1894.3697    1575.0553   1.203 0.229327    
## LandContourHLS          7728.6638    5115.8398   1.511 0.131131    
## LandContourLow         -3788.3299    6471.8550  -0.585 0.558424    
## LandContourLvl          6462.7773    3699.0781   1.747 0.080881 .  
## UtilitiesNoSeWa       -54405.5796   27880.6483  -1.951 0.051254 .  
## LotConfigCulDSac        7679.3814    3250.9841   2.362 0.018334 *  
## LotConfigFR2           -6874.7706    3926.6985  -1.751 0.080250 .  
## LotConfigFR3          -16419.4216   12253.8526  -1.340 0.180529    
## LotConfigInside        -1253.4217    1772.8608  -0.707 0.479706    
## LandSlopeMod            8787.3664    3979.4485   2.208 0.027427 *  
## LandSlopeSev          -35399.1020   11398.8698  -3.105 0.001946 ** 
## NeighborhoodBlueste     7572.3432   19589.2186   0.387 0.699156    
## NeighborhoodBrDale       222.9895   11597.3364   0.019 0.984663    
## NeighborhoodBrkSide    -2714.4237    9603.5291  -0.283 0.777497    
## NeighborhoodClearCr    -9383.2870    9400.8443  -0.998 0.318425    
## NeighborhoodCollgCr    -6779.0451    7339.0932  -0.924 0.355841    
## NeighborhoodCrawfor    14877.3462    8737.1968   1.703 0.088883 .  
## NeighborhoodEdwards   -20142.1236    8117.2991  -2.481 0.013229 *  
## NeighborhoodGilbert    -5301.0633    7745.8229  -0.684 0.493875    
## NeighborhoodIDOTRR     -9914.5639   10845.9875  -0.914 0.360843    
## NeighborhoodMeadowV   -12908.5356   12283.7122  -1.051 0.293540    
## NeighborhoodMitchel   -15578.3349    8289.1501  -1.879 0.060447 .  
## NeighborhoodNAmes     -13625.8055    7993.8666  -1.705 0.088551 .  
## NeighborhoodNoRidge    20006.9890    8449.2484   2.368 0.018054 *  
## NeighborhoodNPkVill    14385.0367   13742.7571   1.047 0.295441    
## NeighborhoodNridgHt    14416.5586    7548.3604   1.910 0.056395 .  
## NeighborhoodNWAmes    -11363.7422    8167.2857  -1.391 0.164381    
## NeighborhoodOldTown   -11497.1115    9740.5410  -1.180 0.238110    
## NeighborhoodSawyer     -7551.0614    8228.7155  -0.918 0.358995    
## NeighborhoodSawyerW    -1267.9200    7857.6183  -0.161 0.871837    
## NeighborhoodSomerst     -143.1670    9062.2290  -0.016 0.987398    
## NeighborhoodStoneBr    34444.5007    8497.2360   4.054 5.38e-05 ***
## NeighborhoodSWISU      -7443.5025    9823.7496  -0.758 0.448783    
## NeighborhoodTimber     -6187.1498    8176.1157  -0.757 0.449363    
## NeighborhoodVeenker     3661.4979   10550.4316   0.347 0.728619    
## Condition1Feedr         4936.5026    4941.8952   0.999 0.318048    
## Condition1Norm         14939.4485    4128.3378   3.619 0.000309 ***
## Condition1PosA         12556.2385    9770.1399   1.285 0.198993    
## Condition1PosN         13904.7262    7324.1268   1.898 0.057882 .  
## Condition1RRAe        -15239.3626    8854.0985  -1.721 0.085489 .  
## Condition1RRAn         13984.8142    6793.2366   2.059 0.039753 *  
## Condition1RRNe         -1202.2300   16928.8206  -0.071 0.943397    
## Condition1RRNn         13839.8090   12546.4842   1.103 0.270221    
## Condition2Feedr         8776.1690   26428.8248   0.332 0.739898    
## Condition2Norm          2378.5998   24002.3491   0.099 0.921077    
## Condition2PosA        -18834.5971   39170.0381  -0.481 0.630720    
## Condition2PosN       -258489.4013   30430.3119  -8.494  < 2e-16 ***
## Condition2RRAe       -107562.5019   73291.8879  -1.468 0.142488    
## Condition2RRAn         -4699.2716   33637.6551  -0.140 0.888919    
## Condition2RRNn         10020.6099   29937.9729   0.335 0.737903    
## BldgType2fmCon         -2212.3330   27346.4174  -0.081 0.935535    
## BldgTypeDuplex                 NA           NA      NA       NA    
## BldgTypeTwnhs          -2088.1771   15296.6039  -0.137 0.891440    
## BldgTypeTwnhsE           554.7325   14557.6446   0.038 0.969610    
## HouseStyle1.5Unf        7316.6389   24498.4576   0.299 0.765255    
## HouseStyle1Story       -7828.3626    9029.7557  -0.867 0.386149    
## HouseStyle2.5Fin       -6561.5638   19009.3043  -0.345 0.730025    
## HouseStyle2.5Unf        5641.4270   17247.6215   0.327 0.743663    
## HouseStyle2Story      -10516.5652    8237.3960  -1.277 0.201970    
## HouseStyleSFoyer        2310.0048   12573.1599   0.184 0.854261    
## HouseStyleSLvl          2065.1615   14037.0544   0.147 0.883061    
## OverallQual2           26840.6553   31267.5125   0.858 0.390839    
## OverallQual3           11242.8658   28835.4037   0.390 0.696684    
## OverallQual4           10677.5264   28585.6786   0.374 0.708825    
## OverallQual5           10001.3553   28733.4023   0.348 0.727848    
## OverallQual6           13383.2947   28803.0126   0.465 0.642270    
## OverallQual7           20047.1855   28836.3697   0.695 0.487067    
## OverallQual8           32971.8738   28978.0187   1.138 0.255431    
## OverallQual9           63180.6625   29527.6733   2.140 0.032588 *  
## OverallQual10         101560.1520   30421.2659   3.338 0.000869 ***
## OverallCond2          -23026.8519   52799.2337  -0.436 0.662831    
## OverallCond3          -51538.9146   54972.2965  -0.938 0.348675    
## OverallCond4          -40842.2151   55222.5517  -0.740 0.459698    
## OverallCond5          -33476.7489   55321.9175  -0.605 0.545214    
## OverallCond6          -27577.4167   55313.3528  -0.499 0.618179    
## OverallCond7          -21069.9957   55309.5363  -0.381 0.703313    
## OverallCond8          -17420.5720   55363.9373  -0.315 0.753080    
## OverallCond9           -9748.6585   55477.2757  -0.176 0.860542    
## YearBuilt               -398.0932      83.3638  -4.775 2.02e-06 ***
## YearRemodAdd             -75.4919      55.8749  -1.351 0.176933    
## RoofStyleGable         -3871.8558   18240.0897  -0.212 0.831933    
## RoofStyleGambrel       -3081.4416   20092.1772  -0.153 0.878137    
## RoofStyleHip           -4582.1507   18303.3675  -0.250 0.802365    
## RoofStyleMansard        1227.4811   21859.3903   0.056 0.955229    
## RoofStyleShed          85967.0079   40024.9183   2.148 0.031935 *  
## RoofMatlCompShg       585529.3349   54166.5268  10.810  < 2e-16 ***
## RoofMatlMembran       661166.2063   63596.3180  10.396  < 2e-16 ***
## RoofMatlMetal         635156.8076   63237.2752  10.044  < 2e-16 ***
## RoofMatlRoll          587564.7093   59654.3825   9.849  < 2e-16 ***
## RoofMatlTarGrv        574270.6328   57660.1437   9.960  < 2e-16 ***
## RoofMatlWdShake       578481.5889   56417.7754  10.254  < 2e-16 ***
## RoofMatlWdShngl       630849.0006   55216.2901  11.425  < 2e-16 ***
## Exterior1stAsphShn    -12740.4988   32393.4394  -0.393 0.694167    
## Exterior1stBrkComm      9229.6717   28221.9229   0.327 0.743698    
## Exterior1stBrkFace     19898.7413   12885.3882   1.544 0.122793    
## Exterior1stCBlock      -6262.5973   27440.6493  -0.228 0.819513    
## Exterior1stCemntBd      4142.8709   19080.6403   0.217 0.828150    
## Exterior1stHdBoard       478.0806   13067.8272   0.037 0.970823    
## Exterior1stImStucc    -15290.1013   27565.1749  -0.555 0.579215    
## Exterior1stMetalSd      7396.1988   14788.9052   0.500 0.617087    
## Exterior1stPlywood      -594.1876   12859.6453  -0.046 0.963154    
## Exterior1stStone        6864.0167   23986.0633   0.286 0.774802    
## Exterior1stStucco       4875.4891   14525.0416   0.336 0.737188    
## Exterior1stVinylSd       792.7966   13587.0974   0.058 0.953481    
## Exterior1stWd Sdng        95.3849   12583.1559   0.008 0.993953    
## Exterior1stWdShing      3269.5070   13659.5921   0.239 0.810872    
## Exterior2ndAsphShn     10495.9599   22184.0199   0.473 0.636209    
## Exterior2ndBrk Cmn      2109.6978   20304.2606   0.104 0.917263    
## Exterior2ndBrkFace     -3872.1856   13257.6970  -0.292 0.770285    
## Exterior2ndCBlock              NA           NA      NA       NA    
## Exterior2ndCmentBd      4310.1523   18665.2459   0.231 0.817419    
## Exterior2ndHdBoard      1071.2443   12513.4361   0.086 0.931793    
## Exterior2ndImStucc      4670.4074   14451.5051   0.323 0.746619    
## Exterior2ndMetalSd     -1449.7693   14371.1190  -0.101 0.919663    
## Exterior2ndOther      -21576.2416   26520.0761  -0.814 0.416053    
## Exterior2ndPlywood       774.8445   12156.8698   0.064 0.949191    
## Exterior2ndStone      -13676.9448   17235.1148  -0.794 0.427620    
## Exterior2ndStucco      -1879.4273   13861.3453  -0.136 0.892171    
## Exterior2ndVinylSd      4633.7771   13043.7155   0.355 0.722467    
## Exterior2ndWd Sdng      5956.6046   12106.0224   0.492 0.622787    
## Exterior2ndWd Shng       681.3413   12719.7997   0.054 0.957291    
## MasVnrTypeBrkFace       7664.6659    6208.2553   1.235 0.217234    
## MasVnrTypeNone          9208.9986    6246.7646   1.474 0.140700    
## MasVnrTypeStone        10195.0205    6570.4481   1.552 0.121021    
## MasVnrArea                18.0445       5.6607   3.188 0.001473 ** 
## ExterQualFa            -2204.5097   12989.7205  -0.170 0.865267    
## ExterQualGd            -5895.7940    5137.4725  -1.148 0.251369    
## ExterQualTA            -6570.6929    5565.0549  -1.181 0.237963    
## ExterCondFa            -9579.9182   18315.4165  -0.523 0.601038    
## ExterCondGd           -14087.9576   17453.0756  -0.807 0.419723    
## ExterCondPo           -42296.7164   35966.5800  -1.176 0.239837    
## ExterCondTA           -11641.5717   17500.0714  -0.665 0.506036    
## FoundationCBlock        1115.6569    3196.5688   0.349 0.727140    
## FoundationPConc         3774.7042    3407.6568   1.108 0.268217    
## FoundationSlab         -3844.9456    9952.4991  -0.386 0.699324    
## FoundationStone         4080.2253   12316.4734   0.331 0.740492    
## FoundationWood        -33236.7575   14480.1387  -2.295 0.021893 *  
## BsmtQualFa             -4372.2416    6420.5463  -0.681 0.496023    
## BsmtQualGd            -11284.2298    3388.1880  -3.330 0.000894 ***
## BsmtQualNB               732.5989   13512.1403   0.054 0.956771    
## BsmtQualTA             -9243.4600    4149.4311  -2.228 0.026097 *  
## BsmtCondGd              3164.3826    5295.5566   0.598 0.550255    
## BsmtCondNB                     NA           NA      NA       NA    
## BsmtCondPo            -10843.0162   37529.5570  -0.289 0.772695    
## BsmtCondTA              5816.9029    4295.9220   1.354 0.175985    
## BsmtExposureGd         10706.9710    3004.4654   3.564 0.000381 ***
## BsmtExposureMn         -2003.0047    2977.6050  -0.673 0.501279    
## BsmtExposureNB                 NA           NA      NA       NA    
## BsmtExposureNo         -4166.6345    2156.1283  -1.932 0.053547 .  
## BsmtFinType1BLQ         1506.4290    2748.5311   0.548 0.583740    
## BsmtFinType1GLQ         5177.3063    2473.9355   2.093 0.036590 *  
## BsmtFinType1LwQ        -2403.3611    3662.5666  -0.656 0.511829    
## BsmtFinType1NB                 NA           NA      NA       NA    
## BsmtFinType1Rec          173.7837    2946.8354   0.059 0.952984    
## BsmtFinType1Unf         2749.8198    2900.7502   0.948 0.343344    
## BsmtFinSF1                33.8308       5.1443   6.576 7.29e-11 ***
## BsmtFinType2BLQ        -8805.3928    7282.6886  -1.209 0.226878    
## BsmtFinType2GLQ        -3284.9591    9100.3017  -0.361 0.718186    
## BsmtFinType2LwQ        -8467.4203    7133.7992  -1.187 0.235494    
## BsmtFinType2NB                 NA           NA      NA       NA    
## BsmtFinType2Rec        -6382.6716    6802.4816  -0.938 0.348294    
## BsmtFinType2Unf        -3855.7317    7250.4149  -0.532 0.594971    
## BsmtFinSF2                29.0478       8.8092   3.297 0.001005 ** 
## BsmtUnfSF                 14.9952       4.7291   3.171 0.001560 ** 
## TotalBsmtSF                    NA           NA      NA       NA    
## HeatingGasA            22192.1664   25410.0888   0.873 0.382648    
## HeatingGasW            22970.5173   26232.4959   0.876 0.381402    
## HeatingGrav             7560.5202   27647.0722   0.273 0.784544    
## HeatingOthW             8770.9961   31266.3951   0.281 0.779125    
## HeatingWall            33611.6671   29252.1490   1.149 0.250781    
## HeatingQCFa            -1341.0831    4734.4495  -0.283 0.777028    
## HeatingQCGd            -2938.4171    2024.0437  -1.452 0.146842    
## HeatingQCPo             9488.6207   25966.2178   0.365 0.714864    
## HeatingQCTA            -2589.3972    2037.7375  -1.271 0.204084    
## CentralAirY             -416.1486    3868.0706  -0.108 0.914343    
## ElectricalFuseF        -2745.2904    5841.1398  -0.470 0.638449    
## ElectricalFuseP       -10212.7563   18649.6849  -0.548 0.584066    
## ElectricalMix                  NA           NA      NA       NA    
## ElectricalSBrkr        -2825.2548    2950.5700  -0.958 0.338501    
## X1stFlrSF                 48.4013       5.5595   8.706  < 2e-16 ***
## X2ndFlrSF                 52.5641       6.0736   8.654  < 2e-16 ***
## LowQualFinSF              -4.6670      19.2784  -0.242 0.808759    
## GrLivArea                      NA           NA      NA       NA    
## BsmtFullBath1           1177.8488    1997.6882   0.590 0.555570    
## BsmtFullBath2           5505.5181    9962.1301   0.553 0.580614    
## BsmtFullBath3          29366.4887   27298.1445   1.076 0.282256    
## BsmtHalfBath1           2649.5711    3118.1774   0.850 0.395658    
## BsmtHalfBath2         -23464.8224   29650.6071  -0.791 0.428887    
## FullBath1              -7464.1327   17478.7877  -0.427 0.669430    
## FullBath2              -7047.3866   17772.4836  -0.397 0.691785    
## FullBath3              16637.4679   18535.7249   0.898 0.369592    
## HalfBath1               3307.8658    2207.3000   1.499 0.134250    
## HalfBath2              -1763.0251    9266.7583  -0.190 0.849145    
## BedroomAbvGr1          18147.7194   16784.4248   1.081 0.279824    
## BedroomAbvGr2          22185.8470   16541.3568   1.341 0.180108    
## BedroomAbvGr3          15933.9648   16679.2490   0.955 0.339618    
## BedroomAbvGr4          17253.2694   16911.5135   1.020 0.307844    
## BedroomAbvGr5           8833.1865   18068.8629   0.489 0.625032    
## BedroomAbvGr6          20038.4118   20235.5406   0.990 0.322256    
## BedroomAbvGr8          37862.6508   35315.2674   1.072 0.283885    
## KitchenAbvGr1          -5630.6399   44118.1705  -0.128 0.898467    
## KitchenAbvGr2         -15664.0709   44590.5941  -0.351 0.725438    
## KitchenAbvGr3         -26808.1967   48140.5373  -0.557 0.577722    
## KitchenQualFa         -17447.3368    6256.0149  -2.789 0.005376 ** 
## KitchenQualGd         -17430.3145    3516.6326  -4.957 8.25e-07 ***
## KitchenQualTA         -17969.5274    3933.3294  -4.569 5.44e-06 ***
## TotRmsAbvGrd            1317.5771     950.2140   1.387 0.165828    
## FunctionalMaj2         -9876.0786   14324.0639  -0.689 0.490663    
## FunctionalMin1          3385.2590    8540.6642   0.396 0.691906    
## FunctionalMin2          3180.4629    8724.4960   0.365 0.715519    
## FunctionalMod          -7104.6055   10488.3939  -0.677 0.498302    
## FunctionalSev         -34338.6173   29070.0654  -1.181 0.237752    
## FunctionalTyp          13063.0907    7564.5761   1.727 0.084458 .  
## Fireplaces1            -6582.4482    5483.7242  -1.200 0.230246    
## Fireplaces2              127.2270    6014.5918   0.021 0.983127    
## Fireplaces3             1869.1869   12866.5352   0.145 0.884519    
## FireplaceQuFa           4549.1504    6786.9507   0.670 0.502814    
## FireplaceQuGd           8208.6063    5316.1123   1.544 0.122839    
## FireplaceQuNF                  NA           NA      NA       NA    
## FireplaceQuPo          13969.8680    7798.4762   1.791 0.073498 .  
## FireplaceQuTA           9360.7897    5505.9034   1.700 0.089375 .  
## GarageTypeAttchd       33119.6708   11809.5244   2.804 0.005124 ** 
## GarageTypeBasment      39938.5136   13369.5144   2.987 0.002874 ** 
## GarageTypeBuiltIn      30354.1390   12287.1570   2.470 0.013640 *  
## GarageTypeCarPort      38737.7148   15411.6653   2.514 0.012088 *  
## GarageTypeDetchd       34919.9286   11803.0625   2.959 0.003154 ** 
## GarageTypeNG          -99198.6001   49654.0658  -1.998 0.045974 *  
## GarageYrBlt                2.2244      60.7921   0.037 0.970818    
## GarageFinishNG        114505.6266   47214.7084   2.425 0.015452 *  
## GarageFinishRFn         -701.9726    1929.9356  -0.364 0.716127    
## GarageFinishUnf        -1033.6589    2384.3036  -0.434 0.664713    
## GarageCars1           -15045.6134   19130.8915  -0.786 0.431762    
## GarageCars2           -15940.7516   19061.8343  -0.836 0.403179    
## GarageCars3            -6305.7140   19187.2391  -0.329 0.742487    
## GarageArea                20.6652       7.8375   2.637 0.008484 ** 
## GarageQualFa          -73062.0313   31793.5482  -2.298 0.021739 *  
## GarageQualGd          -68166.6546   32566.1915  -2.093 0.036552 *  
## GarageQualNG                   NA           NA      NA       NA    
## GarageQualPo          -78037.5768   40775.8366  -1.914 0.055891 .  
## GarageQualTA          -68294.4160   31429.2263  -2.173 0.029986 *  
## GarageCondFa           63157.9279   35385.9877   1.785 0.074552 .  
## GarageCondGd           63734.0725   36764.4691   1.734 0.083260 .  
## GarageCondNG                   NA           NA      NA       NA    
## GarageCondPo           69100.6168   38774.4055   1.782 0.074993 .  
## GarageCondTA           64411.1056   35167.4126   1.832 0.067275 .  
## PavedDriveP            -4946.4662    5489.7861  -0.901 0.367760    
## PavedDriveY             -703.1284    3461.6202  -0.203 0.839076    
## WoodDeckSF                10.1153       5.7623   1.755 0.079453 .  
## OpenPorchSF                5.0249      11.4393   0.439 0.660553    
## EnclosedPorch              7.8812      12.4620   0.632 0.527237    
## X3SsnPorch                52.1939      21.7640   2.398 0.016635 *  
## ScreenPorch               46.3312      12.2859   3.771 0.000171 ***
## PoolArea                 651.3711     226.5377   2.875 0.004110 ** 
## PoolQCFa             -143233.9603   40580.7579  -3.530 0.000433 ***
## PoolQCGd             -108692.0229   36466.6736  -2.981 0.002937 ** 
## PoolQCNP              252148.3295  123801.7614   2.037 0.041907 *  
## FenceGdWo               3602.4174    4875.0861   0.739 0.460091    
## FenceMnPrv              6919.0492    3968.7734   1.743 0.081535 .  
## FenceMnWw                617.7255    8009.5008   0.077 0.938538    
## FenceNF                 5291.5861    3626.2491   1.459 0.144770    
## MiscFeatureNM           -452.7031   98780.6709  -0.005 0.996344    
## MiscFeatureOthr        14837.5813   90648.6409   0.164 0.870010    
## MiscFeatureShed         1399.1800   94548.0229   0.015 0.988195    
## MiscFeatureTenC         6595.4966   97756.4521   0.067 0.946220    
## MiscVal                    0.2458       6.2278   0.039 0.968524    
## MoSold2                -7731.8508    4640.6001  -1.666 0.095959 .  
## MoSold3                -2669.7935    4084.7448  -0.654 0.513499    
## MoSold4                -2592.0432    3873.1388  -0.669 0.503479    
## MoSold5                 -952.1368    3706.0580  -0.257 0.797291    
## MoSold6                -2476.5190    3655.8633  -0.677 0.498282    
## MoSold7                 -347.0836    3713.8167  -0.093 0.925556    
## MoSold8                -6521.0467    3919.2241  -1.664 0.096412 .  
## MoSold9                -5516.9199    4484.2655  -1.230 0.218842    
## MoSold10               -7657.0159    4256.0910  -1.799 0.072269 .  
## MoSold11               -4929.3721    4292.5573  -1.148 0.251061    
## MoSold12               -4544.2673    4605.6940  -0.987 0.324015    
## YrSold2007                86.9338    1923.1899   0.045 0.963953    
## YrSold2008              2545.6042    2015.5626   1.263 0.206854    
## YrSold2009                -3.1244    1957.9171  -0.002 0.998727    
## YrSold2010              3058.1111    2448.0367   1.249 0.211842    
## SaleTypeCon            25106.9899   17511.8883   1.434 0.151926    
## SaleTypeConLD          14126.8993    9891.0703   1.428 0.153491    
## SaleTypeConLI            547.0299   11431.8229   0.048 0.961843    
## SaleTypeConLw           2737.2419   12140.6171   0.225 0.821660    
## SaleTypeCWD            11139.1104   12647.6958   0.881 0.378652    
## SaleTypeNew            25491.2792   15371.7619   1.658 0.097525 .  
## SaleTypeOth             7346.2181   14491.6217   0.507 0.612302    
## SaleTypeWD               145.1526    4109.8105   0.035 0.971832    
## SaleConditionAdjLand   27076.3022   16205.4457   1.671 0.095030 .  
## SaleConditionAlloca    -3537.0955    9952.9400  -0.355 0.722368    
## SaleConditionFamily      753.0752    6013.8156   0.125 0.900368    
## SaleConditionNormal     7021.8337    2873.5884   2.444 0.014692 *  
## SaleConditionPartial   -2789.4755   14754.9403  -0.189 0.850084    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21670 on 1153 degrees of freedom
## Multiple R-squared:  0.9412, Adjusted R-squared:  0.9256 
## F-statistic: 60.31 on 306 and 1153 DF,  p-value: < 2.2e-16

c. Convert data into wide form

D. Build model using wide form dataset

#These values are linearly dependent on other variables
housetrain.lm6 <- lm(SalePrice~BsmtQual+BsmtExposure, data = housetrain.knn)
summary(housetrain.lm6)

m. Model-8 - Best Preforming Linear Model

KNN Models

v. Prediction-8

Top 10 Rows Prediction Using Model-8
fit lwr upr
1 11537.87 -54293.97 77369.71
2 19599.66 -47171.58 86370.90
3 28033.83 -85935.58 142003.24
4 28797.12 -38749.81 96344.06
5 30153.52 -77718.34 138025.37
6 32288.34 -34975.26 99551.94
7 35331.33 -29525.27 100187.93
8 38340.33 -35238.25 111918.91
9 42118.46 -32085.87 116322.78
10 43876.84 -73913.55 161667.22

w. Prediction-9

Top 10 Rows Prediction Using Model-9
fit lwr upr
1 -9050.59 -129702.34 111601.16
2 4541.86 -66614.21 75697.92
3 16437.65 -55704.25 88579.55
4 19254.72 -94764.09 133273.54
5 19519.60 -53449.37 92488.56
6 22202.95 -49356.22 93762.11
7 24145.89 -95026.96 143318.74
8 27868.54 -98062.90 153799.98
9 28388.78 -41718.84 98496.41
10 30323.87 -90381.49 151029.24

x. Prediction-10

Top 10 Rows Prediction Using Model-10
fit lwr upr
1 -10807.48 -132920.28 111305.31
2 1923.39 -113212.93 117059.70
3 4076.57 -67828.73 75981.87
4 15594.48 -57272.02 88460.97
5 17371.82 -56382.23 91125.87
6 21698.34 -98938.62 142335.29
7 24441.14 -47841.68 96723.96
8 24983.82 -102478.96 152446.60
9 28217.72 -42700.22 99135.67
10 30684.52 -91468.75 152837.80

Conclusion - Model Prefomance on Kaggle

Sources