This is a simplified version of the program used for the Kaggle competition “House Prices”
12/2016



Data science problem

Data descriptions

Ask a home buyer to describe their dream house, and they probably won’t begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition’s dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence.

With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home.

The potential for creative feature engineering provides a rich opportunity for fun and learning. This dataset lends itself to advanced regression techniques like random forests and gradient boosting with the popular XGBoost library. We encourage Kagglers to create benchmark code and tutorials on Kernels for community learning. Top kernels will be awarded swag prizes at the competition close.

Acknowledgments

The Ames Housing dataset was compiled by Dean De Cock for use in data science education. It’s an incredible alternative for data scientists looking for a modernized and expanded version of the often cited Boston Housing dataset.




Find the data and the code on Kaggle or on Github



File descriptions

train.csv - the training set
test.csv - the test set
data_description.txt - full description of each column, originally prepared by Dean De Cock but lightly edited to match the column names used here
sample_submission.csv - a benchmark submission from a linear regression on year and month of sale, lot square footage, and number of bedrooms




Find the data and the code on Kaggle or on Github



Variables

SalePrice - the property’s sale price in dollars. This is the target variable that you’re trying to predict.
MSSubClass: The building class
MSZoning: The general zoning classification
LotFrontage: Linear feet of street connected to property
LotArea: Lot size in square feet
Street: Type of road access
Alley: Type of alley access
LotShape: General shape of property
LandContour: Flatness of the property Utilities: Type of utilities available
LotConfig: Lot configuration
LandSlope: Slope of property
Neighborhood: Physical locations within Ames city limits
Condition1: Proximity to main road or railroad
Condition2: Proximity to main road or railroad (if a second is present)
BldgType: Type of dwelling
HouseStyle: Style of dwelling
OverallQual: Overall material and finish quality
OverallCond: Overall condition rating
YearBuilt: Original construction date
YearRemodAdd: Remodel date
RoofStyle: Type of roof
RoofMatl: Roof material
Exterior1st: Exterior covering on house
Exterior2nd: Exterior covering on house (if more than one material)
MasVnrType: Masonry veneer type
MasVnrArea: Masonry veneer area in square feet
ExterQual: Exterior material quality
ExterCond: Present condition of the material on the exterior
Foundation: Type of foundation
BsmtQual: Height of the basement
BsmtCond: General condition of the basement
BsmtExposure: Walkout or garden level basement walls
BsmtFinType1: Quality of basement finished area
BsmtFinSF1: Type 1 finished square feet
BsmtFinType2: Quality of second finishedarea (if present)
BsmtFinSF2: Type 2 finished square feet
BsmtUnfSF: Unfinished square feet of basement area
TotalBsmtSF: Total square feet of basement area
Heating: Type of heating
HeatingQC: Heating quality and condition
CentralAir: Central air conditioning
Electrical: Electrical system
1stFlrSF: First Floor square feet
2ndFlrSF: Second floor square feet
LowQualFinSF: Low quality finished square feet (all floors)
GrLivArea: Above grade (ground) living area square feet
BsmtFullBath: Basement full bathrooms
BsmtHalfBath: Basement half bathrooms
FullBath: Full bathrooms above grade
HalfBath: Half baths above grade
Bedroom: Number of bedrooms above basement level
Kitchen: Number of kitchens
KitchenQual: Kitchen quality
TotRmsAbvGrd: Total rooms above grade (does not include bathrooms)
Functional: Home functionality rating
Fireplaces: Number of fireplaces
FireplaceQu: Fireplace quality
GarageType: Garage location
GarageYrBlt: Year garage was built
GarageFinish: Interior finish of the garage
GarageCars: Size of garage in car capacity
GarageArea: Size of garage in square feet
GarageQual: Garage quality
GarageCond: Garage condition
PavedDrive: Paved driveway
WoodDeckSF: Wood deck area in square feet
OpenPorchSF: Open porch area in square feet
EnclosedPorch: Enclosed porch area in square feet
3SsnPorch: Three season porch area in square feet
ScreenPorch: Screen porch area in square feet
PoolArea: Pool area in square feet
PoolQC: Pool quality
Fence: Fence quality
MiscFeature: Miscellaneous feature not covered in other categories
MiscVal: $Value of miscellaneous feature
MoSold: Month Sold
YrSold: Year Sold
SaleType: Type of sale
SaleCondition: Condition of sale




Find the data and the code on Kaggle or on Github



About Kaggle

In 2010, Kaggle was founded as a platform for predictive modelling and analytics competitions on which companies and researchers post their data and statisticians and data miners from all over the world compete to produce the best models.

This crowdsourcing approach relies on the fact that there are countless strategies that can be applied to any predictive modelling task and it is impossible to know at the outset which technique or analyst will be most effective. Kaggle also hosts recruiting competitions in which data scientists compete for a chance to interview at leading data science companies like Facebook, Winton Capital, and Walmart.




Find the data and the code on Kaggle or on Github



Libraries


# For data manipulation and tidying
library(MASS)
library(tidyr)
library(plyr)
library(dplyr)
library(broom)
library(data.table)
library(testthat)
library(gridExtra)

# For data visualizations
library(ggplot2)
library(plotly)
library(DT)
library(corrplot)
library(GGally)
library(Boruta)
library(pROC)
library(VIM)
library(mice)

# For modeling and predictions
library(mlbench)
library(caret)
library(glmnet)
library(ranger)
library(clValid)
library(e1071)
library(xgboost)



Data Exploration


Data

train <- read.csv("train.csv", header = TRUE, sep = ",", stringsAsFactors = FALSE)
data.test <- read.csv("test.csv", header = TRUE, sep = ",", stringsAsFactors = FALSE)
datatable(head(train, n=20),options = list(scrollX = TRUE))



Summary

summary(train)
##        Id           MSSubClass      MSZoning          LotFrontage    
##  Min.   :   1.0   Min.   : 20.0   Length:1460        Min.   : 21.00  
##  1st Qu.: 365.8   1st Qu.: 20.0   Class :character   1st Qu.: 59.00  
##  Median : 730.5   Median : 50.0   Mode  :character   Median : 69.00  
##  Mean   : 730.5   Mean   : 56.9                      Mean   : 70.05  
##  3rd Qu.:1095.2   3rd Qu.: 70.0                      3rd Qu.: 80.00  
##  Max.   :1460.0   Max.   :190.0                      Max.   :313.00  
##                                                      NA's   :259     
##     LotArea          Street             Alley             LotShape        
##  Min.   :  1300   Length:1460        Length:1460        Length:1460       
##  1st Qu.:  7554   Class :character   Class :character   Class :character  
##  Median :  9478   Mode  :character   Mode  :character   Mode  :character  
##  Mean   : 10517                                                           
##  3rd Qu.: 11602                                                           
##  Max.   :215245                                                           
##                                                                           
##  LandContour         Utilities          LotConfig        
##  Length:1460        Length:1460        Length:1460       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##   LandSlope         Neighborhood        Condition1       
##  Length:1460        Length:1460        Length:1460       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##   Condition2          BldgType          HouseStyle         OverallQual    
##  Length:1460        Length:1460        Length:1460        Min.   : 1.000  
##  Class :character   Class :character   Class :character   1st Qu.: 5.000  
##  Mode  :character   Mode  :character   Mode  :character   Median : 6.000  
##                                                           Mean   : 6.099  
##                                                           3rd Qu.: 7.000  
##                                                           Max.   :10.000  
##                                                                           
##   OverallCond      YearBuilt     YearRemodAdd   RoofStyle        
##  Min.   :1.000   Min.   :1872   Min.   :1950   Length:1460       
##  1st Qu.:5.000   1st Qu.:1954   1st Qu.:1967   Class :character  
##  Median :5.000   Median :1973   Median :1994   Mode  :character  
##  Mean   :5.575   Mean   :1971   Mean   :1985                     
##  3rd Qu.:6.000   3rd Qu.:2000   3rd Qu.:2004                     
##  Max.   :9.000   Max.   :2010   Max.   :2010                     
##                                                                  
##    RoofMatl         Exterior1st        Exterior2nd       
##  Length:1460        Length:1460        Length:1460       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##   MasVnrType          MasVnrArea      ExterQual          ExterCond        
##  Length:1460        Min.   :   0.0   Length:1460        Length:1460       
##  Class :character   1st Qu.:   0.0   Class :character   Class :character  
##  Mode  :character   Median :   0.0   Mode  :character   Mode  :character  
##                     Mean   : 103.7                                        
##                     3rd Qu.: 166.0                                        
##                     Max.   :1600.0                                        
##                     NA's   :8                                             
##   Foundation          BsmtQual           BsmtCond        
##  Length:1460        Length:1460        Length:1460       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##  BsmtExposure       BsmtFinType1         BsmtFinSF1     BsmtFinType2      
##  Length:1460        Length:1460        Min.   :   0.0   Length:1460       
##  Class :character   Class :character   1st Qu.:   0.0   Class :character  
##  Mode  :character   Mode  :character   Median : 383.5   Mode  :character  
##                                        Mean   : 443.6                     
##                                        3rd Qu.: 712.2                     
##                                        Max.   :5644.0                     
##                                                                           
##    BsmtFinSF2        BsmtUnfSF       TotalBsmtSF       Heating         
##  Min.   :   0.00   Min.   :   0.0   Min.   :   0.0   Length:1460       
##  1st Qu.:   0.00   1st Qu.: 223.0   1st Qu.: 795.8   Class :character  
##  Median :   0.00   Median : 477.5   Median : 991.5   Mode  :character  
##  Mean   :  46.55   Mean   : 567.2   Mean   :1057.4                     
##  3rd Qu.:   0.00   3rd Qu.: 808.0   3rd Qu.:1298.2                     
##  Max.   :1474.00   Max.   :2336.0   Max.   :6110.0                     
##                                                                        
##   HeatingQC          CentralAir         Electrical          X1stFlrSF   
##  Length:1460        Length:1460        Length:1460        Min.   : 334  
##  Class :character   Class :character   Class :character   1st Qu.: 882  
##  Mode  :character   Mode  :character   Mode  :character   Median :1087  
##                                                           Mean   :1163  
##                                                           3rd Qu.:1391  
##                                                           Max.   :4692  
##                                                                         
##    X2ndFlrSF     LowQualFinSF       GrLivArea     BsmtFullBath   
##  Min.   :   0   Min.   :  0.000   Min.   : 334   Min.   :0.0000  
##  1st Qu.:   0   1st Qu.:  0.000   1st Qu.:1130   1st Qu.:0.0000  
##  Median :   0   Median :  0.000   Median :1464   Median :0.0000  
##  Mean   : 347   Mean   :  5.845   Mean   :1515   Mean   :0.4253  
##  3rd Qu.: 728   3rd Qu.:  0.000   3rd Qu.:1777   3rd Qu.:1.0000  
##  Max.   :2065   Max.   :572.000   Max.   :5642   Max.   :3.0000  
##                                                                  
##   BsmtHalfBath        FullBath        HalfBath       BedroomAbvGr  
##  Min.   :0.00000   Min.   :0.000   Min.   :0.0000   Min.   :0.000  
##  1st Qu.:0.00000   1st Qu.:1.000   1st Qu.:0.0000   1st Qu.:2.000  
##  Median :0.00000   Median :2.000   Median :0.0000   Median :3.000  
##  Mean   :0.05753   Mean   :1.565   Mean   :0.3829   Mean   :2.866  
##  3rd Qu.:0.00000   3rd Qu.:2.000   3rd Qu.:1.0000   3rd Qu.:3.000  
##  Max.   :2.00000   Max.   :3.000   Max.   :2.0000   Max.   :8.000  
##                                                                    
##   KitchenAbvGr   KitchenQual         TotRmsAbvGrd     Functional       
##  Min.   :0.000   Length:1460        Min.   : 2.000   Length:1460       
##  1st Qu.:1.000   Class :character   1st Qu.: 5.000   Class :character  
##  Median :1.000   Mode  :character   Median : 6.000   Mode  :character  
##  Mean   :1.047                      Mean   : 6.518                     
##  3rd Qu.:1.000                      3rd Qu.: 7.000                     
##  Max.   :3.000                      Max.   :14.000                     
##                                                                        
##    Fireplaces    FireplaceQu         GarageType         GarageYrBlt  
##  Min.   :0.000   Length:1460        Length:1460        Min.   :1900  
##  1st Qu.:0.000   Class :character   Class :character   1st Qu.:1961  
##  Median :1.000   Mode  :character   Mode  :character   Median :1980  
##  Mean   :0.613                                         Mean   :1979  
##  3rd Qu.:1.000                                         3rd Qu.:2002  
##  Max.   :3.000                                         Max.   :2010  
##                                                        NA's   :81    
##  GarageFinish         GarageCars      GarageArea      GarageQual       
##  Length:1460        Min.   :0.000   Min.   :   0.0   Length:1460       
##  Class :character   1st Qu.:1.000   1st Qu.: 334.5   Class :character  
##  Mode  :character   Median :2.000   Median : 480.0   Mode  :character  
##                     Mean   :1.767   Mean   : 473.0                     
##                     3rd Qu.:2.000   3rd Qu.: 576.0                     
##                     Max.   :4.000   Max.   :1418.0                     
##                                                                        
##   GarageCond         PavedDrive          WoodDeckSF      OpenPorchSF    
##  Length:1460        Length:1460        Min.   :  0.00   Min.   :  0.00  
##  Class :character   Class :character   1st Qu.:  0.00   1st Qu.:  0.00  
##  Mode  :character   Mode  :character   Median :  0.00   Median : 25.00  
##                                        Mean   : 94.24   Mean   : 46.66  
##                                        3rd Qu.:168.00   3rd Qu.: 68.00  
##                                        Max.   :857.00   Max.   :547.00  
##                                                                         
##  EnclosedPorch      X3SsnPorch      ScreenPorch        PoolArea      
##  Min.   :  0.00   Min.   :  0.00   Min.   :  0.00   Min.   :  0.000  
##  1st Qu.:  0.00   1st Qu.:  0.00   1st Qu.:  0.00   1st Qu.:  0.000  
##  Median :  0.00   Median :  0.00   Median :  0.00   Median :  0.000  
##  Mean   : 21.95   Mean   :  3.41   Mean   : 15.06   Mean   :  2.759  
##  3rd Qu.:  0.00   3rd Qu.:  0.00   3rd Qu.:  0.00   3rd Qu.:  0.000  
##  Max.   :552.00   Max.   :508.00   Max.   :480.00   Max.   :738.000  
##                                                                      
##     PoolQC             Fence           MiscFeature       
##  Length:1460        Length:1460        Length:1460       
##  Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character  
##                                                          
##                                                          
##                                                          
##                                                          
##     MiscVal             MoSold           YrSold       SaleType        
##  Min.   :    0.00   Min.   : 1.000   Min.   :2006   Length:1460       
##  1st Qu.:    0.00   1st Qu.: 5.000   1st Qu.:2007   Class :character  
##  Median :    0.00   Median : 6.000   Median :2008   Mode  :character  
##  Mean   :   43.49   Mean   : 6.322   Mean   :2008                     
##  3rd Qu.:    0.00   3rd Qu.: 8.000   3rd Qu.:2009                     
##  Max.   :15500.00   Max.   :12.000   Max.   :2010                     
##                                                                       
##  SaleCondition        SalePrice     
##  Length:1460        Min.   : 34900  
##  Class :character   1st Qu.:129975  
##  Mode  :character   Median :163000  
##                     Mean   :180921  
##                     3rd Qu.:214000  
##                     Max.   :755000  
## 



Visualisation

Thanks to laurae2 for this code for plotting all data using tabplots. The objective is to find out some of the good features visually. As Laurae2 say: you can think of it as the vertical as the “sort by SalePrice”:

invisible(library(tabplot))
invisible(library(data.table))

columns <- c("numeric",
             rep("character", 2),
             rep("numeric", 2),
             rep("character", 12),
             rep("numeric", 4),
             rep("character", 5),
             "numeric",
             rep("character", 7),
             "numeric",
             "character",
             rep("numeric", 3),
             rep("character", 4),
             rep("numeric", 10),
             "character",
             "numeric",
             "character",
             "numeric",
             rep("character", 2),
             "numeric",
             "character",
             rep("numeric", 2),
             rep("character", 3),
             rep("numeric", 6),
             rep("character", 3),
             rep("numeric", 3),
             rep("character", 2),
             rep("numeric"))

train$SalePrice <- log(train$SalePrice) # To respect lrmse
train_visu <- as.data.frame(train)

for (i in 1:80) {
  if (typeof(train_visu[, i]) == "character") {
    train_visu[is.na(train_visu[, i]), i] <- ""
    train_visu[, i] <- as.factor(train_visu[, i])
  }
}

for (i in 1:16) {
  plot(tableplot(train_visu, select = c(((i - 1) * 5 + 1):(i * 5), 81), sortCol = 6, nBins = 73, plot = FALSE), fontsize = 12, title = paste("log(SalePrice) vs ", paste(colnames(train_visu)[((i - 1) * 5 + 1):(i * 5)], collapse = "+"), sep = ""), showTitle = TRUE, fontsize.title = 12)
}




Boruta

Thanks to Jim Thompson (JMT5802) for this Boruta Feature Importance Analysis. This report determines what features may be relevant to predicting house sale price. This analysis is based on the Boruta package. The code can be found here.

ID.VAR <- "Id"
TARGET.VAR <- "SalePrice"

# Data Preparation for Bourta Analysis
# retrive data for analysis
sample.df <- read.csv(file.path(ROOT.DIR,"input/train.csv"),stringsAsFactors = FALSE)
# extract only candidate feture names
candidate.features <- setdiff(names(sample.df),c(ID.VAR,TARGET.VAR))
data.type <- sapply(candidate.features,function(x){class(sample.df[[x]])})
# deterimine data types
explanatory.attributes <- setdiff(names(sample.df),c(ID.VAR,TARGET.VAR))
data.classes <- sapply(explanatory.attributes,function(x){class(sample.df[[x]])})
# categorize data types in the data set?
unique.classes <- unique(data.classes)
attr.data.types <- lapply(unique.classes,function(x){names(data.classes[data.classes==x])})
names(attr.data.types) <- unique.classes

#Prepare data set for Boruta analysis.  For this analysis, missing values are
#handled as follows:
#* missing numeric data is set to -1
#* missing character data is set to __*MISSING*__

# pull out the response variable
response <- sample.df$SalePrice

# remove identifier and response variables
sample.df <- sample.df[candidate.features]

# for numeric set missing values to -1 for purposes of the random forest run
for (x in attr.data.types$integer){
  sample.df[[x]][is.na(sample.df[[x]])] <- -1
}

for (x in attr.data.types$character){
  sample.df[[x]][is.na(sample.df[[x]])] <- "*MISSING*"
}

# Run Boruta Analysis
set.seed(13)
bor.results <- Boruta(sample.df,response,
                   maxRuns=101,
                   doTrace=0)
cat("\nSummary of Boruta run:\n")
print(bor.results)

cat("\n\nRelevant Attributes:\n")
getSelectedAttributes(bor.results)
plot(bor.results)

#Detailed results for each candidate explanatory attributes.

cat("\n\nAttribute Importance Details:\n")
options(width=125)
arrange(cbind(attr=rownames(attStats(bor.results)), attStats(bor.results)),desc(medianImp))




Missing values

aggr(train, prop = F, numbers = T)

apply(is.na(train),2,sum)
##            Id    MSSubClass      MSZoning   LotFrontage       LotArea 
##             0             0             0           259             0 
##        Street         Alley      LotShape   LandContour     Utilities 
##             0          1369             0             0             0 
##     LotConfig     LandSlope  Neighborhood    Condition1    Condition2 
##             0             0             0             0             0 
##      BldgType    HouseStyle   OverallQual   OverallCond     YearBuilt 
##             0             0             0             0             0 
##  YearRemodAdd     RoofStyle      RoofMatl   Exterior1st   Exterior2nd 
##             0             0             0             0             0 
##    MasVnrType    MasVnrArea     ExterQual     ExterCond    Foundation 
##             8             8             0             0             0 
##      BsmtQual      BsmtCond  BsmtExposure  BsmtFinType1    BsmtFinSF1 
##            37            37            38            37             0 
##  BsmtFinType2    BsmtFinSF2     BsmtUnfSF   TotalBsmtSF       Heating 
##            38             0             0             0             0 
##     HeatingQC    CentralAir    Electrical     X1stFlrSF     X2ndFlrSF 
##             0             0             1             0             0 
##  LowQualFinSF     GrLivArea  BsmtFullBath  BsmtHalfBath      FullBath 
##             0             0             0             0             0 
##      HalfBath  BedroomAbvGr  KitchenAbvGr   KitchenQual  TotRmsAbvGrd 
##             0             0             0             0             0 
##    Functional    Fireplaces   FireplaceQu    GarageType   GarageYrBlt 
##             0             0           690            81            81 
##  GarageFinish    GarageCars    GarageArea    GarageQual    GarageCond 
##            81             0             0            81            81 
##    PavedDrive    WoodDeckSF   OpenPorchSF EnclosedPorch    X3SsnPorch 
##             0             0             0             0             0 
##   ScreenPorch      PoolArea        PoolQC         Fence   MiscFeature 
##             0             0          1453          1179          1406 
##       MiscVal        MoSold        YrSold      SaleType SaleCondition 
##             0             0             0             0             0 
##     SalePrice 
##             0



Data Preparation


The goal here is to select the most relevant features, reshape them, handle missing values and outliers and get data ready to be processed by different machine learning models.


Feature selection

# 1. Incorporate results of Boruta analysis
Boruta_analysis <- c("MSSubClass","MSZoning","LotArea","LotShape","LandContour","Neighborhood",
                    "BldgType","HouseStyle","OverallQual","OverallCond","YearBuilt",
                    "YearRemodAdd","Exterior1st","Exterior2nd","MasVnrArea","ExterQual",
                    "Foundation","BsmtQual","BsmtCond","BsmtFinType1","BsmtFinSF1",
                    "BsmtFinType2","BsmtUnfSF","TotalBsmtSF","HeatingQC","CentralAir",
                    "X1stFlrSF","X2ndFlrSF","GrLivArea","BsmtFullBath","FullBath","HalfBath",
                    "BedroomAbvGr","KitchenAbvGr","KitchenQual","TotRmsAbvGrd","Functional",
                    "Fireplaces","FireplaceQu","GarageType","GarageYrBlt","GarageFinish",
                    "GarageCars","GarageArea","GarageQual","GarageCond","PavedDrive","WoodDeckSF",
                    "OpenPorchSF","Fence", "SalePrice")

train_selected_boruta <- train[Boruta_analysis]

# Identify near zero variance predictors: remove_cols
remove_cols <- nearZeroVar(train_selected_boruta, names = TRUE, 
                           freqCut = 2, uniqueCut = 20)

# Remove predictors with low variance 
all_cols <- names(train_selected_boruta)
train_selected <- train_selected_boruta[ , setdiff(all_cols, remove_cols)]



Types

# transform all the charaters variable into factor
train_selected[sapply(train_selected, is.character)] <- lapply(train_selected[sapply(train_selected, is.character)], as.factor)



Outliers

train_selected_outliers <- train_selected
# remove outliers
train_selected <- subset(train_selected,!(train_selected$SalePrice > quantile(train_selected$SalePrice, probs=c(.01, .99))[2] | train_selected$SalePrice < quantile(train_selected$SalePrice, probs=c(.01, .9))[1]) ) 

par(mfrow=c(1,2))
boxplot(train_selected_outliers$SalePrice, main="Before")
boxplot(train_selected$SalePrice, main="After")




Missing values

Heuristics or rules of thumb

  • Numeric variables: PMM (Predictive Mean Matching)
  • For Binary Variables( with 2 levels): logreg(Logistic Regression)
  • For Factor Variables (>= 2 levels): polyreg(Bayesian polytomous regression)

As we keep this model simple we just use ppm.

# missingvaluenumeric <- MasVnrArea, GarageYrBlt
# missingvaluefactor <- c('BsmtQual', 'BsmtFinType1', 'FireplaceQu', 'GarageFinish') 
## FireplaceQu miss almost 50% of value!! 
tempData <- mice(train_selected,m=5,maxit=50,meth='pmm',seed=500)
## 
##  iter imp variable
##   1   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   1   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   1   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   1   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   1   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   2   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   2   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   2   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   2   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   2   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   3   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   3   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   3   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   3   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   3   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   4   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   4   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   4   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   4   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   4   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   5   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   5   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   5   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   5   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   5   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   6   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   6   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   6   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   6   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   6   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   7   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   7   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   7   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   7   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   7   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   8   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   8   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   8   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   8   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   8   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   9   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   9   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   9   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   9   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   9   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   10   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   10   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   10   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   10   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   10   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   11   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   11   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   11   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   11   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   11   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   12   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   12   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   12   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   12   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   12   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   13   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   13   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   13   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   13   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   13   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   14   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   14   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   14   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   14   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   14   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   15   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   15   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   15   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   15   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   15   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   16   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   16   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   16   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   16   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   16   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   17   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   17   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   17   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   17   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   17   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   18   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   18   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   18   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   18   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   18   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   19   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   19   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   19   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   19   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   19   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   20   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   20   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   20   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   20   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   20   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   21   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   21   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   21   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   21   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   21   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   22   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   22   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   22   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   22   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   22   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   23   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   23   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   23   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   23   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   23   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   24   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   24   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   24   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   24   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   24   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   25   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   25   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   25   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   25   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   25   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   26   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   26   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   26   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   26   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   26   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   27   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   27   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   27   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   27   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   27   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   28   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   28   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   28   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   28   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   28   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   29   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   29   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   29   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   29   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   29   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   30   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   30   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   30   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   30   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   30   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   31   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   31   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   31   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   31   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   31   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   32   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   32   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   32   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   32   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   32   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   33   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   33   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   33   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   33   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   33   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   34   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   34   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   34   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   34   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   34   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   35   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   35   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   35   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   35   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   35   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   36   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   36   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   36   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   36   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   36   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   37   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   37   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   37   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   37   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   37   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   38   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   38   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   38   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   38   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   38   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   39   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   39   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   39   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   39   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   39   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   40   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   40   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   40   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   40   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   40   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   41   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   41   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   41   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   41   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   41   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   42   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   42   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   42   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   42   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   42   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   43   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   43   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   43   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   43   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   43   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   44   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   44   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   44   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   44   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   44   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   45   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   45   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   45   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   45   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   45   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   46   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   46   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   46   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   46   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   46   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   47   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   47   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   47   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   47   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   47   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   48   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   48   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   48   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   48   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   48   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   49   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   49   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   49   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   49   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   49   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   50   1  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   50   2  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   50   3  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   50   4  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
##   50   5  MasVnrArea  BsmtQual  BsmtFinType1  FireplaceQu  GarageYrBlt  GarageFinish
train_selected_reform <- complete(tempData,1)
apply(is.na(train_selected_reform),2,sum)
##   MSSubClass      LotArea     LotShape Neighborhood   HouseStyle 
##            0            0            0            0            0 
##  OverallQual    YearBuilt YearRemodAdd   MasVnrArea    ExterQual 
##            0            0            0            0            0 
##   Foundation     BsmtQual BsmtFinType1   BsmtFinSF1    BsmtUnfSF 
##            0            0            0            0            0 
##  TotalBsmtSF    HeatingQC    X1stFlrSF    X2ndFlrSF    GrLivArea 
##            0            0            0            0            0 
## BsmtFullBath     FullBath     HalfBath  KitchenQual TotRmsAbvGrd 
##            0            0            0            0            0 
##   Fireplaces  FireplaceQu  GarageYrBlt GarageFinish   GarageArea 
##            0            0            0            0            0 
##    SalePrice 
##            0



Modeling


We now build and evaluate two simple regression models that will need to be futher tuned.

Generalized Boosted Regression Models (Gbm)

### 1
# Train on cross-validation
train_control<- trainControl(method="cv", number=8, repeats=5)

# Build the Generalized Boosted Regression Models
gbm <- train(SalePrice~., data=train_selected_reform, trControl=train_control, method="gbm")
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1219             nan     0.1000    0.0121
##      2        0.1123             nan     0.1000    0.0097
##      3        0.1041             nan     0.1000    0.0080
##      4        0.0969             nan     0.1000    0.0071
##      5        0.0909             nan     0.1000    0.0058
##      6        0.0851             nan     0.1000    0.0052
##      7        0.0800             nan     0.1000    0.0049
##      8        0.0756             nan     0.1000    0.0038
##      9        0.0715             nan     0.1000    0.0040
##     10        0.0682             nan     0.1000    0.0031
##     20        0.0438             nan     0.1000    0.0013
##     40        0.0271             nan     0.1000    0.0004
##     60        0.0212             nan     0.1000    0.0001
##     80        0.0182             nan     0.1000    0.0000
##    100        0.0168             nan     0.1000   -0.0000
##    120        0.0159             nan     0.1000    0.0000
##    140        0.0153             nan     0.1000   -0.0000
##    150        0.0150             nan     0.1000    0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1206             nan     0.1000    0.0136
##      2        0.1092             nan     0.1000    0.0113
##      3        0.0995             nan     0.1000    0.0098
##      4        0.0901             nan     0.1000    0.0091
##      5        0.0825             nan     0.1000    0.0070
##      6        0.0756             nan     0.1000    0.0066
##      7        0.0703             nan     0.1000    0.0054
##      8        0.0654             nan     0.1000    0.0047
##      9        0.0606             nan     0.1000    0.0045
##     10        0.0566             nan     0.1000    0.0038
##     20        0.0330             nan     0.1000    0.0013
##     40        0.0196             nan     0.1000    0.0002
##     60        0.0158             nan     0.1000    0.0000
##     80        0.0144             nan     0.1000    0.0000
##    100        0.0135             nan     0.1000    0.0000
##    120        0.0130             nan     0.1000   -0.0000
##    140        0.0123             nan     0.1000   -0.0000
##    150        0.0121             nan     0.1000   -0.0001
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1185             nan     0.1000    0.0151
##      2        0.1053             nan     0.1000    0.0134
##      3        0.0945             nan     0.1000    0.0110
##      4        0.0846             nan     0.1000    0.0098
##      5        0.0771             nan     0.1000    0.0069
##      6        0.0702             nan     0.1000    0.0062
##      7        0.0640             nan     0.1000    0.0062
##      8        0.0590             nan     0.1000    0.0047
##      9        0.0541             nan     0.1000    0.0043
##     10        0.0499             nan     0.1000    0.0038
##     20        0.0276             nan     0.1000    0.0009
##     40        0.0165             nan     0.1000    0.0001
##     60        0.0139             nan     0.1000    0.0001
##     80        0.0126             nan     0.1000   -0.0000
##    100        0.0119             nan     0.1000    0.0000
##    120        0.0112             nan     0.1000   -0.0000
##    140        0.0106             nan     0.1000   -0.0000
##    150        0.0103             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1209             nan     0.1000    0.0123
##      2        0.1114             nan     0.1000    0.0098
##      3        0.1029             nan     0.1000    0.0079
##      4        0.0959             nan     0.1000    0.0070
##      5        0.0899             nan     0.1000    0.0060
##      6        0.0846             nan     0.1000    0.0052
##      7        0.0799             nan     0.1000    0.0043
##      8        0.0754             nan     0.1000    0.0041
##      9        0.0715             nan     0.1000    0.0039
##     10        0.0678             nan     0.1000    0.0037
##     20        0.0444             nan     0.1000    0.0014
##     40        0.0277             nan     0.1000    0.0003
##     60        0.0218             nan     0.1000    0.0001
##     80        0.0188             nan     0.1000    0.0001
##    100        0.0173             nan     0.1000    0.0000
##    120        0.0164             nan     0.1000   -0.0000
##    140        0.0158             nan     0.1000   -0.0000
##    150        0.0155             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1191             nan     0.1000    0.0143
##      2        0.1079             nan     0.1000    0.0113
##      3        0.0973             nan     0.1000    0.0107
##      4        0.0886             nan     0.1000    0.0082
##      5        0.0814             nan     0.1000    0.0072
##      6        0.0751             nan     0.1000    0.0068
##      7        0.0696             nan     0.1000    0.0055
##      8        0.0646             nan     0.1000    0.0044
##      9        0.0599             nan     0.1000    0.0046
##     10        0.0565             nan     0.1000    0.0034
##     20        0.0336             nan     0.1000    0.0014
##     40        0.0201             nan     0.1000    0.0002
##     60        0.0163             nan     0.1000    0.0000
##     80        0.0149             nan     0.1000   -0.0000
##    100        0.0140             nan     0.1000   -0.0000
##    120        0.0132             nan     0.1000   -0.0000
##    140        0.0126             nan     0.1000   -0.0000
##    150        0.0125             nan     0.1000   -0.0001
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1172             nan     0.1000    0.0162
##      2        0.1040             nan     0.1000    0.0125
##      3        0.0930             nan     0.1000    0.0105
##      4        0.0845             nan     0.1000    0.0081
##      5        0.0768             nan     0.1000    0.0075
##      6        0.0693             nan     0.1000    0.0069
##      7        0.0634             nan     0.1000    0.0053
##      8        0.0582             nan     0.1000    0.0047
##      9        0.0534             nan     0.1000    0.0044
##     10        0.0496             nan     0.1000    0.0033
##     20        0.0274             nan     0.1000    0.0010
##     40        0.0166             nan     0.1000    0.0001
##     60        0.0140             nan     0.1000   -0.0000
##     80        0.0127             nan     0.1000    0.0000
##    100        0.0119             nan     0.1000    0.0000
##    120        0.0113             nan     0.1000   -0.0000
##    140        0.0107             nan     0.1000   -0.0000
##    150        0.0105             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1180             nan     0.1000    0.0116
##      2        0.1082             nan     0.1000    0.0091
##      3        0.1005             nan     0.1000    0.0070
##      4        0.0934             nan     0.1000    0.0069
##      5        0.0873             nan     0.1000    0.0060
##      6        0.0822             nan     0.1000    0.0051
##      7        0.0771             nan     0.1000    0.0047
##      8        0.0731             nan     0.1000    0.0039
##      9        0.0694             nan     0.1000    0.0037
##     10        0.0659             nan     0.1000    0.0030
##     20        0.0431             nan     0.1000    0.0014
##     40        0.0265             nan     0.1000    0.0003
##     60        0.0206             nan     0.1000    0.0002
##     80        0.0175             nan     0.1000   -0.0000
##    100        0.0159             nan     0.1000    0.0000
##    120        0.0149             nan     0.1000   -0.0001
##    140        0.0143             nan     0.1000    0.0000
##    150        0.0141             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1172             nan     0.1000    0.0133
##      2        0.1056             nan     0.1000    0.0112
##      3        0.0958             nan     0.1000    0.0094
##      4        0.0864             nan     0.1000    0.0086
##      5        0.0791             nan     0.1000    0.0072
##      6        0.0732             nan     0.1000    0.0053
##      7        0.0677             nan     0.1000    0.0053
##      8        0.0626             nan     0.1000    0.0051
##      9        0.0583             nan     0.1000    0.0041
##     10        0.0546             nan     0.1000    0.0038
##     20        0.0326             nan     0.1000    0.0014
##     40        0.0191             nan     0.1000    0.0002
##     60        0.0151             nan     0.1000    0.0000
##     80        0.0135             nan     0.1000   -0.0000
##    100        0.0126             nan     0.1000    0.0000
##    120        0.0120             nan     0.1000   -0.0000
##    140        0.0115             nan     0.1000   -0.0001
##    150        0.0113             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1148             nan     0.1000    0.0151
##      2        0.1016             nan     0.1000    0.0128
##      3        0.0912             nan     0.1000    0.0100
##      4        0.0819             nan     0.1000    0.0093
##      5        0.0742             nan     0.1000    0.0073
##      6        0.0675             nan     0.1000    0.0065
##      7        0.0616             nan     0.1000    0.0057
##      8        0.0565             nan     0.1000    0.0045
##      9        0.0520             nan     0.1000    0.0038
##     10        0.0480             nan     0.1000    0.0038
##     20        0.0271             nan     0.1000    0.0009
##     40        0.0157             nan     0.1000    0.0002
##     60        0.0128             nan     0.1000    0.0000
##     80        0.0118             nan     0.1000    0.0000
##    100        0.0111             nan     0.1000   -0.0000
##    120        0.0104             nan     0.1000   -0.0000
##    140        0.0099             nan     0.1000   -0.0001
##    150        0.0096             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1205             nan     0.1000    0.0118
##      2        0.1106             nan     0.1000    0.0098
##      3        0.1027             nan     0.1000    0.0076
##      4        0.0960             nan     0.1000    0.0064
##      5        0.0894             nan     0.1000    0.0059
##      6        0.0842             nan     0.1000    0.0052
##      7        0.0795             nan     0.1000    0.0044
##      8        0.0750             nan     0.1000    0.0047
##      9        0.0709             nan     0.1000    0.0041
##     10        0.0675             nan     0.1000    0.0035
##     20        0.0446             nan     0.1000    0.0013
##     40        0.0277             nan     0.1000    0.0004
##     60        0.0214             nan     0.1000    0.0002
##     80        0.0187             nan     0.1000    0.0000
##    100        0.0171             nan     0.1000    0.0001
##    120        0.0162             nan     0.1000    0.0000
##    140        0.0155             nan     0.1000   -0.0000
##    150        0.0153             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1181             nan     0.1000    0.0138
##      2        0.1067             nan     0.1000    0.0115
##      3        0.0970             nan     0.1000    0.0094
##      4        0.0895             nan     0.1000    0.0078
##      5        0.0818             nan     0.1000    0.0075
##      6        0.0756             nan     0.1000    0.0055
##      7        0.0695             nan     0.1000    0.0060
##      8        0.0648             nan     0.1000    0.0046
##      9        0.0599             nan     0.1000    0.0044
##     10        0.0563             nan     0.1000    0.0035
##     20        0.0333             nan     0.1000    0.0013
##     40        0.0201             nan     0.1000    0.0003
##     60        0.0165             nan     0.1000    0.0000
##     80        0.0149             nan     0.1000   -0.0000
##    100        0.0141             nan     0.1000   -0.0000
##    120        0.0133             nan     0.1000   -0.0000
##    140        0.0128             nan     0.1000   -0.0000
##    150        0.0125             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1174             nan     0.1000    0.0153
##      2        0.1048             nan     0.1000    0.0124
##      3        0.0938             nan     0.1000    0.0110
##      4        0.0844             nan     0.1000    0.0087
##      5        0.0768             nan     0.1000    0.0070
##      6        0.0705             nan     0.1000    0.0063
##      7        0.0641             nan     0.1000    0.0060
##      8        0.0590             nan     0.1000    0.0049
##      9        0.0546             nan     0.1000    0.0041
##     10        0.0503             nan     0.1000    0.0038
##     20        0.0280             nan     0.1000    0.0010
##     40        0.0167             nan     0.1000    0.0002
##     60        0.0140             nan     0.1000    0.0000
##     80        0.0128             nan     0.1000   -0.0000
##    100        0.0120             nan     0.1000   -0.0000
##    120        0.0113             nan     0.1000   -0.0000
##    140        0.0106             nan     0.1000   -0.0000
##    150        0.0104             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1198             nan     0.1000    0.0123
##      2        0.1105             nan     0.1000    0.0098
##      3        0.1022             nan     0.1000    0.0077
##      4        0.0947             nan     0.1000    0.0072
##      5        0.0886             nan     0.1000    0.0056
##      6        0.0831             nan     0.1000    0.0054
##      7        0.0786             nan     0.1000    0.0043
##      8        0.0740             nan     0.1000    0.0044
##      9        0.0702             nan     0.1000    0.0039
##     10        0.0668             nan     0.1000    0.0035
##     20        0.0436             nan     0.1000    0.0014
##     40        0.0275             nan     0.1000    0.0004
##     60        0.0215             nan     0.1000    0.0001
##     80        0.0189             nan     0.1000   -0.0001
##    100        0.0173             nan     0.1000    0.0000
##    120        0.0164             nan     0.1000    0.0000
##    140        0.0157             nan     0.1000    0.0000
##    150        0.0155             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1181             nan     0.1000    0.0135
##      2        0.1067             nan     0.1000    0.0116
##      3        0.0962             nan     0.1000    0.0102
##      4        0.0875             nan     0.1000    0.0081
##      5        0.0810             nan     0.1000    0.0066
##      6        0.0746             nan     0.1000    0.0066
##      7        0.0693             nan     0.1000    0.0053
##      8        0.0647             nan     0.1000    0.0043
##      9        0.0602             nan     0.1000    0.0043
##     10        0.0561             nan     0.1000    0.0039
##     20        0.0332             nan     0.1000    0.0012
##     40        0.0200             nan     0.1000    0.0002
##     60        0.0164             nan     0.1000    0.0000
##     80        0.0149             nan     0.1000    0.0000
##    100        0.0139             nan     0.1000   -0.0000
##    120        0.0131             nan     0.1000   -0.0000
##    140        0.0125             nan     0.1000   -0.0000
##    150        0.0122             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1162             nan     0.1000    0.0162
##      2        0.1034             nan     0.1000    0.0129
##      3        0.0921             nan     0.1000    0.0112
##      4        0.0830             nan     0.1000    0.0091
##      5        0.0753             nan     0.1000    0.0077
##      6        0.0686             nan     0.1000    0.0058
##      7        0.0630             nan     0.1000    0.0053
##      8        0.0574             nan     0.1000    0.0051
##      9        0.0530             nan     0.1000    0.0048
##     10        0.0492             nan     0.1000    0.0035
##     20        0.0276             nan     0.1000    0.0011
##     40        0.0166             nan     0.1000    0.0001
##     60        0.0139             nan     0.1000   -0.0001
##     80        0.0126             nan     0.1000   -0.0000
##    100        0.0116             nan     0.1000   -0.0000
##    120        0.0109             nan     0.1000   -0.0001
##    140        0.0103             nan     0.1000   -0.0000
##    150        0.0100             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1196             nan     0.1000    0.0121
##      2        0.1099             nan     0.1000    0.0096
##      3        0.1019             nan     0.1000    0.0070
##      4        0.0950             nan     0.1000    0.0069
##      5        0.0889             nan     0.1000    0.0058
##      6        0.0833             nan     0.1000    0.0056
##      7        0.0783             nan     0.1000    0.0045
##      8        0.0738             nan     0.1000    0.0044
##      9        0.0698             nan     0.1000    0.0040
##     10        0.0661             nan     0.1000    0.0033
##     20        0.0430             nan     0.1000    0.0014
##     40        0.0269             nan     0.1000    0.0003
##     60        0.0212             nan     0.1000    0.0001
##     80        0.0184             nan     0.1000    0.0001
##    100        0.0170             nan     0.1000   -0.0000
##    120        0.0161             nan     0.1000   -0.0000
##    140        0.0156             nan     0.1000   -0.0000
##    150        0.0153             nan     0.1000   -0.0001
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1179             nan     0.1000    0.0143
##      2        0.1071             nan     0.1000    0.0109
##      3        0.0973             nan     0.1000    0.0092
##      4        0.0886             nan     0.1000    0.0090
##      5        0.0816             nan     0.1000    0.0064
##      6        0.0750             nan     0.1000    0.0065
##      7        0.0691             nan     0.1000    0.0055
##      8        0.0639             nan     0.1000    0.0048
##      9        0.0594             nan     0.1000    0.0042
##     10        0.0557             nan     0.1000    0.0034
##     20        0.0329             nan     0.1000    0.0011
##     40        0.0192             nan     0.1000    0.0002
##     60        0.0157             nan     0.1000    0.0000
##     80        0.0142             nan     0.1000   -0.0000
##    100        0.0135             nan     0.1000    0.0000
##    120        0.0129             nan     0.1000   -0.0000
##    140        0.0125             nan     0.1000   -0.0000
##    150        0.0122             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1160             nan     0.1000    0.0150
##      2        0.1032             nan     0.1000    0.0128
##      3        0.0923             nan     0.1000    0.0103
##      4        0.0830             nan     0.1000    0.0086
##      5        0.0757             nan     0.1000    0.0063
##      6        0.0691             nan     0.1000    0.0069
##      7        0.0625             nan     0.1000    0.0063
##      8        0.0573             nan     0.1000    0.0049
##      9        0.0528             nan     0.1000    0.0042
##     10        0.0492             nan     0.1000    0.0037
##     20        0.0274             nan     0.1000    0.0009
##     40        0.0168             nan     0.1000    0.0002
##     60        0.0140             nan     0.1000   -0.0000
##     80        0.0130             nan     0.1000   -0.0000
##    100        0.0121             nan     0.1000    0.0000
##    120        0.0114             nan     0.1000   -0.0000
##    140        0.0108             nan     0.1000   -0.0000
##    150        0.0105             nan     0.1000    0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1181             nan     0.1000    0.0120
##      2        0.1087             nan     0.1000    0.0096
##      3        0.1008             nan     0.1000    0.0076
##      4        0.0943             nan     0.1000    0.0064
##      5        0.0875             nan     0.1000    0.0063
##      6        0.0822             nan     0.1000    0.0050
##      7        0.0771             nan     0.1000    0.0048
##      8        0.0728             nan     0.1000    0.0040
##      9        0.0689             nan     0.1000    0.0034
##     10        0.0650             nan     0.1000    0.0036
##     20        0.0427             nan     0.1000    0.0013
##     40        0.0269             nan     0.1000    0.0004
##     60        0.0209             nan     0.1000    0.0002
##     80        0.0182             nan     0.1000    0.0001
##    100        0.0168             nan     0.1000    0.0000
##    120        0.0159             nan     0.1000   -0.0000
##    140        0.0152             nan     0.1000   -0.0000
##    150        0.0150             nan     0.1000    0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1158             nan     0.1000    0.0136
##      2        0.1046             nan     0.1000    0.0113
##      3        0.0953             nan     0.1000    0.0092
##      4        0.0870             nan     0.1000    0.0076
##      5        0.0797             nan     0.1000    0.0068
##      6        0.0728             nan     0.1000    0.0066
##      7        0.0674             nan     0.1000    0.0055
##      8        0.0632             nan     0.1000    0.0042
##      9        0.0592             nan     0.1000    0.0040
##     10        0.0551             nan     0.1000    0.0040
##     20        0.0328             nan     0.1000    0.0012
##     40        0.0196             nan     0.1000    0.0002
##     60        0.0156             nan     0.1000    0.0001
##     80        0.0142             nan     0.1000    0.0000
##    100        0.0134             nan     0.1000   -0.0001
##    120        0.0128             nan     0.1000   -0.0000
##    140        0.0123             nan     0.1000   -0.0000
##    150        0.0121             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1146             nan     0.1000    0.0158
##      2        0.1020             nan     0.1000    0.0125
##      3        0.0915             nan     0.1000    0.0102
##      4        0.0830             nan     0.1000    0.0080
##      5        0.0753             nan     0.1000    0.0074
##      6        0.0682             nan     0.1000    0.0070
##      7        0.0625             nan     0.1000    0.0050
##      8        0.0571             nan     0.1000    0.0052
##      9        0.0525             nan     0.1000    0.0044
##     10        0.0488             nan     0.1000    0.0035
##     20        0.0272             nan     0.1000    0.0011
##     40        0.0162             nan     0.1000    0.0001
##     60        0.0137             nan     0.1000   -0.0000
##     80        0.0125             nan     0.1000    0.0000
##    100        0.0116             nan     0.1000   -0.0001
##    120        0.0110             nan     0.1000    0.0000
##    140        0.0105             nan     0.1000   -0.0000
##    150        0.0102             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1197             nan     0.1000    0.0121
##      2        0.1096             nan     0.1000    0.0097
##      3        0.1012             nan     0.1000    0.0080
##      4        0.0941             nan     0.1000    0.0069
##      5        0.0886             nan     0.1000    0.0053
##      6        0.0839             nan     0.1000    0.0048
##      7        0.0789             nan     0.1000    0.0050
##      8        0.0741             nan     0.1000    0.0044
##      9        0.0701             nan     0.1000    0.0040
##     10        0.0665             nan     0.1000    0.0034
##     20        0.0438             nan     0.1000    0.0014
##     40        0.0273             nan     0.1000    0.0005
##     60        0.0212             nan     0.1000    0.0001
##     80        0.0185             nan     0.1000    0.0001
##    100        0.0170             nan     0.1000    0.0000
##    120        0.0162             nan     0.1000   -0.0001
##    140        0.0155             nan     0.1000   -0.0000
##    150        0.0152             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1190             nan     0.1000    0.0133
##      2        0.1066             nan     0.1000    0.0116
##      3        0.0960             nan     0.1000    0.0097
##      4        0.0876             nan     0.1000    0.0083
##      5        0.0806             nan     0.1000    0.0073
##      6        0.0737             nan     0.1000    0.0066
##      7        0.0679             nan     0.1000    0.0056
##      8        0.0627             nan     0.1000    0.0048
##      9        0.0586             nan     0.1000    0.0040
##     10        0.0547             nan     0.1000    0.0036
##     20        0.0327             nan     0.1000    0.0013
##     40        0.0195             nan     0.1000    0.0003
##     60        0.0158             nan     0.1000    0.0001
##     80        0.0143             nan     0.1000   -0.0000
##    100        0.0135             nan     0.1000   -0.0000
##    120        0.0127             nan     0.1000   -0.0001
##    140        0.0122             nan     0.1000   -0.0000
##    150        0.0119             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1173             nan     0.1000    0.0155
##      2        0.1042             nan     0.1000    0.0135
##      3        0.0936             nan     0.1000    0.0102
##      4        0.0842             nan     0.1000    0.0092
##      5        0.0767             nan     0.1000    0.0073
##      6        0.0697             nan     0.1000    0.0070
##      7        0.0634             nan     0.1000    0.0060
##      8        0.0586             nan     0.1000    0.0046
##      9        0.0537             nan     0.1000    0.0043
##     10        0.0495             nan     0.1000    0.0040
##     20        0.0279             nan     0.1000    0.0012
##     40        0.0169             nan     0.1000    0.0001
##     60        0.0140             nan     0.1000    0.0001
##     80        0.0127             nan     0.1000    0.0000
##    100        0.0117             nan     0.1000    0.0000
##    120        0.0109             nan     0.1000   -0.0000
##    140        0.0103             nan     0.1000   -0.0000
##    150        0.0100             nan     0.1000   -0.0000
## 
## Iter   TrainDeviance   ValidDeviance   StepSize   Improve
##      1        0.1160             nan     0.1000    0.0158
##      2        0.1032             nan     0.1000    0.0126
##      3        0.0922             nan     0.1000    0.0106
##      4        0.0833             nan     0.1000    0.0084
##      5        0.0756             nan     0.1000    0.0072
##      6        0.0689             nan     0.1000    0.0067
##      7        0.0635             nan     0.1000    0.0055
##      8        0.0581             nan     0.1000    0.0051
##      9        0.0538             nan     0.1000    0.0043
##     10        0.0498             nan     0.1000    0.0037
##     20        0.0282             nan     0.1000    0.0012
##     40        0.0170             nan     0.1000    0.0001
##     60        0.0141             nan     0.1000   -0.0000
##     80        0.0128             nan     0.1000   -0.0000
##    100        0.0120             nan     0.1000   -0.0000
##    120        0.0113             nan     0.1000   -0.0000
##    140        0.0107             nan     0.1000   -0.0001
##    150        0.0105             nan     0.1000   -0.0000
# make prediction on the train data
prediction <- predict(gbm, train_selected_reform)
binded <- cbind(train_selected_reform, prediction)

Calculate the rmse

res <- binded$SalePrice - prediction
rmse <- sqrt(mean(res ^ 2))
print(rmse)
## [1] 0.1023828

Example: this code isn’t ran (it’s just an ideas on how this algorithm could be futher tuned)

library(hydroGOF)
library(Metrics)

caretGrid <- expand.grid(interaction.depth=c(1, 3, 5), n.trees = (0:50)*50,
                   shrinkage=c(0.01, 0.001),
                   n.minobsinnode=10)
metric <- "RMSE"

set.seed(99)
gbm.caret <- train(SalePrice ~ ., data=train_selected_reform, method="gbm",
              trControl=train_control, verbose=FALSE, 
              tuneGrid=caretGrid, metric=metric, bag.fraction=0.75)

print(gbm.caret)
Find the data and the code on Kaggle or on Github



Extreme Gradient Boosting (Xgboost)

xgbTree <- train(SalePrice~., data=train_selected_reform, trControl=train_control, method="xgbTree")
# make prediction on the train data
prediction <- predict(xgbTree, train_selected_reform)
binded <- cbind(train_selected_reform, prediction)

Calculate the rmse

res <- binded$SalePrice - prediction
rmse <- sqrt(mean(res ^ 2))
print(rmse)
## [1] 0.09031233

*learn more about how to tune a Xgboost here https://github.com/topepo/caret/blob/master/RegressionTests/Code/xgbTree.R




Find the data and the code on Kaggle or on Github



