original data set is on Kaggle
this project is for the individual project, not for commercial.
Here’s a brief version of what you’ll find in the data description file.
SalePrice: the property’s sale price in dollars. This is the target variable that you’re trying to predict.
MSSubClass: The building class
MSZoning: The general zoning classification
LotFrontage: Linear feet of street connected to property
LotArea: Lot size in square feet
Street: Type of road access
Alley: Type of alley access
LotShape: General shape of property
LandContour: Flatness of the property
Utilities: Type of utilities available
LotConfig: Lot configuration
LandSlope: Slope of property
Neighborhood: Physical locations within Ames city limits
Condition1: Proximity to main road or railroad
Condition2: Proximity to main road or railroad (if a second is present)
BldgType: Type of dwelling
HouseStyle: Style of dwelling
OverallQual: Overall material and finish quality
OverallCond: Overall condition rating
YearBuilt: Original construction date
YearRemodAdd: Remodel date
RoofStyle: Type of roof
RoofMatl: Roof material
Exterior1st: Exterior covering on house
Exterior2nd: Exterior covering on house (if more than one material)
MasVnrType: Masonry veneer type
MasVnrArea: Masonry veneer area in square feet
ExterQual: Exterior material quality
ExterCond: Present condition of the material on the exterior
Foundation: Type of foundation
BsmtQual: Height of the basement
BsmtCond: General condition of the basement
BsmtExposure: Walkout or garden level basement walls
BsmtFinType1: Quality of basement finished area
BsmtFinSF1: Type 1 finished square feet
BsmtFinType2: Quality of second finished area (if present)
BsmtFinSF2: Type 2 finished square feet
BsmtUnfSF: Unfinished square feet of basement area
TotalBsmtSF: Total square feet of basement area
Heating: Type of heating
HeatingQC: Heating quality and condition
CentralAir: Central air conditioning
Electrical: Electrical system
1stFlrSF: First Floor square feet
2ndFlrSF: Second floor square feet
LowQualFinSF: Low quality finished square feet (all floors)
GrLivArea: Above grade (ground) living area square feet
BsmtFullBath: Basement full bathrooms
BsmtHalfBath: Basement half bathrooms
FullBath: Full bathrooms above grade
HalfBath: Half baths above grade
Bedroom: Number of bedrooms above basement level
Kitchen: Number of kitchens
KitchenQual: Kitchen quality
TotRmsAbvGrd: Total rooms above grade (does not include bathrooms)
Functional: Home functionality rating
Fireplaces: Number of fireplaces
FireplaceQu: Fireplace quality
GarageType: Garage location
GarageYrBlt: Year garage was built
GarageFinish: Interior finish of the garage
GarageCars: Size of garage in car capacity
GarageArea: Size of garage in square feet
GarageQual: Garage quality
GarageCond: Garage condition
PavedDrive: Paved driveway
WoodDeckSF: Wood deck area in square feet
OpenPorchSF: Open porch area in square feet
EnclosedPorch: Enclosed porch area in square feet
3SsnPorch: Three season porch area in square feet
ScreenPorch: Screen porch area in square feet
PoolArea: Pool area in square feet
PoolQC: Pool quality
Fence: Fence quality
MiscFeature: Miscellaneous feature not covered in other categories
MiscVal: $Value of miscellaneous feature
MoSold: Month Sold
YrSold: Year Sold
SaleType: Type of sale
SaleCondition: Condition of sale
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──
## ✓ ggplot2 3.3.5 ✓ purrr 0.3.4
## ✓ tibble 3.1.6 ✓ dplyr 1.0.8
## ✓ tidyr 1.2.0 ✓ stringr 1.4.0
## ✓ readr 2.1.2 ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(caret)
## Loading required package: lattice
##
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
##
## lift
library(ggpubr)
data_df <- read.csv("~/Desktop/r_houseprice/house/train.csv")
glimpse(data_df)
## Rows: 1,460
## Columns: 81
## $ Id <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 1…
## $ MSSubClass <int> 60, 20, 60, 70, 60, 50, 20, 60, 50, 190, 20, 60, 20, 20,…
## $ MSZoning <chr> "RL", "RL", "RL", "RL", "RL", "RL", "RL", "RL", "RM", "R…
## $ LotFrontage <int> 65, 80, 68, 60, 84, 85, 75, NA, 51, 50, 70, 85, NA, 91, …
## $ LotArea <int> 8450, 9600, 11250, 9550, 14260, 14115, 10084, 10382, 612…
## $ Street <chr> "Pave", "Pave", "Pave", "Pave", "Pave", "Pave", "Pave", …
## $ Alley <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ LotShape <chr> "Reg", "Reg", "IR1", "IR1", "IR1", "IR1", "Reg", "IR1", …
## $ LandContour <chr> "Lvl", "Lvl", "Lvl", "Lvl", "Lvl", "Lvl", "Lvl", "Lvl", …
## $ Utilities <chr> "AllPub", "AllPub", "AllPub", "AllPub", "AllPub", "AllPu…
## $ LotConfig <chr> "Inside", "FR2", "Inside", "Corner", "FR2", "Inside", "I…
## $ LandSlope <chr> "Gtl", "Gtl", "Gtl", "Gtl", "Gtl", "Gtl", "Gtl", "Gtl", …
## $ Neighborhood <chr> "CollgCr", "Veenker", "CollgCr", "Crawfor", "NoRidge", "…
## $ Condition1 <chr> "Norm", "Feedr", "Norm", "Norm", "Norm", "Norm", "Norm",…
## $ Condition2 <chr> "Norm", "Norm", "Norm", "Norm", "Norm", "Norm", "Norm", …
## $ BldgType <chr> "1Fam", "1Fam", "1Fam", "1Fam", "1Fam", "1Fam", "1Fam", …
## $ HouseStyle <chr> "2Story", "1Story", "2Story", "2Story", "2Story", "1.5Fi…
## $ OverallQual <int> 7, 6, 7, 7, 8, 5, 8, 7, 7, 5, 5, 9, 5, 7, 6, 7, 6, 4, 5,…
## $ OverallCond <int> 5, 8, 5, 5, 5, 5, 5, 6, 5, 6, 5, 5, 6, 5, 5, 8, 7, 5, 5,…
## $ YearBuilt <int> 2003, 1976, 2001, 1915, 2000, 1993, 2004, 1973, 1931, 19…
## $ YearRemodAdd <int> 2003, 1976, 2002, 1970, 2000, 1995, 2005, 1973, 1950, 19…
## $ RoofStyle <chr> "Gable", "Gable", "Gable", "Gable", "Gable", "Gable", "G…
## $ RoofMatl <chr> "CompShg", "CompShg", "CompShg", "CompShg", "CompShg", "…
## $ Exterior1st <chr> "VinylSd", "MetalSd", "VinylSd", "Wd Sdng", "VinylSd", "…
## $ Exterior2nd <chr> "VinylSd", "MetalSd", "VinylSd", "Wd Shng", "VinylSd", "…
## $ MasVnrType <chr> "BrkFace", "None", "BrkFace", "None", "BrkFace", "None",…
## $ MasVnrArea <int> 196, 0, 162, 0, 350, 0, 186, 240, 0, 0, 0, 286, 0, 306, …
## $ ExterQual <chr> "Gd", "TA", "Gd", "TA", "Gd", "TA", "Gd", "TA", "TA", "T…
## $ ExterCond <chr> "TA", "TA", "TA", "TA", "TA", "TA", "TA", "TA", "TA", "T…
## $ Foundation <chr> "PConc", "CBlock", "PConc", "BrkTil", "PConc", "Wood", "…
## $ BsmtQual <chr> "Gd", "Gd", "Gd", "TA", "Gd", "Gd", "Ex", "Gd", "TA", "T…
## $ BsmtCond <chr> "TA", "TA", "TA", "Gd", "TA", "TA", "TA", "TA", "TA", "T…
## $ BsmtExposure <chr> "No", "Gd", "Mn", "No", "Av", "No", "Av", "Mn", "No", "N…
## $ BsmtFinType1 <chr> "GLQ", "ALQ", "GLQ", "ALQ", "GLQ", "GLQ", "GLQ", "ALQ", …
## $ BsmtFinSF1 <int> 706, 978, 486, 216, 655, 732, 1369, 859, 0, 851, 906, 99…
## $ BsmtFinType2 <chr> "Unf", "Unf", "Unf", "Unf", "Unf", "Unf", "Unf", "BLQ", …
## $ BsmtFinSF2 <int> 0, 0, 0, 0, 0, 0, 0, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ BsmtUnfSF <int> 150, 284, 434, 540, 490, 64, 317, 216, 952, 140, 134, 17…
## $ TotalBsmtSF <int> 856, 1262, 920, 756, 1145, 796, 1686, 1107, 952, 991, 10…
## $ Heating <chr> "GasA", "GasA", "GasA", "GasA", "GasA", "GasA", "GasA", …
## $ HeatingQC <chr> "Ex", "Ex", "Ex", "Gd", "Ex", "Ex", "Ex", "Ex", "Gd", "E…
## $ CentralAir <chr> "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "…
## $ Electrical <chr> "SBrkr", "SBrkr", "SBrkr", "SBrkr", "SBrkr", "SBrkr", "S…
## $ X1stFlrSF <int> 856, 1262, 920, 961, 1145, 796, 1694, 1107, 1022, 1077, …
## $ X2ndFlrSF <int> 854, 0, 866, 756, 1053, 566, 0, 983, 752, 0, 0, 1142, 0,…
## $ LowQualFinSF <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ GrLivArea <int> 1710, 1262, 1786, 1717, 2198, 1362, 1694, 2090, 1774, 10…
## $ BsmtFullBath <int> 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 0, 1,…
## $ BsmtHalfBath <int> 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ FullBath <int> 2, 2, 2, 1, 2, 1, 2, 2, 2, 1, 1, 3, 1, 2, 1, 1, 1, 2, 1,…
## $ HalfBath <int> 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1,…
## $ BedroomAbvGr <int> 3, 3, 3, 3, 4, 1, 3, 3, 2, 2, 3, 4, 2, 3, 2, 2, 2, 2, 3,…
## $ KitchenAbvGr <int> 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1,…
## $ KitchenQual <chr> "Gd", "TA", "Gd", "Gd", "Gd", "TA", "Gd", "TA", "TA", "T…
## $ TotRmsAbvGrd <int> 8, 6, 6, 7, 9, 5, 7, 7, 8, 5, 5, 11, 4, 7, 5, 5, 5, 6, 6…
## $ Functional <chr> "Typ", "Typ", "Typ", "Typ", "Typ", "Typ", "Typ", "Typ", …
## $ Fireplaces <int> 0, 1, 1, 1, 1, 0, 1, 2, 2, 2, 0, 2, 0, 1, 1, 0, 1, 0, 0,…
## $ FireplaceQu <chr> NA, "TA", "TA", "Gd", "TA", NA, "Gd", "TA", "TA", "TA", …
## $ GarageType <chr> "Attchd", "Attchd", "Attchd", "Detchd", "Attchd", "Attch…
## $ GarageYrBlt <int> 2003, 1976, 2001, 1998, 2000, 1993, 2004, 1973, 1931, 19…
## $ GarageFinish <chr> "RFn", "RFn", "RFn", "Unf", "RFn", "Unf", "RFn", "RFn", …
## $ GarageCars <int> 2, 2, 2, 3, 3, 2, 2, 2, 2, 1, 1, 3, 1, 3, 1, 2, 2, 2, 2,…
## $ GarageArea <int> 548, 460, 608, 642, 836, 480, 636, 484, 468, 205, 384, 7…
## $ GarageQual <chr> "TA", "TA", "TA", "TA", "TA", "TA", "TA", "TA", "Fa", "G…
## $ GarageCond <chr> "TA", "TA", "TA", "TA", "TA", "TA", "TA", "TA", "TA", "T…
## $ PavedDrive <chr> "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "Y", "…
## $ WoodDeckSF <int> 0, 298, 0, 0, 192, 40, 255, 235, 90, 0, 0, 147, 140, 160…
## $ OpenPorchSF <int> 61, 0, 42, 35, 84, 30, 57, 204, 0, 4, 0, 21, 0, 33, 213,…
## $ EnclosedPorch <int> 0, 0, 0, 272, 0, 0, 0, 228, 205, 0, 0, 0, 0, 0, 176, 0, …
## $ X3SsnPorch <int> 0, 0, 0, 0, 0, 320, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
## $ ScreenPorch <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 176, 0, 0, 0, 0, 0, …
## $ PoolArea <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ PoolQC <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
## $ Fence <chr> NA, NA, NA, NA, NA, "MnPrv", NA, NA, NA, NA, NA, NA, NA,…
## $ MiscFeature <chr> NA, NA, NA, NA, NA, "Shed", NA, "Shed", NA, NA, NA, NA, …
## $ MiscVal <int> 0, 0, 0, 0, 0, 700, 0, 350, 0, 0, 0, 0, 0, 0, 0, 0, 700,…
## $ MoSold <int> 2, 5, 9, 2, 12, 10, 8, 11, 4, 1, 2, 7, 9, 8, 5, 7, 3, 10…
## $ YrSold <int> 2008, 2007, 2008, 2006, 2008, 2009, 2007, 2009, 2008, 20…
## $ SaleType <chr> "WD", "WD", "WD", "WD", "WD", "WD", "WD", "WD", "WD", "W…
## $ SaleCondition <chr> "Normal", "Normal", "Normal", "Abnorml", "Normal", "Norm…
## $ SalePrice <int> 208500, 181500, 223500, 140000, 250000, 143000, 307000, …
data_df %>%
ggplot(aes(SalePrice)) +
geom_histogram(color = "black", bins = 20, fill = "#b3cde0") +
theme_minimal()+
labs(title = "The SalePrice ranges of property")+
theme(plot.title = element_text(hjust = 0.5))
MSSubClass: Identifies the type of dwelling involved in the sale.
20 1-STORY 1946 & NEWER ALL STYLES
30 1-STORY 1945 & OLDER
40 1-STORY W/FINISHED ATTIC ALL AGES
45 1-1/2 STORY - UNFINISHED ALL AGES
50 1-1/2 STORY FINISHED ALL AGES
60 2-STORY 1946 & NEWER
70 2-STORY 1945 & OLDER
75 2-1/2 STORY ALL AGES
80 SPLIT OR MULTI-LEVEL
85 SPLIT FOYER
90 DUPLEX - ALL STYLES AND AGES
120 1-STORY PUD (Planned Unit Development) - 1946 & NEWER
150 1-1/2 STORY PUD - ALL AGES
160 2-STORY PUD - 1946 & NEWER
180 PUD - MULTILEVEL - INCL SPLIT LEV/FOYER
190 2 FAMILY CONVERSION - ALL STYLES AND AGES
data_df %>%
ggplot(aes(MSSubClass, SalePrice))+
geom_jitter(color = "#68a7d9") +
theme_minimal() +
labs(title = "The types of dwelling involved in SalePrice")+
theme(plot.title = element_text(hjust = 0.5))
MSZoning: Identifies the general zoning classification of the sale.
A Agriculture
C Commercial
FV Floating Village Residential
I Industrial
RH Residential High Density
RL Residential Low Density
RP Residential Low Density Park
RM Residential Medium Density
data_df %>%
ggplot(aes(MSZoning, SalePrice))+
geom_jitter(color = "#68a7d9") +
theme_minimal() +
labs(title = "The types of dwelling zones involved in SalePrice")+
theme(plot.title = element_text(hjust = 0.5))
Street: Type of road access to property
Grvl Gravel
Pave Paved
Alley: Type of alley access to property
Grvl Gravel
Pave Paved
NA No alley access
street <- data_df %>%
ggplot(aes( Street))+
geom_bar(color = "black", fill = "#b3cde0", alpha = 0.8) +
theme_minimal() +
labs(title = "The types of Street access to property")+
theme(plot.title = element_text(hjust = 0.5))
alley <- data_df %>%
ggplot(aes( Alley))+
geom_bar(color = "black", fill = "#b3cde0", alpha = 0.8) +
theme_minimal() +
labs(title = "The types of Alley access to property")+
theme(plot.title = element_text(hjust = 0.5))
ggarrange(street, alley)
data_df %>%
ggplot(aes( Street, SalePrice))+
geom_jitter(color = "#68a7d9") +
theme_minimal() +
facet_wrap(~Alley)+
labs(title = "The types of Street and Alley access to property involved in SalePrice",
subtitle = "Alley")+
theme(plot.title = element_text(hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5) )
LotShape: General shape of property
Reg Regular
IR1 Slightly irregular
IR2 Moderately Irregular
IR3 Irregular
LandContour: Flatness of the property
Lvl Near Flat/Level
Bnk Banked - Quick and significant rise from street grade to building
HLS Hillside - Significant slope from side to side
Low Depression
Utilities: Type of utilities available
AllPub All public Utilities (E,G,W,& S)
NoSeWa Electricity and Gas Only
LandSlope: Slope of property
Gtl Gentle slope
Mod Moderate Slope
Sev Severe Slope
lot_shape <- data_df %>%
ggplot(aes(LotShape, SalePrice))+
geom_col(color ="#b3cde0") +
theme_minimal() +
labs(title = "The general shape of property vs. SalePrice")+
theme(plot.title = element_text(hjust = 0.5))
land_contour <- data_df %>%
ggplot(aes(LandContour, SalePrice))+
geom_col(color = "#b3cde0") +
theme_minimal() +
labs(title = "The Flatness of the property vs. SalePrice")+
theme(plot.title = element_text(hjust = 0.5))
util <- data_df %>%
ggplot(aes(Utilities, SalePrice))+
geom_col(color = "#b3cde0") +
theme_minimal() +
labs(title = "The utilities available vs. SalePrice")+
theme(plot.title = element_text(hjust = 0.5))
land_slope <- data_df %>%
ggplot(aes(LandSlope, SalePrice))+
geom_col(color = "#b3cde0") +
theme_minimal() +
labs(title = "The slope of property vs. SalePrice")+
theme(plot.title = element_text(hjust = 0.5))
ggarrange(lot_shape, land_contour, util, land_slope + rremove("x.text"),
ncol = 2, nrow = 2)
Neighborhood: Physical locations within Ames city limits
Blmngtn Bloomington Heights
Blueste Bluestem
BrDale Briardale
BrkSide Brookside
ClearCr Clear Creek
CollgCr College Creek
Crawfor Crawford
Edwards Edwards
Gilbert Gilbert
IDOTRR Iowa DOT and Rail Road
MeadowV Meadow Village
Mitchel Mitchell
Names North Ames
NoRidge Northridge
NPkVill Northpark Villa
NridgHt Northridge Heights
NWAmes Northwest Ames
OldTown Old Town
SWISU South & West of Iowa State University
Sawyer Sawyer
SawyerW Sawyer West
Somerst Somerset
StoneBr Stone Brook
Timber Timberland
Veenker Veenker
data_df %>%
ggplot(aes(Neighborhood, SalePrice))+
geom_col(color = "#b3cde0") +
theme_minimal() +
labs(title = "The Physical locations within Ames city limits(Neighborhood) involved in SalePrice")+
theme(plot.title = element_text(hjust = 0.5),
axis.text.x = element_text(angle = 90, vjust = 0.5))
Condition1: Proximity to various conditions
Artery Adjacent to arterial street
Feedr Adjacent to feeder street
Norm Normal
RRNn Within 200' of North-South Railroad
RRAn Adjacent to North-South Railroad
PosN Near positive off-site feature--park, greenbelt, etc.
PosA Adjacent to postive off-site feature
RRNe Within 200' of East-West Railroad
RRAe Adjacent to East-West Railroad
Condition2: Proximity to various conditions (if more than one is present)
Artery Adjacent to arterial street
Feedr Adjacent to feeder street
Norm Normal
RRNn Within 200' of North-South Railroad
RRAn Adjacent to North-South Railroad
PosN Near positive off-site feature--park, greenbelt, etc.
PosA Adjacent to postive off-site feature
RRNe Within 200' of East-West Railroad
RRAe Adjacent to East-West Railroad
con1 <- data_df %>%
ggplot(aes(Condition1, SalePrice))+
geom_col(color = "#b3cde0") +
theme_minimal() +
labs(title = "The Condition1 vs. SalePrice")+
theme(plot.title = element_text(hjust = 0.5),
axis.text.x = element_text(angle = 90, vjust = 0.5))
con2 <- data_df %>%
ggplot(aes(Condition2, SalePrice))+
geom_col(color = "#b3cde0") +
theme_minimal() +
labs(title = "The Condition2 vs. SalePrice")+
theme(plot.title = element_text(hjust = 0.5),
axis.text.x = element_text(angle = 90, vjust = 0.5))
ggarrange(con1, con2)
BldgType: Type of dwelling
1Fam Single-family Detached
2FmCon Two-family Conversion; originally built as one-family dwelling
Duplx Duplex
TwnhsE Townhouse End Unit
TwnhsI Townhouse Inside Unit
data_df %>%
ggplot(aes(BldgType)) +
geom_bar(color = "black",fill = "#b3cde0") +
theme_minimal() +
labs(title = "The type of dwelling") +
theme(plot.title = element_text(hjust = 0.5))
HouseStyle: Style of dwelling
1Story One story
1.5Fin One and one-half story: 2nd level finished
1.5Unf One and one-half story: 2nd level unfinished
2Story Two story
2.5Fin Two and one-half story: 2nd level finished
2.5Unf Two and one-half story: 2nd level unfinished
SFoyer Split Foyer
SLvl Split Level
data_df %>%
ggplot(aes(HouseStyle)) +
geom_bar(color = "black",fill = "#b3cde0") +
theme_minimal() +
labs(title = "The style of dwelling") +
theme(plot.title = element_text(hjust = 0.5))
data_df %>%
ggplot(aes(BldgType, SalePrice)) +
geom_boxplot(fill = "#b3cde0") +
theme_minimal() +
labs(title = "The type of dwelling vs. SalePrice") +
theme(plot.title = element_text(hjust = 0.5))
data_df %>%
ggplot(aes(HouseStyle, SalePrice)) +
geom_boxplot(fill = "#b3cde0") +
theme_minimal() +
labs(title = "The style of dwelling vs. SalePrice") +
theme(plot.title = element_text(hjust = 0.5))
OverallQual: Rates the overall material and finish of the house
10 Very Excellent
9 Excellent
8 Very Good
7 Good
6 Above Average
5 Average
4 Below Average
3 Fair
2 Poor
1 Very Poor
OverallCond: Rates the overall condition of the house
10 Very Excellent
9 Excellent
8 Very Good
7 Good
6 Above Average
5 Average
4 Below Average
3 Fair
2 Poor
1 Very Poor
data_df %>%
ggplot(aes(OverallQual)) +
geom_bar(color = "black", fill = "#b3cde0") +
theme_minimal() +
labs(title = "Rates the overall material and finish of the house") +
theme(plot.title = element_text(hjust = 0.5))
data_df %>%
ggplot(aes(OverallCond)) +
geom_bar(color = "black", fill = "#b3cde0") +
theme_minimal() +
labs(title = "Rates the overall condition of the house") +
theme(plot.title = element_text(hjust = 0.5))
data_df %>%
ggplot(aes(OverallQual, SalePrice)) +
geom_point(color = "#7db0df") +
geom_rug(color = "#6b7b86") +
theme_minimal() +
labs(title = "Rates the overall material and finish of the house") +
theme(plot.title = element_text(hjust = 0.5))
data_df %>%
ggplot(aes(OverallCond, SalePrice)) +
geom_point(color = "#7db0df") +
geom_rug(color = "#6b7b86") +
theme_minimal() +
labs(title = "Rates the overall condition of the house") +
theme(plot.title = element_text(hjust = 0.5))
data_df %>%
ggplot(aes(OverallQual, SalePrice)) +
geom_col(color = "#6fa8dc") +
facet_grid(~MSZoning)+
theme_minimal() +
labs(title = "Zone of property with overall quality and sale price",
subtitle = "Zoning Classification of property sale") +
theme(plot.title = element_text(hjust = 0.5))
YearBuilt: Original construction date
YearRemodAdd: Remodel date (same as construction date if no remodeling or additions)
y_original <- data_df %>%
ggplot(aes(YearBuilt, fill = YearRemodAdd)) +
geom_histogram(bins = 15, color = "black", fill = "#b3cde0") +
theme_minimal() +
labs(title = "Original construction date")+
theme(plot.title = element_text(hjust = 0.5))
y_remodel <- data_df %>%
ggplot(aes(YearRemodAdd)) +
geom_histogram(bins = 15, color = "black", fill = "#b3cde0") +
theme_minimal() +
labs(title = "Remodel date")+
theme(plot.title = element_text(hjust = 0.5))
ggarrange(y_original, y_remodel)
RoofStyle: Type of roof
Flat Flat
Gable Gable
Gambrel Gabrel (Barn)
Hip Hip
Mansard Mansard
Shed Shed
RoofMatl: Roof material
ClyTile Clay or Tile
CompShg Standard (Composite) Shingle
Membran Membrane
Metal Metal
Roll Roll
Tar&Grv Gravel & Tar
WdShake Wood Shakes
WdShngl Wood Shingles
r_style <- data_df %>%
ggplot(aes(RoofStyle, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Type of roof vs SalePrice") +
theme(plot.title = element_text(hjust = 0.5),
axis.text.x = element_text(angle = 90, vjust = 0.5))
r_matl <- data_df %>%
ggplot(aes(RoofMatl, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Material of roof vs SalePrice") +
theme(plot.title = element_text(hjust = 0.5),
axis.text.x = element_text(angle = 90, vjust = 0.5))
ggarrange(r_style, r_matl,nrow = 1, ncol = 2 )
Foundation: Type of foundation
BrkTil Brick & Tile
CBlock Cinder Block
PConc Poured Contrete
Slab Slab
Stone Stone
Wood Wood
KitchenQual: Kitchen quality
Ex Excellent
Gd Good
TA Typical/Average
Fa Fair
Po Poor
f_style <- data_df %>%
ggplot(aes(Foundation, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Type of foundation") +
theme(plot.title = element_text(hjust = 0.5))
kitchen <- data_df %>%
ggplot(aes(KitchenQual, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Kitchen quality") +
theme(plot.title = element_text(hjust = 0.5))
ggarrange(f_style, kitchen, nrow = 1, ncol = 2)
Fireplaces: Number of fireplaces
FireplaceQu: Fireplace quality
Ex Excellent - Exceptional Masonry Fireplace
Gd Good - Masonry Fireplace in main level
TA Average - Prefabricated Fireplace in main living area or Masonry Fireplace in basement
Fa Fair - Prefabricated Fireplace in basement
Po Poor - Ben Franklin Stove
NA No Fireplace
data_df %>%
ggplot(aes(Fireplaces)) +
geom_bar(color = "black", fill = "#b3cde0")+
theme_minimal() +
facet_wrap(~MSZoning)+
labs(title = "Fireplace Number in Zoning") +
theme(plot.title = element_text(hjust = 0.5))
data_df %>%
ggplot(aes(FireplaceQu, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Fireplace quality") +
theme(plot.title = element_text(hjust = 0.5))
GarageType: Garage location
2Types More than one type of garage
Attchd Attached to home
Basment Basement Garage
BuiltIn Built-In (Garage part of house - typically has room above garage)
CarPort Car Port
Detchd Detached from home
NA No Garage
GarageFinish: Interior finish of the garage
Fin Finished
RFn Rough Finished
Unf Unfinished
No No Garage
GarageQual: Garage quality
Ex Excellent
Gd Good
TA Typical/Average
Fa Fair
Po Poor
NA No Garage
GarageCond: Garage condition
Ex Excellent
Gd Good
TA Typical/Average
Fa Fair
Po Poor
NA No Garage
garage_t <- data_df %>%
ggplot(aes(GarageType, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Garage type vs Price") +
theme(plot.title = element_text(hjust = 0.5),
axis.text.x = element_text(angle = 90, vjust = 0.5))
garage_f <- data_df %>%
ggplot(aes(GarageFinish, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Garage finish interior vs Price") +
theme(plot.title = element_text(hjust = 0.5))
garage_q <- data_df %>%
ggplot(aes(GarageQual, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Garage quality vs Price") +
theme(plot.title = element_text(hjust = 0.5))
garage_c <- data_df %>%
ggplot(aes(GarageCond, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Garage condition vs Price") +
theme(plot.title = element_text(hjust = 0.5))
ggarrange(garage_t, garage_f, garage_q, garage_c, nrow = 2, ncol = 2 )
PoolQC: Pool quality
Ex Excellent
Gd Good
TA Average/Typical
Fa Fair
NA No Pool
Fence: Fence quality
GdPrv Good Privacy
MnPrv Minimum Privacy
GdWo Good Wood
MnWw Minimum Wood/Wire
NA No Fence
pool <- data_df %>%
ggplot(aes(PoolQC, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Pool quality vs SalePrice") +
theme(plot.title = element_text(hjust = 0.5))
fence <- data_df %>%
ggplot(aes(Fence, SalePrice)) +
geom_boxplot(fill = "#b3cde0")+
theme_minimal() +
labs(title = "Fence quality vs SalePrice") +
theme(plot.title = element_text(hjust = 0.5))
ggarrange( pool, fence,nrow = 1, ncol = 2 )
SaleType: Type of sale
WD Warranty Deed - Conventional
CWD Warranty Deed - Cash
VWD Warranty Deed - VA Loan
New Home just constructed and sold
COD Court Officer Deed/Estate
Con Contract 15% Down payment regular terms
ConLw Contract Low Down payment and low interest
ConLI Contract Low Interest
ConLD Contract Low Down
Oth Other
data_df %>%
ggplot(aes(SaleType, SalePrice)) +
geom_col(color = "#6fa8dc") +
facet_grid(~MSZoning)+
theme_minimal() +
labs(title = "Sale type vs sale price")+
theme(plot.title = element_text(hjust = 0.5),
axis.text.x = element_text(angle = 90, vjust =0.5))
SaleCondition: Condition of sale
Normal Normal Sale
Abnorml Abnormal Sale - trade, foreclosure, short sale
AdjLand Adjoining Land Purchase
Alloca Allocation - two linked properties with separate deeds, typically condo with a garage unit
Family Sale between family members
Partial Home was not completed when last assessed (associated with New Homes)
data_df %>%
ggplot(aes(SaleCondition, SalePrice)) +
geom_col(color = "#6fa8dc") +
facet_grid(~MSZoning)+
theme_minimal() +
labs(title = "Zone of property with sale condition and sale price",
subtitle = "Zoning Classification of property sale") +
theme(plot.title = element_text(hjust = 0.5),
plot.subtitle = element_text(hjust = 0.5),
axis.text.x = element_text(angle = 90, vjust = 0.5))
Check NA in both data sets.
If having NA in numeric column, 0 will be assigned instead of NA.
If having NA in character column, “No” will be assigned instead of NA.
mean(complete.cases(data_df))
## [1] 0
data_id <- data_df$Id
# normalization [0,1] for all numeric data
process_norm <- preProcess(as.data.frame(data_df), method = c('range'))
data_df <- predict(process_norm, as.data.frame(data_df))
data_df <- data_df %>% mutate_if(is.numeric, ~replace_na(.,0))
data_df <- data_df %>% mutate_if(is.character, ~replace_na(.,"None"))
data_df <- select(data_df, -1)
mean(complete.cases(data_df))
## [1] 1
data_df <- data_df %>%
mutate(across(where(is.character), as.factor))
glimpse(data_df)
## Rows: 1,460
## Columns: 80
## $ MSSubClass <dbl> 0.2352941, 0.0000000, 0.2352941, 0.2941176, 0.2352941, 0…
## $ MSZoning <fct> RL, RL, RL, RL, RL, RL, RL, RL, RM, RL, RL, RL, RL, RL, …
## $ LotFrontage <dbl> 0.15068493, 0.20205479, 0.16095890, 0.13356164, 0.215753…
## $ LotArea <dbl> 0.03341980, 0.03879502, 0.04650728, 0.03856131, 0.060576…
## $ Street <fct> Pave, Pave, Pave, Pave, Pave, Pave, Pave, Pave, Pave, Pa…
## $ Alley <fct> None, None, None, None, None, None, None, None, None, No…
## $ LotShape <fct> Reg, Reg, IR1, IR1, IR1, IR1, Reg, IR1, Reg, Reg, Reg, I…
## $ LandContour <fct> Lvl, Lvl, Lvl, Lvl, Lvl, Lvl, Lvl, Lvl, Lvl, Lvl, Lvl, L…
## $ Utilities <fct> AllPub, AllPub, AllPub, AllPub, AllPub, AllPub, AllPub, …
## $ LotConfig <fct> Inside, FR2, Inside, Corner, FR2, Inside, Inside, Corner…
## $ LandSlope <fct> Gtl, Gtl, Gtl, Gtl, Gtl, Gtl, Gtl, Gtl, Gtl, Gtl, Gtl, G…
## $ Neighborhood <fct> CollgCr, Veenker, CollgCr, Crawfor, NoRidge, Mitchel, So…
## $ Condition1 <fct> Norm, Feedr, Norm, Norm, Norm, Norm, Norm, PosN, Artery,…
## $ Condition2 <fct> Norm, Norm, Norm, Norm, Norm, Norm, Norm, Norm, Norm, Ar…
## $ BldgType <fct> 1Fam, 1Fam, 1Fam, 1Fam, 1Fam, 1Fam, 1Fam, 1Fam, 1Fam, 2f…
## $ HouseStyle <fct> 2Story, 1Story, 2Story, 2Story, 2Story, 1.5Fin, 1Story, …
## $ OverallQual <dbl> 0.6666667, 0.5555556, 0.6666667, 0.6666667, 0.7777778, 0…
## $ OverallCond <dbl> 0.500, 0.875, 0.500, 0.500, 0.500, 0.500, 0.500, 0.625, …
## $ YearBuilt <dbl> 0.9492754, 0.7536232, 0.9347826, 0.3115942, 0.9275362, 0…
## $ YearRemodAdd <dbl> 0.8833333, 0.4333333, 0.8666667, 0.3333333, 0.8333333, 0…
## $ RoofStyle <fct> Gable, Gable, Gable, Gable, Gable, Gable, Gable, Gable, …
## $ RoofMatl <fct> CompShg, CompShg, CompShg, CompShg, CompShg, CompShg, Co…
## $ Exterior1st <fct> VinylSd, MetalSd, VinylSd, Wd Sdng, VinylSd, VinylSd, Vi…
## $ Exterior2nd <fct> VinylSd, MetalSd, VinylSd, Wd Shng, VinylSd, VinylSd, Vi…
## $ MasVnrType <fct> BrkFace, None, BrkFace, None, BrkFace, None, Stone, Ston…
## $ MasVnrArea <dbl> 0.122500, 0.000000, 0.101250, 0.000000, 0.218750, 0.0000…
## $ ExterQual <fct> Gd, TA, Gd, TA, Gd, TA, Gd, TA, TA, TA, TA, Ex, TA, Gd, …
## $ ExterCond <fct> TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, …
## $ Foundation <fct> PConc, CBlock, PConc, BrkTil, PConc, Wood, PConc, CBlock…
## $ BsmtQual <fct> Gd, Gd, Gd, TA, Gd, Gd, Ex, Gd, TA, TA, TA, Ex, TA, Gd, …
## $ BsmtCond <fct> TA, TA, TA, Gd, TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, …
## $ BsmtExposure <fct> No, Gd, Mn, No, Av, No, Av, Mn, No, No, No, No, No, Av, …
## $ BsmtFinType1 <fct> GLQ, ALQ, GLQ, ALQ, GLQ, GLQ, GLQ, ALQ, Unf, GLQ, Rec, G…
## $ BsmtFinSF1 <dbl> 0.12508859, 0.17328136, 0.08610914, 0.03827073, 0.116052…
## $ BsmtFinType2 <fct> Unf, Unf, Unf, Unf, Unf, Unf, Unf, BLQ, Unf, Unf, Unf, U…
## $ BsmtFinSF2 <dbl> 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.000000…
## $ BsmtUnfSF <dbl> 0.06421233, 0.12157534, 0.18578767, 0.23116438, 0.209760…
## $ TotalBsmtSF <dbl> 0.1400982, 0.2065466, 0.1505728, 0.1237316, 0.1873977, 0…
## $ Heating <fct> GasA, GasA, GasA, GasA, GasA, GasA, GasA, GasA, GasA, Ga…
## $ HeatingQC <fct> Ex, Ex, Ex, Gd, Ex, Ex, Ex, Ex, Gd, Ex, Ex, Ex, TA, Ex, …
## $ CentralAir <fct> Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y,…
## $ Electrical <fct> SBrkr, SBrkr, SBrkr, SBrkr, SBrkr, SBrkr, SBrkr, SBrkr, …
## $ X1stFlrSF <dbl> 0.1197797, 0.2129417, 0.1344654, 0.1438733, 0.1860945, 0…
## $ X2ndFlrSF <dbl> 0.4135593, 0.0000000, 0.4193705, 0.3661017, 0.5099274, 0…
## $ LowQualFinSF <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ GrLivArea <dbl> 0.25923135, 0.17483044, 0.27354936, 0.26055011, 0.351168…
## $ BsmtFullBath <dbl> 0.3333333, 0.0000000, 0.3333333, 0.3333333, 0.3333333, 0…
## $ BsmtHalfBath <dbl> 0.0, 0.5, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0…
## $ FullBath <dbl> 0.6666667, 0.6666667, 0.6666667, 0.3333333, 0.6666667, 0…
## $ HalfBath <dbl> 0.5, 0.0, 0.5, 0.0, 0.5, 0.5, 0.0, 0.5, 0.0, 0.0, 0.0, 0…
## $ BedroomAbvGr <dbl> 0.375, 0.375, 0.375, 0.375, 0.500, 0.125, 0.375, 0.375, …
## $ KitchenAbvGr <dbl> 0.3333333, 0.3333333, 0.3333333, 0.3333333, 0.3333333, 0…
## $ KitchenQual <fct> Gd, TA, Gd, Gd, Gd, TA, Gd, TA, TA, TA, TA, Ex, TA, Gd, …
## $ TotRmsAbvGrd <dbl> 0.5000000, 0.3333333, 0.3333333, 0.4166667, 0.5833333, 0…
## $ Functional <fct> Typ, Typ, Typ, Typ, Typ, Typ, Typ, Typ, Min1, Typ, Typ, …
## $ Fireplaces <dbl> 0.0000000, 0.3333333, 0.3333333, 0.3333333, 0.3333333, 0…
## $ FireplaceQu <fct> None, TA, TA, Gd, TA, None, Gd, TA, TA, TA, None, Gd, No…
## $ GarageType <fct> Attchd, Attchd, Attchd, Detchd, Attchd, Attchd, Attchd, …
## $ GarageYrBlt <dbl> 0.9363636, 0.6909091, 0.9181818, 0.8909091, 0.9090909, 0…
## $ GarageFinish <fct> RFn, RFn, RFn, Unf, RFn, Unf, RFn, RFn, Unf, RFn, Unf, F…
## $ GarageCars <dbl> 0.50, 0.50, 0.50, 0.75, 0.75, 0.50, 0.50, 0.50, 0.50, 0.…
## $ GarageArea <dbl> 0.3864598, 0.3244006, 0.4287729, 0.4527504, 0.5895628, 0…
## $ GarageQual <fct> TA, TA, TA, TA, TA, TA, TA, TA, Fa, Gd, TA, TA, TA, TA, …
## $ GarageCond <fct> TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, TA, …
## $ PavedDrive <fct> Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y, Y,…
## $ WoodDeckSF <dbl> 0.00000000, 0.34772462, 0.00000000, 0.00000000, 0.224037…
## $ OpenPorchSF <dbl> 0.111517367, 0.000000000, 0.076782450, 0.063985375, 0.15…
## $ EnclosedPorch <dbl> 0.0000000, 0.0000000, 0.0000000, 0.4927536, 0.0000000, 0…
## $ X3SsnPorch <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0…
## $ ScreenPorch <dbl> 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0.0000000, 0…
## $ PoolArea <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,…
## $ PoolQC <fct> None, None, None, None, None, None, None, None, None, No…
## $ Fence <fct> None, None, None, None, None, MnPrv, None, None, None, N…
## $ MiscFeature <fct> None, None, None, None, None, Shed, None, Shed, None, No…
## $ MiscVal <dbl> 0.00000000, 0.00000000, 0.00000000, 0.00000000, 0.000000…
## $ MoSold <dbl> 0.09090909, 0.36363636, 0.72727273, 0.09090909, 1.000000…
## $ YrSold <dbl> 0.50, 0.25, 0.50, 0.00, 0.50, 0.75, 0.25, 0.75, 0.50, 0.…
## $ SaleType <fct> WD, WD, WD, WD, WD, WD, WD, WD, WD, WD, WD, New, WD, New…
## $ SaleCondition <fct> Normal, Normal, Normal, Abnorml, Normal, Normal, Normal,…
## $ SalePrice <dbl> 0.24107763, 0.20358284, 0.26190807, 0.14595195, 0.298708…
80% train data
20% test data
set.seed(55)
nrow(data_df)
## [1] 1460
id <- createDataPartition(y = data_df$SalePrice,
p = 0.8,
list = FALSE)
train_df <- data_df[id, ]
test_df <- data_df[-id, ]
nrow(train_df)
## [1] 1169
nrow(test_df)
## [1] 291
set.seed(55)
control <- trainControl(method = "repeatedcv",
repeats = 5,
number = 5,
verboseIter = FALSE)
lm_model <- train(SalePrice ~. ,
data = train_df,
method = "lm",
trControl = control)
lm_model
## Linear Regression
##
## 1169 samples
## 79 predictor
##
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 5 times)
## Summary of sample sizes: 934, 935, 935, 936, 936, 936, ...
## Resampling results:
##
## RMSE Rsquared MAE
## 0.07895367 0.6360975 0.03034698
##
## Tuning parameter 'intercept' was held constant at a value of TRUE
set.seed(55)
control <- trainControl(method = "repeatedcv",
number = 5,
repeats = 5,
verboseIter = FALSE)
rf_model <- train(SalePrice ~. ,
data = train_df,
method = "rf",
trControl = control)
rf_model
## Random Forest
##
## 1169 samples
## 79 predictor
##
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 5 times)
## Summary of sample sizes: 934, 935, 935, 936, 936, 936, ...
## Resampling results across tuning parameters:
##
## mtry RMSE Rsquared MAE
## 2 0.06774708 0.7727620 0.04299229
## 131 0.04076242 0.8692313 0.02358575
## 260 0.04167289 0.8609170 0.02447494
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was mtry = 131.
set.seed(55)
control <- trainControl(method = "repeatedcv",
number = 5,
repeats = 5,
verboseIter = FALSE)
grid = expand.grid(alpha = 1,
lambda = seq(0.001, 0.1, by = 0.0002))
lasso_model <- train(SalePrice ~. ,
data = train_df,
method = "glmnet",
trControl = control,
tuneGrid = grid)
lasso_model
## glmnet
##
## 1169 samples
## 79 predictor
##
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 5 times)
## Summary of sample sizes: 934, 935, 935, 936, 936, 936, ...
## Resampling results across tuning parameters:
##
## lambda RMSE Rsquared MAE
## 0.0010 0.05226546 0.7849217 0.02610218
## 0.0012 0.05207359 0.7857421 0.02623435
## 0.0014 0.05190789 0.7865095 0.02640744
## 0.0016 0.05175243 0.7873268 0.02655905
## 0.0018 0.05160822 0.7881541 0.02669449
## 0.0020 0.05147554 0.7889570 0.02682417
## 0.0022 0.05136430 0.7896807 0.02697084
## 0.0024 0.05128917 0.7901747 0.02713268
## 0.0026 0.05123168 0.7906097 0.02728465
## 0.0028 0.05118295 0.7910329 0.02742169
## 0.0030 0.05114188 0.7914336 0.02753800
## 0.0032 0.05111793 0.7917323 0.02764722
## 0.0034 0.05110687 0.7919780 0.02775666
## 0.0036 0.05110274 0.7922053 0.02787167
## 0.0038 0.05111690 0.7923209 0.02799838
## 0.0040 0.05114556 0.7923279 0.02813092
## 0.0042 0.05119038 0.7922295 0.02826661
## 0.0044 0.05125279 0.7919964 0.02840915
## 0.0046 0.05132559 0.7916825 0.02854963
## 0.0048 0.05141128 0.7912749 0.02869435
## 0.0050 0.05150414 0.7908201 0.02884708
## 0.0052 0.05159512 0.7903932 0.02898953
## 0.0054 0.05169268 0.7899258 0.02913321
## 0.0056 0.05179221 0.7894537 0.02926440
## 0.0058 0.05189805 0.7889437 0.02939244
## 0.0060 0.05200396 0.7884392 0.02951537
## 0.0062 0.05211556 0.7878903 0.02963836
## 0.0064 0.05223353 0.7872896 0.02976349
## 0.0066 0.05235501 0.7866649 0.02988964
## 0.0068 0.05248017 0.7860107 0.03001732
## 0.0070 0.05260881 0.7853175 0.03014629
## 0.0072 0.05273427 0.7846379 0.03027608
## 0.0074 0.05284954 0.7840486 0.03040061
## 0.0076 0.05296858 0.7834229 0.03052606
## 0.0078 0.05309234 0.7827547 0.03065536
## 0.0080 0.05321002 0.7821345 0.03078241
## 0.0082 0.05332552 0.7815290 0.03090880
## 0.0084 0.05344313 0.7809028 0.03103547
## 0.0086 0.05355929 0.7802871 0.03115803
## 0.0088 0.05367159 0.7797028 0.03127742
## 0.0090 0.05378306 0.7791196 0.03139517
## 0.0092 0.05389777 0.7785053 0.03151345
## 0.0094 0.05401319 0.7778790 0.03163160
## 0.0096 0.05412754 0.7772642 0.03174674
## 0.0098 0.05424502 0.7766278 0.03186104
## 0.0100 0.05436434 0.7759716 0.03197587
## 0.0102 0.05448538 0.7752998 0.03209068
## 0.0104 0.05460844 0.7746180 0.03220548
## 0.0106 0.05473312 0.7739290 0.03231859
## 0.0108 0.05486168 0.7732052 0.03243159
## 0.0110 0.05499315 0.7724513 0.03254525
## 0.0112 0.05512860 0.7716560 0.03266115
## 0.0114 0.05526693 0.7708326 0.03277540
## 0.0116 0.05540622 0.7700080 0.03288943
## 0.0118 0.05554635 0.7691817 0.03300400
## 0.0120 0.05568685 0.7683536 0.03311758
## 0.0122 0.05582886 0.7675125 0.03323164
## 0.0124 0.05597165 0.7666710 0.03334654
## 0.0126 0.05611326 0.7658458 0.03346065
## 0.0128 0.05625259 0.7650466 0.03357103
## 0.0130 0.05639313 0.7642360 0.03368194
## 0.0132 0.05653575 0.7634051 0.03379334
## 0.0134 0.05668004 0.7625595 0.03390518
## 0.0136 0.05682367 0.7617308 0.03401522
## 0.0138 0.05696631 0.7609149 0.03412371
## 0.0140 0.05710522 0.7601386 0.03422855
## 0.0142 0.05724294 0.7593773 0.03433276
## 0.0144 0.05738185 0.7586044 0.03443815
## 0.0146 0.05752075 0.7578302 0.03454377
## 0.0148 0.05766066 0.7570473 0.03464960
## 0.0150 0.05779924 0.7562806 0.03475404
## 0.0152 0.05793679 0.7555234 0.03485804
## 0.0154 0.05806982 0.7548128 0.03496040
## 0.0156 0.05820357 0.7540939 0.03506403
## 0.0158 0.05833765 0.7533705 0.03516774
## 0.0160 0.05847228 0.7526401 0.03527160
## 0.0162 0.05860676 0.7519102 0.03537543
## 0.0164 0.05873751 0.7512171 0.03547722
## 0.0166 0.05886234 0.7505885 0.03557586
## 0.0168 0.05898390 0.7499969 0.03567407
## 0.0170 0.05910481 0.7494155 0.03577302
## 0.0172 0.05922508 0.7488438 0.03587281
## 0.0174 0.05934621 0.7482631 0.03597383
## 0.0176 0.05946824 0.7476730 0.03607625
## 0.0178 0.05959034 0.7470819 0.03617881
## 0.0180 0.05971023 0.7465170 0.03628039
## 0.0182 0.05982758 0.7459853 0.03638092
## 0.0184 0.05994130 0.7455062 0.03647969
## 0.0186 0.06005231 0.7450664 0.03657636
## 0.0188 0.06016284 0.7446364 0.03667258
## 0.0190 0.06027365 0.7442072 0.03676887
## 0.0192 0.06038471 0.7437839 0.03686497
## 0.0194 0.06049656 0.7433568 0.03696148
## 0.0196 0.06060703 0.7429531 0.03705743
## 0.0198 0.06071647 0.7425729 0.03715312
## 0.0200 0.06082533 0.7422078 0.03724830
## 0.0202 0.06093109 0.7418977 0.03734134
## 0.0204 0.06103457 0.7416287 0.03743271
## 0.0206 0.06113828 0.7413646 0.03752451
## 0.0208 0.06124247 0.7411030 0.03761654
## 0.0210 0.06134657 0.7408514 0.03770848
## 0.0212 0.06145143 0.7405971 0.03780133
## 0.0214 0.06155665 0.7403452 0.03789504
## 0.0216 0.06166220 0.7401017 0.03798955
## 0.0218 0.06176723 0.7398756 0.03808334
## 0.0220 0.06187277 0.7396514 0.03817735
## 0.0222 0.06197740 0.7394519 0.03827059
## 0.0224 0.06208283 0.7392498 0.03836490
## 0.0226 0.06218916 0.7390436 0.03845995
## 0.0228 0.06229621 0.7388355 0.03855571
## 0.0230 0.06240411 0.7386244 0.03865208
## 0.0232 0.06251286 0.7384086 0.03874902
## 0.0234 0.06262244 0.7381891 0.03884700
## 0.0236 0.06273264 0.7379717 0.03894596
## 0.0238 0.06284373 0.7377500 0.03904633
## 0.0240 0.06295562 0.7375238 0.03914728
## 0.0242 0.06306833 0.7372931 0.03924905
## 0.0244 0.06318164 0.7370596 0.03935116
## 0.0246 0.06329584 0.7368198 0.03945412
## 0.0248 0.06341090 0.7365748 0.03955850
## 0.0250 0.06352679 0.7363247 0.03966396
## 0.0252 0.06364350 0.7360698 0.03977012
## 0.0254 0.06376095 0.7358110 0.03987683
## 0.0256 0.06387923 0.7355468 0.03998454
## 0.0258 0.06399824 0.7352789 0.04009349
## 0.0260 0.06411779 0.7350100 0.04020307
## 0.0262 0.06423803 0.7347357 0.04031308
## 0.0264 0.06435898 0.7344560 0.04042375
## 0.0266 0.06448062 0.7341704 0.04053469
## 0.0268 0.06460311 0.7338759 0.04064545
## 0.0270 0.06472630 0.7335748 0.04075665
## 0.0272 0.06485025 0.7332683 0.04086871
## 0.0274 0.06497494 0.7329562 0.04098198
## 0.0276 0.06510038 0.7326385 0.04109604
## 0.0278 0.06522641 0.7323174 0.04121087
## 0.0280 0.06535316 0.7319905 0.04132642
## 0.0282 0.06548044 0.7316615 0.04144284
## 0.0284 0.06560784 0.7313366 0.04155960
## 0.0286 0.06573566 0.7310104 0.04167679
## 0.0288 0.06586393 0.7306826 0.04179458
## 0.0290 0.06599287 0.7303497 0.04191305
## 0.0292 0.06612255 0.7300102 0.04203234
## 0.0294 0.06625304 0.7296625 0.04215220
## 0.0296 0.06638423 0.7293083 0.04227254
## 0.0298 0.06651612 0.7289474 0.04239350
## 0.0300 0.06664850 0.7285815 0.04251476
## 0.0302 0.06678151 0.7282096 0.04263644
## 0.0304 0.06691485 0.7278404 0.04275824
## 0.0306 0.06704861 0.7274721 0.04288009
## 0.0308 0.06718301 0.7270983 0.04300274
## 0.0310 0.06731784 0.7267249 0.04312587
## 0.0312 0.06745323 0.7263471 0.04324940
## 0.0314 0.06758871 0.7259719 0.04337319
## 0.0316 0.06772361 0.7256120 0.04349695
## 0.0318 0.06785913 0.7252465 0.04362151
## 0.0320 0.06799486 0.7248837 0.04374615
## 0.0322 0.06812970 0.7245408 0.04386984
## 0.0324 0.06826501 0.7241940 0.04399418
## 0.0326 0.06840095 0.7238408 0.04411911
## 0.0328 0.06853732 0.7234850 0.04424452
## 0.0330 0.06867402 0.7231275 0.04437017
## 0.0332 0.06881118 0.7227691 0.04449612
## 0.0334 0.06894880 0.7224098 0.04462266
## 0.0336 0.06908705 0.7220434 0.04474986
## 0.0338 0.06922592 0.7216699 0.04487792
## 0.0340 0.06936525 0.7212919 0.04500645
## 0.0342 0.06950477 0.7209139 0.04513521
## 0.0344 0.06964444 0.7205399 0.04526443
## 0.0346 0.06978320 0.7201897 0.04539317
## 0.0348 0.06992230 0.7198374 0.04552248
## 0.0350 0.07006186 0.7194832 0.04565206
## 0.0352 0.07020107 0.7191468 0.04578150
## 0.0354 0.07034032 0.7188169 0.04591106
## 0.0356 0.07048017 0.7184802 0.04604106
## 0.0358 0.07062060 0.7181368 0.04617133
## 0.0360 0.07076157 0.7177876 0.04630174
## 0.0362 0.07090313 0.7174310 0.04643250
## 0.0364 0.07104527 0.7170671 0.04656365
## 0.0366 0.07118800 0.7166959 0.04669514
## 0.0368 0.07133132 0.7163168 0.04682721
## 0.0370 0.07147523 0.7159298 0.04695991
## 0.0372 0.07161968 0.7155352 0.04709307
## 0.0374 0.07176419 0.7151421 0.04722577
## 0.0376 0.07190902 0.7147446 0.04735838
## 0.0378 0.07205398 0.7143480 0.04749086
## 0.0380 0.07219819 0.7139685 0.04762239
## 0.0382 0.07234289 0.7135821 0.04775415
## 0.0384 0.07248814 0.7131875 0.04788625
## 0.0386 0.07263325 0.7127977 0.04801829
## 0.0388 0.07277853 0.7124091 0.04815051
## 0.0390 0.07292376 0.7120256 0.04828250
## 0.0392 0.07306953 0.7116338 0.04841464
## 0.0394 0.07321476 0.7112574 0.04854599
## 0.0396 0.07336016 0.7108788 0.04867710
## 0.0398 0.07350606 0.7104923 0.04880831
## 0.0400 0.07365249 0.7100974 0.04893973
## 0.0402 0.07379941 0.7096943 0.04907166
## 0.0404 0.07394683 0.7092827 0.04920380
## 0.0406 0.07409478 0.7088622 0.04933606
## 0.0408 0.07424310 0.7084359 0.04946829
## 0.0410 0.07439110 0.7080173 0.04959996
## 0.0412 0.07453895 0.7076033 0.04973132
## 0.0414 0.07468698 0.7071910 0.04986273
## 0.0416 0.07483421 0.7068009 0.04999322
## 0.0418 0.07498052 0.7064326 0.05012252
## 0.0420 0.07512731 0.7060563 0.05025197
## 0.0422 0.07527441 0.7056758 0.05038147
## 0.0424 0.07542013 0.7053320 0.05050978
## 0.0426 0.07556501 0.7050085 0.05063731
## 0.0428 0.07571015 0.7046804 0.05076484
## 0.0430 0.07585574 0.7043455 0.05089267
## 0.0432 0.07600170 0.7040056 0.05102107
## 0.0434 0.07614782 0.7036636 0.05114944
## 0.0436 0.07629424 0.7033171 0.05127790
## 0.0438 0.07644106 0.7029641 0.05140673
## 0.0440 0.07658811 0.7026096 0.05153559
## 0.0442 0.07673527 0.7022600 0.05166433
## 0.0444 0.07688287 0.7019033 0.05179337
## 0.0446 0.07703090 0.7015393 0.05192262
## 0.0448 0.07717900 0.7011808 0.05205151
## 0.0450 0.07732714 0.7008236 0.05218026
## 0.0452 0.07747523 0.7004722 0.05230863
## 0.0454 0.07762354 0.7001217 0.05243717
## 0.0456 0.07777206 0.6997701 0.05256584
## 0.0458 0.07792037 0.6994262 0.05269426
## 0.0460 0.07806900 0.6990767 0.05282295
## 0.0462 0.07821781 0.6987262 0.05295179
## 0.0464 0.07836628 0.6983842 0.05307993
## 0.0466 0.07851351 0.6980648 0.05320624
## 0.0468 0.07866062 0.6977484 0.05333215
## 0.0470 0.07880792 0.6974287 0.05345811
## 0.0472 0.07895558 0.6971029 0.05358430
## 0.0474 0.07910362 0.6967706 0.05371092
## 0.0476 0.07925203 0.6964317 0.05383798
## 0.0478 0.07940079 0.6960862 0.05396542
## 0.0480 0.07954993 0.6957338 0.05409309
## 0.0482 0.07969927 0.6953789 0.05422077
## 0.0484 0.07984896 0.6950174 0.05434864
## 0.0486 0.07999900 0.6946488 0.05447676
## 0.0488 0.08014940 0.6942727 0.05460512
## 0.0490 0.08030015 0.6938891 0.05473375
## 0.0492 0.08045125 0.6934977 0.05486299
## 0.0494 0.08060271 0.6930984 0.05499264
## 0.0496 0.08075450 0.6926909 0.05512269
## 0.0498 0.08090665 0.6922752 0.05525307
## 0.0500 0.08105914 0.6918509 0.05538371
## 0.0502 0.08121197 0.6914179 0.05551475
## 0.0504 0.08136514 0.6909760 0.05564602
## 0.0506 0.08151865 0.6905249 0.05577743
## 0.0508 0.08167249 0.6900645 0.05590894
## 0.0510 0.08182638 0.6895944 0.05604011
## 0.0512 0.08198057 0.6891149 0.05617159
## 0.0514 0.08213508 0.6886257 0.05630327
## 0.0516 0.08228991 0.6881266 0.05643517
## 0.0518 0.08244506 0.6876173 0.05656734
## 0.0520 0.08260052 0.6870975 0.05669973
## 0.0522 0.08275630 0.6865670 0.05683233
## 0.0524 0.08291240 0.6860256 0.05696531
## 0.0526 0.08306880 0.6854730 0.05709877
## 0.0528 0.08322544 0.6849111 0.05723267
## 0.0530 0.08338230 0.6843401 0.05736696
## 0.0532 0.08353947 0.6837572 0.05750141
## 0.0534 0.08369695 0.6831620 0.05763618
## 0.0536 0.08385472 0.6825543 0.05777115
## 0.0538 0.08401239 0.6819492 0.05790614
## 0.0540 0.08416857 0.6813990 0.05804037
## 0.0542 0.08432506 0.6808368 0.05817483
## 0.0544 0.08448184 0.6802623 0.05830945
## 0.0546 0.08463851 0.6796932 0.05844400
## 0.0548 0.08479539 0.6791152 0.05857869
## 0.0550 0.08495256 0.6785244 0.05871360
## 0.0552 0.08511002 0.6779203 0.05884872
## 0.0554 0.08526777 0.6773026 0.05898400
## 0.0556 0.08542581 0.6766709 0.05911939
## 0.0558 0.08558413 0.6760249 0.05925487
## 0.0560 0.08574274 0.6753642 0.05939056
## 0.0562 0.08590163 0.6746883 0.05952641
## 0.0564 0.08606081 0.6739970 0.05966251
## 0.0566 0.08622026 0.6732896 0.05979902
## 0.0568 0.08637999 0.6725659 0.05993573
## 0.0570 0.08654000 0.6718252 0.06007259
## 0.0572 0.08670029 0.6710672 0.06020960
## 0.0574 0.08685911 0.6703600 0.06034543
## 0.0576 0.08701817 0.6696377 0.06048152
## 0.0578 0.08717749 0.6688984 0.06061775
## 0.0580 0.08733578 0.6681975 0.06075333
## 0.0582 0.08749349 0.6675197 0.06088839
## 0.0584 0.08765086 0.6668561 0.06102305
## 0.0586 0.08780850 0.6661767 0.06115800
## 0.0588 0.08796640 0.6654811 0.06129318
## 0.0590 0.08812457 0.6647689 0.06142856
## 0.0592 0.08828291 0.6640430 0.06156399
## 0.0594 0.08844119 0.6633132 0.06169923
## 0.0596 0.08859932 0.6625829 0.06183433
## 0.0598 0.08875742 0.6618517 0.06196929
## 0.0600 0.08891505 0.6611445 0.06210383
## 0.0602 0.08907221 0.6604565 0.06223779
## 0.0604 0.08922853 0.6597922 0.06237102
## 0.0606 0.08938445 0.6591322 0.06250392
## 0.0608 0.08954063 0.6584560 0.06263701
## 0.0610 0.08969675 0.6577768 0.06276993
## 0.0612 0.08985219 0.6571242 0.06290218
## 0.0614 0.09000637 0.6565293 0.06303363
## 0.0616 0.09015859 0.6560326 0.06316367
## 0.0618 0.09031076 0.6555390 0.06329349
## 0.0620 0.09046318 0.6550326 0.06342339
## 0.0622 0.09061583 0.6545130 0.06355336
## 0.0624 0.09076872 0.6539797 0.06368357
## 0.0626 0.09092186 0.6534323 0.06381386
## 0.0628 0.09107520 0.6528714 0.06394426
## 0.0630 0.09122861 0.6523048 0.06407465
## 0.0632 0.09138224 0.6517230 0.06420522
## 0.0634 0.09153611 0.6511253 0.06433586
## 0.0636 0.09169021 0.6505112 0.06446669
## 0.0638 0.09184450 0.6498835 0.06459769
## 0.0640 0.09199864 0.6492642 0.06472857
## 0.0642 0.09215301 0.6486274 0.06485958
## 0.0644 0.09230761 0.6479726 0.06499090
## 0.0646 0.09246243 0.6472991 0.06512247
## 0.0648 0.09261748 0.6466063 0.06525428
## 0.0650 0.09277275 0.6458935 0.06538617
## 0.0652 0.09292704 0.6452376 0.06551726
## 0.0654 0.09307999 0.6446583 0.06564698
## 0.0656 0.09323295 0.6440739 0.06577659
## 0.0658 0.09338612 0.6434725 0.06590635
## 0.0660 0.09353901 0.6428932 0.06603578
## 0.0662 0.09368973 0.6424661 0.06616299
## 0.0664 0.09383848 0.6421445 0.06628853
## 0.0666 0.09398731 0.6418195 0.06641417
## 0.0668 0.09413635 0.6414846 0.06654002
## 0.0670 0.09428537 0.6411551 0.06666582
## 0.0672 0.09443425 0.6408334 0.06679141
## 0.0674 0.09458160 0.6406166 0.06691560
## 0.0676 0.09472803 0.6404808 0.06703876
## 0.0678 0.09487454 0.6403516 0.06716194
## 0.0680 0.09502123 0.6402180 0.06728516
## 0.0682 0.09516814 0.6400796 0.06740845
## 0.0684 0.09531526 0.6399360 0.06753181
## 0.0686 0.09546240 0.6398005 0.06765505
## 0.0688 0.09560906 0.6397058 0.06777782
## 0.0690 0.09575594 0.6396075 0.06790085
## 0.0692 0.09590302 0.6395052 0.06802409
## 0.0694 0.09605030 0.6393990 0.06814747
## 0.0696 0.09619779 0.6392885 0.06827101
## 0.0698 0.09634548 0.6391737 0.06839475
## 0.0700 0.09649338 0.6390542 0.06851871
## 0.0702 0.09664147 0.6389299 0.06864278
## 0.0704 0.09678977 0.6388005 0.06876687
## 0.0706 0.09693826 0.6386658 0.06889116
## 0.0708 0.09708696 0.6385256 0.06901572
## 0.0710 0.09723585 0.6383795 0.06914032
## 0.0712 0.09738493 0.6382274 0.06926499
## 0.0714 0.09753421 0.6380687 0.06938976
## 0.0716 0.09768369 0.6379034 0.06951456
## 0.0718 0.09783321 0.6377434 0.06963923
## 0.0720 0.09798291 0.6375778 0.06976396
## 0.0722 0.09813280 0.6374049 0.06988879
## 0.0724 0.09828287 0.6372262 0.07001378
## 0.0726 0.09843295 0.6370662 0.07013877
## 0.0728 0.09858322 0.6368994 0.07026384
## 0.0730 0.09873368 0.6367246 0.07038906
## 0.0732 0.09888433 0.6365413 0.07051431
## 0.0734 0.09903517 0.6363491 0.07063962
## 0.0736 0.09918619 0.6361474 0.07076505
## 0.0738 0.09933679 0.6359806 0.07089004
## 0.0740 0.09948676 0.6358684 0.07101445
## 0.0742 0.09963651 0.6357877 0.07113863
## 0.0744 0.09978645 0.6357029 0.07126291
## 0.0746 0.09993599 0.6356741 0.07138660
## 0.0748 0.10008567 0.6356477 0.07151044
## 0.0750 0.10023553 0.6356206 0.07163440
## 0.0752 0.10038557 0.6355927 0.07175843
## 0.0754 0.10053557 0.6355874 0.07188238
## 0.0756 0.10068569 0.6355874 0.07200637
## 0.0758 0.10083600 0.6355874 0.07213039
## 0.0760 0.10098647 0.6355874 0.07225450
## 0.0762 0.10113712 0.6355874 0.07237883
## 0.0764 0.10128794 0.6355874 0.07250332
## 0.0766 0.10143894 0.6355874 0.07262797
## 0.0768 0.10159010 0.6355874 0.07275272
## 0.0770 0.10174144 0.6355874 0.07287762
## 0.0772 0.10189294 0.6355874 0.07300269
## 0.0774 0.10204462 0.6355874 0.07312782
## 0.0776 0.10219646 0.6355874 0.07325319
## 0.0778 0.10234847 0.6355874 0.07337885
## 0.0780 0.10250064 0.6355874 0.07350464
## 0.0782 0.10265298 0.6355874 0.07363050
## 0.0784 0.10280549 0.6355874 0.07375649
## 0.0786 0.10295816 0.6355874 0.07388272
## 0.0788 0.10311099 0.6355874 0.07400902
## 0.0790 0.10326399 0.6355874 0.07413548
## 0.0792 0.10341715 0.6355874 0.07426195
## 0.0794 0.10357047 0.6355874 0.07438848
## 0.0796 0.10372395 0.6355874 0.07451512
## 0.0798 0.10387759 0.6355874 0.07464191
## 0.0800 0.10403138 0.6355874 0.07476875
## 0.0802 0.10418534 0.6355874 0.07489561
## 0.0804 0.10433946 0.6355874 0.07502270
## 0.0806 0.10449373 0.6355874 0.07514984
## 0.0808 0.10464816 0.6355874 0.07527708
## 0.0810 0.10480274 0.6355874 0.07540440
## 0.0812 0.10495748 0.6355874 0.07553180
## 0.0814 0.10511237 0.6355874 0.07565934
## 0.0816 0.10526742 0.6355874 0.07578693
## 0.0818 0.10542262 0.6355874 0.07591467
## 0.0820 0.10557797 0.6355874 0.07604245
## 0.0822 0.10573347 0.6355874 0.07617027
## 0.0824 0.10588913 0.6355874 0.07629817
## 0.0826 0.10604493 0.6355874 0.07642611
## 0.0828 0.10620089 0.6355874 0.07655424
## 0.0830 0.10635676 0.6367800 0.07668237
## 0.0832 0.10650637 0.6367800 0.07680517
## 0.0834 0.10665612 0.6367800 0.07692827
## 0.0836 0.10680601 0.6367800 0.07705152
## 0.0838 0.10695604 0.6367800 0.07717487
## 0.0840 0.10710359 0.6359810 0.07729644
## 0.0842 0.10724723 0.6359810 0.07741480
## 0.0844 0.10738810 0.6352574 0.07753082
## 0.0846 0.10752555 0.6352574 0.07764388
## 0.0848 0.10766312 0.6352574 0.07775705
## 0.0850 0.10780083 0.6352574 0.07787031
## 0.0852 0.10793866 0.6352574 0.07798365
## 0.0854 0.10807661 0.6352574 0.07809741
## 0.0856 0.10821470 0.6352574 0.07821140
## 0.0858 0.10834714 0.6375306 0.07832041
## 0.0860 0.10847953 0.6375306 0.07842937
## 0.0862 0.10860196 0.6345138 0.07853012
## 0.0864 0.10871640 0.6303803 0.07862439
## 0.0866 0.10882853 0.6303803 0.07871674
## 0.0868 0.10893691 0.6323395 0.07880635
## 0.0870 0.10904303 0.6323395 0.07889427
## 0.0872 0.10914443 0.6278031 0.07897825
## 0.0874 0.10923181 0.6263924 0.07905069
## 0.0876 0.10930490 0.6353794 0.07911149
## 0.0878 0.10937349 0.6353794 0.07916847
## 0.0880 0.10944215 0.6353794 0.07922547
## 0.0882 0.10951087 0.6353794 0.07928247
## 0.0884 0.10957966 0.6353794 0.07933950
## 0.0886 0.10964347 0.6335537 0.07939220
## 0.0888 0.10970486 0.6337036 0.07944290
## 0.0890 0.10974487 0.6259503 0.07947652
## 0.0892 0.10976753 0.6009704 0.07949590
## 0.0894 0.10977941 0.6009704 0.07950619
## 0.0896 0.10979129 0.6009704 0.07951648
## 0.0898 0.10979785 0.5842176 0.07952231
## 0.0900 0.10980354 0.5842176 0.07952742
## 0.0902 0.10980924 0.5842176 0.07953253
## 0.0904 0.10981495 0.5842176 0.07953764
## 0.0906 0.10982066 0.5842176 0.07954275
## 0.0908 0.10982136 NaN 0.07954336
## 0.0910 0.10982136 NaN 0.07954336
## 0.0912 0.10982136 NaN 0.07954336
## 0.0914 0.10982136 NaN 0.07954336
## 0.0916 0.10982136 NaN 0.07954336
## 0.0918 0.10982136 NaN 0.07954336
## 0.0920 0.10982136 NaN 0.07954336
## 0.0922 0.10982136 NaN 0.07954336
## 0.0924 0.10982136 NaN 0.07954336
## 0.0926 0.10982136 NaN 0.07954336
## 0.0928 0.10982136 NaN 0.07954336
## 0.0930 0.10982136 NaN 0.07954336
## 0.0932 0.10982136 NaN 0.07954336
## 0.0934 0.10982136 NaN 0.07954336
## 0.0936 0.10982136 NaN 0.07954336
## 0.0938 0.10982136 NaN 0.07954336
## 0.0940 0.10982136 NaN 0.07954336
## 0.0942 0.10982136 NaN 0.07954336
## 0.0944 0.10982136 NaN 0.07954336
## 0.0946 0.10982136 NaN 0.07954336
## 0.0948 0.10982136 NaN 0.07954336
## 0.0950 0.10982136 NaN 0.07954336
## 0.0952 0.10982136 NaN 0.07954336
## 0.0954 0.10982136 NaN 0.07954336
## 0.0956 0.10982136 NaN 0.07954336
## 0.0958 0.10982136 NaN 0.07954336
## 0.0960 0.10982136 NaN 0.07954336
## 0.0962 0.10982136 NaN 0.07954336
## 0.0964 0.10982136 NaN 0.07954336
## 0.0966 0.10982136 NaN 0.07954336
## 0.0968 0.10982136 NaN 0.07954336
## 0.0970 0.10982136 NaN 0.07954336
## 0.0972 0.10982136 NaN 0.07954336
## 0.0974 0.10982136 NaN 0.07954336
## 0.0976 0.10982136 NaN 0.07954336
## 0.0978 0.10982136 NaN 0.07954336
## 0.0980 0.10982136 NaN 0.07954336
## 0.0982 0.10982136 NaN 0.07954336
## 0.0984 0.10982136 NaN 0.07954336
## 0.0986 0.10982136 NaN 0.07954336
## 0.0988 0.10982136 NaN 0.07954336
## 0.0990 0.10982136 NaN 0.07954336
## 0.0992 0.10982136 NaN 0.07954336
## 0.0994 0.10982136 NaN 0.07954336
## 0.0996 0.10982136 NaN 0.07954336
## 0.0998 0.10982136 NaN 0.07954336
## 0.1000 0.10982136 NaN 0.07954336
##
## Tuning parameter 'alpha' was held constant at a value of 1
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were alpha = 1 and lambda = 0.0036.
set.seed(55)
control <- trainControl(method = "repeatedcv",
number = 5,
repeats = 5,
verboseIter = FALSE)
gbm_model <- train(SalePrice ~. ,
data = train_df,
method = "gbm",
trControl = control)
gbm_model
## Stochastic Gradient Boosting
##
## 1169 samples
## 79 predictor
##
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 5 times)
## Summary of sample sizes: 934, 935, 935, 936, 936, 936, ...
## Resampling results across tuning parameters:
##
## interaction.depth n.trees RMSE Rsquared MAE
## 1 50 0.04988156 0.8108476 0.03208121
## 1 100 0.04549907 0.8325540 0.02850307
## 1 150 0.04474028 0.8378777 0.02753639
## 2 50 0.04559703 0.8335385 0.02811659
## 2 100 0.04330617 0.8482977 0.02585975
## 2 150 0.04227966 0.8551180 0.02493875
## 3 50 0.04389564 0.8445934 0.02642121
## 3 100 0.04185455 0.8574481 0.02447859
## 3 150 0.04080081 0.8651538 0.02376450
##
## Tuning parameter 'shrinkage' was held constant at a value of 0.1
##
## Tuning parameter 'n.minobsinnode' was held constant at a value of 10
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were n.trees = 150, interaction.depth =
## 3, shrinkage = 0.1 and n.minobsinnode = 10.
set.seed(55)
control <- trainControl(method = "repeatedcv",
repeats = 5,
number = 5,
verboseIter = FALSE)
xgboost_model <- train(SalePrice ~. ,
data = train_df,
method = "xgbTree",
trControl = control)
xgboost_model
## eXtreme Gradient Boosting
##
## 1169 samples
## 79 predictor
##
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 5 times)
## Summary of sample sizes: 934, 935, 935, 936, 936, 936, ...
## Resampling results across tuning parameters:
##
## eta max_depth colsample_bytree subsample nrounds RMSE Rsquared
## 0.3 1 0.6 0.50 50 0.04666810 0.8222816
## 0.3 1 0.6 0.50 100 0.04389896 0.8434252
## 0.3 1 0.6 0.50 150 0.04307133 0.8492586
## 0.3 1 0.6 0.75 50 0.04522468 0.8340231
## 0.3 1 0.6 0.75 100 0.04314480 0.8490399
## 0.3 1 0.6 0.75 150 0.04210926 0.8564211
## 0.3 1 0.6 1.00 50 0.04450992 0.8400402
## 0.3 1 0.6 1.00 100 0.04179542 0.8590594
## 0.3 1 0.6 1.00 150 0.04072539 0.8662231
## 0.3 1 0.8 0.50 50 0.04603343 0.8282794
## 0.3 1 0.8 0.50 100 0.04460901 0.8379362
## 0.3 1 0.8 0.50 150 0.04420564 0.8408984
## 0.3 1 0.8 0.75 50 0.04546706 0.8322798
## 0.3 1 0.8 0.75 100 0.04342871 0.8469451
## 0.3 1 0.8 0.75 150 0.04251441 0.8533721
## 0.3 1 0.8 1.00 50 0.04472090 0.8382726
## 0.3 1 0.8 1.00 100 0.04229729 0.8551669
## 0.3 1 0.8 1.00 150 0.04128870 0.8618764
## 0.3 2 0.6 0.50 50 0.04455563 0.8392005
## 0.3 2 0.6 0.50 100 0.04354272 0.8461002
## 0.3 2 0.6 0.50 150 0.04292920 0.8509368
## 0.3 2 0.6 0.75 50 0.04195861 0.8565356
## 0.3 2 0.6 0.75 100 0.04119405 0.8630885
## 0.3 2 0.6 0.75 150 0.04068898 0.8664690
## 0.3 2 0.6 1.00 50 0.04124803 0.8616308
## 0.3 2 0.6 1.00 100 0.03993713 0.8702749
## 0.3 2 0.6 1.00 150 0.03950825 0.8732313
## 0.3 2 0.8 0.50 50 0.04426876 0.8415029
## 0.3 2 0.8 0.50 100 0.04276487 0.8525843
## 0.3 2 0.8 0.50 150 0.04233918 0.8562013
## 0.3 2 0.8 0.75 50 0.04215356 0.8556172
## 0.3 2 0.8 0.75 100 0.04096162 0.8636792
## 0.3 2 0.8 0.75 150 0.04047555 0.8670668
## 0.3 2 0.8 1.00 50 0.04096968 0.8651001
## 0.3 2 0.8 1.00 100 0.03967595 0.8736744
## 0.3 2 0.8 1.00 150 0.03928677 0.8760867
## 0.3 3 0.6 0.50 50 0.04395390 0.8446129
## 0.3 3 0.6 0.50 100 0.04350677 0.8480621
## 0.3 3 0.6 0.50 150 0.04354990 0.8478870
## 0.3 3 0.6 0.75 50 0.04192066 0.8590029
## 0.3 3 0.6 0.75 100 0.04151973 0.8620221
## 0.3 3 0.6 0.75 150 0.04130315 0.8634516
## 0.3 3 0.6 1.00 50 0.04086992 0.8653029
## 0.3 3 0.6 1.00 100 0.04022240 0.8692102
## 0.3 3 0.6 1.00 150 0.04005197 0.8702001
## 0.3 3 0.8 0.50 50 0.04259803 0.8533075
## 0.3 3 0.8 0.50 100 0.04217077 0.8563402
## 0.3 3 0.8 0.50 150 0.04191975 0.8581681
## 0.3 3 0.8 0.75 50 0.04116572 0.8620571
## 0.3 3 0.8 0.75 100 0.04071964 0.8655462
## 0.3 3 0.8 0.75 150 0.04065327 0.8659396
## 0.3 3 0.8 1.00 50 0.04082889 0.8658059
## 0.3 3 0.8 1.00 100 0.04017897 0.8702059
## 0.3 3 0.8 1.00 150 0.04008968 0.8708165
## 0.4 1 0.6 0.50 50 0.04909205 0.8070808
## 0.4 1 0.6 0.50 100 0.04629840 0.8279314
## 0.4 1 0.6 0.50 150 0.04529632 0.8353251
## 0.4 1 0.6 0.75 50 0.04537138 0.8331203
## 0.4 1 0.6 0.75 100 0.04326242 0.8488621
## 0.4 1 0.6 0.75 150 0.04233739 0.8556701
## 0.4 1 0.6 1.00 50 0.04477728 0.8376783
## 0.4 1 0.6 1.00 100 0.04198712 0.8572797
## 0.4 1 0.6 1.00 150 0.04075005 0.8656987
## 0.4 1 0.8 0.50 50 0.04672960 0.8222226
## 0.4 1 0.8 0.50 100 0.04477340 0.8377956
## 0.4 1 0.8 0.50 150 0.04399271 0.8436509
## 0.4 1 0.8 0.75 50 0.04608577 0.8282226
## 0.4 1 0.8 0.75 100 0.04395508 0.8437708
## 0.4 1 0.8 0.75 150 0.04328696 0.8477514
## 0.4 1 0.8 1.00 50 0.04464676 0.8386428
## 0.4 1 0.8 1.00 100 0.04201217 0.8575910
## 0.4 1 0.8 1.00 150 0.04085886 0.8652383
## 0.4 2 0.6 0.50 50 0.04570131 0.8312397
## 0.4 2 0.6 0.50 100 0.04495909 0.8366255
## 0.4 2 0.6 0.50 150 0.04480582 0.8379476
## 0.4 2 0.6 0.75 50 0.04306136 0.8504284
## 0.4 2 0.6 0.75 100 0.04174834 0.8591871
## 0.4 2 0.6 0.75 150 0.04161111 0.8603478
## 0.4 2 0.6 1.00 50 0.04180129 0.8588491
## 0.4 2 0.6 1.00 100 0.04065542 0.8669360
## 0.4 2 0.6 1.00 150 0.04015888 0.8702007
## 0.4 2 0.8 0.50 50 0.04503367 0.8345773
## 0.4 2 0.8 0.50 100 0.04380941 0.8429576
## 0.4 2 0.8 0.50 150 0.04367132 0.8443337
## 0.4 2 0.8 0.75 50 0.04217579 0.8545164
## 0.4 2 0.8 0.75 100 0.04106282 0.8621827
## 0.4 2 0.8 0.75 150 0.04078791 0.8637340
## 0.4 2 0.8 1.00 50 0.04113160 0.8630861
## 0.4 2 0.8 1.00 100 0.04037025 0.8684953
## 0.4 2 0.8 1.00 150 0.03990118 0.8716743
## 0.4 3 0.6 0.50 50 0.04485524 0.8379344
## 0.4 3 0.6 0.50 100 0.04437346 0.8420352
## 0.4 3 0.6 0.50 150 0.04455475 0.8406748
## 0.4 3 0.6 0.75 50 0.04291738 0.8509567
## 0.4 3 0.6 0.75 100 0.04277700 0.8520056
## 0.4 3 0.6 0.75 150 0.04266493 0.8529617
## 0.4 3 0.6 1.00 50 0.04256776 0.8535387
## 0.4 3 0.6 1.00 100 0.04195062 0.8576612
## 0.4 3 0.6 1.00 150 0.04180185 0.8587902
## 0.4 3 0.8 0.50 50 0.04503155 0.8372493
## 0.4 3 0.8 0.50 100 0.04497415 0.8375359
## 0.4 3 0.8 0.50 150 0.04509112 0.8366474
## 0.4 3 0.8 0.75 50 0.04316254 0.8483986
## 0.4 3 0.8 0.75 100 0.04291841 0.8506797
## 0.4 3 0.8 0.75 150 0.04281373 0.8514134
## 0.4 3 0.8 1.00 50 0.04129532 0.8628121
## 0.4 3 0.8 1.00 100 0.04086382 0.8655789
## 0.4 3 0.8 1.00 150 0.04074503 0.8663587
## MAE
## 0.03001082
## 0.02736020
## 0.02637073
## 0.02928508
## 0.02684440
## 0.02577014
## 0.02864958
## 0.02638172
## 0.02527026
## 0.02950558
## 0.02779960
## 0.02686287
## 0.02919927
## 0.02690909
## 0.02572395
## 0.02859874
## 0.02639199
## 0.02535859
## 0.02705550
## 0.02594649
## 0.02545092
## 0.02573945
## 0.02444504
## 0.02394855
## 0.02524177
## 0.02399348
## 0.02347391
## 0.02652959
## 0.02516193
## 0.02485469
## 0.02533370
## 0.02421569
## 0.02377764
## 0.02529678
## 0.02404901
## 0.02358020
## 0.02614266
## 0.02588245
## 0.02593841
## 0.02497857
## 0.02449108
## 0.02435101
## 0.02416022
## 0.02351333
## 0.02330846
## 0.02544564
## 0.02503152
## 0.02498328
## 0.02458161
## 0.02414855
## 0.02417589
## 0.02422676
## 0.02366818
## 0.02357368
## 0.03129159
## 0.02878475
## 0.02761522
## 0.02936444
## 0.02694053
## 0.02591381
## 0.02923004
## 0.02661285
## 0.02544173
## 0.03009048
## 0.02796800
## 0.02684121
## 0.02947253
## 0.02705066
## 0.02614667
## 0.02898076
## 0.02645329
## 0.02533884
## 0.02780905
## 0.02675767
## 0.02648729
## 0.02651737
## 0.02518546
## 0.02478980
## 0.02586885
## 0.02463968
## 0.02414782
## 0.02727373
## 0.02638282
## 0.02608367
## 0.02591073
## 0.02479389
## 0.02437764
## 0.02548640
## 0.02436406
## 0.02389855
## 0.02750330
## 0.02741406
## 0.02764022
## 0.02607903
## 0.02587065
## 0.02584057
## 0.02556683
## 0.02500330
## 0.02489852
## 0.02751847
## 0.02750749
## 0.02761151
## 0.02583099
## 0.02554132
## 0.02561996
## 0.02514468
## 0.02469098
## 0.02452470
##
## Tuning parameter 'gamma' was held constant at a value of 0
## Tuning
## parameter 'min_child_weight' was held constant at a value of 1
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were nrounds = 150, max_depth = 2, eta
## = 0.3, gamma = 0, colsample_bytree = 0.8, min_child_weight = 1 and subsample
## = 1.
rmse_models <- data.frame(model = c("lm", "rf", "lasso", "gbm", "xgboost"),
RMSE = rep(0, times = 5))
lm <- min(lm_model$results$RMSE)
rf <- min(rf_model$results$RMSE)
lasso <- min(lasso_model$results$RMSE)
gbm <- min(gbm_model$results$RMSE)
xgboost <- min(xgboost_model$results$RMSE)
rmse_models$RMSE <- c(lm, rf, lasso, gbm, xgboost)
rmse_models
## model RMSE
## 1 lm 0.07895367
## 2 rf 0.04076242
## 3 lasso 0.05110274
## 4 gbm 0.04080081
## 5 xgboost 0.03928677
rmse_models %>%
ggplot( aes(model, RMSE)) +
geom_point(color = "#6fa8dc", size = 3)+
scale_x_discrete(name = "Model",
limits = c("lm", "lasso", "rf", "gbm", "xgboost")) +
theme_minimal() +
labs(title = "RMSE of train models with RepeatedCV 5 reps") +
theme(plot.title = element_text(hjust = 0.5))
eXtreme Gradient Boosting (XGBoost) is used to test data.
xgboost_pred <- predict(xgboost_model, newdata = test_df)
xgboost_pred
## [1] 0.13150251 0.44251642 0.25439680 0.17116161 0.14257967 0.15466762
## [7] 0.13675860 0.02799611 0.07449380 0.05345283 0.17289414 0.10403908
## [13] 0.15116212 0.28815109 0.05480476 0.11477421 0.11968883 0.10978008
## [19] 0.06646258 0.41327205 0.26152790 0.45851231 0.07919944 0.07636599
## [25] 0.24337997 0.24508503 0.15811375 0.12343169 0.30912596 0.28613058
## [31] 0.22271970 0.10243414 0.23148160 0.23764129 0.24974819 0.11196637
## [37] 0.21666287 0.15151586 0.28270608 0.14155176 0.13543865 0.26642409
## [43] 0.22964168 0.18288073 0.14534533 0.13676836 0.13539080 0.15092991
## [49] 0.33427900 0.21512036 0.24807067 0.22445007 0.21119039 0.32499161
## [55] 0.19627394 0.17907615 0.31452334 0.20481187 0.14581819 0.26950783
## [61] 0.09320923 0.49407145 0.30332670 0.05720381 0.34738237 0.12261432
## [67] 0.26205599 0.36085150 0.06661431 0.14788111 0.39886516 0.18646064
## [73] 0.15192996 0.18586703 0.23160821 0.05268119 0.15621568 0.12527624
## [79] 0.18733585 0.34893045 0.08611120 0.28900957 0.08832998 0.24036060
## [85] 0.09708202 0.13939299 0.15248437 0.16009468 0.24568677 0.22660200
## [91] 0.11887615 0.27373832 0.09770814 0.11843304 0.19613321 0.14204632
## [97] 0.33370957 0.45607248 0.19986430 0.46360207 0.13290356 0.21588953
## [103] 0.21792135 0.09701488 0.15922403 0.22134537 0.12542012 0.21933360
## [109] 0.11122136 0.45698202 0.28073066 0.11548221 0.09895558 0.11429666
## [115] 0.21407373 0.11335277 0.10693188 0.14832817 0.23150600 0.18698069
## [121] 0.16716379 0.25191271 0.38142872 0.43913841 0.36676925 0.02669199
## [127] 0.33664861 0.27153680 0.40985250 0.15649031 0.43348277 0.08837362
## [133] 0.26677337 0.15738320 0.38629380 0.18934050 0.28617549 0.16239768
## [139] 0.16678494 0.15947028 0.24206369 0.24941534 0.07198805 0.38887686
## [145] 0.21220617 0.32104957 0.26406756 0.13399285 0.26776198 0.40295184
## [151] 0.21369059 0.15836287 0.16764195 0.14922999 0.17664856 0.56240207
## [157] 0.22914331 0.14560381 0.08633146 0.14659415 0.14809911 0.18666627
## [163] 0.14015257 0.18344754 0.14261857 0.19146712 0.14181578 0.20267802
## [169] 0.20886506 0.16610961 0.15190566 0.24268080 0.39168942 0.43590230
## [175] 0.15152995 0.11650676 0.22192718 0.14591286 0.21977021 0.21988076
## [181] 0.22568497 0.27137905 0.27486956 0.25116962 0.12541458 0.38111007
## [187] 0.16188148 0.12980931 0.17115897 0.21023828 0.09753671 0.17998758
## [193] 0.28217828 0.19206160 0.21467726 0.11713880 0.28229335 0.06138631
## [199] 0.41509894 0.20702155 0.13764364 0.54881531 0.22673418 0.27730498
## [205] 0.41055229 0.23054627 0.14308000 0.31015179 0.22322597 0.12924951
## [211] 0.15492305 0.15638930 0.14095055 0.13639787 0.10015264 0.21818969
## [217] 0.13948335 0.07463707 0.17571108 0.28945550 0.24134709 0.25195652
## [223] 0.25830767 0.10971408 0.22105433 0.20453313 0.18250215 0.17515698
## [229] 0.13202187 0.09886902 0.22007719 0.23213108 0.09765033 0.12236453
## [235] 0.06487637 0.16121688 0.14195485 0.16901554 0.14525028 0.16411780
## [241] 0.26578727 0.24731115 0.17508003 0.13228719 0.20442735 0.21831466
## [247] 0.15339199 0.14577495 0.12253786 0.15535423 0.15445724 0.23753421
## [253] 0.23274359 0.15245944 0.16486089 0.16169873 0.14730576 0.18775576
## [259] 0.37469339 0.19548468 0.47440901 0.10817692 0.26178977 0.12640603
## [265] 0.13251229 0.05894818 0.32768363 0.45641726 0.19958034 0.18651265
## [271] 0.27640688 0.28581539 0.27146292 0.14948590 0.17688911 0.08806807
## [277] 0.15303612 0.14972748 0.16079675 0.18398161 0.14680324 0.10120692
## [283] 0.24523167 0.27141416 0.21198121 0.14505224 0.09843111 0.21380579
## [289] 0.06798968 0.15830880 0.20961148
In this part, the predict data and actual data are be compared to compute the Root Mean Square Error(RMSE).
RMSE of train model vs. RMSE of test model
test_rmse <- sqrt(mean((test_df$SalePrice - xgboost_pred)**2))
cat("The train model RMSE: ",min(xgboost_model$results$RMSE))
## The train model RMSE: 0.03928677
cat("\nThe test model RMSE: ", test_rmse)
##
## The test model RMSE: 0.03853753
In House Prices Project, there are some cleaning data and normalizing data.
eXtreme Gradient Boosting (XGBoost) is the best model that gives the lowest Root Mean Square Error(RMSE) at 0.03928677.
When the model is used to test data(unseen data set) the Root Mean Square Error(RMSE) at 0.03853753
Hence xgboost model is quite the efficient model for predicting the house prices.