When we build a model, we often start with many potential features. Including every available feature can lead to overfitting, where the model performs well on training data but poorly on unseen data. Conversely, a model with too few features can lead to underfitting.
Model selection techniques aim to find the optimal subset of features that balances the model’s complexity with its explanatory power, often measured by the Akaike Information Criterion (AIC).
The first block of code handles our R environment. Note the packages required for this specific task:
| Package | Primary Function | Purpose |
|---|---|---|
tidyverse |
Data manipulation (%>%) |
General utility and pipeline management. |
modeldata |
ames dataset |
Provides the housing data for our analysis. |
MASS |
stepAIC() |
Core function for stepwise model selection. |
modelr |
model_matrix() |
Crucial for correctly handling categorical features. |
# Load Necessary Packages
library(MASS) # For Stepwise Selection (stepAIC)
library(tidyverse)
library(modeldata)
library(modelr) # For model_matrix()
Our goal is to build a Logistic Regression model, which requires a binary (0/1) outcome variable.
We start by selecting a diverse set of potentially relevant
predictors from the Ames dataset. Importantly, we use
drop_na() to quickly remove rows with any missing values
across these selected columns.
Since Sale_Price is continuous, we create a binary
classification problem: predicting whether a house’s sale price is
greater than $160,000.
\[ Y = \begin{cases} 1 & \text{if } \text{Sale_Price} > \$160,000 \\ 0 & \text{if } \text{Sale_Price} \le \$160,000 \end{cases} \]
# Load the Ames Housing data
data(ames)
# Select a broad set of key predictors and handle missing values
full_data <- ames %>%
drop_na() %>%
dplyr::select(Gr_Liv_Area, Year_Built, Garage_Area, Total_Bsmt_SF, Garage_Cars, Year_Remod_Add,
First_Flr_SF, Full_Bath, Garage_Type, Fireplaces, Second_Flr_SF, Longitude, Foundation,
Lot_Area, Latitude, Heating_QC, TotRms_AbvGrd, Open_Porch_SF, Mas_Vnr_Area, Overall_Cond,
Sale_Price) %>%
mutate(Sale_Price_gt160K = as.factor(if_else(Sale_Price>160000,1,0))) %>%
dplyr::select(-Sale_Price) # Remove the original continuous outcome
This is a critical step often overlooked. Logistic Regression
(glm) expects its predictor inputs to be numeric.
Categorical variables (like Garage_Type or
Foundation) must be converted into dummy variables (one
column for each level, often excluding one reference level).
The modelr::model_matrix() function handles this process
efficiently, creating a design matrix (\(\mathbf{X}\)).
# Convert all categorical predictors into dummy variables and remove the intercept column
design_tibble<-model_matrix(full_data, ~ . -Sale_Price_gt160K) %>%
dplyr::select(-`(Intercept)`)
# Combine the new binary outcome with the fully numeric design matrix
model_data<-cbind(full_data$Sale_Price_gt160K, design_tibble) %>%
rename(Sale_Price_gt160K=`full_data$Sale_Price_gt160K`)
model_data is now the complete dataset required for our
glm function. Every column except the outcome is a numeric
predictor.
Logistic Regression is a Generalized Linear Model (GLM) that models the relationship between predictors and the logit of the probability of the outcome.
The relationship is defined by:
Random Component: The error distribution is Binomial (Bernoulli trial).
Systematic Component: The linear predictor \(\eta = \beta_0 + \beta_1 X_1 + \dots + \beta_k X_k\).
Link Function: The logit function, which links the mean of the response (the probability \(p\)) to the linear predictor:
\[\text{logit}(p) = \ln\left(\frac{p}{1-p}\right) = \eta\]
Stepwise selection requires defining the limits of the search space:
Null Model: The simplest model, containing only the intercept. This is the lower bound of our search.
Full Model: The most complex model, containing all available predictors. This is the upper bound of our search.
# Null Model (Intercept Only)
null_model<-glm(Sale_Price_gt160K~1,
data=model_data,
family=binomial)
# Full Model (All Predictors)
full_model<-glm(Sale_Price_gt160K~.,
data = model_data,
family = binomial)
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
# Report the AIC of the starting point (Full Model)
full_model$aic
[1] 1256.541
The Akaike Information Criterion (AIC) is a measure of the relative quality of statistical models for a given set of data. It is defined as:
\[\text{AIC} = 2k - 2\ln(\mathcal{L})\]
where \(k\) is the number of parameters (model complexity) and \(\mathcal{L}\) is the maximum likelihood of the model (goodness of fit). The goal is to find the model with the minimum AIC.
The MASS::stepAIC() function performs the automated
search.
step_model <- stepAIC(null_model,
scope = list(lower = null_model, upper = full_model),
direction = "both",
trace = TRUE)
Start: AIC=4063.84
Sale_Price_gt160K ~ 1
Df Deviance AIC
+ Gr_Liv_Area 1 2793.3 2797.3
+ Full_Bath 1 2803.4 2807.4
+ Garage_Cars 1 2807.3 2811.3
+ Year_Built 1 2837.2 2841.2
+ FoundationPConc 1 3084.4 3088.4
+ Year_Remod_Add 1 3101.6 3105.6
+ Garage_Area 1 3127.7 3131.7
+ Total_Bsmt_SF 1 3332.6 3336.6
+ First_Flr_SF 1 3400.6 3404.6
+ Fireplaces 1 3443.6 3447.6
+ Garage_TypeDetchd 1 3466.4 3470.4
+ TotRms_AbvGrd 1 3577.9 3581.9
+ Overall_CondAverage 1 3610.3 3614.3
+ FoundationCBlock 1 3623.5 3627.5
+ Heating_QCTypical 1 3644.2 3648.2
+ Mas_Vnr_Area 1 3722.4 3726.4
+ Open_Porch_SF 1 3762.6 3766.6
+ Longitude 1 3821.3 3825.3
+ Lot_Area 1 3827.9 3831.9
+ Second_Flr_SF 1 3831.7 3835.7
+ Garage_TypeNo_Garage 1 3881.1 3885.1
+ Latitude 1 3886.2 3890.2
+ Garage_TypeBuiltIn 1 3922.8 3926.8
+ Overall_CondAbove_Average 1 3962.1 3966.1
+ Overall_CondGood 1 3982.6 3986.6
+ Overall_CondBelow_Average 1 4001.4 4005.4
+ Overall_CondFair 1 4014.7 4018.7
+ FoundationSlab 1 4016.0 4020.0
+ Overall_CondVery_Good 1 4029.1 4033.1
+ Heating_QCFair 1 4029.8 4033.8
+ Heating_QCGood 1 4037.9 4041.9
+ Garage_TypeCarPort 1 4048.4 4052.4
+ Garage_TypeBasment 1 4054.4 4058.4
+ Overall_CondPoor 1 4054.5 4058.5
+ Heating_QCPoor 1 4057.7 4061.7
+ FoundationStone 1 4059.5 4063.5
<none> 4061.8 4063.8
+ Garage_TypeMore_Than_Two_Types 1 4060.7 4064.7
+ Overall_CondExcellent 1 4061.6 4065.6
+ FoundationWood 1 4061.6 4065.6
Step: AIC=2797.3
Sale_Price_gt160K ~ Gr_Liv_Area
Warning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Year_Built 1 1804.2 1810.2
+ Year_Remod_Add 1 2195.8 2201.8
+ FoundationPConc 1 2222.9 2228.9
+ Garage_Cars 1 2225.6 2231.6
+ Garage_TypeDetchd 1 2403.4 2409.4
+ Garage_Area 1 2433.4 2439.4
+ Full_Bath 1 2466.7 2472.7
+ Total_Bsmt_SF 1 2494.2 2500.2
+ Overall_CondAverage 1 2527.2 2533.2
+ Heating_QCTypical 1 2552.9 2558.9
+ FoundationCBlock 1 2617.5 2623.5
+ Longitude 1 2621.3 2627.3
+ Fireplaces 1 2644.7 2650.7
+ Garage_TypeNo_Garage 1 2646.4 2652.4
+ TotRms_AbvGrd 1 2672.0 2678.0
+ First_Flr_SF 1 2674.3 2680.3
+ Second_Flr_SF 1 2694.7 2700.7
+ Latitude 1 2707.8 2713.8
+ Overall_CondBelow_Average 1 2712.5 2718.5
+ Mas_Vnr_Area 1 2721.7 2727.7
+ Open_Porch_SF 1 2728.0 2734.0
+ FoundationSlab 1 2732.7 2738.7
+ Overall_CondFair 1 2743.2 2749.2
+ Overall_CondAbove_Average 1 2748.8 2754.8
+ Overall_CondGood 1 2761.7 2767.7
+ Heating_QCFair 1 2771.7 2777.7
+ Garage_TypeBuiltIn 1 2773.0 2779.0
+ Overall_CondVery_Good 1 2773.5 2779.5
+ Lot_Area 1 2781.2 2787.2
+ Overall_CondPoor 1 2781.3 2787.3
+ FoundationStone 1 2781.6 2787.6
+ Garage_TypeCarPort 1 2785.1 2791.1
+ Heating_QCGood 1 2785.3 2791.3
+ Garage_TypeMore_Than_Two_Types 1 2787.1 2793.1
+ Garage_TypeBasment 1 2787.8 2793.8
<none> 2793.3 2797.3
+ Heating_QCPoor 1 2791.8 2797.8
+ Overall_CondExcellent 1 2793.0 2799.0
+ FoundationWood 1 2793.2 2799.2
- Gr_Liv_Area 1 4061.8 4063.8
Step: AIC=1810.23
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built
Warning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Fireplaces 1 1633.8 1641.8
+ Garage_Cars 1 1702.5 1710.5
+ Total_Bsmt_SF 1 1733.0 1741.0
+ Garage_Area 1 1743.2 1751.2
+ FoundationSlab 1 1744.0 1752.0
+ Garage_TypeNo_Garage 1 1744.3 1752.3
+ Year_Remod_Add 1 1748.3 1756.3
+ Heating_QCTypical 1 1750.5 1758.5
+ Garage_TypeDetchd 1 1762.0 1770.0
+ Lot_Area 1 1763.1 1771.1
+ TotRms_AbvGrd 1 1765.9 1773.9
+ FoundationPConc 1 1778.6 1786.6
+ Overall_CondFair 1 1780.9 1788.9
+ First_Flr_SF 1 1780.9 1788.9
+ Overall_CondBelow_Average 1 1781.7 1789.7
+ Overall_CondGood 1 1783.4 1791.4
+ Overall_CondExcellent 1 1784.5 1792.5
+ Second_Flr_SF 1 1785.7 1793.7
+ FoundationCBlock 1 1791.6 1799.6
+ Full_Bath 1 1792.8 1800.8
+ Overall_CondVery_Good 1 1793.7 1801.7
+ Overall_CondAverage 1 1794.7 1802.7
+ Latitude 1 1795.2 1803.2
+ Open_Porch_SF 1 1795.5 1803.5
+ Overall_CondPoor 1 1796.3 1804.3
+ Mas_Vnr_Area 1 1798.4 1806.4
+ Heating_QCGood 1 1801.0 1809.0
+ Garage_TypeBasment 1 1801.1 1809.1
+ Garage_TypeCarPort 1 1801.8 1809.8
<none> 1804.2 1810.2
+ FoundationWood 1 1803.0 1811.0
+ FoundationStone 1 1803.6 1811.6
+ Garage_TypeMore_Than_Two_Types 1 1803.7 1811.7
+ Garage_TypeBuiltIn 1 1803.8 1811.8
+ Heating_QCFair 1 1804.0 1812.0
+ Overall_CondAbove_Average 1 1804.0 1812.0
+ Heating_QCPoor 1 1804.1 1812.1
+ Longitude 1 1804.2 1812.2
- Year_Built 1 2793.3 2797.3
- Gr_Liv_Area 1 2837.2 2841.2
Step: AIC=1641.83
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces
Warning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Year_Remod_Add 1 1552.2 1562.2
+ Garage_Cars 1 1558.6 1568.6
+ Heating_QCTypical 1 1568.4 1578.4
+ FoundationSlab 1 1579.2 1589.2
+ FoundationPConc 1 1583.8 1593.8
+ Total_Bsmt_SF 1 1587.2 1597.2
+ Garage_Area 1 1588.4 1598.4
+ Full_Bath 1 1598.2 1608.2
+ FoundationCBlock 1 1598.5 1608.5
+ Garage_TypeNo_Garage 1 1602.2 1612.2
+ Overall_CondFair 1 1608.3 1618.3
+ Overall_CondExcellent 1 1609.4 1619.4
+ Overall_CondBelow_Average 1 1612.9 1622.9
+ Overall_CondGood 1 1613.0 1623.0
+ Lot_Area 1 1615.4 1625.4
+ Garage_TypeDetchd 1 1616.1 1626.1
+ TotRms_AbvGrd 1 1617.7 1627.7
+ Overall_CondVery_Good 1 1618.4 1628.4
+ Open_Porch_SF 1 1623.9 1633.9
+ Overall_CondPoor 1 1625.0 1635.0
+ First_Flr_SF 1 1626.6 1636.6
+ Overall_CondAverage 1 1628.1 1638.1
+ Garage_TypeBasment 1 1628.4 1638.4
+ Latitude 1 1628.5 1638.5
+ Second_Flr_SF 1 1628.7 1638.7
+ Heating_QCGood 1 1629.1 1639.1
+ FoundationStone 1 1630.5 1640.5
+ Garage_TypeCarPort 1 1631.8 1641.8
<none> 1633.8 1641.8
+ Mas_Vnr_Area 1 1633.1 1643.1
+ Longitude 1 1633.4 1643.4
+ FoundationWood 1 1633.6 1643.6
+ Garage_TypeMore_Than_Two_Types 1 1633.6 1643.6
+ Overall_CondAbove_Average 1 1633.6 1643.6
+ Heating_QCFair 1 1633.6 1643.6
+ Heating_QCPoor 1 1633.8 1643.8
+ Garage_TypeBuiltIn 1 1633.8 1643.8
- Fireplaces 1 1804.2 1810.2
- Gr_Liv_Area 1 2295.3 2301.3
- Year_Built 1 2644.8 2650.8
Step: AIC=1562.22
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
Df Deviance AIC
+ Garage_Cars 1 1471.3 1483.3
+ Garage_Area 1 1502.9 1514.9
+ Total_Bsmt_SF 1 1503.4 1515.4
+ FoundationSlab 1 1504.8 1516.8
+ Garage_TypeNo_Garage 1 1519.4 1531.4
+ Full_Bath 1 1521.3 1533.3
+ Heating_QCTypical 1 1521.4 1533.4
+ Lot_Area 1 1527.5 1539.5
+ Garage_TypeDetchd 1 1529.8 1541.8
+ FoundationPConc 1 1531.4 1543.4
+ Overall_CondFair 1 1533.9 1545.9
+ Overall_CondBelow_Average 1 1535.6 1547.6
+ First_Flr_SF 1 1538.9 1550.9
+ Second_Flr_SF 1 1542.0 1554.0
+ TotRms_AbvGrd 1 1542.7 1554.7
+ FoundationCBlock 1 1543.3 1555.3
+ Overall_CondExcellent 1 1543.5 1555.5
+ Overall_CondPoor 1 1544.5 1556.5
+ Overall_CondGood 1 1546.1 1558.1
+ Latitude 1 1546.3 1558.3
+ Garage_TypeBasment 1 1546.6 1558.6
+ Open_Porch_SF 1 1546.8 1558.8
+ Mas_Vnr_Area 1 1547.0 1559.0
+ Heating_QCGood 1 1548.6 1560.6
+ Garage_TypeCarPort 1 1549.5 1561.5
<none> 1552.2 1562.2
+ Overall_CondVery_Good 1 1550.6 1562.6
+ Heating_QCFair 1 1551.0 1563.0
+ Overall_CondAverage 1 1551.1 1563.1
+ FoundationStone 1 1551.4 1563.4
+ Overall_CondAbove_Average 1 1551.7 1563.7
+ FoundationWood 1 1551.8 1563.8
+ Longitude 1 1552.1 1564.1
+ Heating_QCPoor 1 1552.2 1564.2
+ Garage_TypeBuiltIn 1 1552.2 1564.2
+ Garage_TypeMore_Than_Two_Types 1 1552.2 1564.2
- Year_Remod_Add 1 1633.8 1641.8
- Fireplaces 1 1748.3 1756.3
- Year_Built 1 1967.2 1975.2
- Gr_Liv_Area 1 2119.1 2127.1
Step: AIC=1483.28
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars
Warning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Total_Bsmt_SF 1 1418.9 1432.9
+ FoundationSlab 1 1427.1 1441.1
+ Garage_TypeDetchd 1 1434.0 1448.0
+ Heating_QCTypical 1 1441.7 1455.7
+ Full_Bath 1 1448.1 1462.1
+ FoundationPConc 1 1450.4 1464.4
+ Overall_CondBelow_Average 1 1454.0 1468.0
+ Overall_CondFair 1 1454.9 1468.9
+ Lot_Area 1 1455.4 1469.4
+ TotRms_AbvGrd 1 1462.8 1476.8
+ Overall_CondPoor 1 1463.3 1477.3
+ First_Flr_SF 1 1463.7 1477.7
+ Overall_CondGood 1 1464.3 1478.3
+ FoundationCBlock 1 1464.4 1478.4
+ Overall_CondExcellent 1 1464.9 1478.9
+ Open_Porch_SF 1 1465.5 1479.5
+ Second_Flr_SF 1 1465.9 1479.9
+ Garage_TypeBasment 1 1467.0 1481.0
+ Garage_TypeMore_Than_Two_Types 1 1467.4 1481.4
+ Latitude 1 1467.8 1481.8
+ Mas_Vnr_Area 1 1468.3 1482.3
+ Heating_QCGood 1 1468.7 1482.7
+ Overall_CondAverage 1 1468.8 1482.8
+ Garage_TypeCarPort 1 1469.0 1483.0
<none> 1471.3 1483.3
+ Overall_CondVery_Good 1 1469.4 1483.4
+ Heating_QCFair 1 1469.7 1483.7
+ Garage_TypeNo_Garage 1 1470.2 1484.2
+ Overall_CondAbove_Average 1 1470.4 1484.4
+ FoundationWood 1 1470.7 1484.7
+ FoundationStone 1 1470.9 1484.9
+ Garage_Area 1 1471.0 1485.0
+ Garage_TypeBuiltIn 1 1471.2 1485.2
+ Heating_QCPoor 1 1471.2 1485.2
+ Longitude 1 1471.3 1485.3
- Garage_Cars 1 1552.2 1562.2
- Year_Remod_Add 1 1558.6 1568.6
- Fireplaces 1 1642.3 1652.3
- Year_Built 1 1676.6 1686.6
- Gr_Liv_Area 1 1862.5 1872.5
Warning: glm.fit: algorithm did not converge
Step: AIC=1432.91
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Heating_QCTypical 1 1388.8 1404.8
+ Garage_TypeDetchd 1 1390.0 1406.0
+ FoundationPConc 1 1395.5 1411.5
+ Full_Bath 1 1397.2 1413.2
+ Overall_CondBelow_Average 1 1398.4 1414.4
+ Overall_CondFair 1 1398.8 1414.8
+ FoundationSlab 1 1400.0 1416.0
+ FoundationCBlock 1 1403.4 1419.4
+ Second_Flr_SF 1 1406.0 1422.0
+ Overall_CondPoor 1 1409.3 1425.3
+ First_Flr_SF 1 1409.7 1425.7
+ Overall_CondExcellent 1 1411.1 1427.1
+ TotRms_AbvGrd 1 1411.2 1427.2
+ Overall_CondGood 1 1411.3 1427.3
+ Lot_Area 1 1411.7 1427.7
+ Garage_TypeMore_Than_Two_Types 1 1414.7 1430.7
+ Open_Porch_SF 1 1415.1 1431.1
+ Heating_QCGood 1 1415.2 1431.2
+ Garage_TypeBuiltIn 1 1415.4 1431.4
+ Overall_CondAverage 1 1415.8 1431.8
+ Garage_Area 1 1415.9 1431.9
+ Garage_TypeNo_Garage 1 1416.2 1432.2
+ Latitude 1 1416.2 1432.2
+ Overall_CondVery_Good 1 1416.6 1432.6
<none> 1418.9 1432.9
+ Garage_TypeBasment 1 1416.9 1432.9
+ Garage_TypeCarPort 1 1417.0 1433.0
+ Overall_CondAbove_Average 1 1417.6 1433.6
+ Mas_Vnr_Area 1 1417.7 1433.7
+ Heating_QCFair 1 1418.0 1434.0
+ FoundationStone 1 1418.2 1434.2
+ FoundationWood 1 1418.6 1434.6
+ Longitude 1 1418.7 1434.7
+ Heating_QCPoor 1 1418.9 1434.9
- Total_Bsmt_SF 1 1471.3 1483.3
- Garage_Cars 1 1503.4 1515.4
- Year_Remod_Add 1 1519.3 1531.3
- Fireplaces 1 1568.2 1580.2
- Year_Built 1 1575.9 1587.9
- Gr_Liv_Area 1 1775.4 1787.4
Warning: glm.fit: algorithm did not converge
Step: AIC=1404.79
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Garage_TypeDetchd 1 1359.0 1377.0
+ Overall_CondFair 1 1366.6 1384.6
+ Full_Bath 1 1366.8 1384.8
+ Overall_CondBelow_Average 1 1370.0 1388.0
+ FoundationSlab 1 1374.8 1392.8
+ FoundationPConc 1 1376.3 1394.3
+ Second_Flr_SF 1 1378.7 1396.7
+ Overall_CondPoor 1 1379.0 1397.0
+ Overall_CondGood 1 1380.8 1398.8
+ Lot_Area 1 1381.7 1399.7
+ Overall_CondExcellent 1 1381.7 1399.7
+ First_Flr_SF 1 1381.7 1399.7
+ FoundationCBlock 1 1381.9 1399.9
+ TotRms_AbvGrd 1 1383.1 1401.1
+ Overall_CondAverage 1 1384.6 1402.6
+ Garage_TypeMore_Than_Two_Types 1 1384.8 1402.8
+ Garage_TypeBasment 1 1385.8 1403.8
+ Garage_TypeBuiltIn 1 1385.8 1403.8
+ Latitude 1 1386.3 1404.3
+ Garage_Area 1 1386.6 1404.6
+ Garage_TypeNo_Garage 1 1386.7 1404.7
+ Overall_CondAbove_Average 1 1386.7 1404.7
<none> 1388.8 1404.8
+ Open_Porch_SF 1 1386.8 1404.8
+ Overall_CondVery_Good 1 1387.0 1405.0
+ Garage_TypeCarPort 1 1387.1 1405.1
+ Mas_Vnr_Area 1 1387.6 1405.6
+ FoundationStone 1 1388.1 1406.1
+ FoundationWood 1 1388.4 1406.4
+ Heating_QCPoor 1 1388.8 1406.8
+ Heating_QCFair 1 1388.8 1406.8
+ Heating_QCGood 1 1388.8 1406.8
+ Longitude 1 1388.8 1406.8
- Heating_QCTypical 1 1418.9 1432.9
- Total_Bsmt_SF 1 1441.7 1455.7
- Year_Remod_Add 1 1450.7 1464.7
- Garage_Cars 1 1471.5 1485.5
- Year_Built 1 1540.7 1554.7
- Fireplaces 1 1542.4 1556.4
- Gr_Liv_Area 1 1731.6 1745.6
Step: AIC=1377
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd
Warning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Full_Bath 1 1335.5 1355.5
+ Overall_CondFair 1 1337.2 1357.2
+ Overall_CondBelow_Average 1 1337.4 1357.4
+ Second_Flr_SF 1 1341.8 1361.8
+ FoundationPConc 1 1341.8 1361.8
+ FoundationSlab 1 1343.4 1363.4
+ First_Flr_SF 1 1344.7 1364.7
+ FoundationCBlock 1 1345.7 1365.7
+ Overall_CondPoor 1 1348.4 1368.4
+ Overall_CondGood 1 1350.4 1370.4
+ Garage_TypeMore_Than_Two_Types 1 1351.1 1371.1
+ Overall_CondExcellent 1 1352.1 1372.1
+ TotRms_AbvGrd 1 1352.9 1372.9
+ Garage_TypeBasment 1 1354.3 1374.3
+ Garage_TypeNo_Garage 1 1355.9 1375.9
+ Overall_CondAverage 1 1356.4 1376.4
+ Lot_Area 1 1356.5 1376.5
+ Latitude 1 1356.5 1376.5
+ Garage_TypeCarPort 1 1356.8 1376.8
<none> 1359.0 1377.0
+ Overall_CondVery_Good 1 1357.1 1377.1
+ Garage_Area 1 1357.3 1377.3
+ Open_Porch_SF 1 1357.4 1377.4
+ Garage_TypeBuiltIn 1 1357.5 1377.5
+ Mas_Vnr_Area 1 1357.7 1377.7
+ Overall_CondAbove_Average 1 1357.8 1377.8
+ FoundationWood 1 1358.3 1378.3
+ FoundationStone 1 1358.6 1378.6
+ Heating_QCGood 1 1358.7 1378.7
+ Longitude 1 1358.9 1378.9
+ Heating_QCPoor 1 1359.0 1379.0
+ Heating_QCFair 1 1359.0 1379.0
- Garage_TypeDetchd 1 1388.8 1404.8
- Heating_QCTypical 1 1390.0 1406.0
- Total_Bsmt_SF 1 1402.6 1418.6
- Year_Remod_Add 1 1423.8 1439.8
- Year_Built 1 1438.7 1454.7
- Garage_Cars 1 1443.9 1459.9
- Fireplaces 1 1480.1 1496.1
- Gr_Liv_Area 1 1678.5 1694.5
Step: AIC=1355.54
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd +
Full_Bath
Df Deviance AIC
+ Overall_CondFair 1 1312.8 1334.8
+ Overall_CondBelow_Average 1 1314.9 1336.9
+ FoundationPConc 1 1319.0 1341.0
+ FoundationSlab 1 1319.5 1341.5
+ Second_Flr_SF 1 1321.0 1343.0
+ First_Flr_SF 1 1323.7 1345.7
+ Overall_CondPoor 1 1323.8 1345.8
+ FoundationCBlock 1 1324.5 1346.5
+ Overall_CondGood 1 1325.4 1347.4
+ TotRms_AbvGrd 1 1327.6 1349.6
+ Garage_TypeMore_Than_Two_Types 1 1328.3 1350.3
+ Overall_CondExcellent 1 1329.6 1351.6
+ Lot_Area 1 1331.5 1353.5
+ Garage_TypeNo_Garage 1 1331.8 1353.8
+ Garage_TypeBasment 1 1332.0 1354.0
+ Overall_CondAverage 1 1332.0 1354.0
<none> 1335.5 1355.5
+ Garage_TypeCarPort 1 1333.5 1355.5
+ Overall_CondVery_Good 1 1333.8 1355.8
+ Mas_Vnr_Area 1 1333.9 1355.9
+ Overall_CondAbove_Average 1 1334.0 1356.0
+ Open_Porch_SF 1 1334.1 1356.1
+ Garage_TypeBuiltIn 1 1334.4 1356.4
+ Latitude 1 1334.6 1356.6
+ Garage_Area 1 1334.7 1356.7
+ Longitude 1 1335.1 1357.1
+ FoundationWood 1 1335.2 1357.2
+ Heating_QCGood 1 1335.3 1357.3
+ FoundationStone 1 1335.4 1357.4
+ Heating_QCFair 1 1335.5 1357.5
+ Heating_QCPoor 1 1335.5 1357.5
- Full_Bath 1 1359.0 1377.0
- Garage_TypeDetchd 1 1366.8 1384.8
- Heating_QCTypical 1 1367.0 1385.0
- Total_Bsmt_SF 1 1378.0 1396.0
- Year_Built 1 1386.4 1404.4
- Year_Remod_Add 1 1394.9 1412.9
- Garage_Cars 1 1412.9 1430.9
- Fireplaces 1 1472.4 1490.4
- Gr_Liv_Area 1 1476.2 1494.2
Step: AIC=1334.77
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd +
Full_Bath + Overall_CondFair
Warning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Overall_CondBelow_Average 1 1290.4 1314.4
+ FoundationSlab 1 1296.6 1320.6
+ FoundationPConc 1 1297.1 1321.1
+ Second_Flr_SF 1 1300.4 1324.4
+ Overall_CondPoor 1 1300.5 1324.5
+ FoundationCBlock 1 1301.5 1325.5
+ First_Flr_SF 1 1302.9 1326.9
+ Overall_CondGood 1 1303.2 1327.2
+ TotRms_AbvGrd 1 1303.4 1327.4
+ Garage_TypeMore_Than_Two_Types 1 1305.0 1329.0
+ Overall_CondExcellent 1 1306.7 1330.7
+ Overall_CondAverage 1 1306.9 1330.9
+ Lot_Area 1 1309.0 1333.0
+ Garage_TypeNo_Garage 1 1309.4 1333.4
+ Garage_TypeCarPort 1 1310.7 1334.7
<none> 1312.8 1334.8
+ Garage_TypeBasment 1 1310.9 1334.9
+ Overall_CondVery_Good 1 1311.0 1335.0
+ Mas_Vnr_Area 1 1311.3 1335.3
+ Open_Porch_SF 1 1311.5 1335.5
+ Garage_TypeBuiltIn 1 1311.7 1335.7
+ Latitude 1 1311.8 1335.8
+ Overall_CondAbove_Average 1 1312.0 1336.0
+ Garage_Area 1 1312.1 1336.1
+ Longitude 1 1312.3 1336.3
+ FoundationWood 1 1312.5 1336.5
+ Heating_QCFair 1 1312.5 1336.5
+ FoundationStone 1 1312.7 1336.7
+ Heating_QCPoor 1 1312.7 1336.7
+ Heating_QCGood 1 1312.8 1336.8
- Overall_CondFair 1 1335.5 1355.5
- Full_Bath 1 1337.2 1357.2
- Garage_TypeDetchd 1 1343.7 1363.7
- Heating_QCTypical 1 1346.5 1366.5
- Total_Bsmt_SF 1 1357.5 1377.5
- Year_Built 1 1364.0 1384.0
- Year_Remod_Add 1 1366.0 1386.0
- Garage_Cars 1 1387.8 1407.8
- Fireplaces 1 1455.3 1475.3
- Gr_Liv_Area 1 1459.2 1479.2
Step: AIC=1314.37
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd +
Full_Bath + Overall_CondFair + Overall_CondBelow_Average
Warning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not converge
Df Deviance AIC
+ FoundationSlab 1 1274.8 1300.8
+ FoundationPConc 1 1275.5 1301.5
+ Overall_CondPoor 1 1277.3 1303.3
+ Overall_CondAverage 1 1279.3 1305.3
+ Second_Flr_SF 1 1280.0 1306.0
+ FoundationCBlock 1 1280.3 1306.3
+ TotRms_AbvGrd 1 1281.4 1307.4
+ First_Flr_SF 1 1282.6 1308.6
+ Overall_CondGood 1 1282.9 1308.9
+ Garage_TypeMore_Than_Two_Types 1 1284.5 1310.5
+ Overall_CondExcellent 1 1284.7 1310.7
+ Lot_Area 1 1285.4 1311.4
+ Garage_TypeNo_Garage 1 1286.5 1312.5
+ Garage_TypeCarPort 1 1288.2 1314.2
<none> 1290.4 1314.4
+ Garage_TypeBasment 1 1288.4 1314.4
+ Open_Porch_SF 1 1289.0 1315.0
+ Mas_Vnr_Area 1 1289.0 1315.0
+ Overall_CondVery_Good 1 1289.2 1315.2
+ Garage_TypeBuiltIn 1 1289.3 1315.3
+ Longitude 1 1289.7 1315.7
+ Heating_QCFair 1 1289.9 1315.9
+ Garage_Area 1 1290.0 1316.0
+ Latitude 1 1290.0 1316.0
+ FoundationWood 1 1290.0 1316.0
+ FoundationStone 1 1290.2 1316.2
+ Overall_CondAbove_Average 1 1290.3 1316.3
+ Heating_QCPoor 1 1290.3 1316.3
+ Heating_QCGood 1 1290.4 1316.4
- Overall_CondBelow_Average 1 1312.8 1334.8
- Full_Bath 1 1313.8 1335.8
- Overall_CondFair 1 1314.9 1336.9
- Heating_QCTypical 1 1322.0 1344.0
- Garage_TypeDetchd 1 1324.0 1346.0
- Year_Built 1 1337.3 1359.3
- Total_Bsmt_SF 1 1338.9 1360.9
- Year_Remod_Add 1 1341.3 1363.3
- Garage_Cars 1 1365.5 1387.5
- Fireplaces 1 1430.2 1452.2
- Gr_Liv_Area 1 1447.3 1469.3
Step: AIC=1300.8
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd +
Full_Bath + Overall_CondFair + Overall_CondBelow_Average +
FoundationSlab
Warning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not converge
Df Deviance AIC
+ FoundationCBlock 1 1261.6 1289.6
+ Overall_CondPoor 1 1261.9 1289.9
+ Overall_CondAverage 1 1263.1 1291.1
+ FoundationPConc 1 1263.4 1291.4
+ Overall_CondGood 1 1266.8 1294.8
+ TotRms_AbvGrd 1 1267.2 1295.2
+ Garage_TypeMore_Than_Two_Types 1 1268.3 1296.3
+ Lot_Area 1 1269.0 1297.0
+ Overall_CondExcellent 1 1269.3 1297.3
+ Second_Flr_SF 1 1271.4 1299.4
+ Garage_TypeNo_Garage 1 1271.6 1299.6
+ Garage_TypeBasment 1 1272.3 1300.3
+ Garage_TypeCarPort 1 1272.8 1300.8
<none> 1274.8 1300.8
+ First_Flr_SF 1 1273.1 1301.1
+ Longitude 1 1273.3 1301.3
+ Open_Porch_SF 1 1273.6 1301.6
+ Mas_Vnr_Area 1 1273.7 1301.7
+ Overall_CondVery_Good 1 1273.7 1301.7
+ Heating_QCFair 1 1274.4 1302.4
+ FoundationWood 1 1274.4 1302.4
+ Garage_TypeBuiltIn 1 1274.4 1302.4
+ Latitude 1 1274.5 1302.5
+ Overall_CondAbove_Average 1 1274.7 1302.7
+ FoundationStone 1 1274.7 1302.7
+ Heating_QCPoor 1 1274.7 1302.7
+ Garage_Area 1 1274.8 1302.8
+ Heating_QCGood 1 1274.8 1302.8
- FoundationSlab 1 1290.4 1314.4
- Overall_CondBelow_Average 1 1296.6 1320.6
- Total_Bsmt_SF 1 1297.8 1321.8
- Full_Bath 1 1299.2 1323.2
- Overall_CondFair 1 1299.5 1323.5
- Heating_QCTypical 1 1301.7 1325.7
- Garage_TypeDetchd 1 1310.2 1334.2
- Year_Remod_Add 1 1319.8 1343.8
- Year_Built 1 1326.1 1350.1
- Garage_Cars 1 1348.7 1372.7
- Fireplaces 1 1414.5 1438.5
- Gr_Liv_Area 1 1440.3 1464.3
Step: AIC=1289.56
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd +
Full_Bath + Overall_CondFair + Overall_CondBelow_Average +
FoundationSlab + FoundationCBlock
Warning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Overall_CondAverage 1 1243.0 1273.0
+ Overall_CondPoor 1 1247.5 1277.5
+ Overall_CondGood 1 1252.6 1282.6
+ Lot_Area 1 1254.4 1284.4
+ Garage_TypeMore_Than_Two_Types 1 1255.0 1285.0
+ TotRms_AbvGrd 1 1255.2 1285.2
+ Overall_CondExcellent 1 1255.5 1285.5
+ Garage_TypeNo_Garage 1 1257.5 1287.5
+ Garage_TypeBasment 1 1259.0 1289.0
+ Second_Flr_SF 1 1259.4 1289.4
+ Mas_Vnr_Area 1 1259.4 1289.4
<none> 1261.6 1289.6
+ Overall_CondVery_Good 1 1259.8 1289.8
+ Garage_TypeCarPort 1 1260.0 1290.0
+ Latitude 1 1260.5 1290.5
+ Open_Porch_SF 1 1260.7 1290.7
+ Overall_CondAbove_Average 1 1260.8 1290.8
+ Longitude 1 1260.8 1290.8
+ First_Flr_SF 1 1260.9 1290.9
+ FoundationWood 1 1260.9 1290.9
+ FoundationPConc 1 1261.0 1291.0
+ Heating_QCFair 1 1261.2 1291.2
+ Heating_QCGood 1 1261.2 1291.2
+ Garage_TypeBuiltIn 1 1261.3 1291.3
+ Heating_QCPoor 1 1261.5 1291.5
+ FoundationStone 1 1261.5 1291.5
+ Garage_Area 1 1261.5 1291.5
- FoundationCBlock 1 1274.8 1300.8
- Heating_QCTypical 1 1278.1 1304.1
- FoundationSlab 1 1280.3 1306.3
- Overall_CondBelow_Average 1 1281.8 1307.8
- Full_Bath 1 1283.9 1309.9
- Overall_CondFair 1 1286.6 1312.6
- Total_Bsmt_SF 1 1287.8 1313.8
- Year_Remod_Add 1 1291.8 1317.8
- Garage_TypeDetchd 1 1304.2 1330.2
- Year_Built 1 1307.7 1333.7
- Garage_Cars 1 1335.0 1361.0
- Fireplaces 1 1408.9 1434.9
- Gr_Liv_Area 1 1415.5 1441.5
Step: AIC=1273.01
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd +
Full_Bath + Overall_CondFair + Overall_CondBelow_Average +
FoundationSlab + FoundationCBlock + Overall_CondAverage
Warning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Overall_CondPoor 1 1226.2 1258.2
+ Lot_Area 1 1235.3 1267.3
+ TotRms_AbvGrd 1 1235.8 1267.8
+ Overall_CondAbove_Average 1 1236.0 1268.0
+ Garage_TypeMore_Than_Two_Types 1 1237.0 1269.0
+ Overall_CondExcellent 1 1237.2 1269.2
+ Overall_CondGood 1 1240.0 1272.0
+ Garage_TypeNo_Garage 1 1240.0 1272.0
+ Mas_Vnr_Area 1 1240.3 1272.3
+ Garage_TypeBasment 1 1240.7 1272.7
<none> 1243.0 1273.0
+ Second_Flr_SF 1 1241.4 1273.4
+ Garage_TypeCarPort 1 1241.6 1273.6
+ Latitude 1 1241.6 1273.6
+ Longitude 1 1241.9 1273.9
+ Open_Porch_SF 1 1242.1 1274.1
+ Overall_CondVery_Good 1 1242.1 1274.1
+ FoundationPConc 1 1242.2 1274.2
+ FoundationWood 1 1242.4 1274.4
+ Heating_QCGood 1 1242.6 1274.6
+ First_Flr_SF 1 1242.6 1274.6
+ Garage_TypeBuiltIn 1 1242.8 1274.8
+ Heating_QCFair 1 1242.9 1274.9
+ FoundationStone 1 1243.0 1275.0
+ Heating_QCPoor 1 1243.0 1275.0
+ Garage_Area 1 1243.0 1275.0
- Heating_QCTypical 1 1258.6 1286.6
- Overall_CondAverage 1 1261.6 1289.6
- FoundationCBlock 1 1263.1 1291.1
- FoundationSlab 1 1263.2 1291.2
- Year_Remod_Add 1 1263.6 1291.6
- Full_Bath 1 1267.4 1295.4
- Overall_CondBelow_Average 1 1269.8 1297.8
- Total_Bsmt_SF 1 1272.6 1300.6
- Overall_CondFair 1 1273.4 1301.4
- Garage_TypeDetchd 1 1283.3 1311.3
- Year_Built 1 1307.2 1335.2
- Garage_Cars 1 1316.8 1344.8
- Fireplaces 1 1388.7 1416.7
- Gr_Liv_Area 1 1404.5 1432.5
Step: AIC=1258.22
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd +
Full_Bath + Overall_CondFair + Overall_CondBelow_Average +
FoundationSlab + FoundationCBlock + Overall_CondAverage +
Overall_CondPoor
Warning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Overall_CondAbove_Average 1 1216.6 1250.6
+ Lot_Area 1 1217.6 1251.6
+ TotRms_AbvGrd 1 1219.4 1253.4
+ Overall_CondExcellent 1 1220.5 1254.5
+ Garage_TypeMore_Than_Two_Types 1 1221.2 1255.2
+ Garage_TypeNo_Garage 1 1223.6 1257.6
+ Mas_Vnr_Area 1 1223.7 1257.7
+ Overall_CondGood 1 1223.7 1257.7
+ Garage_TypeBasment 1 1223.9 1257.9
<none> 1226.2 1258.2
+ Second_Flr_SF 1 1224.2 1258.2
+ Garage_TypeCarPort 1 1224.8 1258.8
+ Latitude 1 1224.8 1258.8
+ Longitude 1 1225.3 1259.3
+ FoundationPConc 1 1225.4 1259.4
+ Overall_CondVery_Good 1 1225.4 1259.4
+ Open_Porch_SF 1 1225.5 1259.5
+ FoundationWood 1 1225.6 1259.6
+ First_Flr_SF 1 1225.7 1259.7
+ Heating_QCGood 1 1225.8 1259.8
+ Garage_TypeBuiltIn 1 1226.0 1260.0
+ Heating_QCPoor 1 1226.2 1260.2
+ FoundationStone 1 1226.2 1260.2
+ Heating_QCFair 1 1226.2 1260.2
+ Garage_Area 1 1226.2 1260.2
- Heating_QCTypical 1 1241.9 1271.9
- Overall_CondPoor 1 1243.0 1273.0
- Year_Remod_Add 1 1245.2 1275.2
- FoundationSlab 1 1246.6 1276.6
- Overall_CondAverage 1 1247.5 1277.5
- FoundationCBlock 1 1248.3 1278.3
- Overall_CondBelow_Average 1 1254.6 1284.6
- Overall_CondFair 1 1257.8 1287.8
- Total_Bsmt_SF 1 1258.2 1288.2
- Full_Bath 1 1263.4 1293.4
- Garage_TypeDetchd 1 1268.3 1298.3
- Year_Built 1 1289.9 1319.9
- Garage_Cars 1 1302.0 1332.0
- Fireplaces 1 1371.5 1401.5
- Gr_Liv_Area 1 1389.7 1419.7
Step: AIC=1250.58
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd +
Full_Bath + Overall_CondFair + Overall_CondBelow_Average +
FoundationSlab + FoundationCBlock + Overall_CondAverage +
Overall_CondPoor + Overall_CondAbove_Average
Warning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Lot_Area 1 1208.8 1244.8
+ TotRms_AbvGrd 1 1210.0 1246.0
+ Garage_TypeMore_Than_Two_Types 1 1210.8 1246.8
+ Overall_CondExcellent 1 1212.2 1248.2
+ Mas_Vnr_Area 1 1213.9 1249.9
+ Garage_TypeNo_Garage 1 1214.3 1250.3
+ Garage_TypeBasment 1 1214.5 1250.5
<none> 1216.6 1250.6
+ Garage_TypeCarPort 1 1214.9 1250.9
+ Latitude 1 1215.0 1251.0
+ Second_Flr_SF 1 1215.1 1251.1
+ Open_Porch_SF 1 1215.5 1251.5
+ Overall_CondGood 1 1215.6 1251.6
+ FoundationPConc 1 1215.6 1251.6
+ Longitude 1 1215.8 1251.8
+ FoundationWood 1 1215.9 1251.9
+ Heating_QCGood 1 1216.0 1252.0
+ Garage_TypeBuiltIn 1 1216.2 1252.2
+ First_Flr_SF 1 1216.2 1252.2
+ Heating_QCPoor 1 1216.5 1252.5
+ FoundationStone 1 1216.5 1252.5
+ Heating_QCFair 1 1216.6 1252.6
+ Overall_CondVery_Good 1 1216.6 1252.6
+ Garage_Area 1 1216.6 1252.6
- Overall_CondAbove_Average 1 1226.2 1258.2
- Year_Remod_Add 1 1226.4 1258.4
- Heating_QCTypical 1 1231.6 1263.6
- Overall_CondPoor 1 1236.0 1268.0
- FoundationSlab 1 1236.8 1268.8
- FoundationCBlock 1 1238.6 1270.6
- Overall_CondAverage 1 1246.9 1278.9
- Total_Bsmt_SF 1 1249.5 1281.5
- Full_Bath 1 1252.7 1284.7
- Overall_CondBelow_Average 1 1253.1 1285.1
- Overall_CondFair 1 1254.6 1286.6
- Garage_TypeDetchd 1 1259.9 1291.9
- Year_Built 1 1289.8 1321.8
- Garage_Cars 1 1304.3 1336.3
- Fireplaces 1 1362.5 1394.5
- Gr_Liv_Area 1 1384.6 1416.6
Step: AIC=1244.79
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd +
Full_Bath + Overall_CondFair + Overall_CondBelow_Average +
FoundationSlab + FoundationCBlock + Overall_CondAverage +
Overall_CondPoor + Overall_CondAbove_Average + Lot_Area
Warning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not converge
Df Deviance AIC
+ Garage_TypeMore_Than_Two_Types 1 1200.9 1238.9
+ Overall_CondExcellent 1 1204.2 1242.2
+ Mas_Vnr_Area 1 1205.6 1243.6
+ Garage_TypeNo_Garage 1 1206.0 1244.0
+ Latitude 1 1206.2 1244.2
<none> 1208.8 1244.8
+ Garage_TypeCarPort 1 1206.9 1244.9
+ Garage_TypeBasment 1 1207.1 1245.1
+ Overall_CondGood 1 1207.7 1245.7
+ Open_Porch_SF 1 1207.8 1245.8
+ Heating_QCGood 1 1207.9 1245.9
+ FoundationPConc 1 1208.1 1246.1
+ FoundationWood 1 1208.1 1246.1
+ Garage_TypeBuiltIn 1 1208.5 1246.5
+ Longitude 1 1208.6 1246.6
+ Garage_Area 1 1208.6 1246.6
+ Heating_QCFair 1 1208.7 1246.7
+ Heating_QCPoor 1 1208.8 1246.8
+ FoundationStone 1 1208.8 1246.8
+ Overall_CondVery_Good 1 1208.8 1246.8
+ TotRms_AbvGrd 1 1210.6 1248.6
- Lot_Area 1 1216.6 1250.6
- Overall_CondAbove_Average 1 1217.6 1251.6
- Year_Remod_Add 1 1218.8 1252.8
+ Second_Flr_SF 1 1216.3 1254.3
- Heating_QCTypical 1 1222.6 1256.6
+ First_Flr_SF 1 1218.9 1256.9
- Overall_CondPoor 1 1228.9 1262.9
- FoundationCBlock 1 1232.7 1266.7
- Total_Bsmt_SF 1 1235.5 1269.5
- Overall_CondAverage 1 1238.9 1272.9
- FoundationSlab 1 1241.3 1275.3
- Full_Bath 1 1244.7 1278.7
- Overall_CondFair 1 1246.2 1280.2
- Overall_CondBelow_Average 1 1246.3 1280.3
- Garage_TypeDetchd 1 1253.5 1287.5
- Garage_Cars 1 1282.4 1316.4
- Year_Built 1 1286.8 1320.8
- Fireplaces 1 1348.2 1382.2
- Gr_Liv_Area 1 1357.8 1391.8
Warning: glm.fit: algorithm did not converge
Step: AIC=1238.93
Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built + Fireplaces + Year_Remod_Add +
Garage_Cars + Total_Bsmt_SF + Heating_QCTypical + Garage_TypeDetchd +
Full_Bath + Overall_CondFair + Overall_CondBelow_Average +
FoundationSlab + FoundationCBlock + Overall_CondAverage +
Overall_CondPoor + Overall_CondAbove_Average + Lot_Area +
Garage_TypeMore_Than_Two_Types
Warning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not convergeWarning: glm.fit: algorithm did not convergeWarning: glm.fit: fitted probabilities numerically 0 or 1 occurredWarning: glm.fit: algorithm did not converge
Df Deviance AIC
<none> 1200.9 1238.9
+ Garage_TypeNo_Garage 1 1198.9 1238.9
+ Garage_TypeCarPort 1 1199.0 1239.0
+ Garage_TypeBasment 1 1199.2 1239.2
+ Latitude 1 1199.3 1239.3
+ Overall_CondGood 1 1199.8 1239.8
+ Open_Porch_SF 1 1200.0 1240.0
+ Heating_QCGood 1 1200.0 1240.0
+ FoundationWood 1 1200.1 1240.1
+ FoundationPConc 1 1200.4 1240.4
+ Garage_TypeBuiltIn 1 1200.7 1240.7
+ Heating_QCFair 1 1200.8 1240.8
+ Longitude 1 1200.8 1240.8
+ Heating_QCPoor 1 1200.9 1240.9
+ Garage_Area 1 1200.9 1240.9
+ FoundationStone 1 1200.9 1240.9
+ Overall_CondVery_Good 1 1200.9 1240.9
+ TotRms_AbvGrd 1 1203.2 1243.2
- Garage_TypeMore_Than_Two_Types 1 1208.8 1244.8
- Year_Remod_Add 1 1210.0 1246.0
- Overall_CondAbove_Average 1 1210.4 1246.4
- Lot_Area 1 1210.8 1246.8
+ Second_Flr_SF 1 1208.5 1248.5
+ Overall_CondExcellent 1 1208.6 1248.6
- Heating_QCTypical 1 1214.2 1250.2
+ Mas_Vnr_Area 1 1210.3 1250.3
+ First_Flr_SF 1 1211.0 1251.0
- Overall_CondPoor 1 1220.2 1256.2
- FoundationCBlock 1 1224.7 1260.7
- Total_Bsmt_SF 1 1226.9 1262.9
- Overall_CondAverage 1 1231.0 1267.0
- FoundationSlab 1 1233.8 1269.8
- Full_Bath 1 1234.8 1270.8
- Overall_CondBelow_Average 1 1237.0 1273.0
- Overall_CondFair 1 1238.8 1274.8
- Garage_TypeDetchd 1 1248.4 1284.4
- Year_Built 1 1275.0 1311.0
- Garage_Cars 1 1281.8 1317.8
- Fireplaces 1 1331.5 1367.5
- Gr_Liv_Area 1 1350.5 1386.5
direction = "both": This specifies
bidirectional elimination. The algorithm can both add
predictors (forward) and remove existing predictors (backward) in an
iterative process, stopping when no single step improves the
AIC.
trace = TRUE: Instructs R to print
the AIC at every step, showing the path the algorithm takes to the final
model.
summary(step_model)
Call:
glm(formula = Sale_Price_gt160K ~ Gr_Liv_Area + Year_Built +
Fireplaces + Year_Remod_Add + Garage_Cars + Total_Bsmt_SF +
Heating_QCTypical + Garage_TypeDetchd + Full_Bath + Overall_CondFair +
Overall_CondBelow_Average + FoundationSlab + FoundationCBlock +
Overall_CondAverage + Overall_CondPoor + Overall_CondAbove_Average +
Lot_Area + Garage_TypeMore_Than_Two_Types, family = binomial,
data = model_data)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.190e+02 1.106e+01 -10.766 < 2e-16 ***
Gr_Liv_Area 3.360e-03 3.056e-04 10.995 < 2e-16 ***
Year_Built 3.931e-02 4.883e-03 8.050 8.28e-16 ***
Fireplaces 1.415e+00 1.320e-01 10.721 < 2e-16 ***
Year_Remod_Add 1.636e-02 5.444e-03 3.006 0.002648 **
Garage_Cars 1.254e+00 1.577e-01 7.950 1.86e-15 ***
Total_Bsmt_SF 1.266e-03 2.528e-04 5.006 5.55e-07 ***
Heating_QCTypical -6.548e-01 1.810e-01 -3.617 0.000298 ***
Garage_TypeDetchd -1.202e+00 1.969e-01 -6.105 1.03e-09 ***
Full_Bath 8.946e-01 1.739e-01 5.143 2.71e-07 ***
Overall_CondFair -6.280e+00 1.532e+00 -4.100 4.13e-05 ***
Overall_CondBelow_Average -3.073e+00 5.552e-01 -5.536 3.10e-08 ***
FoundationSlab -3.980e+00 1.011e+00 -3.937 8.25e-05 ***
FoundationCBlock -9.349e-01 1.917e-01 -4.876 1.08e-06 ***
Overall_CondAverage -1.417e+00 2.641e-01 -5.365 8.11e-08 ***
Overall_CondPoor -7.902e+00 1.480e+00 -5.338 9.40e-08 ***
Overall_CondAbove_Average -7.647e-01 2.497e-01 -3.062 0.002198 **
Lot_Area 4.921e-05 1.597e-05 3.081 0.002062 **
Garage_TypeMore_Than_Two_Types -1.941e+00 7.145e-01 -2.716 0.006600 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 4061.8 on 2929 degrees of freedom
Residual deviance: 1200.9 on 2911 degrees of freedom
AIC: 1238.9
Number of Fisher Scoring iterations: 25
Once stepAIC returns the final model, we need to
interpret the coefficients (\(\beta_k\)).