1 Business Problem

Sebagai seorang penjual properti, kita ingin membuat model yang mana dapat memprediksi harga properti berdasarkan beberapa informasi yang ada pada data.

Tentukan variabel:

  • target: price
  • prediktor: seluruh variabel terkecuali price

2 Data Wrangling & EDA

1. Read data house_data.csv

house <- read.csv("data_input/house_data.csv")
head(house)

2. Cek struktur data

library(dplyr)
glimpse(house)
#> Rows: 21,613
#> Columns: 9
#> $ price       <int> 221900, 538000, 180000, 604000, 510000, 1225000, 257500, 2…
#> $ bedrooms    <int> 3, 3, 2, 4, 3, 4, 3, 3, 3, 3, 3, 2, 3, 3, 5, 4, 3, 4, 2, 3…
#> $ bathrooms   <dbl> 1.00, 2.25, 1.00, 3.00, 2.00, 4.50, 2.25, 1.50, 1.00, 2.50…
#> $ sqft_living <int> 1180, 2570, 770, 1960, 1680, 5420, 1715, 1060, 1780, 1890,…
#> $ sqft_lot    <int> 5650, 7242, 10000, 5000, 8080, 101930, 6819, 9711, 7470, 6…
#> $ floors      <dbl> 1.0, 2.0, 1.0, 1.0, 1.0, 1.0, 2.0, 1.0, 1.0, 2.0, 1.0, 1.0…
#> $ waterfront  <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
#> $ grade       <int> 7, 7, 6, 7, 8, 11, 7, 7, 7, 7, 8, 7, 7, 7, 7, 9, 7, 7, 7, …
#> $ yr_built    <int> 1955, 1951, 1933, 1965, 1987, 2001, 1995, 1963, 1960, 2003…

💡 Hasil pemeriksaan struktur data: bedrooms -> factor floors -> factor grade -> factor

3. Cleansing Data

library(lubridate)
house <- house %>% 
  mutate_at(vars(bedrooms,floors,grade),as.factor) #mengubah tipe data menjadi factor
glimpse(house)
#> Rows: 21,613
#> Columns: 9
#> $ price       <int> 221900, 538000, 180000, 604000, 510000, 1225000, 257500, 2…
#> $ bedrooms    <fct> 3, 3, 2, 4, 3, 4, 3, 3, 3, 3, 3, 2, 3, 3, 5, 4, 3, 4, 2, 3…
#> $ bathrooms   <dbl> 1.00, 2.25, 1.00, 3.00, 2.00, 4.50, 2.25, 1.50, 1.00, 2.50…
#> $ sqft_living <int> 1180, 2570, 770, 1960, 1680, 5420, 1715, 1060, 1780, 1890,…
#> $ sqft_lot    <int> 5650, 7242, 10000, 5000, 8080, 101930, 6819, 9711, 7470, 6…
#> $ floors      <fct> 1, 2, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1.5, 1, 1.5, 2, 2, 1.5…
#> $ waterfront  <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
#> $ grade       <fct> 7, 7, 6, 7, 8, 11, 7, 7, 7, 7, 8, 7, 7, 7, 7, 9, 7, 7, 7, …
#> $ yr_built    <int> 1955, 1951, 1933, 1965, 1987, 2001, 1995, 1963, 1960, 2003…

3. EDA

#persebaran data
boxplot(house)

#korelasi
boxplot(house$price)

#cek korelasi
library(GGally)
ggcorr(house,label = TRUE )

💡 Insight: - ada outlier - Skew kekanan - prediktor dengan korelasi baiuk dengan target adalah bathrooms # Modeling Buatlah 3 model berdasarkan feature selection yg telah dipelajari 1. model all predictor 2. model selection based on correlation (korelasi > 0.5) 3. model selection hasil stepwise (backward/forward/both)

head(house)

2.0.1 Model All Prediktor

model_all <- lm(formula = price~., data = house)
summary(model_all)
#> 
#> Call:
#> lm(formula = price ~ ., data = house)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -1517103  -107587   -10970    85414  3813468 
#> 
#> Coefficients:
#>                  Estimate    Std. Error t value             Pr(>|t|)    
#> (Intercept) 7147857.06153  249085.31875  28.696 < 0.0000000000000002 ***
#> bedrooms1     37424.97778   62910.89983   0.595              0.55192    
#> bedrooms2     45623.36394   61348.99816   0.744              0.45708    
#> bedrooms3     -6857.60993   61294.14995  -0.112              0.91092    
#> bedrooms4    -44750.89791   61356.85974  -0.729              0.46579    
#> bedrooms5    -35177.34817   61609.08243  -0.571              0.56802    
#> bedrooms6    -78883.53811   62799.42407  -1.256              0.20909    
#> bedrooms7   -131940.67030   70420.24010  -1.874              0.06100 .  
#> bedrooms8     61874.15704   84454.54665   0.733              0.46379    
#> bedrooms9   -264200.67930  105777.05165  -2.498              0.01251 *  
#> bedrooms10   -85275.78492  135616.78332  -0.629              0.52949    
#> bedrooms11  -321420.39093  218087.79358  -1.474              0.14055    
#> bedrooms33   183125.68799  217960.15021   0.840              0.40082    
#> bathrooms     59323.17930    3339.47381  17.764 < 0.0000000000000002 ***
#> sqft_living     150.82615       3.34495  45.091 < 0.0000000000000002 ***
#> sqft_lot         -0.26362       0.03527  -7.474    0.000000000000081 ***
#> floors1.5      2065.20064    5516.86935   0.374              0.70815    
#> floors2       -2086.69407    4016.66492  -0.520              0.60341    
#> floors2.5    103156.01348   16939.45136   6.090    0.000000001150417 ***
#> floors3      119955.20419    9456.47094  12.685 < 0.0000000000000002 ***
#> floors3.5    221434.69776   74840.40158   2.959              0.00309 ** 
#> waterfront   694092.03079   16684.68661  41.601 < 0.0000000000000002 ***
#> grade3       -96635.81614  249563.42063  -0.387              0.69860    
#> grade4      -159916.49032  220291.08263  -0.726              0.46789    
#> grade5      -193776.86952  218338.52545  -0.888              0.37482    
#> grade6      -140947.75035  217972.54746  -0.647              0.51788    
#> grade7       -50876.31110  217938.41841  -0.233              0.81542    
#> grade8        43430.97033  217988.12226   0.199              0.84208    
#> grade9       194335.96998  218070.62723   0.891              0.37285    
#> grade10      373542.29828  218182.72098   1.712              0.08690 .  
#> grade11      634497.10197  218451.15444   2.905              0.00368 ** 
#> grade12     1089430.43387  219317.87445   4.967    0.000000683896980 ***
#> grade13     2253954.55040  226376.98970   9.957 < 0.0000000000000002 ***
#> yr_built      -3588.43279      68.87915 -52.098 < 0.0000000000000002 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 209200 on 21579 degrees of freedom
#> Multiple R-squared:  0.6759, Adjusted R-squared:  0.6754 
#> F-statistic:  1364 on 33 and 21579 DF,  p-value: < 0.00000000000000022

2.0.2 Model Based on Correlation (bathrooms)

model_bathrooms <- lm(formula = price~bathrooms, data = house)
summary(model_bathrooms)
#> 
#> Call:
#> lm(formula = price ~ bathrooms, data = house)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -1438157  -184525   -41525   113220  5925322 
#> 
#> Coefficients:
#>             Estimate Std. Error t value            Pr(>|t|)    
#> (Intercept)    10708       6211   1.724              0.0847 .  
#> bathrooms     250326       2760  90.714 <0.0000000000000002 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 312400 on 21611 degrees of freedom
#> Multiple R-squared:  0.2758, Adjusted R-squared:  0.2757 
#> F-statistic:  8229 on 1 and 21611 DF,  p-value: < 0.00000000000000022

price=10708+250326∗bathrooms

plot(house$bathrooms, house$price)
abline(model_bathrooms, col = "red")

# STEPWISE

2.1 Backward

backward <- step(object = model_all, direction = "backward")
#> Start:  AIC=529587.7
#> price ~ bedrooms + bathrooms + sqft_living + sqft_lot + floors + 
#>     waterfront + grade + yr_built
#> 
#>               Df       Sum of Sq              RSS    AIC
#> <none>                            943960178918381 529588
#> - sqft_lot     1   2443362459096  946403541377477 529642
#> - floors       5  10007431499118  953967610417499 529806
#> - bedrooms    12  12540920524799  956501099443180 529849
#> - bathrooms    1  13804322987608  957764501905989 529899
#> - waterfront   1  75704268598963 1019664447517344 531253
#> - sqft_living  1  88939924950670 1032900103869050 531532
#> - yr_built     1 118728880745514 1062689059663895 532146
#> - grade       11 242625670736310 1186585849654690 534510
summary(backward)
#> 
#> Call:
#> lm(formula = price ~ bedrooms + bathrooms + sqft_living + sqft_lot + 
#>     floors + waterfront + grade + yr_built, data = house)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -1517103  -107587   -10970    85414  3813468 
#> 
#> Coefficients:
#>                  Estimate    Std. Error t value             Pr(>|t|)    
#> (Intercept) 7147857.06153  249085.31875  28.696 < 0.0000000000000002 ***
#> bedrooms1     37424.97778   62910.89983   0.595              0.55192    
#> bedrooms2     45623.36394   61348.99816   0.744              0.45708    
#> bedrooms3     -6857.60993   61294.14995  -0.112              0.91092    
#> bedrooms4    -44750.89791   61356.85974  -0.729              0.46579    
#> bedrooms5    -35177.34817   61609.08243  -0.571              0.56802    
#> bedrooms6    -78883.53811   62799.42407  -1.256              0.20909    
#> bedrooms7   -131940.67030   70420.24010  -1.874              0.06100 .  
#> bedrooms8     61874.15704   84454.54665   0.733              0.46379    
#> bedrooms9   -264200.67930  105777.05165  -2.498              0.01251 *  
#> bedrooms10   -85275.78492  135616.78332  -0.629              0.52949    
#> bedrooms11  -321420.39093  218087.79358  -1.474              0.14055    
#> bedrooms33   183125.68799  217960.15021   0.840              0.40082    
#> bathrooms     59323.17930    3339.47381  17.764 < 0.0000000000000002 ***
#> sqft_living     150.82615       3.34495  45.091 < 0.0000000000000002 ***
#> sqft_lot         -0.26362       0.03527  -7.474    0.000000000000081 ***
#> floors1.5      2065.20064    5516.86935   0.374              0.70815    
#> floors2       -2086.69407    4016.66492  -0.520              0.60341    
#> floors2.5    103156.01348   16939.45136   6.090    0.000000001150417 ***
#> floors3      119955.20419    9456.47094  12.685 < 0.0000000000000002 ***
#> floors3.5    221434.69776   74840.40158   2.959              0.00309 ** 
#> waterfront   694092.03079   16684.68661  41.601 < 0.0000000000000002 ***
#> grade3       -96635.81614  249563.42063  -0.387              0.69860    
#> grade4      -159916.49032  220291.08263  -0.726              0.46789    
#> grade5      -193776.86952  218338.52545  -0.888              0.37482    
#> grade6      -140947.75035  217972.54746  -0.647              0.51788    
#> grade7       -50876.31110  217938.41841  -0.233              0.81542    
#> grade8        43430.97033  217988.12226   0.199              0.84208    
#> grade9       194335.96998  218070.62723   0.891              0.37285    
#> grade10      373542.29828  218182.72098   1.712              0.08690 .  
#> grade11      634497.10197  218451.15444   2.905              0.00368 ** 
#> grade12     1089430.43387  219317.87445   4.967    0.000000683896980 ***
#> grade13     2253954.55040  226376.98970   9.957 < 0.0000000000000002 ***
#> yr_built      -3588.43279      68.87915 -52.098 < 0.0000000000000002 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 209200 on 21579 degrees of freedom
#> Multiple R-squared:  0.6759, Adjusted R-squared:  0.6754 
#> F-statistic:  1364 on 33 and 21579 DF,  p-value: < 0.00000000000000022

2.2 Forward

forward <- step(object = model_bathrooms, direction = "forward", scope = list(lower = model_bathrooms, upper = model_all))
#> Start:  AIC=546904.4
#> price ~ bathrooms
#> 
#>               Df       Sum of Sq              RSS    AIC
#> + grade       11 752394678422453 1357228777001178 537394
#> + sqft_living  1 632494286111913 1477129169311718 539203
#> + yr_built     1 175510961437577 1934112493986054 545029
#> + waterfront   1 158641800230300 1950981655193331 545217
#> + floors       5  46043144331791 2063580311091840 546438
#> + bedrooms    12  33177360169758 2076446095253873 546586
#> + sqft_lot     1   5576578954819 2104046876468812 546849
#> <none>                           2109623455423631 546904
#> 
#> Step:  AIC=537393.7
#> price ~ bathrooms + grade
#> 
#>               Df       Sum of Sq              RSS    AIC
#> + yr_built     1 225794937949651 1131433839051526 533463
#> + sqft_living  1 139395971837453 1217832805163725 535053
#> + waterfront   1 107174563579753 1250054213421424 535618
#> + floors       5  60169818251324 1297058958749854 536424
#> + bedrooms    12  22264807429182 1334963969571996 537060
#> + sqft_lot     1    174875436325 1357053901564852 537393
#> <none>                           1357228777001178 537394
#> 
#> Step:  AIC=533463
#> price ~ bathrooms + grade + yr_built
#> 
#>               Df      Sum of Sq              RSS    AIC
#> + waterfront   1 88000861122237 1043432977929290 531715
#> + sqft_living  1 79053517962434 1052380321089093 531900
#> + bedrooms    12  5893198032337 1125540641019190 533374
#> + floors       5  5025122100976 1126408716950551 533377
#> <none>                          1131433839051526 533463
#> + sqft_lot     1    58165483405 1131375673568121 533464
#> 
#> Step:  AIC=531715
#> price ~ bathrooms + grade + yr_built + waterfront
#> 
#>               Df      Sum of Sq              RSS    AIC
#> + sqft_living  1 73076552051395  970356425877895 530148
#> + bedrooms    12  5727253131939 1037705724797350 531620
#> + floors       5  4757227546980 1038675750382310 531626
#> <none>                          1043432977929290 531715
#> + sqft_lot     1    87180090980 1043345797838310 531715
#> 
#> Step:  AIC=530147.8
#> price ~ bathrooms + grade + yr_built + waterfront + sqft_living
#> 
#>            Df      Sum of Sq             RSS    AIC
#> + bedrooms 12 13627551392089 956728874485806 529866
#> + floors    5 12027309816169 958329116061726 529888
#> + sqft_lot  1  2138877360195 968217548517699 530102
#> <none>                       970356425877895 530148
#> 
#> Step:  AIC=529866.1
#> price ~ bathrooms + grade + yr_built + waterfront + sqft_living + 
#>     bedrooms
#> 
#>            Df      Sum of Sq             RSS    AIC
#> + floors    5 10325333108330 946403541377476 529642
#> + sqft_lot  1  2761264068308 953967610417498 529806
#> <none>                       956728874485806 529866
#> 
#> Step:  AIC=529641.6
#> price ~ bathrooms + grade + yr_built + waterfront + sqft_living + 
#>     bedrooms + floors
#> 
#>            Df     Sum of Sq             RSS    AIC
#> + sqft_lot  1 2443362459095 943960178918381 529588
#> <none>                      946403541377476 529642
#> 
#> Step:  AIC=529587.7
#> price ~ bathrooms + grade + yr_built + waterfront + sqft_living + 
#>     bedrooms + floors + sqft_lot
summary(forward)
#> 
#> Call:
#> lm(formula = price ~ bathrooms + grade + yr_built + waterfront + 
#>     sqft_living + bedrooms + floors + sqft_lot, data = house)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -1517103  -107587   -10970    85414  3813468 
#> 
#> Coefficients:
#>                  Estimate    Std. Error t value             Pr(>|t|)    
#> (Intercept) 7147857.06153  249085.31875  28.696 < 0.0000000000000002 ***
#> bathrooms     59323.17930    3339.47381  17.764 < 0.0000000000000002 ***
#> grade3       -96635.81614  249563.42063  -0.387              0.69860    
#> grade4      -159916.49032  220291.08263  -0.726              0.46789    
#> grade5      -193776.86952  218338.52545  -0.888              0.37482    
#> grade6      -140947.75035  217972.54746  -0.647              0.51788    
#> grade7       -50876.31110  217938.41841  -0.233              0.81542    
#> grade8        43430.97033  217988.12226   0.199              0.84208    
#> grade9       194335.96998  218070.62723   0.891              0.37285    
#> grade10      373542.29828  218182.72098   1.712              0.08690 .  
#> grade11      634497.10197  218451.15444   2.905              0.00368 ** 
#> grade12     1089430.43387  219317.87445   4.967    0.000000683896980 ***
#> grade13     2253954.55040  226376.98970   9.957 < 0.0000000000000002 ***
#> yr_built      -3588.43279      68.87915 -52.098 < 0.0000000000000002 ***
#> waterfront   694092.03079   16684.68661  41.601 < 0.0000000000000002 ***
#> sqft_living     150.82615       3.34495  45.091 < 0.0000000000000002 ***
#> bedrooms1     37424.97778   62910.89983   0.595              0.55192    
#> bedrooms2     45623.36394   61348.99816   0.744              0.45708    
#> bedrooms3     -6857.60993   61294.14995  -0.112              0.91092    
#> bedrooms4    -44750.89791   61356.85974  -0.729              0.46579    
#> bedrooms5    -35177.34817   61609.08243  -0.571              0.56802    
#> bedrooms6    -78883.53811   62799.42407  -1.256              0.20909    
#> bedrooms7   -131940.67030   70420.24010  -1.874              0.06100 .  
#> bedrooms8     61874.15704   84454.54665   0.733              0.46379    
#> bedrooms9   -264200.67930  105777.05165  -2.498              0.01251 *  
#> bedrooms10   -85275.78492  135616.78332  -0.629              0.52949    
#> bedrooms11  -321420.39093  218087.79358  -1.474              0.14055    
#> bedrooms33   183125.68799  217960.15021   0.840              0.40082    
#> floors1.5      2065.20064    5516.86935   0.374              0.70815    
#> floors2       -2086.69407    4016.66492  -0.520              0.60341    
#> floors2.5    103156.01348   16939.45136   6.090    0.000000001150417 ***
#> floors3      119955.20419    9456.47094  12.685 < 0.0000000000000002 ***
#> floors3.5    221434.69776   74840.40158   2.959              0.00309 ** 
#> sqft_lot         -0.26362       0.03527  -7.474    0.000000000000081 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 209200 on 21579 degrees of freedom
#> Multiple R-squared:  0.6759, Adjusted R-squared:  0.6754 
#> F-statistic:  1364 on 33 and 21579 DF,  p-value: < 0.00000000000000022
# performa Model backward 
summary(backward)$adj.r.squared
#> [1] 0.6754443
# Performa Model Forward 
summary(forward)$adj.r.squared
#> [1] 0.6754443

dari kedua model dengan menggunakan stepwise yaitu backward dan forward, tidak didapat perbedaan yang signifikan

3 PREDICT

house$price_bathrooms <- predict(object = model_bathrooms, newdata = house)
house$priceBackward <-predict(object = backward, newdata = house)

4 Evaluasi model

Berdasarkan RMSE model regresi manakah yang terbaik?

RMSE(y_pred = house$priceBackward, y_true = house$price) # model_all
#> [1] 208987
RMSE(y_pred = house$price_bathrooms, y_true = house$price) # Model_bathrooms
#> [1] 312424.4
MSE(y_pred = house$priceBackward, y_true = house$price) # model_all
#> [1] 43675573910
MSE(y_pred = house$price_bathrooms, y_true = house$price) # Model_bathrooms
#> [1] 97609006405

💡 Kesimpulan : Dari hasil prediksi dan pengujian error dengan menggunakan RMSE dan MSE, didapat model All (backward) adalah yang terbaik dengan error yang lebih rendah dibandingkan dengan model single linear regression (bathrooms)/ korelasi

5 Interpretasi Model Terbaik:

summary(model_all)
#> 
#> Call:
#> lm(formula = price ~ ., data = house)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -1517103  -107587   -10970    85414  3813468 
#> 
#> Coefficients:
#>                  Estimate    Std. Error t value             Pr(>|t|)    
#> (Intercept) 7147857.06153  249085.31875  28.696 < 0.0000000000000002 ***
#> bedrooms1     37424.97778   62910.89983   0.595              0.55192    
#> bedrooms2     45623.36394   61348.99816   0.744              0.45708    
#> bedrooms3     -6857.60993   61294.14995  -0.112              0.91092    
#> bedrooms4    -44750.89791   61356.85974  -0.729              0.46579    
#> bedrooms5    -35177.34817   61609.08243  -0.571              0.56802    
#> bedrooms6    -78883.53811   62799.42407  -1.256              0.20909    
#> bedrooms7   -131940.67030   70420.24010  -1.874              0.06100 .  
#> bedrooms8     61874.15704   84454.54665   0.733              0.46379    
#> bedrooms9   -264200.67930  105777.05165  -2.498              0.01251 *  
#> bedrooms10   -85275.78492  135616.78332  -0.629              0.52949    
#> bedrooms11  -321420.39093  218087.79358  -1.474              0.14055    
#> bedrooms33   183125.68799  217960.15021   0.840              0.40082    
#> bathrooms     59323.17930    3339.47381  17.764 < 0.0000000000000002 ***
#> sqft_living     150.82615       3.34495  45.091 < 0.0000000000000002 ***
#> sqft_lot         -0.26362       0.03527  -7.474    0.000000000000081 ***
#> floors1.5      2065.20064    5516.86935   0.374              0.70815    
#> floors2       -2086.69407    4016.66492  -0.520              0.60341    
#> floors2.5    103156.01348   16939.45136   6.090    0.000000001150417 ***
#> floors3      119955.20419    9456.47094  12.685 < 0.0000000000000002 ***
#> floors3.5    221434.69776   74840.40158   2.959              0.00309 ** 
#> waterfront   694092.03079   16684.68661  41.601 < 0.0000000000000002 ***
#> grade3       -96635.81614  249563.42063  -0.387              0.69860    
#> grade4      -159916.49032  220291.08263  -0.726              0.46789    
#> grade5      -193776.86952  218338.52545  -0.888              0.37482    
#> grade6      -140947.75035  217972.54746  -0.647              0.51788    
#> grade7       -50876.31110  217938.41841  -0.233              0.81542    
#> grade8        43430.97033  217988.12226   0.199              0.84208    
#> grade9       194335.96998  218070.62723   0.891              0.37285    
#> grade10      373542.29828  218182.72098   1.712              0.08690 .  
#> grade11      634497.10197  218451.15444   2.905              0.00368 ** 
#> grade12     1089430.43387  219317.87445   4.967    0.000000683896980 ***
#> grade13     2253954.55040  226376.98970   9.957 < 0.0000000000000002 ***
#> yr_built      -3588.43279      68.87915 -52.098 < 0.0000000000000002 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 209200 on 21579 degrees of freedom
#> Multiple R-squared:  0.6759, Adjusted R-squared:  0.6754 
#> F-statistic:  1364 on 33 and 21579 DF,  p-value: < 0.00000000000000022

1. Interpretasi coefficient untuk prediktor kategorik:

  • Model bathrooms bernilai 10708

2. Interpretasi coefficient untuk prediktor numerik: - Model all bernilai 7147857.06153

3. Signifikansi prediktor: prediktor yang paling signifikan 1. bathrooms 2. grade 3. sqft_living 4. sqft_lot

4. Adjusted R Squared:

summary(model_all)$adj.r.squared
#> [1] 0.6754443
summary(model_bathrooms)$adj.r.squared
#> [1] 0.2757359
  • model dapat dijelaskan dengan baik oleh prediktor sebesar 67%