Email:
Linkedin: https://www.linkedin.com/in/sherly-taurin-8a50221b4/
RPubs: https://rpubs.com/sherlytaurin/


1 Introduction

The aim of this report is to apply Exploratory Data Analysis (EDA) to the house sales in King County, Washington State, USA. The data set consisted of historic data of houses sold between May 2014 to May 2015.

  • The dataset consisted of 21 variables and 21613 observations.
  • Variables Description Data Type:
    • id: a notation for a house Numeric
    • date: Date house was sold String
    • price: Price is prediction target Numeric
    • bedrooms: Number of Bedrooms/House Numeric
    • bathrooms: Number of bathrooms/bedrooms Numeric
    • sqftliving: square footage of the home Numeric sqftlot square footage of the lot Numeric
    • floors: Total floors (levels) in house Numeric
    • waterfront: House which has a view to a waterfront Numeric
    • view: Has been viewed Numeric
    • condition: How good the condition is ( Overall ). 1 indicates worn out property and 5 excellent.(http://info.kingcounty.gov/assessor/esales/Glossary.aspx?type=r#g) Numeric
    • grade: overall grade given to the housing unit, based on King County grading system. 1 poor ,13 excellent (Numeric)
    • sqftabove: square footage of house apart from basement Numeric
    • sqftbasement: square footage of the basement Numeric
    • yrbuilt: Built Year Numeric
    • yrrenovated: Year when house was renovated Numeric
    • zipcode: zip Numeric
    • lat: Latitude coordinate Numeric
    • long: Longitude coordinate Numeric
    • sqftliving15: Living room area in 2015(implies-some renovations) This might or might not have affected the lotsize area Numeric sqftlot15 lotSize area in 2015(implies-some renovations) Numeric

2 Persiapan Data

## 'data.frame':    21597 obs. of  21 variables:
##  $ id           : num  7.13e+09 6.41e+09 5.63e+09 2.49e+09 1.95e+09 ...
##  $ date         : chr  "10/13/2014" "12/9/2014" "2/25/2015" "12/9/2014" ...
##  $ price        : num  221900 538000 180000 604000 510000 ...
##  $ bedrooms     : int  3 3 2 4 3 4 3 3 3 3 ...
##  $ bathrooms    : num  1 2.25 1 3 2 4.5 2.25 1.5 1 2.5 ...
##  $ sqft_living  : int  1180 2570 770 1960 1680 5420 1715 1060 1780 1890 ...
##  $ sqft_lot     : int  5650 7242 10000 5000 8080 101930 6819 9711 7470 6560 ...
##  $ floors       : num  1 2 1 1 1 1 2 1 1 2 ...
##  $ waterfront   : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ view         : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ condition    : int  3 3 3 5 3 3 3 3 3 3 ...
##  $ grade        : int  7 7 6 7 8 11 7 7 7 7 ...
##  $ sqft_above   : int  1180 2170 770 1050 1680 3890 1715 1060 1050 1890 ...
##  $ sqft_basement: int  0 400 0 910 0 1530 0 0 730 0 ...
##  $ yr_built     : int  1955 1951 1933 1965 1987 2001 1995 1963 1960 2003 ...
##  $ yr_renovated : int  0 1991 0 0 0 0 0 0 0 0 ...
##  $ zipcode      : int  98178 98125 98028 98136 98074 98053 98003 98198 98146 98038 ...
##  $ lat          : num  47.5 47.7 47.7 47.5 47.6 ...
##  $ long         : num  -122 -122 -122 -122 -122 ...
##  $ sqft_living15: int  1340 1690 2720 1360 1800 4760 2238 1650 1780 2390 ...
##  $ sqft_lot15   : int  5650 7639 8062 5000 7503 101930 6819 9711 8113 7570 ...
## integer(0)

3 Part 1

##     
##         1    2    3    4    5
##   3     0    0    0    0    1
##   4     1    4   12   10    0
##   5     9   15  100   84   34
##   6    11   59 1035  685  248
##   7     6   75 5229 2831  833
##   8     2   13 4266 1394  390
##   9     0    2 2041  446  126
##   10    0    2  921  156   55
##   11    0    0  332   56   11
##   12    0    0   73   13    3
##   13    0    0   11    2    0
##     
##                 1            2            3            4            5
##   3  0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 4.630273e-05
##   4  4.630273e-05 1.852109e-04 5.556327e-04 4.630273e-04 0.000000e+00
##   5  4.167245e-04 6.945409e-04 4.630273e-03 3.889429e-03 1.574293e-03
##   6  5.093300e-04 2.731861e-03 4.792332e-02 3.171737e-02 1.148308e-02
##   7  2.778164e-04 3.472705e-03 2.421170e-01 1.310830e-01 3.857017e-02
##   8  9.260545e-05 6.019355e-04 1.975274e-01 6.454600e-02 1.805806e-02
##   9  0.000000e+00 9.260545e-05 9.450387e-02 2.065102e-02 5.834144e-03
##   10 0.000000e+00 9.260545e-05 4.264481e-02 7.223225e-03 2.546650e-03
##   11 0.000000e+00 0.000000e+00 1.537251e-02 2.592953e-03 5.093300e-04
##   12 0.000000e+00 0.000000e+00 3.380099e-03 6.019355e-04 1.389082e-04
##   13 0.000000e+00 0.000000e+00 5.093300e-04 9.260545e-05 0.000000e+00
##  [1] "id"            "price"         "bedrooms"      "bathrooms"    
##  [5] "sqft_living"   "sqft_lot"      "floors"        "waterfront"   
##  [9] "view"          "condition"     "grade"         "sqft_above"   
## [13] "sqft_basement" "yr_built"      "yr_renovated"  "zipcode"      
## [17] "lat"           "long"          "sqft_living15" "sqft_lot15"
##        id                price            bedrooms        bathrooms    
##  Min.   :1.000e+06   Min.   :  78000   Min.   : 1.000   Min.   :0.500  
##  1st Qu.:2.123e+09   1st Qu.: 322000   1st Qu.: 3.000   1st Qu.:1.750  
##  Median :3.905e+09   Median : 450000   Median : 3.000   Median :2.250  
##  Mean   :4.580e+09   Mean   : 540297   Mean   : 3.373   Mean   :2.116  
##  3rd Qu.:7.309e+09   3rd Qu.: 645000   3rd Qu.: 4.000   3rd Qu.:2.500  
##  Max.   :9.900e+09   Max.   :7700000   Max.   :33.000   Max.   :8.000  
##   sqft_living       sqft_lot           floors        waterfront      
##  Min.   :  370   Min.   :    520   Min.   :1.000   Min.   :0.000000  
##  1st Qu.: 1430   1st Qu.:   5040   1st Qu.:1.000   1st Qu.:0.000000  
##  Median : 1910   Median :   7618   Median :1.500   Median :0.000000  
##  Mean   : 2080   Mean   :  15099   Mean   :1.494   Mean   :0.007547  
##  3rd Qu.: 2550   3rd Qu.:  10685   3rd Qu.:2.000   3rd Qu.:0.000000  
##  Max.   :13540   Max.   :1651359   Max.   :3.500   Max.   :1.000000  
##       view          condition        grade          sqft_above  
##  Min.   :0.0000   Min.   :1.00   Min.   : 3.000   Min.   : 370  
##  1st Qu.:0.0000   1st Qu.:3.00   1st Qu.: 7.000   1st Qu.:1190  
##  Median :0.0000   Median :3.00   Median : 7.000   Median :1560  
##  Mean   :0.2343   Mean   :3.41   Mean   : 7.658   Mean   :1789  
##  3rd Qu.:0.0000   3rd Qu.:4.00   3rd Qu.: 8.000   3rd Qu.:2210  
##  Max.   :4.0000   Max.   :5.00   Max.   :13.000   Max.   :9410  
##  sqft_basement       yr_built     yr_renovated        zipcode     
##  Min.   :   0.0   Min.   :1900   Min.   :   0.00   Min.   :98001  
##  1st Qu.:   0.0   1st Qu.:1951   1st Qu.:   0.00   1st Qu.:98033  
##  Median :   0.0   Median :1975   Median :   0.00   Median :98065  
##  Mean   : 291.7   Mean   :1971   Mean   :  84.46   Mean   :98078  
##  3rd Qu.: 560.0   3rd Qu.:1997   3rd Qu.:   0.00   3rd Qu.:98118  
##  Max.   :4820.0   Max.   :2015   Max.   :2015.00   Max.   :98199  
##       lat             long        sqft_living15    sqft_lot15    
##  Min.   :47.16   Min.   :-122.5   Min.   : 399   Min.   :   651  
##  1st Qu.:47.47   1st Qu.:-122.3   1st Qu.:1490   1st Qu.:  5100  
##  Median :47.57   Median :-122.2   Median :1840   Median :  7620  
##  Mean   :47.56   Mean   :-122.2   Mean   :1987   Mean   : 12758  
##  3rd Qu.:47.68   3rd Qu.:-122.1   3rd Qu.:2360   3rd Qu.: 10083  
##  Max.   :47.78   Max.   :-121.3   Max.   :6210   Max.   :871200
## [1] 0.7881271
## Rows: 21,597
## Columns: 21
## $ id            <dbl> 7129300520, 6414100192, 5631500400, 2487200875, 19544...
## $ date          <chr> "10/13/2014", "12/9/2014", "2/25/2015", "12/9/2014", ...
## $ price         <dbl> 221900, 538000, 180000, 604000, 510000, 1230000, 2575...
## $ bedrooms      <int> 3, 3, 2, 4, 3, 4, 3, 3, 3, 3, 3, 2, 3, 3, 5, 4, 3, 4,...
## $ bathrooms     <dbl> 1.00, 2.25, 1.00, 3.00, 2.00, 4.50, 2.25, 1.50, 1.00,...
## $ sqft_living   <int> 1180, 2570, 770, 1960, 1680, 5420, 1715, 1060, 1780, ...
## $ sqft_lot      <int> 5650, 7242, 10000, 5000, 8080, 101930, 6819, 9711, 74...
## $ floors        <dbl> 1.0, 2.0, 1.0, 1.0, 1.0, 1.0, 2.0, 1.0, 1.0, 2.0, 1.0...
## $ waterfront    <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
## $ view          <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0,...
## $ condition     <int> 3, 3, 3, 5, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 3, 3, 3, 4,...
## $ grade         <int> 7, 7, 6, 7, 8, 11, 7, 7, 7, 7, 8, 7, 7, 7, 7, 9, 7, 7...
## $ sqft_above    <int> 1180, 2170, 770, 1050, 1680, 3890, 1715, 1060, 1050, ...
## $ sqft_basement <int> 0, 400, 0, 910, 0, 1530, 0, 0, 730, 0, 1700, 300, 0, ...
## $ yr_built      <int> 1955, 1951, 1933, 1965, 1987, 2001, 1995, 1963, 1960,...
## $ yr_renovated  <int> 0, 1991, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
## $ zipcode       <int> 98178, 98125, 98028, 98136, 98074, 98053, 98003, 9819...
## $ lat           <dbl> 47.5112, 47.7210, 47.7379, 47.5208, 47.6168, 47.6561,...
## $ long          <dbl> -122.257, -122.319, -122.233, -122.393, -122.045, -12...
## $ sqft_living15 <int> 1340, 1690, 2720, 1360, 1800, 4760, 2238, 1650, 1780,...
## $ sqft_lot15    <int> 5650, 7639, 8062, 5000, 7503, 101930, 6819, 9711, 811...
##         variable q_zeros p_zeros q_na p_na q_inf p_inf      type unique
## 1             id       0    0.00    0    0     0     0   numeric  21420
## 2           date       0    0.00    0    0     0     0 character    372
## 3          price       0    0.00    0    0     0     0   numeric   3622
## 4       bedrooms       0    0.00    0    0     0     0   integer     12
## 5      bathrooms       0    0.00    0    0     0     0   numeric     29
## 6    sqft_living       0    0.00    0    0     0     0   integer   1034
## 7       sqft_lot       0    0.00    0    0     0     0   integer   9776
## 8         floors       0    0.00    0    0     0     0   numeric      6
## 9     waterfront   21434   99.25    0    0     0     0   integer      2
## 10          view   19475   90.17    0    0     0     0   integer      5
## 11     condition       0    0.00    0    0     0     0   integer      5
## 12         grade       0    0.00    0    0     0     0   integer     11
## 13    sqft_above       0    0.00    0    0     0     0   integer    942
## 14 sqft_basement   13110   60.70    0    0     0     0   integer    306
## 15      yr_built       0    0.00    0    0     0     0   integer    116
## 16  yr_renovated   20683   95.77    0    0     0     0   integer     70
## 17       zipcode       0    0.00    0    0     0     0   integer     70
## 18           lat       0    0.00    0    0     0     0   numeric   5033
## 19          long       0    0.00    0    0     0     0   numeric    751
## 20 sqft_living15       0    0.00    0    0     0     0   integer    777
## 21    sqft_lot15       0    0.00    0    0     0     0   integer   8682
## Warning in freq_logic(data = data, input = input, plot, na.rm, path_out =
## path_out): Skipping plot for variable 'date' (more than 100 categories)

## data 
## 
##  21  Variables      21597  Observations
## --------------------------------------------------------------------------------
## id 
##         n   missing  distinct      Info      Mean       Gmd       .05       .10 
##     21597         0     21420         1  4.58e+09 3.297e+09 5.127e+08 1.036e+09 
##       .25       .50       .75       .90       .95 
## 2.123e+09 3.905e+09 7.309e+09 8.732e+09 9.297e+09 
## 
## lowest :    1000102    1200019    1200021    2800031    3600057
## highest: 9842300095 9842300485 9842300540 9895000040 9900000190
## --------------------------------------------------------------------------------
## date 
##        n  missing distinct 
##    21597        0      372 
## 
## lowest : 1/10/2015 1/12/2015 1/13/2015 1/14/2015 1/15/2015
## highest: 9/5/2014  9/6/2014  9/7/2014  9/8/2014  9/9/2014 
## --------------------------------------------------------------------------------
## price 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0     3622        1   540297   329526   210000   245000 
##      .25      .50      .75      .90      .95 
##   322000   450000   645000   887000  1160000 
## 
## lowest :   78000   80000   81000   82000   82500
## highest: 5350000 5570000 6890000 7060000 7700000
## --------------------------------------------------------------------------------
## bedrooms 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0       12    0.871    3.373   0.9427        2        2 
##      .25      .50      .75      .90      .95 
##        3        3        4        4        5 
## 
## lowest :  1  2  3  4  5, highest:  8  9 10 11 33
##                                                                             
## Value          1     2     3     4     5     6     7     8     9    10    11
## Frequency    196  2760  9824  6882  1601   272    38    13     6     3     1
## Proportion 0.009 0.128 0.455 0.319 0.074 0.013 0.002 0.001 0.000 0.000 0.000
##                 
## Value         33
## Frequency      1
## Proportion 0.000
## --------------------------------------------------------------------------------
## bathrooms 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0       29    0.974    2.116   0.8432     1.00     1.00 
##      .25      .50      .75      .90      .95 
##     1.75     2.25     2.50     3.00     3.50 
## 
## lowest : 0.50 0.75 1.00 1.25 1.50, highest: 6.50 6.75 7.50 7.75 8.00
## --------------------------------------------------------------------------------
## sqft_living 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0     1034        1     2080      978      940     1090 
##      .25      .50      .75      .90      .95 
##     1430     1910     2550     3254     3760 
## 
## lowest :   370   380   390   410   420, highest:  9640  9890 10040 12050 13540
## --------------------------------------------------------------------------------
## sqft_lot 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0     9776        1    15099    17841     1801     3323 
##      .25      .50      .75      .90      .95 
##     5040     7618    10685    21372    43307 
## 
## lowest :     520     572     600     609     635
## highest:  982998 1024068 1074218 1164794 1651359
## --------------------------------------------------------------------------------
## floors 
##        n  missing distinct     Info     Mean      Gmd 
##    21597        0        6    0.823    1.494   0.5561 
## 
## lowest : 1.0 1.5 2.0 2.5 3.0, highest: 1.5 2.0 2.5 3.0 3.5
##                                               
## Value        1.0   1.5   2.0   2.5   3.0   3.5
## Frequency  10673  1910  8235   161   611     7
## Proportion 0.494 0.088 0.381 0.007 0.028 0.000
## --------------------------------------------------------------------------------
## waterfront 
##        n  missing distinct     Info      Sum     Mean      Gmd 
##    21597        0        2    0.022      163 0.007547  0.01498 
## 
## --------------------------------------------------------------------------------
## view 
##        n  missing distinct     Info     Mean      Gmd 
##    21597        0        5    0.267   0.2343   0.4322 
## 
## lowest : 0 1 2 3 4, highest: 0 1 2 3 4
##                                         
## Value          0     1     2     3     4
## Frequency  19475   332   961   510   319
## Proportion 0.902 0.015 0.044 0.024 0.015
## --------------------------------------------------------------------------------
## condition 
##        n  missing distinct     Info     Mean      Gmd 
##    21597        0        5    0.708     3.41   0.6159 
## 
## lowest : 1 2 3 4 5, highest: 1 2 3 4 5
##                                         
## Value          1     2     3     4     5
## Frequency     29   170 14020  5677  1701
## Proportion 0.001 0.008 0.649 0.263 0.079
## --------------------------------------------------------------------------------
## grade 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0       11    0.903    7.658    1.229        6        6 
##      .25      .50      .75      .90      .95 
##        7        7        8        9       10 
## 
## lowest :  3  4  5  6  7, highest:  9 10 11 12 13
##                                                                             
## Value          3     4     5     6     7     8     9    10    11    12    13
## Frequency      1    27   242  2038  8974  6065  2615  1134   399    89    13
## Proportion 0.000 0.001 0.011 0.094 0.416 0.281 0.121 0.053 0.018 0.004 0.001
## --------------------------------------------------------------------------------
## sqft_above 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0      942        1     1789    875.8      850      970 
##      .25      .50      .75      .90      .95 
##     1190     1560     2210     2950     3400 
## 
## lowest :  370  380  390  410  420, highest: 7880 8020 8570 8860 9410
## --------------------------------------------------------------------------------
## sqft_basement 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0      306    0.776    291.7    422.4        0        0 
##      .25      .50      .75      .90      .95 
##        0        0      560      970     1190 
## 
## lowest :    0   10   20   40   50, highest: 3260 3480 3500 4130 4820
## --------------------------------------------------------------------------------
## yr_built 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0      116        1     1971    33.38     1915     1926 
##      .25      .50      .75      .90      .95 
##     1951     1975     1997     2007     2011 
## 
## lowest : 1900 1901 1902 1903 1904, highest: 2011 2012 2013 2014 2015
## --------------------------------------------------------------------------------
## yr_renovated 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0       70    0.122    84.46    161.8        0        0 
##      .25      .50      .75      .90      .95 
##        0        0        0        0        0 
## 
## lowest :    0 1934 1940 1944 1945, highest: 2011 2012 2013 2014 2015
##                                                                             
## Value          0  1935  1940  1945  1950  1955  1960  1965  1970  1975  1980
## Frequency  20683     1     2     6     4    13    12    16    27    25    43
## Proportion 0.958 0.000 0.000 0.000 0.000 0.001 0.001 0.001 0.001 0.001 0.002
##                                                     
## Value       1985  1990  1995  2000  2005  2010  2015
## Frequency     88    99    84   112   156    82   144
## Proportion 0.004 0.005 0.004 0.005 0.007 0.004 0.007
## 
## For the frequency table, variable is rounded to the nearest 5
## --------------------------------------------------------------------------------
## zipcode 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0       70        1    98078    60.78    98004    98008 
##      .25      .50      .75      .90      .95 
##    98033    98065    98118    98155    98177 
## 
## lowest : 98001 98002 98003 98004 98005, highest: 98177 98178 98188 98198 98199
## --------------------------------------------------------------------------------
## lat 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0     5033        1    47.56   0.1573    47.31    47.35 
##      .25      .50      .75      .90      .95 
##    47.47    47.57    47.68    47.73    47.75 
## 
## lowest : 47.1559 47.1593 47.1622 47.1647 47.1764
## highest: 47.7771 47.7772 47.7774 47.7775 47.7776
## --------------------------------------------------------------------------------
## long 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0      751        1   -122.2   0.1557   -122.4   -122.4 
##      .25      .50      .75      .90      .95 
##   -122.3   -122.2   -122.1   -122.0   -122.0 
## 
## lowest : -122.519 -122.515 -122.514 -122.512 -122.511
## highest: -121.325 -121.321 -121.319 -121.316 -121.315
## --------------------------------------------------------------------------------
## sqft_living15 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0      777        1     1987    743.1     1140     1258 
##      .25      .50      .75      .90      .95 
##     1490     1840     2360     2930     3300 
## 
## lowest :  399  460  620  670  690, highest: 5600 5610 5790 6110 6210
## --------------------------------------------------------------------------------
## sqft_lot15 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    21597        0     8682        1    12758    13385     2002     3668 
##      .25      .50      .75      .90      .95 
##     5100     7620    10083    17822    37045 
## 
## lowest :    651    659    660    748    750, highest: 434728 438213 560617 858132 871200
## --------------------------------------------------------------------------------
## 
## 
## processing file: report.rmd
## output file: C:/Users/vferd/Documents/Sherly/Campus/Semester 3/Algoritma/week 11/report.knit.md
## 
## Output created: report.html

4 Part 2

Pada part ini, saya akan mencari pengaruh dari beberapa faktor terhadap harga rumah yang ada pada data kc_house_data.

4.1 Hipotesis

\(H_0 = Harga\ tidak\ dipengaruhi\ oleh\ bedrooms\ bathrooms\ floors\ grade\ dan\ sqft\ living\) \(H_1 = Harga\ dipengaruhi\ oleh\ bedrooms\ bathrooms\ floors\ grade\ dan\ sqft\ living\)

4.2 Model

Model yang dipakai adalah $ price = a_0 + a_1bedrooms + a_2bathrooms + a_3floors + a_4grade + a_5sqft living $ saya akan mengestimasi model tersebut menggunakan OLS dan observasi p-value, R kuadrat residual, dan koefisien untuk menilai price (harga)

## 
## Call:
## lm(formula = Model1, data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1061981  -136474   -23050    97388  4631488 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -4.818e+05  1.494e+04 -32.261  < 2e-16 ***
## sqft_living  2.215e+02  3.622e+00  61.156  < 2e-16 ***
## floors      -3.844e+04  3.745e+03 -10.265  < 2e-16 ***
## bedrooms    -4.087e+04  2.302e+03 -17.757  < 2e-16 ***
## bathrooms   -1.405e+04  3.712e+03  -3.786 0.000153 ***
## grade        1.027e+05  2.389e+03  42.984  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 247600 on 21591 degrees of freedom
## Multiple R-squared:  0.5458, Adjusted R-squared:  0.5457 
## F-statistic:  5189 on 5 and 21591 DF,  p-value: < 2.2e-16
## Subset selection object
## Call: regsubsets.formula(Model1, data = df, nbest = 1)
## 5 Variables  (and intercept)
##             Forced in Forced out
## sqft_living     FALSE      FALSE
## floors          FALSE      FALSE
## bedrooms        FALSE      FALSE
## bathrooms       FALSE      FALSE
## grade           FALSE      FALSE
## 1 subsets of each size up to 5
## Selection Algorithm: exhaustive
##          sqft_living floors bedrooms bathrooms grade
## 1  ( 1 ) "*"         " "    " "      " "       " "  
## 2  ( 1 ) "*"         " "    " "      " "       "*"  
## 3  ( 1 ) "*"         " "    "*"      " "       "*"  
## 4  ( 1 ) "*"         "*"    "*"      " "       "*"  
## 5  ( 1 ) "*"         "*"    "*"      "*"       "*"

4.3 Kesimpulan

Karena p-value < 0.05, maka terima H1 yang berarti Harga dipengaruhi oleh bedrooms, bathrooms, floors, grade, dan sqft_living. Selain itu diperkuat oleh nilai R kuadrat residual yang kecil yang berarti errornya kecil. Kesimpulannya, dengan signifikansi 95% data dapat membuktikan Price dipengaruhi oleh variabel-variabel berikut.