Analysis Objective

Using any of the two unsupervised learning algorithms we’ve learned, we will produce a simple R markdown document where we demonstrate an exercise of either clustering or dimensionality reduction on one of either the wholesale.csv, the nyc.csv, or our own dataset.

We will explain our choice of parameters (how we choose k for k-means clustering, or how we choose to retain n number of dimensions for PCA) from the original data. We will give some business utility for the unsupervised model we’ve developed. (The R Markdown document should contain one or two visualization.)

Libraries and Setup

We’ll set-up caching for this notebook given how computationally expensive some of the code we will write can get.

knitr::opts_chunk$set(cache=TRUE)
options(scipen = 9999)

The libraries that will be used in this analysis :

library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(Hmisc)
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
## 
## Attaching package: 'Hmisc'
## The following objects are masked from 'package:dplyr':
## 
##     src, summarize
## The following objects are masked from 'package:base':
## 
##     format.pval, units
library(FactoMineR)

PCA Analysis on Property Sales in NYC

In this analysis, we use nyc dataset to find out the principal component of it.

Import the nyc data :

property_data <- read.csv("data_input/nyc.csv", stringsAsFactors = F, sep = ",")

Data exploration :

describe(property_data)
## property_data 
## 
##  22  Variables      84548  Observations
## ---------------------------------------------------------------------------
## X 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    84548        0    26736        1    10344     8149      849     1695 
##      .25      .50      .75      .90      .95 
##     4231     8942    15987    21167    23281 
## 
## lowest :     4     5     6     7     8, highest: 26735 26736 26737 26738 26739
## ---------------------------------------------------------------------------
## BOROUGH 
##        n  missing distinct     Info     Mean      Gmd 
##    84548        0        5    0.934    2.999    1.424 
##                                         
## Value          1     2     3     4     5
## Frequency  18306  7049 24047 26736  8410
## Proportion 0.217 0.083 0.284 0.316 0.099
## ---------------------------------------------------------------------------
## NEIGHBORHOOD 
##        n  missing distinct 
##    84548        0      254 
## 
## lowest : AIRPORT LA GUARDIA ALPHABET CITY      ANNADALE           ARDEN HEIGHTS      ARROCHAR          
## highest: WOODHAVEN          WOODLAWN           WOODROW            WOODSIDE           WYCKOFF HEIGHTS   
## ---------------------------------------------------------------------------
## BUILDING.CLASS.CATEGORY 
##        n  missing distinct 
##    84548        0       47 
## 
## lowest : 01 ONE FAMILY DWELLINGS                     02 TWO FAMILY DWELLINGS                     03 THREE FAMILY DWELLINGS                   04 TAX CLASS 1 CONDOS                       05 TAX CLASS 1 VACANT LAND                 
## highest: 45 CONDO HOTELS                             46 CONDO STORE BUILDINGS                    47 CONDO NON-BUSINESS STORAGE               48 CONDO TERRACES/GARDENS/CABANAS           49 CONDO WAREHOUSES/FACTORY/INDUS          
## ---------------------------------------------------------------------------
## TAX.CLASS.AT.PRESENT 
##        n  missing distinct 
##    83810      738       10 
##                                                                       
## Value          1    1A    1B    1C     2    2A    2B    2C     3     4
## Frequency  38633  1444  1234   186 30919  2521   814  1915     4  6140
## Proportion 0.461 0.017 0.015 0.002 0.369 0.030 0.010 0.023 0.000 0.073
## ---------------------------------------------------------------------------
## BLOCK 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    84548        0    11566        1     4237     3872      276      633 
##      .25      .50      .75      .90      .95 
##     1323     3311     6281     9151    11616 
## 
## lowest :     1     3     5     6     7, highest: 16315 16316 16317 16319 16322
## ---------------------------------------------------------------------------
## LOT 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    84548        0     2627        1    376.2    544.8        2        7 
##      .25      .50      .75      .90      .95 
##       22       50     1001     1207     1403 
## 
## lowest :    1    2    3    4    5, highest: 9080 9081 9085 9099 9106
## ---------------------------------------------------------------------------
## BUILDING.CLASS.AT.PRESENT 
##        n  missing distinct 
##    83810      738      166 
## 
## lowest : A0 A1 A2 A3 A4, highest: Z0 Z2 Z3 Z7 Z9
## ---------------------------------------------------------------------------
## ADDRESS 
##        n  missing distinct 
##    84548        0    67563 
## 
## lowest : ****** 95TH   STREET  1 12TH   ST EXTENSION 1 5 AVENUE            1 5TH AVENUE, 23A     1 ASCAN AVE, 35      
## highest: WOODROW ROAD          WOODYCREST AVENUE     WORTMAN AVENUE        YORK AVENUE           ZEREGA AVENUE        
## ---------------------------------------------------------------------------
## APARTMENT.NUMBER 
##        n  missing distinct 
##    19052    65496     3988 
## 
## lowest : #4   #PHC `    0    0.25, highest: W6B  W8D  WS2  Y1   Z   
## ---------------------------------------------------------------------------
## ZIP.CODE 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    84548        0      186        1    10732      842    10011    10019 
##      .25      .50      .75      .90      .95 
##    10305    11209    11357    11414    11427 
##                                                                       
## Value          0 10000 10100 10200 10300 10500 10800 11000 11100 11200
## Frequency    982 15806  2117     1  8350  6990     1   529  1954 23732
## Proportion 0.012 0.187 0.025 0.000 0.099 0.083 0.000 0.006 0.023 0.281
##                       
## Value      11400 11700
## Frequency  23079  1007
## Proportion 0.273 0.012
## ---------------------------------------------------------------------------
## RESIDENTIAL.UNITS 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    84548        0      176    0.899    2.025    2.833        0        0 
##      .25      .50      .75      .90      .95 
##        0        1        2        3        4 
## 
## lowest :    0    1    2    3    4, highest:  889  894  948 1641 1844
## ---------------------------------------------------------------------------
## COMMERCIAL.UNITS 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    84548        0       55    0.171   0.1936   0.3788        0        0 
##      .25      .50      .75      .90      .95 
##        0        0        0        0        1 
## 
## lowest :    0    1    2    3    4, highest:  254  318  422  436 2261
## ---------------------------------------------------------------------------
## TOTAL.UNITS 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    84548        0      192    0.887    2.249    3.072        0        0 
##      .25      .50      .75      .90      .95 
##        1        1        2        3        4 
## 
## lowest :    0    1    2    3    4, highest:  902  955 1653 1866 2261
## ---------------------------------------------------------------------------
## LAND.SQUARE.FEET 
##        n  missing distinct 
##    84548        0     6062 
## 
## lowest :  -    0     100   1000  10000, highest: 998   9980  999   9992  9996 
## ---------------------------------------------------------------------------
## GROSS.SQUARE.FEET 
##        n  missing distinct 
##    84548        0     5691 
## 
## lowest :  -    0     100   1000  10000, highest: 9975  998   999   9990  9992 
## ---------------------------------------------------------------------------
## YEAR.BUILT 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##    84548        0      158    0.998     1789    327.5        0     1899 
##      .25      .50      .75      .90      .95 
##     1920     1940     1965     2006     2013 
##                                                                       
## Value          0  1120  1680  1800  1820  1840  1860  1880  1900  1920
## Frequency   6970     1     1    37     2    22    15   122  5882 24002
## Proportion 0.082 0.000 0.000 0.000 0.000 0.000 0.000 0.001 0.070 0.284
##                                         
## Value       1940  1960  1980  2000  2020
## Frequency  10018 18118  5666  9017  4675
## Proportion 0.118 0.214 0.067 0.107 0.055
## ---------------------------------------------------------------------------
## TAX.CLASS.AT.TIME.OF.SALE 
##        n  missing distinct     Info     Mean      Gmd 
##    84548        0        4    0.799    1.657   0.7752 
##                                   
## Value          1     2     3     4
## Frequency  41533 36726     4  6285
## Proportion 0.491 0.434 0.000 0.074
## ---------------------------------------------------------------------------
## BUILDING.CLASS.AT.TIME.OF.SALE 
##        n  missing distinct 
##    84548        0      166 
## 
## lowest : A0 A1 A2 A3 A4, highest: Z0 Z2 Z3 Z7 Z9
## ---------------------------------------------------------------------------
## SALE.PRICE 
##        n  missing distinct 
##    84548        0    10008 
## 
## lowest :  -      0       1       10      100    
## highest: 999988  99999   999990  999999  9999999
## ---------------------------------------------------------------------------
## SALE.DATE 
##        n  missing distinct 
##    84548        0      364 
## 
## lowest : 2016-09-01 00:00:00 2016-09-02 00:00:00 2016-09-03 00:00:00 2016-09-04 00:00:00 2016-09-05 00:00:00
## highest: 2017-08-27 00:00:00 2017-08-28 00:00:00 2017-08-29 00:00:00 2017-08-30 00:00:00 2017-08-31 00:00:00
## ---------------------------------------------------------------------------
## 
## Variables with all observations missing:
## 
## [1] EASE.MENT
glimpse(property_data)
## Observations: 84,548
## Variables: 22
## $ X                              <int> 4, 5, 6, 7, 8, 9, 10, 11, 12, 1...
## $ BOROUGH                        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1...
## $ NEIGHBORHOOD                   <chr> "ALPHABET CITY", "ALPHABET CITY...
## $ BUILDING.CLASS.CATEGORY        <chr> "07 RENTALS - WALKUP APARTMENTS...
## $ TAX.CLASS.AT.PRESENT           <chr> "2A", "2", "2", "2B", "2A", "2"...
## $ BLOCK                          <int> 392, 399, 399, 402, 404, 405, 4...
## $ LOT                            <int> 6, 26, 39, 21, 55, 16, 32, 18, ...
## $ EASE.MENT                      <lgl> NA, NA, NA, NA, NA, NA, NA, NA,...
## $ BUILDING.CLASS.AT.PRESENT      <chr> "C2", "C7", "C7", "C4", "C2", "...
## $ ADDRESS                        <chr> "153 AVENUE B", "234 EAST 4TH  ...
## $ APARTMENT.NUMBER               <chr> " ", " ", " ", " ", " ", " ", "...
## $ ZIP.CODE                       <int> 10009, 10009, 10009, 10009, 100...
## $ RESIDENTIAL.UNITS              <int> 5, 28, 16, 10, 6, 20, 8, 44, 15...
## $ COMMERCIAL.UNITS               <int> 0, 3, 1, 0, 0, 0, 0, 2, 0, 0, 4...
## $ TOTAL.UNITS                    <int> 5, 31, 17, 10, 6, 20, 8, 46, 15...
## $ LAND.SQUARE.FEET               <chr> "1633", "4616", "2212", "2272",...
## $ GROSS.SQUARE.FEET              <chr> "6440", "18690", "7803", "6794"...
## $ YEAR.BUILT                     <int> 1900, 1900, 1900, 1913, 1900, 1...
## $ TAX.CLASS.AT.TIME.OF.SALE      <int> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2...
## $ BUILDING.CLASS.AT.TIME.OF.SALE <chr> "C2", "C7", "C7", "C4", "C2", "...
## $ SALE.PRICE                     <chr> "6625000", " -  ", " -  ", "393...
## $ SALE.DATE                      <chr> "2017-07-19 00:00:00", "2016-12...

Variables explanations : BOROUGH: A digit code for the borough the property is located in; in order these are Manhattan (1), Bronx (2), Brooklyn (3), Queens (4), and Staten Island (5)
NEIGHBORHOOD: The neighborhood name
BUILDING.CLASS.CATEGORY: Category class of the property[^6]
TAX.CLASS.AT.PRESENT, TAX.CLASS.AT.TIME.OF.SALE: BLOCK, LOT: The combination of borough, block, and lot forms a unique key for property in New York City
EASE.MENT:An easement is a right, such as a right of way, which allows an entity to make limited use of another’s real property
BUILDING.CLASS.AT.PRESENT, BUILDING.CLASS.AT.TIME.OF.SALE: The type of building at various points in time, (for example “A”" signifies one-family homes, “O” signifies office buildings. “R” signifies condominiums)[^6]
ADDRESS: Street address of the property
APARTMENT.NUMBER: Apartment number if applicable
ZIP.CODE: The property’s postal code
RESIDENTIAL.UNTIS: The number of residential units at the listed property
COMMERCIAL.UNITS: The number of commercial units at the listed property
LAND.SQUARE.FEET: The land area of the property listed in square feet
GROSS.SQUARE.FEET: The total area of all the floors of a building as measured from the exterior surfaces of the outside walls of the building, including the land area and space within any building or structure on the property
YEAR.BUILT: Year the property was built
SALE.PRICE: Price paid for the property
SALE.DATE: Date the property sold

This dataset uses the financial definition of a building/building unit, for tax purposes. In case a single entity owns the building in question, a sale covers the value of the entire building. In case a building is owned piecemeal by its residents (a condominium), a sale refers to a single apartment (or group of apartments) owned by some individual.

Cleaning and pre-processing using dplyr

We only want to use numerical data (integers) and keeping BOROUGH as one of the class (factor) in the data we want to analyse.

library(dplyr)
property <- property_data %>% 
  mutate(LAND.SQUARE.FEET = as.integer(LAND.SQUARE.FEET),
         GROSS.SQUARE.FEET = as.integer(GROSS.SQUARE.FEET),
         SALE.PRICE = as.integer(SALE.PRICE)
         ) %>% 
  select_if(is.integer) %>% 
  select(-c(X, BLOCK, LOT, ZIP.CODE)) %>% 
  filter(complete.cases(.))

property$BOROUGH <- as.factor(property$BOROUGH)
property$TAX.CLASS.AT.TIME.OF.SALE <- as.factor(property$TAX.CLASS.AT.TIME.OF.SALE)
glimpse(property)
## Observations: 48,243
## Variables: 9
## $ BOROUGH                   <fct> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
## $ RESIDENTIAL.UNITS         <int> 5, 10, 6, 8, 24, 10, 24, 3, 4, 5, 0,...
## $ COMMERCIAL.UNITS          <int> 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, ...
## $ TOTAL.UNITS               <int> 5, 10, 6, 8, 24, 10, 24, 4, 5, 6, 1,...
## $ LAND.SQUARE.FEET          <int> 1633, 2272, 2369, 1750, 4489, 3717, ...
## $ GROSS.SQUARE.FEET         <int> 6440, 6794, 4615, 4226, 18523, 12350...
## $ YEAR.BUILT                <int> 1900, 1913, 1900, 1920, 1920, 2009, ...
## $ TAX.CLASS.AT.TIME.OF.SALE <fct> 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 4, 1, ...
## $ SALE.PRICE                <int> 6625000, 3936272, 8000000, 3192840, ...

PCA Analysis

We have to scale our data first (with scale.unit inside PCA()) because our data can be on different units and if not, the amount of variance explained by the different principal components is going to be dominated by variables that are on a larger range.

(property_PCA <- PCA(property, quali.sup = c(1,8), scale.unit = TRUE, graph = FALSE))
## **Results for the Principal Component Analysis (PCA)**
## The analysis was performed on 48243 individuals, described by 9 variables
## *The results are available in the following objects:
## 
##    name               
## 1  "$eig"             
## 2  "$var"             
## 3  "$var$coord"       
## 4  "$var$cor"         
## 5  "$var$cos2"        
## 6  "$var$contrib"     
## 7  "$ind"             
## 8  "$ind$coord"       
## 9  "$ind$cos2"        
## 10 "$ind$contrib"     
## 11 "$quali.sup"       
## 12 "$quali.sup$coord" 
## 13 "$quali.sup$v.test"
## 14 "$call"            
## 15 "$call$centre"     
## 16 "$call$ecart.type" 
## 17 "$call$row.w"      
## 18 "$call$col.w"      
##    description                                          
## 1  "eigenvalues"                                        
## 2  "results for the variables"                          
## 3  "coord. for the variables"                           
## 4  "correlations variables - dimensions"                
## 5  "cos2 for the variables"                             
## 6  "contributions of the variables"                     
## 7  "results for the individuals"                        
## 8  "coord. for the individuals"                         
## 9  "cos2 for the individuals"                           
## 10 "contributions of the individuals"                   
## 11 "results for the supplementary categorical variables"
## 12 "coord. for the supplementary categories"            
## 13 "v-test of the supplementary categories"             
## 14 "summary statistics"                                 
## 15 "mean of the variables"                              
## 16 "standard error of the variables"                    
## 17 "weights for the individuals"                        
## 18 "weights for the variables"
summary(property_PCA)
## 
## Call:
## PCA(X = property, scale.unit = TRUE, quali.sup = c(1, 8), graph = FALSE) 
## 
## 
## Eigenvalues
##                        Dim.1   Dim.2   Dim.3   Dim.4   Dim.5   Dim.6
## Variance               2.908   1.179   0.999   0.976   0.706   0.232
## % of var.             41.541  16.843  14.274  13.941  10.092   3.309
## Cumulative % of var.  41.541  58.384  72.658  86.598  96.690 100.000
##                        Dim.7
## Variance               0.000
## % of var.              0.000
## Cumulative % of var. 100.000
## 
## Individuals (the 10 first)
##                                 Dist    Dim.1    ctr   cos2    Dim.2
## 1                           |  0.675 |  0.294  0.000  0.189 | -0.148
## 2                           |  0.671 |  0.508  0.000  0.574 | -0.030
## 3                           |  0.833 |  0.359  0.000  0.186 | -0.158
## 4                           |  0.507 |  0.331  0.000  0.426 | -0.005
## 5                           |  2.403 |  1.808  0.002  0.566 | -0.304
## 6                           |  1.279 |  0.792  0.000  0.384 | -0.261
## 7                           |  2.071 |  1.663  0.002  0.644 | -0.176
## 8                           |  0.323 |  0.088  0.000  0.074 |  0.034
## 9                           |  0.727 |  0.289  0.000  0.158 | -0.080
## 10                          |  0.500 |  0.242  0.000  0.234 |  0.016
##                                ctr   cos2    Dim.3    ctr   cos2  
## 1                            0.000  0.048 |  0.198  0.000  0.086 |
## 2                            0.000  0.002 |  0.195  0.000  0.085 |
## 3                            0.000  0.036 |  0.206  0.000  0.061 |
## 4                            0.000  0.000 |  0.209  0.000  0.170 |
## 5                            0.000  0.016 |  0.279  0.000  0.014 |
## 6                            0.000  0.042 |  0.447  0.000  0.122 |
## 7                            0.000  0.007 |  0.263  0.000  0.016 |
## 8                            0.000  0.011 |  0.198  0.000  0.375 |
## 9                            0.000  0.012 |  0.204  0.000  0.079 |
## 10                           0.000  0.001 |  0.206  0.000  0.169 |
## 
## Variables
##                                Dim.1    ctr   cos2    Dim.2    ctr   cos2
## RESIDENTIAL.UNITS           |  0.855 25.139  0.731 | -0.097  0.799  0.009
## COMMERCIAL.UNITS            |  0.316  3.441  0.100 |  0.886 66.516  0.784
## TOTAL.UNITS                 |  0.887 27.055  0.787 |  0.387 12.720  0.150
## LAND.SQUARE.FEET            |  0.640 14.104  0.410 | -0.285  6.884  0.081
## GROSS.SQUARE.FEET           |  0.854 25.071  0.729 | -0.310  8.136  0.096
## YEAR.BUILT                  |  0.044  0.068  0.002 | -0.018  0.026  0.000
## SALE.PRICE                  |  0.386  5.124  0.149 | -0.241  4.919  0.058
##                                Dim.3    ctr   cos2  
## RESIDENTIAL.UNITS           | -0.019  0.037  0.000 |
## COMMERCIAL.UNITS            |  0.015  0.022  0.000 |
## TOTAL.UNITS                 | -0.009  0.008  0.000 |
## LAND.SQUARE.FEET            | -0.075  0.556  0.006 |
## GROSS.SQUARE.FEET           | -0.005  0.002  0.000 |
## YEAR.BUILT                  |  0.994 98.883  0.988 |
## SALE.PRICE                  |  0.070  0.492  0.005 |
## 
## Supplementary categories
##                                  Dist     Dim.1    cos2  v.test     Dim.2
## BOROUGH 1                   |   2.469 |   1.981   0.644  37.226 |  -0.352
## BOROUGH 2                   |   0.200 |   0.048   0.057   2.535 |   0.004
## BOROUGH 3                   |   0.150 |  -0.076   0.253  -9.705 |   0.014
## BOROUGH 4                   |   0.243 |  -0.016   0.004  -1.125 |   0.013
## BOROUGH 5                   |   0.339 |  -0.066   0.037  -2.893 |  -0.031
## TAX.CLASS.AT.TIME.OF.SALE 1 |   0.190 |  -0.106   0.312 -19.555 |   0.000
## TAX.CLASS.AT.TIME.OF.SALE 2 |   0.329 |   0.170   0.265  12.710 |   0.023
## TAX.CLASS.AT.TIME.OF.SALE 3 |   3.947 |  -0.383   0.009  -0.318 |   0.102
## TAX.CLASS.AT.TIME.OF.SALE 4 |   0.892 |   0.382   0.183  13.933 |  -0.079
##                                cos2  v.test     Dim.3    cos2  v.test  
## BOROUGH 1                     0.020 -10.398 |   0.290   0.014   9.279 |
## BOROUGH 2                     0.000   0.360 |  -0.183   0.838 -16.610 |
## BOROUGH 3                     0.009   2.870 |  -0.128   0.723 -27.997 |
## BOROUGH 4                     0.003   1.393 |   0.235   0.934  28.209 |
## BOROUGH 5                     0.008  -2.164 |   0.289   0.727  21.780 |
## TAX.CLASS.AT.TIME.OF.SALE 1   0.000  -0.002 |   0.147   0.597  46.181 |
## TAX.CLASS.AT.TIME.OF.SALE 2   0.005   2.741 |  -0.224   0.460 -28.584 |
## TAX.CLASS.AT.TIME.OF.SALE 3   0.001   0.132 |  -3.911   0.982  -5.533 |
## TAX.CLASS.AT.TIME.OF.SALE 4   0.008  -4.546 |  -0.565   0.400 -35.125 |

From the summary, we know that we only need 4 principal components to retain more than 80% of the variation in our data.

glimpse(property_PCA)
## List of 6
##  $ eig      : num [1:7, 1:3] 2.908 1.179 0.999 0.976 0.706 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : chr [1:7] "comp 1" "comp 2" "comp 3" "comp 4" ...
##   .. ..$ : chr [1:3] "eigenvalue" "percentage of variance" "cumulative percentage of variance"
##  $ var      :List of 4
##   ..$ coord  : num [1:7, 1:5] 0.855 0.316 0.887 0.64 0.854 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   ..$ cor    : num [1:7, 1:5] 0.855 0.316 0.887 0.64 0.854 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   ..$ cos2   : num [1:7, 1:5] 0.731 0.1 0.787 0.41 0.729 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   ..$ contrib: num [1:7, 1:5] 25.14 3.44 27.05 14.1 25.07 ...
##   .. ..- attr(*, "dimnames")=List of 2
##  $ ind      :List of 4
##   ..$ coord  : num [1:48243, 1:5] 0.294 0.508 0.359 0.331 1.808 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   ..$ cos2   : num [1:48243, 1:5] 0.189 0.574 0.186 0.426 0.566 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   ..$ contrib: num [1:48243, 1:5] 0.0000614 0.0001841 0.000092 0.000078 0.0023293 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   ..$ dist   : Named num [1:48243] 0.675 0.671 0.833 0.507 2.403 ...
##   .. ..- attr(*, "names")= chr [1:48243] "1" "2" "3" "4" ...
##  $ svd      :List of 3
##   ..$ vs: num [1:7] 1.705 1.086 1 0.988 0.841 ...
##   ..$ U : num [1:48243, 1:5] 0.172 0.298 0.211 0.194 1.06 ...
##   ..$ V : num [1:7, 1:5] 0.501 0.185 0.52 0.376 0.501 ...
##  $ quali.sup:List of 5
##   ..$ coord : num [1:9, 1:5] 1.9814 0.0476 -0.0756 -0.016 -0.0656 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   ..$ cos2  : num [1:9, 1:5] 0.6443 0.05679 0.253 0.00432 0.03733 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   ..$ v.test: num [1:9, 1:5] 37.23 2.54 -9.7 -1.12 -2.89 ...
##   .. ..- attr(*, "dimnames")=List of 2
##   ..$ dist  : Named num [1:9] 2.469 0.2 0.15 0.243 0.339 ...
##   .. ..- attr(*, "names")= chr [1:9] "BOROUGH 1" "BOROUGH 2" "BOROUGH 3" "BOROUGH 4" ...
##   ..$ eta2  : num [1:2, 1:5] 0.029395 0.008823 0.0024 0.000513 0.036288 ...
##   .. ..- attr(*, "dimnames")=List of 2
##  $ call     :List of 10
##   ..$ row.w     : num [1:48243] 0.0000207 0.0000207 0.0000207 0.0000207 0.0000207 ...
##   ..$ col.w     : num [1:7] 1 1 1 1 1 1 1
##   ..$ scale.unit: logi TRUE
##   ..$ ncp       : num 5
##   ..$ centre    : num [1:7] 2.567 0.248 2.834 3356.5 3636.935 ...
##   ..$ ecart.type: num [1:7] 17.5 11 20.7 31433.9 28579.9 ...
##   ..$ X         :'data.frame':   48243 obs. of  9 variables:
##   .. ..$ BOROUGH                  : Factor w/ 5 levels "1","2","3","4",..: 1 1 1 1 1 1 1 1 1 1 ...
##   .. ..$ RESIDENTIAL.UNITS        : int [1:48243] 5 10 6 8 24 10 24 3 4 5 ...
##   .. ..$ COMMERCIAL.UNITS         : int [1:48243] 0 0 0 0 0 0 0 1 1 1 ...
##   .. ..$ TOTAL.UNITS              : int [1:48243] 5 10 6 8 24 10 24 4 5 6 ...
##   .. ..$ LAND.SQUARE.FEET         : int [1:48243] 1633 2272 2369 1750 4489 3717 4131 1520 2201 1779 ...
##   .. ..$ GROSS.SQUARE.FEET        : int [1:48243] 6440 6794 4615 4226 18523 12350 16776 3360 5608 3713 ...
##   .. ..$ YEAR.BUILT               : int [1:48243] 1900 1913 1900 1920 1920 2009 1928 1910 1900 1910 ...
##   .. ..$ TAX.CLASS.AT.TIME.OF.SALE: Factor w/ 4 levels "1","2","3","4": 2 2 2 2 2 2 2 2 2 2 ...
##   .. ..$ SALE.PRICE               : int [1:48243] 6625000 3936272 8000000 3192840 16232000 10350000 11900000 3300000 7215000 4750000 ...
##   ..$ row.w.init: num [1:48243] 1 1 1 1 1 1 1 1 1 1 ...
##   ..$ call      : language PCA(X = property, scale.unit = TRUE, quali.sup = c(1, 8), graph = FALSE)
##   ..$ quali.sup :List of 5
##   .. ..$ quali.sup :'data.frame':    48243 obs. of  2 variables:
##   .. ..$ modalite  : int [1:2] 5 4
##   .. ..$ nombre    : num [1:9] 1005 7049 24047 11078 5064 ...
##   .. ..$ barycentre:'data.frame':    9 obs. of  7 variables:
##   .. ..$ numero    : num [1:2] 1 8
##  - attr(*, "class")= chr [1:2] "PCA" "list "

The eigen value (\(eig), result of variables (\)var), mean of the variables (\(call\)centre), and standard error of the variables of (\(call\)ecart.type) each principal component

property_PCA$eig
##           eigenvalue percentage of variance
## comp 1 2.90787699968          41.5410999955
## comp 2 1.17898738072          16.8426768674
## comp 3 0.99918293523          14.2740419318
## comp 4 0.97584668055          13.9406668651
## comp 5 0.70644025663          10.0920036662
## comp 6 0.23163936792           3.3091338275
## comp 7 0.00002637926           0.0003768466
##        cumulative percentage of variance
## comp 1                          41.54110
## comp 2                          58.38378
## comp 3                          72.65782
## comp 4                          86.59849
## comp 5                          96.69049
## comp 6                          99.99962
## comp 7                         100.00000
property_PCA$var
## $coord
##                        Dim.1       Dim.2        Dim.3       Dim.4
## RESIDENTIAL.UNITS 0.85498392 -0.09703775 -0.019233785 -0.16704593
## COMMERCIAL.UNITS  0.31630325  0.88555744  0.014968969  0.14712653
## TOTAL.UNITS       0.88697377  0.38725903 -0.008856898 -0.06267080
## LAND.SQUARE.FEET  0.64040411 -0.28489754 -0.074548106 -0.38758659
## GROSS.SQUARE.FEET 0.85382986 -0.30971630 -0.004504338  0.08036685
## YEAR.BUILT        0.04435943 -0.01755000  0.993991182 -0.09605043
## SALE.PRICE        0.38600345 -0.24081261  0.070102083  0.86974744
##                         Dim.5
## RESIDENTIAL.UNITS -0.47077704
## COMMERCIAL.UNITS   0.30401284
## TOTAL.UNITS       -0.23530494
## LAND.SQUARE.FEET   0.55477904
## GROSS.SQUARE.FEET  0.15132125
## YEAR.BUILT         0.02109446
## SALE.PRICE         0.07677326
## 
## $cor
##                        Dim.1       Dim.2        Dim.3       Dim.4
## RESIDENTIAL.UNITS 0.85498392 -0.09703775 -0.019233785 -0.16704593
## COMMERCIAL.UNITS  0.31630325  0.88555744  0.014968969  0.14712653
## TOTAL.UNITS       0.88697377  0.38725903 -0.008856898 -0.06267080
## LAND.SQUARE.FEET  0.64040411 -0.28489754 -0.074548106 -0.38758659
## GROSS.SQUARE.FEET 0.85382986 -0.30971630 -0.004504338  0.08036685
## YEAR.BUILT        0.04435943 -0.01755000  0.993991182 -0.09605043
## SALE.PRICE        0.38600345 -0.24081261  0.070102083  0.86974744
##                         Dim.5
## RESIDENTIAL.UNITS -0.47077704
## COMMERCIAL.UNITS   0.30401284
## TOTAL.UNITS       -0.23530494
## LAND.SQUARE.FEET   0.55477904
## GROSS.SQUARE.FEET  0.15132125
## YEAR.BUILT         0.02109446
## SALE.PRICE         0.07677326
## 
## $cos2
##                         Dim.1        Dim.2         Dim.3       Dim.4
## RESIDENTIAL.UNITS 0.730997502 0.0094163244 0.00036993848 0.027904344
## COMMERCIAL.UNITS  0.100047744 0.7842119871 0.00022407004 0.021646216
## TOTAL.UNITS       0.786722475 0.1499695595 0.00007844464 0.003927629
## LAND.SQUARE.FEET  0.410117418 0.0811666068 0.00555742010 0.150223366
## GROSS.SQUARE.FEET 0.729025437 0.0959241882 0.00002028906 0.006458830
## YEAR.BUILT        0.001967759 0.0003080025 0.98801847081 0.009225685
## SALE.PRICE        0.148998665 0.0579907122 0.00491430210 0.756460610
##                          Dim.5
## RESIDENTIAL.UNITS 0.2216310242
## COMMERCIAL.UNITS  0.0924238079
## TOTAL.UNITS       0.0553684128
## LAND.SQUARE.FEET  0.3077797807
## GROSS.SQUARE.FEET 0.0228981219
## YEAR.BUILT        0.0004449763
## SALE.PRICE        0.0058941328
## 
## $contrib
##                         Dim.1       Dim.2        Dim.3      Dim.4
## RESIDENTIAL.UNITS 25.13852898  0.79867898  0.037024099  2.8595008
## COMMERCIAL.UNITS   3.44057687 66.51572357  0.022425327  2.2181985
## TOTAL.UNITS       27.05487456 12.72020057  0.007850879  0.4024842
## LAND.SQUARE.FEET  14.10367145  6.88443389  0.556196458 15.3941566
## GROSS.SQUARE.FEET 25.07071093  8.13615055  0.002030565  0.6618694
## YEAR.BUILT         0.06766996  0.02612432 98.882640604  0.9454032
## SALE.PRICE         5.12396724  4.91868812  0.491832069 77.5183873
##                         Dim.5
## RESIDENTIAL.UNITS 31.37293240
## COMMERCIAL.UNITS  13.08303244
## TOTAL.UNITS        7.83766387
## LAND.SQUARE.FEET  43.56770127
## GROSS.SQUARE.FEET  3.24133877
## YEAR.BUILT         0.06298852
## SALE.PRICE         0.83434272
property_PCA$call$centre
## [1]       2.5665900       0.2484506       2.8339655    3356.5001969
## [5]    3636.9349958    1827.7623075 1107495.5967083
property_PCA$call$ecart.type
## [1]      17.46548      10.98693      20.74990   31433.89188   28579.92478
## [6]     464.36073 8857712.41336
property_PCA$quali.sup
## $coord
##                                   Dim.1           Dim.2      Dim.3
## BOROUGH 1                    1.98143909 -0.352399007367  0.2895092
## BOROUGH 2                    0.04758252  0.004300310523 -0.1827417
## BOROUGH 3                   -0.07557940  0.014233585870 -0.1278099
## BOROUGH 4                   -0.01599782  0.012617273994  0.2351416
## BOROUGH 5                   -0.06557540 -0.031240341073  0.2894424
## TAX.CLASS.AT.TIME.OF.SALE 1 -0.10589099 -0.000007537488  0.1465885
## TAX.CLASS.AT.TIME.OF.SALE 2  0.16955791  0.023283324304 -0.2235252
## TAX.CLASS.AT.TIME.OF.SALE 3 -0.38342543  0.101643219101 -3.9105636
## TAX.CLASS.AT.TIME.OF.SALE 4  0.38207685 -0.079372880927 -0.5646255
##                                    Dim.4       Dim.5
## BOROUGH 1                    1.366165184 -0.30492111
## BOROUGH 2                   -0.051049816 -0.03294969
## BOROUGH 3                    0.002985195 -0.01579772
## BOROUGH 4                   -0.045953693  0.03322987
## BOROUGH 5                   -0.113715606  0.10870366
## TAX.CLASS.AT.TIME.OF.SALE 1 -0.054309713  0.01691500
## TAX.CLASS.AT.TIME.OF.SALE 2  0.034504129 -0.16691869
## TAX.CLASS.AT.TIME.OF.SALE 3  0.334317954 -0.09123121
## TAX.CLASS.AT.TIME.OF.SALE 4  0.374523938  0.41581442
## 
## $cos2
##                                   Dim.1             Dim.2      Dim.3
## BOROUGH 1                   0.644304411 0.020379757187860 0.01375481
## BOROUGH 2                   0.056789658 0.000463846214325 0.83762539
## BOROUGH 3                   0.252995218 0.008972926352411 0.72349355
## BOROUGH 4                   0.004324796 0.002690141222544 0.93433590
## BOROUGH 5                   0.037328674 0.008472121804581 0.72725193
## TAX.CLASS.AT.TIME.OF.SALE 1 0.311650540 0.000000001579076 0.59724103
## TAX.CLASS.AT.TIME.OF.SALE 2 0.264894838 0.004994906545456 0.46035252
## TAX.CLASS.AT.TIME.OF.SALE 3 0.009438117 0.000663254705965 0.98175296
## TAX.CLASS.AT.TIME.OF.SALE 4 0.183309179 0.007910921253578 0.40031679
##                                    Dim.4       Dim.5
## BOROUGH 1                   0.3062923001 0.015258247
## BOROUGH 2                   0.0653676321 0.027231845
## BOROUGH 3                   0.0003946852 0.011053358
## BOROUGH 4                   0.0356849294 0.018659556
## BOROUGH 5                   0.1122537338 0.102576767
## TAX.CLASS.AT.TIME.OF.SALE 1 0.0819794271 0.007952322
## TAX.CLASS.AT.TIME.OF.SALE 2 0.0109693155 0.256712685
## TAX.CLASS.AT.TIME.OF.SALE 3 0.0071753472 0.000534331
## TAX.CLASS.AT.TIME.OF.SALE 4 0.1761334887 0.217111017
## 
## $v.test
##                                   Dim.1         Dim.2      Dim.3
## BOROUGH 1                    37.2256802 -10.397526364   9.278758
## BOROUGH 2                     2.5352386   0.359836044 -16.610192
## BOROUGH 3                    -9.7048031   2.870324932 -27.997124
## BOROUGH 4                    -1.1249910   1.393436620  28.208713
## BOROUGH 5                    -2.8925185  -2.164135250  21.780248
## TAX.CLASS.AT.TIME.OF.SALE 1 -19.5551384  -0.002186059  46.181420
## TAX.CLASS.AT.TIME.OF.SALE 2  12.7102782   2.741040352 -28.584400
## TAX.CLASS.AT.TIME.OF.SALE 3  -0.3179892   0.132386388  -5.532690
## TAX.CLASS.AT.TIME.OF.SALE 4  13.9329450  -4.545668756 -35.125154
##                                   Dim.4       Dim.5
## BOROUGH 1                    44.3059804 -11.6225040
## BOROUGH 2                    -4.6952943  -3.5618288
## BOROUGH 3                     0.6616879  -4.1155511
## BOROUGH 4                    -5.5783516   4.7409698
## BOROUGH 5                    -8.6586959   9.7281367
## TAX.CLASS.AT.TIME.OF.SALE 1 -17.3131708   6.3375912
## TAX.CLASS.AT.TIME.OF.SALE 2   4.4648338 -25.3858827
## TAX.CLASS.AT.TIME.OF.SALE 3   0.4786173  -0.1535059
## TAX.CLASS.AT.TIME.OF.SALE 4  23.5759394  30.7639462
## 
## $dist
##                   BOROUGH 1                   BOROUGH 2 
##                   2.4685116                   0.1996700 
##                   BOROUGH 3                   BOROUGH 4 
##                   0.1502613                   0.2432641 
##                   BOROUGH 5 TAX.CLASS.AT.TIME.OF.SALE 1 
##                   0.3394061                   0.1896815 
## TAX.CLASS.AT.TIME.OF.SALE 2 TAX.CLASS.AT.TIME.OF.SALE 3 
##                   0.3294438                   3.9467375 
## TAX.CLASS.AT.TIME.OF.SALE 4 
##                   0.8923981 
## 
## $eta2
##                                 Dim.1       Dim.2      Dim.3      Dim.4
## BOROUGH                   0.029394967 0.002400124 0.03628809 0.04212607
## TAX.CLASS.AT.TIME.OF.SALE 0.008823114 0.000513238 0.05143083 0.01301342
##                                 Dim.5
## BOROUGH                   0.005257136
## TAX.CLASS.AT.TIME.OF.SALE 0.028414011

When using select, we include:
select="contrib 5: label 5 elements that have the highest contribution on the 2 dimensions of our plot

## Variable Factor Map
plot.PCA(property_PCA, 
         cex = 0.6, 
         choix = c("ind"),
         select = "contrib 5",
         habillage = 8) # Tax class for legend

## Variable Factor Map
plot.PCA(property_PCA,
         cex=0.6, 
         choix = c("var"))

With our Variables factor map plotted (second plot), let’s use the dimdesc() function to help us understand the variables and the categories that are the most characteristic according to each dimension obtained in the PCA process. (This is also consistent with the result from summary())

We are only looking at the quantitative variables “best describe” the first and second dimensions, and that the function has automatically sorted the values by descending order.

dimdesc(property_PCA)
## $Dim.1
## $Dim.1$quanti
##                   correlation                       p.value
## TOTAL.UNITS        0.88697377 0.000000000000000000000000000
## RESIDENTIAL.UNITS  0.85498392 0.000000000000000000000000000
## GROSS.SQUARE.FEET  0.85382986 0.000000000000000000000000000
## LAND.SQUARE.FEET   0.64040411 0.000000000000000000000000000
## SALE.PRICE         0.38600345 0.000000000000000000000000000
## COMMERCIAL.UNITS   0.31630325 0.000000000000000000000000000
## YEAR.BUILT         0.04435943 0.000000000000000000000188547
## 
## $Dim.1$quali
##                                    R2
## BOROUGH                   0.029394967
## TAX.CLASS.AT.TIME.OF.SALE 0.008823114
##                                                                                                                                                                                                                                                                                                                                                 p.value
## BOROUGH                   0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000213916
## TAX.CLASS.AT.TIME.OF.SALE 0.000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000024294661395084662660018860854146705015708451330674062836799611835927685182349483674513938722391525017081656021498561652619427484751253128718524302445455902281821974844141251854343014066901787618619004426259248844299883302641
## 
## $Dim.1$category
##     Estimate
## 1  1.6070653
## 4  0.3664973
## 2  0.1539783
## 2 -0.3267913
## 5 -0.4399492
## 3 -0.4499532
## 1 -0.1214706
##                                                                                                                                                                                                                                                                                                                       p.value
## 1 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001043815
## 4 0.0000000000000000000000000000000000000000000329153775615770020911642912732372638933177643462111757427108925700012128833096382648177450962799388765763259246006910974102765976567752659320831298828125000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## 2 0.0000000000000000000000000000000000004535404307008517481500053664971626594853625738998926919166963189718464064188430565909318098471742636279557814305007923394441604614257812500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## 2 0.0112356300103029615317096201465574267785996198654174804687500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## 5 0.0038206879643984691975744372172130169929005205631256103515625000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## 3 0.0000000000000000000002749672808466575869358360091487126560195843103841720985759927015079640000294602941721677780151367187500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## 1 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000001748147462519148452359124632461778150748826580477113888583998314459509434395101347782169052558164076420359718310264061442021928818821925178510865763372309608994316562134031164193239453759913311311294132149537529363758636691272841
## 
## 
## $Dim.2
## $Dim.2$quanti
##                   correlation
## COMMERCIAL.UNITS   0.88555744
## TOTAL.UNITS        0.38725903
## YEAR.BUILT        -0.01755000
## RESIDENTIAL.UNITS -0.09703775
## SALE.PRICE        -0.24081261
## LAND.SQUARE.FEET  -0.28489754
## GROSS.SQUARE.FEET -0.30971630
##                                                                                                                         p.value
## COMMERCIAL.UNITS  0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## TOTAL.UNITS       0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## YEAR.BUILT        0.00011576294932938061650259942148011305107502266764640808105468750000000000000000000000000000000000000000000
## RESIDENTIAL.UNITS 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000002918995
## SALE.PRICE        0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## LAND.SQUARE.FEET  0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## GROSS.SQUARE.FEET 0.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## 
## $Dim.2$quali
##                                    R2                          p.value
## BOROUGH                   0.002400124 0.000000000000000000000003973015
## TAX.CLASS.AT.TIME.OF.SALE 0.000513238 0.000017292220738148899770395689
## 
## $Dim.2$category
##      Estimate                          p.value
## 3  0.08473122 0.004099491264680589323876613861
## 2  0.01189679 0.006123303495678600674723135455
## 5  0.03925729 0.030452522172288578466980979442
## 4 -0.09075941 0.000005465298646257913447316090
## 1 -0.28190137 0.000000000000000000000000239752
## 
## 
## $Dim.3
## $Dim.3$quanti
##                   correlation
## YEAR.BUILT         0.99399118
## SALE.PRICE         0.07010208
## COMMERCIAL.UNITS   0.01496897
## RESIDENTIAL.UNITS -0.01923378
## LAND.SQUARE.FEET  -0.07454811
##                                                                                p.value
## YEAR.BUILT        0.000000000000000000000000000000000000000000000000000000000000000000
## SALE.PRICE        0.000000000000000000000000000000000000000000000000000012779263457765
## COMMERCIAL.UNITS  0.001009281260546321607254882657400685275206342339515686035156250000
## RESIDENTIAL.UNITS 0.000023910481720145631851642820109304921061266213655471801757812500
## LAND.SQUARE.FEET  0.000000000000000000000000000000000000000000000000000000000002030628
## 
## $Dim.3$quali
##                                   R2 p.value
## BOROUGH                   0.03628809       0
## TAX.CLASS.AT.TIME.OF.SALE 0.05143083       0
## 
## $Dim.3$category
##     Estimate
## 1  1.2846200
## 4  0.5734059
## 2  0.9145062
## 4  0.1344333
## 5  0.1887341
## 1  0.1888009
## 3 -2.7725321
## 2 -0.2834500
## 3 -0.2285182
##                                                                                                                                                                                                                                                                                      p.value
## 1 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## 4 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000009226337
## 2 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003200863174319934831159695851198660266247852320594460240302195111672638963014795625248319663702820550
## 4 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000167037749568152415667071495338133056811324669913256487057695090182692141192490737002168992786056607706283
## 5 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000011096322854539416200659894539489379598694373288985007149737575645869886495684904237520340918333770171176914928764463170639847323161363357279599016303429988552573965820012337472
## 1 0.0000000000000000000165147381655864597382188648610615061114022303989091368793366843004122301863390021026134490966796875000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## 3 0.0000000313935158681424085383343037559955579496318023302592337131500244140625000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
## 2 0.0000000000000000000000000000000000000000000000000000000000000396915802406432462492421110789460489473228378621684504727858589446452456021403627630265892588902191755044054875595762851899583728648927367902983960273421808195859483703316072933375835418701171875000000000000000000000000
## 3 0.0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000071029404444766175710980547551243671287715718013212165327325541538632974825706470415557429633311878021600959

In the following graph, we see that PC1 and PC2 combined can do a classifying task the Tax Class of our dataset, but not too well classifying on Borough:

plotellipses(property_PCA, keepvar = "quali.sup")

plotellipses(property_PCA, keepvar = 8, 
             ylim = c(-50, 50), xlim = c(-10, 60))

From this analysis we know which tax class of the property that we want to sell/buy in New York City.