title: “Test” output: html_document date: “2023-01-01”


R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

summary(cars)
##      speed           dist       
##  Min.   : 4.0   Min.   :  2.00  
##  1st Qu.:12.0   1st Qu.: 26.00  
##  Median :15.0   Median : 36.00  
##  Mean   :15.4   Mean   : 42.98  
##  3rd Qu.:19.0   3rd Qu.: 56.00  
##  Max.   :25.0   Max.   :120.00

Including Plots

You can also embed plots, for example:

Note that the echo = FALSE parameter was added to the code chunk to prevent printing of the R code that generated the plot.

Github

Two plus two equals 4

Boston Housing Data

Boston housing data contains information collected by the U.S Census Service concerning housing in the area of Boston Mass. It has been used extensively throughout the literature to benchmark algorithms. The dataset has \(506\) cases, and \(14\) covariates, which are

  1. Crime - per capita crime rate by town
  2. #Large lots - proportion of residential land zoned for lots over 25,000 sq.ft.
  3. Nonretail acres - proportion of non-retail business acres per town.
  4. Charls - Charles River dummy variable (1 if tract bounds river; 0 otherwise)
  5. Nitric oxides - nitric oxides concentration (parts per 10 million)
  6. Rooms - average number of rooms per dwelling
  7. #Old houses - proportion of owner-occupied units built prior to 1940
  8. Dist.empl.centers - weighted distances to five Boston employment centres
  9. Access to highway - index of accessibility to radial highways
  10. Tax - full-value property-tax rate per $10,000
  11. Pupil/teacher - pupil-teacher ratio by town
  12. African American - \(1000\times (B_{k} - 0.63)^2\) where \(B_k\) is the proportion of blacks by town
  13. # Lower status - % lower status of the population
  14. Value - Median value of owner-occupied homes in $1000’s

We will apply factor analysis to this data set. Towards that, we first find the number of factors using principal factor analysis (PFA). The steps are:

  1. First find the communalities of the variables, \(i\)-th communality is \(h_i=\max_{j\neq i} |r_{i,j}|\).

\[\\[.02 in]\]

Communality and individual variance
Communality Individual variance
Crime 0.3912567 0.6087433
#Large lots 0.4414383 0.5585617
Nonretail acres 0.5831635 0.4168365
Nitric oxides 0.5831635 0.4168365
Rooms 0.4835255 0.5164745
#Old houses 0.5350485 0.4649515
Dist.empl.centers 0.4414383 0.5585617
Access to highway 0.8285154 0.1714846
Tax 0.8285154 0.1714846
Pupil/teacher 0.2159844 0.7840156
African American 0.1111961 0.8888039
# Lower status 0.3645741 0.6354259
Value 0.4835255 0.5164745

Orthogonal projection of a vector