Approach

For this project, I plan to work with a housing-related dataset from NYC Open Data that contains information on housing programs, population counts, and rental costs. I will begin by identifying a dataset that is relevant to understanding housing conditions in New York City and storing it in a public GitHub repository to ensure reproducibility.

My approach will focus on loading the data directly from a public URL into R and then selecting a meaningful subset of variables that are most useful for downstream analysis. I will clean and rename columns to make them easier to interpret and consistent with standard naming conventions. This transformed dataset will serve as a clean foundation for future exploratory analysis.

One anticipated challenge is that NYC Open Data files often contain many columns, inconsistent naming conventions, or missing values. I may need to inspect the data structure carefully and decide which variables are relevant and which can be safely removed. Another challenge may be ensuring that all data types are correctly interpreted when importing the CSV file.

Overview

housing-related data obtained from NYC Open Data. The dataset includes information on housing programs, population counts, and rental costs, and is used here to demonstrate data loading and transformation in R. Source: https://opendata.cityofnewyork.us/

Data Transformation

The original NYC Open Data file contained a large number of demographic and housing-related variables. To simplify downstream analysis, a subset of relevant columns was selected, including program type, Section 8 status, total families, total population, and average gross rent. Column names were also cleaned to improve readability and consistency.

dataset <- read.csv(
  "https://raw.githubusercontent.com/japhet125/r-workflow-assignment/main/NYCHA_Resident_Data_Book_Summary_20260125.csv",
  stringsAsFactors = FALSE
)

head(dataset)
##                 PROGRAM         STATECITY_SECTION8_FLAG Total.Families
## 1               FEDERAL                TOTAL HOUSEHOLDS        131,286
## 2 FORMER NEW YORK STATE                TOTAL HOUSEHOLDS          7,668
## 3 FORMER NEW YORK STATE       PUBLIC HOUSING HOUSEHOLDS          6,319
## 4 FORMER NEW YORK STATE SECTION 8 TRANSITION HOUSEHOLDS          1,349
## 5  FORMER NEW YORK CITY                TOTAL HOUSEHOLDS          4,020
## 6  FORMER NEW YORK CITY       PUBLIC HOUSING HOUSEHOLDS          3,305
##   Total.Female.Headed.Families Total.Male.Headed.Families Total.Population
## 1                      101,586                     29,610          272,972
## 2                        5,754                      1,908           17,237
## 3                        4,730                      1,584           13,928
## 4                        1,024                        324            3,309
## 5                        3,224                        790            7,997
## 6                        2,653                        646            6,448
##   Average.Family.Size Total.Minors.Under.18 Average.Minors.per.Family
## 1                 2.1                62,730                       0.5
## 2                 2.2                 4,012                       0.5
## 3                 2.2                 3,099                       0.5
## 4                 2.5                   913                       0.7
## 5                 2.0                 1,964                       0.5
## 6                 2.0                 1,555                       0.5
##   Total.Minors.as.Percent.of.Population All.Average.Total.Gross.Income
## 1                                22.98%                        $26,105
## 2                                23.28%                        $26,667
## 3                                22.25%                        $27,295
## 4                                27.59%                        $23,650
## 5                                24.56%                        $25,913
## 6                                24.12%                        $26,459
##   All.Average.Gross.Rent Total.HOH.62.Years.and.Over
## 1                   $622                      59,366
## 2                   $614                       3,399
## 3                   $624                       3,071
## 4                   $569                         328
## 5                   $617                       1,529
## 6                   $627                       1,376
##   Total.HOH.62.Years.and.Over.as.Percent.of.Families
## 1                                             45.22%
## 2                                             44.33%
## 3                                              48.6%
## 4                                             24.31%
## 5                                             38.03%
## 6                                             41.63%
##   Total.Female.Headed.HOH.62.Years.and.Over
## 1                                    43,526
## 2                                     2,358
## 3                                     2,201
## 4                                       157
## 5                                     1,152
## 6                                     1,075
##   Total.Male.Headed.HOH.62.Years.and.Over Total.Elderly.Single.Person.Families
## 1                                  15,831                               35,343
## 2                                   1,039                                1,848
## 3                                     869                                1,616
## 4                                     170                                  232
## 5                                     376                                  996
## 6                                     300                                  892
##   Total.Elderly.Population Total.62.Years.and.Over.as.Percent.of.Population
## 1                   70,229                                           25.73%
## 2                    4,078                                           23.66%
## 3                    3,723                                           26.73%
## 4                      355                                           10.73%
## 5                    1,735                                            21.7%
## 6                    1,565                                           24.27%
##   Total.Families.on.Welfare Total.Families.on.Welfare.and.HOH.Elderly
## 1                    20,174                                     2,202
## 2                     1,178                                       129
## 3                       860                                       108
## 4                       318                                        21
## 5                       655                                        50
## 6                       523                                        41
##   Total.Families.on.Full.Welfare
## 1                         11,346
## 2                            627
## 3                            469
## 4                            158
## 5                            373
## 6                            300
##   Total.Families.on.Welfare.as.Percent.of.Families
## 1                                           15.37%
## 2                                           15.36%
## 3                                           13.61%
## 4                                           23.57%
## 5                                           16.29%
## 6                                           15.82%
##   Total.Single.Parent.Grandparent.Families.with.Minors
## 1                                               26,428
## 2                                                1,367
## 3                                                1,126
## 4                                                  241
## 5                                                  939
## 6                                                  775
##   Total.Female.Headed.Single.Parent.Grandparent.with.Minors
## 1                                                    25,086
## 2                                                     1,294
## 3                                                     1,063
## 4                                                       231
## 5                                                       894
## 6                                                       736
##   Total.Male.Headed.Single.Parent.Grandparent.with.Minors
## 1                                                   1,320
## 2                                                      73
## 3                                                      63
## 4                                                      10
## 5                                                      43
## 6                                                      37
##   Total.Single.Parent.Grandparent.Families.on.Welfare
## 1                                              10,001
## 2                                                 508
## 3                                                 419
## 4                                                  89
## 5                                                 351
## 6                                                 310
##   Total.Single.Parent.Grandparent.with.Minors.as...of.Families
## 1                                                       20.13%
## 2                                                       17.83%
## 3                                                       17.82%
## 4                                                       17.87%
## 5                                                       23.36%
## 6                                                       23.45%
##   Total.Families...1.or.More.Employed
## 1                          5,053,500%
## 2                            285,300%
## 3                            239,400%
## 4                             45,900%
## 5                            149,700%
## 6                            123,200%
##   Total.Families...1.or.More.Employed.as.Percent.of.Families
## 1                                                     38.49%
## 2                                                     37.21%
## 3                                                     37.89%
## 4                                                     34.03%
## 5                                                     37.24%
## 6                                                     37.28%
##   Total.Families...2nd.Adult.Employed
## 1                               7,008
## 2                                 482
## 3                                 402
## 4                                  80
## 5                                 140
## 6                                 112
##   All.Families.Average.Years.in.Public.Housing Residents.Under.4
## 1                                         27.1             6,494
## 2                                         25.8               338
## 3                                         28.7               281
## 4                                         11.7                57
## 5                                         24.4               211
## 6                                         27.0               187
##   Residents.4.to.5 Residents.6.to.9 Residents.10.to.13 Residents.14.to.17
## 1            5,492           14,363             16,880             19,501
## 2              358              890              1,075              1,351
## 3              293              709                825                991
## 4               65              181                250                360
## 5              164              459                493                637
## 6              139              386                389                454
##   Residents.18.to.20 Residents.21.to.49 Residents.50.to.61 Residents.62.Plus
## 1             14,427             89,029             36,556                NA
## 2              1,057              5,828              2,262                NA
## 3                723              4,536              1,847                NA
## 4                334              1,292                415                NA
## 5                428              2,807              1,063                NA
## 6                295              2,174                859                NA
##   Total.Fixed.Income.Families
## 1                      60,097
## 2                       3,413
## 3                       2,888
## 4                         525
## 5                       1,777
## 6                       1,503
##   Total.Fixed.Income.Families.as.Percent.of.Families
## 1                                             0.4578
## 2                                             0.4451
## 3                                             0.4570
## 4                                             0.3892
## 5                                             0.4420
## 6                                             0.4548
str(dataset)
## 'data.frame':    26 obs. of  43 variables:
##  $ PROGRAM                                                     : chr  "FEDERAL" "FORMER NEW YORK STATE" "FORMER NEW YORK STATE" "FORMER NEW YORK STATE" ...
##  $ STATECITY_SECTION8_FLAG                                     : chr  "TOTAL HOUSEHOLDS" "TOTAL HOUSEHOLDS" "PUBLIC HOUSING HOUSEHOLDS" "SECTION 8 TRANSITION HOUSEHOLDS" ...
##  $ Total.Families                                              : chr  "131,286" "7,668" "6,319" "1,349" ...
##  $ Total.Female.Headed.Families                                : chr  "101,586" "5,754" "4,730" "1,024" ...
##  $ Total.Male.Headed.Families                                  : chr  "29,610" "1,908" "1,584" "324" ...
##  $ Total.Population                                            : chr  "272,972" "17,237" "13,928" "3,309" ...
##  $ Average.Family.Size                                         : num  2.1 2.2 2.2 2.5 2 2 2.2 2.2 2.1 2.4 ...
##  $ Total.Minors.Under.18                                       : chr  "62,730" "4,012" "3,099" "913" ...
##  $ Average.Minors.per.Family                                   : num  0.5 0.5 0.5 0.7 0.5 0.5 0.6 0.5 0.5 0.6 ...
##  $ Total.Minors.as.Percent.of.Population                       : chr  "22.98%" "23.28%" "22.25%" "27.59%" ...
##  $ All.Average.Total.Gross.Income                              : chr  "$26,105" "$26,667" "$27,295" "$23,650" ...
##  $ All.Average.Gross.Rent                                      : chr  "$622" "$614" "$624" "$569" ...
##  $ Total.HOH.62.Years.and.Over                                 : chr  "59,366" "3,399" "3,071" "328" ...
##  $ Total.HOH.62.Years.and.Over.as.Percent.of.Families          : chr  "45.22%" "44.33%" "48.6%" "24.31%" ...
##  $ Total.Female.Headed.HOH.62.Years.and.Over                   : chr  "43,526" "2,358" "2,201" "157" ...
##  $ Total.Male.Headed.HOH.62.Years.and.Over                     : chr  "15,831" "1,039" "869" "170" ...
##  $ Total.Elderly.Single.Person.Families                        : chr  "35,343" "1,848" "1,616" "232" ...
##  $ Total.Elderly.Population                                    : chr  "70,229" "4,078" "3,723" "355" ...
##  $ Total.62.Years.and.Over.as.Percent.of.Population            : chr  "25.73%" "23.66%" "26.73%" "10.73%" ...
##  $ Total.Families.on.Welfare                                   : chr  "20,174" "1,178" "860" "318" ...
##  $ Total.Families.on.Welfare.and.HOH.Elderly                   : chr  "2,202" "129" "108" "21" ...
##  $ Total.Families.on.Full.Welfare                              : chr  "11,346" "627" "469" "158" ...
##  $ Total.Families.on.Welfare.as.Percent.of.Families            : chr  "15.37%" "15.36%" "13.61%" "23.57%" ...
##  $ Total.Single.Parent.Grandparent.Families.with.Minors        : chr  "26,428" "1,367" "1,126" "241" ...
##  $ Total.Female.Headed.Single.Parent.Grandparent.with.Minors   : chr  "25,086" "1,294" "1,063" "231" ...
##  $ Total.Male.Headed.Single.Parent.Grandparent.with.Minors     : chr  "1,320" "73" "63" "10" ...
##  $ Total.Single.Parent.Grandparent.Families.on.Welfare         : chr  "10,001" "508" "419" "89" ...
##  $ Total.Single.Parent.Grandparent.with.Minors.as...of.Families: chr  "20.13%" "17.83%" "17.82%" "17.87%" ...
##  $ Total.Families...1.or.More.Employed                         : chr  "5,053,500%" "285,300%" "239,400%" "45,900%" ...
##  $ Total.Families...1.or.More.Employed.as.Percent.of.Families  : chr  "38.49%" "37.21%" "37.89%" "34.03%" ...
##  $ Total.Families...2nd.Adult.Employed                         : chr  "7,008" "482" "402" "80" ...
##  $ All.Families.Average.Years.in.Public.Housing                : num  27.1 25.8 28.7 11.7 24.4 27 12.1 25.3 28.2 11.9 ...
##  $ Residents.Under.4                                           : chr  "6,494" "338" "281" "57" ...
##  $ Residents.4.to.5                                            : chr  "5,492" "358" "293" "65" ...
##  $ Residents.6.to.9                                            : chr  "14,363" "890" "709" "181" ...
##  $ Residents.10.to.13                                          : chr  "16,880" "1,075" "825" "250" ...
##  $ Residents.14.to.17                                          : chr  "19,501" "1,351" "991" "360" ...
##  $ Residents.18.to.20                                          : chr  "14,427" "1,057" "723" "334" ...
##  $ Residents.21.to.49                                          : chr  "89,029" "5,828" "4,536" "1,292" ...
##  $ Residents.50.to.61                                          : chr  "36,556" "2,262" "1,847" "415" ...
##  $ Residents.62.Plus                                           : logi  NA NA NA NA NA NA ...
##  $ Total.Fixed.Income.Families                                 : chr  "60,097" "3,413" "2,888" "525" ...
##  $ Total.Fixed.Income.Families.as.Percent.of.Families          : num  0.458 0.445 0.457 0.389 0.442 ...
colnames(dataset)
##  [1] "PROGRAM"                                                     
##  [2] "STATECITY_SECTION8_FLAG"                                     
##  [3] "Total.Families"                                              
##  [4] "Total.Female.Headed.Families"                                
##  [5] "Total.Male.Headed.Families"                                  
##  [6] "Total.Population"                                            
##  [7] "Average.Family.Size"                                         
##  [8] "Total.Minors.Under.18"                                       
##  [9] "Average.Minors.per.Family"                                   
## [10] "Total.Minors.as.Percent.of.Population"                       
## [11] "All.Average.Total.Gross.Income"                              
## [12] "All.Average.Gross.Rent"                                      
## [13] "Total.HOH.62.Years.and.Over"                                 
## [14] "Total.HOH.62.Years.and.Over.as.Percent.of.Families"          
## [15] "Total.Female.Headed.HOH.62.Years.and.Over"                   
## [16] "Total.Male.Headed.HOH.62.Years.and.Over"                     
## [17] "Total.Elderly.Single.Person.Families"                        
## [18] "Total.Elderly.Population"                                    
## [19] "Total.62.Years.and.Over.as.Percent.of.Population"            
## [20] "Total.Families.on.Welfare"                                   
## [21] "Total.Families.on.Welfare.and.HOH.Elderly"                   
## [22] "Total.Families.on.Full.Welfare"                              
## [23] "Total.Families.on.Welfare.as.Percent.of.Families"            
## [24] "Total.Single.Parent.Grandparent.Families.with.Minors"        
## [25] "Total.Female.Headed.Single.Parent.Grandparent.with.Minors"   
## [26] "Total.Male.Headed.Single.Parent.Grandparent.with.Minors"     
## [27] "Total.Single.Parent.Grandparent.Families.on.Welfare"         
## [28] "Total.Single.Parent.Grandparent.with.Minors.as...of.Families"
## [29] "Total.Families...1.or.More.Employed"                         
## [30] "Total.Families...1.or.More.Employed.as.Percent.of.Families"  
## [31] "Total.Families...2nd.Adult.Employed"                         
## [32] "All.Families.Average.Years.in.Public.Housing"                
## [33] "Residents.Under.4"                                           
## [34] "Residents.4.to.5"                                            
## [35] "Residents.6.to.9"                                            
## [36] "Residents.10.to.13"                                          
## [37] "Residents.14.to.17"                                          
## [38] "Residents.18.to.20"                                          
## [39] "Residents.21.to.49"                                          
## [40] "Residents.50.to.61"                                          
## [41] "Residents.62.Plus"                                           
## [42] "Total.Fixed.Income.Families"                                 
## [43] "Total.Fixed.Income.Families.as.Percent.of.Families"
#selecting a subset of a column

subset_dataset = dataset[, c(
  "PROGRAM",
  "STATECITY_SECTION8_FLAG",
  "Total.Families",
  "Total.Population",
  "All.Average.Gross.Rent"
  
)]

head(subset_dataset)
##                 PROGRAM         STATECITY_SECTION8_FLAG Total.Families
## 1               FEDERAL                TOTAL HOUSEHOLDS        131,286
## 2 FORMER NEW YORK STATE                TOTAL HOUSEHOLDS          7,668
## 3 FORMER NEW YORK STATE       PUBLIC HOUSING HOUSEHOLDS          6,319
## 4 FORMER NEW YORK STATE SECTION 8 TRANSITION HOUSEHOLDS          1,349
## 5  FORMER NEW YORK CITY                TOTAL HOUSEHOLDS          4,020
## 6  FORMER NEW YORK CITY       PUBLIC HOUSING HOUSEHOLDS          3,305
##   Total.Population All.Average.Gross.Rent
## 1          272,972                   $622
## 2           17,237                   $614
## 3           13,928                   $624
## 4            3,309                   $569
## 5            7,997                   $617
## 6            6,448                   $627
#cleaning column names

colnames(subset_dataset) = c(
  "CITY_PROGRAM",
  "SECTION_8",
  "Total_Families",
  "Total_Population",
  "All_Average_Gross_Rent"
)
head(subset_dataset)
##            CITY_PROGRAM                       SECTION_8 Total_Families
## 1               FEDERAL                TOTAL HOUSEHOLDS        131,286
## 2 FORMER NEW YORK STATE                TOTAL HOUSEHOLDS          7,668
## 3 FORMER NEW YORK STATE       PUBLIC HOUSING HOUSEHOLDS          6,319
## 4 FORMER NEW YORK STATE SECTION 8 TRANSITION HOUSEHOLDS          1,349
## 5  FORMER NEW YORK CITY                TOTAL HOUSEHOLDS          4,020
## 6  FORMER NEW YORK CITY       PUBLIC HOUSING HOUSEHOLDS          3,305
##   Total_Population All_Average_Gross_Rent
## 1          272,972                   $622
## 2           17,237                   $614
## 3           13,928                   $624
## 4            3,309                   $569
## 5            7,997                   $617
## 6            6,448                   $627

Conclusion

This analysis prepared a clean and reproducible subset of NYC housing data for future exploration. Future work could examine relationships between housing programs, population size, and rental costs or incorporate additional years of data for trend analysis.