This is an R Markdown document in which the analysis of the 2015 US Demographic Data by Census Tract has been performed and the R commands have been provided here along with the output.The data is given for each tract within a state and county.
Column Meaning
CensusTract Census tract ID State State, DC, or Puerto Rico County County or county equivalent TotalPop Total population Men Number of men Women Number of women Hispanic percent of population that is Hispanic/Latino White percent of population that is white Black percent of population that is black Native percent of population that is Native American or Native Alaskan Asian percent of population that is Asian Pacific percent of population that is Native Hawaiian or Pacific Islander Citizen Number of citizens Income Median household income IncomeErr Median household income error IncomePerCap Income per capita IncomePerCapErr Income per capita error Poverty percent under poverty level ChildPoverty percent of children under poverty level Professional percent employed in management, business, science, and arts Service percent employed in service jobs Office percent employed in sales and office jobs Construction percent employed in natural resources, construction, and maintenance Production percent employed in production, transportation, and material movement Drive percent commuting alone in a car, van, or truck Carpool percent carpooling in a car, van, or truck Transit percent commuting on public transportation Walk percent walking to work OtherTransp percent commuting via other means WorkAtHome percent working at home MeanCommute Mean commute time (minutes) Employed percent employed (16+) PrivateWork percent employed in private industry PublicWork percent employed in public jobs SelfEmployed percent self-employed FamilyWork percent in unpaid family work Unemployment Unemployment rate (percent)
setwd("D:/R Internship")
US_data<-read.csv(paste("US Demographic Data 2015_Census Tract.csv",sep = ""))
View(US_data)
str(US_data)
## 'data.frame': 74001 obs. of 37 variables:
## $ CensusTract : num 1e+09 1e+09 1e+09 1e+09 1e+09 ...
## $ State : Factor w/ 52 levels "Alabama","Alaska",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ County : Factor w/ 1928 levels "Añasco","Abbeville",..: 90 90 90 90 90 90 90 90 90 90 ...
## $ TotalPop : int 1948 2156 2968 4423 10763 3851 2761 3187 10915 5668 ...
## $ Men : int 940 1059 1364 2172 4922 1787 1210 1502 5486 2897 ...
## $ Women : int 1008 1097 1604 2251 5841 2064 1551 1685 5429 2771 ...
## $ Hispanic : num 0.9 0.8 0 10.5 0.7 13.1 3.8 1.3 1.4 0.4 ...
## $ White : num 87.4 40.4 74.5 82.8 68.5 72.9 74.5 84 89.5 85.5 ...
## $ Black : num 7.7 53.3 18.6 3.7 24.8 11.9 19.7 10.7 8.4 12.1 ...
## $ Native : num 0.3 0 0.5 1.6 0 0 0 3.1 0 0 ...
## $ Asian : num 0.6 2.3 1.4 0 3.8 0 0 0 0 0.3 ...
## $ Pacific : num 0 0 0.3 0 0 0 0 0 0 0 ...
## $ Citizen : int 1503 1662 2335 3306 7666 2642 2060 2391 7778 4217 ...
## $ Income : int 61838 32303 44922 54329 51965 63092 34821 73728 60063 41287 ...
## $ IncomeErr : int 11900 13538 5629 7003 6935 9585 7867 2447 8602 7857 ...
## $ IncomePerCap : int 25713 18021 20689 24125 27526 30480 20442 32813 24028 24710 ...
## $ IncomePerCapErr: int 4548 2474 2817 2870 2813 7550 3245 4669 2233 4149 ...
## $ Poverty : num 8.1 25.5 12.7 2.1 11.4 14.4 28.9 13 13.9 6.8 ...
## $ ChildPoverty : num 8.4 40.3 19.7 1.6 17.5 21.9 41.9 25.9 18.3 10 ...
## $ Professional : num 34.7 22.3 31.4 27 49.6 24.2 19.5 42.8 31.5 29.3 ...
## $ Service : num 17 24.7 24.9 20.8 14.2 17.5 29.6 10.7 17.5 13.7 ...
## $ Office : num 21.3 21.5 22.1 27 18.2 35.4 25.3 34.2 26.1 17.7 ...
## $ Construction : num 11.9 9.4 9.2 8.7 2.1 7.9 10.1 5.5 7.8 11 ...
## $ Production : num 15.2 22 12.4 16.4 15.8 14.9 15.5 6.8 17.1 28.3 ...
## $ Drive : num 90.2 86.3 94.8 86.6 88 82.7 92.4 84.3 90.1 88.7 ...
## $ Carpool : num 4.8 13.1 2.8 9.1 10.5 6.9 7.6 8.1 8.6 7.9 ...
## $ Transit : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Walk : num 0.5 0 0 0 0 0 0 0 0 0 ...
## $ OtherTransp : num 2.3 0.7 0 2.6 0.6 6 0 1.7 0 1.2 ...
## $ WorkAtHome : num 2.1 0 2.5 1.6 0.9 4.5 0 5.9 1.3 2.1 ...
## $ MeanCommute : num 25 23.4 19.6 25.3 24.8 19.8 20 24.3 29.4 32.9 ...
## $ Employed : int 943 753 1373 1782 5037 1560 1166 1502 4348 2485 ...
## $ PrivateWork : num 77.1 77 64.1 75.7 67.1 79.4 82 78.1 73.3 77.9 ...
## $ PublicWork : num 18.3 16.9 23.6 21.2 27.6 14.7 14.6 14.8 22.1 15.2 ...
## $ SelfEmployed : num 4.6 6.1 12.3 3.1 5.3 5.8 3.4 7.1 4.6 6.9 ...
## $ FamilyWork : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Unemployment : num 5.4 13.3 6.2 10.8 4.2 10.9 11.4 8.2 8.7 7.2 ...
dim(US_data)
## [1] 74001 37
library(psych)
summary(US_data)
## CensusTract State County
## Min. :1.001e+09 California : 8057 Los Angeles: 2346
## 1st Qu.:1.304e+10 Texas : 5265 Cook : 1326
## Median :2.805e+10 New York : 4918 Orange : 939
## Mean :2.839e+10 Florida : 4245 Jefferson : 927
## 3rd Qu.:4.200e+10 Pennsylvania: 3218 Maricopa : 916
## Max. :7.215e+10 Illinois : 3123 Montgomery : 833
## (Other) :45175 (Other) :66714
## TotalPop Men Women Hispanic
## Min. : 0 Min. : 0 Min. : 0 Min. : 0.00
## 1st Qu.: 2891 1st Qu.: 1409 1st Qu.: 1461 1st Qu.: 2.40
## Median : 4063 Median : 1986 Median : 2066 Median : 7.00
## Mean : 4326 Mean : 2128 Mean : 2198 Mean : 16.86
## 3rd Qu.: 5442 3rd Qu.: 2674 3rd Qu.: 2774 3rd Qu.: 20.40
## Max. :53812 Max. :27962 Max. :27250 Max. :100.00
## NA's :690
## White Black Native Asian
## Min. : 0.00 Min. : 0.00 Min. : 0.0000 Min. : 0.000
## 1st Qu.: 39.40 1st Qu.: 0.70 1st Qu.: 0.0000 1st Qu.: 0.200
## Median : 71.40 Median : 3.70 Median : 0.0000 Median : 1.400
## Mean : 62.03 Mean : 13.27 Mean : 0.7277 Mean : 4.588
## 3rd Qu.: 88.30 3rd Qu.: 14.40 3rd Qu.: 0.4000 3rd Qu.: 4.800
## Max. :100.00 Max. :100.00 Max. :100.0000 Max. :91.300
## NA's :690 NA's :690 NA's :690 NA's :690
## Pacific Citizen Income IncomeErr
## Min. : 0.000 Min. : 0 Min. : 2611 Min. : 390
## 1st Qu.: 0.000 1st Qu.: 2037 1st Qu.: 37683 1st Qu.: 5317
## Median : 0.000 Median : 2863 Median : 51094 Median : 7732
## Mean : 0.145 Mean : 3043 Mean : 57226 Mean : 9134
## 3rd Qu.: 0.000 3rd Qu.: 3838 3rd Qu.: 70117 3rd Qu.: 11258
## Max. :84.700 Max. :37416 Max. :248750 Max. :123116
## NA's :690 NA's :1100 NA's :1100
## IncomePerCap IncomePerCapErr Poverty ChildPoverty
## Min. : 128 Min. : 85 Min. : 0.00 Min. : 0.00
## 1st Qu.: 19123 1st Qu.: 2312 1st Qu.: 7.20 1st Qu.: 7.00
## Median : 25344 Median : 3127 Median : 13.40 Median : 17.80
## Mean : 28491 Mean : 3943 Mean : 16.96 Mean : 22.49
## 3rd Qu.: 33894 3rd Qu.: 4537 3rd Qu.: 23.10 3rd Qu.: 33.80
## Max. :254204 Max. :134380 Max. :100.00 Max. :100.00
## NA's :740 NA's :740 NA's :835 NA's :1118
## Professional Service Office Construction
## Min. : 0.00 Min. : 0.0 Min. : 0.00 Min. : 0.000
## 1st Qu.: 24.10 1st Qu.: 13.4 1st Qu.: 20.10 1st Qu.: 5.000
## Median : 32.60 Median : 17.9 Median : 23.80 Median : 8.400
## Mean : 34.80 Mean : 19.1 Mean : 23.95 Mean : 9.292
## 3rd Qu.: 43.88 3rd Qu.: 23.6 3rd Qu.: 27.50 3rd Qu.: 12.500
## Max. :100.00 Max. :100.0 Max. :100.00 Max. :100.000
## NA's :807 NA's :807 NA's :807 NA's :807
## Production Drive Carpool Transit
## Min. : 0.00 Min. : 0.00 Min. : 0.000 Min. : 0.000
## 1st Qu.: 7.10 1st Qu.: 72.00 1st Qu.: 6.000 1st Qu.: 0.000
## Median : 11.80 Median : 79.70 Median : 8.800 Median : 1.100
## Mean : 12.86 Mean : 75.53 Mean : 9.627 Mean : 5.456
## 3rd Qu.: 17.40 3rd Qu.: 84.90 3rd Qu.: 12.300 3rd Qu.: 4.700
## Max. :100.00 Max. :100.00 Max. :100.000 Max. :100.000
## NA's :807 NA's :797 NA's :797 NA's :797
## Walk OtherTransp WorkAtHome MeanCommute
## Min. : 0.000 Min. : 0.000 Min. : 0.000 Min. : 1.20
## 1st Qu.: 0.400 1st Qu.: 0.400 1st Qu.: 1.800 1st Qu.:20.80
## Median : 1.400 Median : 1.100 Median : 3.500 Median :25.00
## Mean : 3.123 Mean : 1.892 Mean : 4.368 Mean :25.67
## 3rd Qu.: 3.500 3rd Qu.: 2.500 3rd Qu.: 5.900 3rd Qu.:29.80
## Max. :100.000 Max. :100.000 Max. :100.000 Max. :80.00
## NA's :797 NA's :797 NA's :797 NA's :949
## Employed PrivateWork PublicWork SelfEmployed
## Min. : 0 Min. : 0.00 Min. : 0.00 Min. : 0.000
## 1st Qu.: 1249 1st Qu.: 74.60 1st Qu.: 9.60 1st Qu.: 3.500
## Median : 1846 Median : 80.10 Median : 13.40 Median : 5.500
## Mean : 1984 Mean : 78.98 Mean : 14.62 Mean : 6.234
## 3rd Qu.: 2553 3rd Qu.: 84.60 3rd Qu.: 18.20 3rd Qu.: 8.100
## Max. :24075 Max. :100.00 Max. :100.00 Max. :100.000
## NA's :807 NA's :807 NA's :807
## FamilyWork Unemployment
## Min. : 0.0000 Min. : 0.000
## 1st Qu.: 0.0000 1st Qu.: 5.100
## Median : 0.0000 Median : 7.700
## Mean : 0.1698 Mean : 9.029
## 3rd Qu.: 0.0000 3rd Qu.: 11.400
## Max. :26.5000 Max. :100.000
## NA's :807 NA's :802
describe(US_data)
## vars n mean sd median
## CensusTract 1 74001 2.839113e+10 1.647593e+10 2.8047e+10
## State* 2 74001 2.534000e+01 1.510000e+01 2.5000e+01
## County* 3 74001 9.879300e+02 5.229400e+02 1.0340e+03
## TotalPop 4 74001 4.325590e+03 2.129310e+03 4.0630e+03
## Men 5 74001 2.127650e+03 1.072330e+03 1.9860e+03
## Women 6 74001 2.197940e+03 1.095730e+03 2.0660e+03
## Hispanic 7 73311 1.686000e+01 2.294000e+01 7.0000e+00
## White 8 73311 6.203000e+01 3.068000e+01 7.1400e+01
## Black 9 73311 1.327000e+01 2.176000e+01 3.7000e+00
## Native 10 73311 7.300000e-01 4.490000e+00 0.0000e+00
## Asian 11 73311 4.590000e+00 8.790000e+00 1.4000e+00
## Pacific 12 73311 1.500000e-01 1.040000e+00 0.0000e+00
## Citizen 13 74001 3.043080e+03 1.475490e+03 2.8630e+03
## Income 14 72901 5.722556e+04 2.866333e+04 5.1094e+04
## IncomeErr 15 72901 9.134490e+03 5.920340e+03 7.7320e+03
## IncomePerCap 16 73261 2.849123e+04 1.504707e+04 2.5344e+04
## IncomePerCapErr 17 73261 3.942910e+03 3.023030e+03 3.1270e+03
## Poverty 18 73166 1.696000e+01 1.320000e+01 1.3400e+01
## ChildPoverty 19 72883 2.249000e+01 1.919000e+01 1.7800e+01
## Professional 20 73194 3.480000e+01 1.501000e+01 3.2600e+01
## Service 21 73194 1.910000e+01 8.280000e+00 1.7900e+01
## Office 22 73194 2.395000e+01 5.960000e+00 2.3800e+01
## Construction 23 73194 9.290000e+00 6.020000e+00 8.4000e+00
## Production 24 73194 1.286000e+01 7.670000e+00 1.1800e+01
## Drive 25 73204 7.553000e+01 1.537000e+01 7.9700e+01
## Carpool 26 73204 9.630000e+00 5.370000e+00 8.8000e+00
## Transit 27 73204 5.460000e+00 1.172000e+01 1.1000e+00
## Walk 28 73204 3.120000e+00 5.880000e+00 1.4000e+00
## OtherTransp 29 73204 1.890000e+00 2.600000e+00 1.1000e+00
## WorkAtHome 30 73204 4.370000e+00 3.900000e+00 3.5000e+00
## MeanCommute 31 73052 2.567000e+01 6.960000e+00 2.5000e+01
## Employed 32 74001 1.983910e+03 1.073430e+03 1.8460e+03
## PrivateWork 33 73194 7.898000e+01 8.350000e+00 8.0100e+01
## PublicWork 34 73194 1.462000e+01 7.540000e+00 1.3400e+01
## SelfEmployed 35 73194 6.230000e+00 4.040000e+00 5.5000e+00
## FamilyWork 36 73194 1.700000e-01 4.600000e-01 0.0000e+00
## Unemployment 37 73199 9.030000e+00 5.960000e+00 7.7000e+00
## trimmed mad min max
## CensusTract 2.803846e+10 2.080416e+10 1.00102e+09 7.215375e+10
## State* 2.516000e+01 2.076000e+01 1.00000e+00 5.200000e+01
## County* 9.921900e+02 6.553100e+02 1.00000e+00 1.928000e+03
## TotalPop 4.165830e+03 1.866590e+03 0.00000e+00 5.381200e+04
## Men 2.041010e+03 9.236600e+02 0.00000e+00 2.796200e+04
## Women 2.117140e+03 9.622100e+02 0.00000e+00 2.725000e+04
## Hispanic 1.161000e+01 8.450000e+00 0.00000e+00 1.000000e+02
## White 6.496000e+01 3.039000e+01 0.00000e+00 1.000000e+02
## Black 7.770000e+00 5.340000e+00 0.00000e+00 1.000000e+02
## Native 1.700000e-01 0.000000e+00 0.00000e+00 1.000000e+02
## Asian 2.520000e+00 2.080000e+00 0.00000e+00 9.130000e+01
## Pacific 0.000000e+00 0.000000e+00 0.00000e+00 8.470000e+01
## Citizen 2.935270e+03 1.319510e+03 0.00000e+00 3.741600e+04
## Income 5.378624e+04 2.265116e+04 2.61100e+03 2.487500e+05
## IncomeErr 8.270770e+03 4.123110e+03 3.90000e+02 1.231160e+05
## IncomePerCap 2.649608e+04 1.054425e+04 1.28000e+02 2.542040e+05
## IncomePerCapErr 3.422590e+03 1.460360e+03 8.50000e+01 1.343800e+05
## Poverty 1.509000e+01 1.082000e+01 0.00000e+00 1.000000e+02
## ChildPoverty 2.013000e+01 1.838000e+01 0.00000e+00 1.000000e+02
## Professional 3.383000e+01 1.423000e+01 0.00000e+00 1.000000e+02
## Service 1.847000e+01 7.410000e+00 0.00000e+00 1.000000e+02
## Office 2.383000e+01 5.490000e+00 0.00000e+00 1.000000e+02
## Construction 8.740000e+00 5.490000e+00 0.00000e+00 1.000000e+02
## Production 1.224000e+01 7.560000e+00 0.00000e+00 1.000000e+02
## Drive 7.831000e+01 8.900000e+00 0.00000e+00 1.000000e+02
## Carpool 9.140000e+00 4.600000e+00 0.00000e+00 1.000000e+02
## Transit 2.470000e+00 1.630000e+00 0.00000e+00 1.000000e+02
## Walk 1.920000e+00 2.080000e+00 0.00000e+00 1.000000e+02
## OtherTransp 1.410000e+00 1.480000e+00 0.00000e+00 1.000000e+02
## WorkAtHome 3.860000e+00 2.820000e+00 0.00000e+00 1.000000e+02
## MeanCommute 2.528000e+01 6.520000e+00 1.20000e+00 8.000000e+01
## Employed 1.899360e+03 9.562800e+02 0.00000e+00 2.407500e+04
## PrivateWork 7.963000e+01 7.410000e+00 0.00000e+00 1.000000e+02
## PublicWork 1.387000e+01 6.230000e+00 0.00000e+00 1.000000e+02
## SelfEmployed 5.800000e+00 3.260000e+00 0.00000e+00 1.000000e+02
## FamilyWork 6.000000e-02 0.000000e+00 0.00000e+00 2.650000e+01
## Unemployment 8.210000e+00 4.450000e+00 0.00000e+00 1.000000e+02
## range skew kurtosis se
## CensusTract 7.115273e+10 0.13 -0.92 60566306.31
## State* 5.100000e+01 0.02 -1.33 0.06
## County* 1.927000e+03 -0.08 -1.04 1.92
## TotalPop 5.381200e+04 1.83 14.56 7.83
## Men 2.796200e+04 1.96 17.07 3.94
## Women 2.725000e+04 1.79 13.80 4.03
## Hispanic 1.000000e+02 2.00 3.39 0.08
## White 1.000000e+02 -0.67 -0.86 0.11
## Black 1.000000e+02 2.31 4.77 0.08
## Native 1.000000e+02 15.89 289.71 0.02
## Asian 9.130000e+01 3.95 19.99 0.03
## Pacific 8.470000e+01 26.92 1295.51 0.00
## Citizen 3.741600e+04 1.61 12.68 5.42
## Income 2.461390e+05 1.48 3.46 106.16
## IncomeErr 1.227260e+05 3.00 21.40 21.93
## IncomePerCap 2.540760e+05 2.33 10.80 55.59
## IncomePerCapErr 1.342950e+05 5.81 96.59 11.17
## Poverty 1.000000e+02 1.46 2.69 0.05
## ChildPoverty 1.000000e+02 1.02 0.63 0.07
## Professional 1.000000e+02 0.60 0.09 0.06
## Service 1.000000e+02 0.97 2.44 0.03
## Office 1.000000e+02 0.73 7.23 0.02
## Construction 1.000000e+02 1.49 6.34 0.02
## Production 1.000000e+02 0.97 2.55 0.03
## Drive 1.000000e+02 -2.19 5.67 0.06
## Carpool 1.000000e+02 1.77 12.50 0.02
## Transit 1.000000e+02 3.63 14.68 0.04
## Walk 1.000000e+02 5.61 46.31 0.02
## OtherTransp 1.000000e+02 5.12 70.50 0.01
## WorkAtHome 1.000000e+02 3.95 50.07 0.01
## MeanCommute 7.880000e+01 0.63 0.85 0.03
## Employed 2.407500e+04 1.63 10.65 3.95
## PrivateWork 1.000000e+02 -1.32 5.17 0.03
## PublicWork 1.000000e+02 1.75 7.57 0.03
## SelfEmployed 1.000000e+02 2.92 38.47 0.01
## FamilyWork 2.650000e+01 7.29 188.67 0.00
## Unemployment 1.000000e+02 2.16 10.67 0.02
US_data1<-US_data[c(1:50,1182:1231),]
US_data2<-US_data[(US_data$State=="Alabama") | (US_data$State=="Alaska"),]
US_data1$State_1<-droplevels(US_data1$State)
US_data1$County_1<-droplevels(US_data1$County)
US_data2$State_1<-droplevels(US_data2$State)
US_data2$County_1<-droplevels(US_data2$County)
dim(US_data1)
## [1] 100 39
View(US_data1)
dim(US_data2)
## [1] 1348 39
View(US_data2)
str(US_data2)
## 'data.frame': 1348 obs. of 39 variables:
## $ CensusTract : num 1e+09 1e+09 1e+09 1e+09 1e+09 ...
## $ State : Factor w/ 52 levels "Alabama","Alaska",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ County : Factor w/ 1928 levels "Añasco","Abbeville",..: 90 90 90 90 90 90 90 90 90 90 ...
## $ TotalPop : int 1948 2156 2968 4423 10763 3851 2761 3187 10915 5668 ...
## $ Men : int 940 1059 1364 2172 4922 1787 1210 1502 5486 2897 ...
## $ Women : int 1008 1097 1604 2251 5841 2064 1551 1685 5429 2771 ...
## $ Hispanic : num 0.9 0.8 0 10.5 0.7 13.1 3.8 1.3 1.4 0.4 ...
## $ White : num 87.4 40.4 74.5 82.8 68.5 72.9 74.5 84 89.5 85.5 ...
## $ Black : num 7.7 53.3 18.6 3.7 24.8 11.9 19.7 10.7 8.4 12.1 ...
## $ Native : num 0.3 0 0.5 1.6 0 0 0 3.1 0 0 ...
## $ Asian : num 0.6 2.3 1.4 0 3.8 0 0 0 0 0.3 ...
## $ Pacific : num 0 0 0.3 0 0 0 0 0 0 0 ...
## $ Citizen : int 1503 1662 2335 3306 7666 2642 2060 2391 7778 4217 ...
## $ Income : int 61838 32303 44922 54329 51965 63092 34821 73728 60063 41287 ...
## $ IncomeErr : int 11900 13538 5629 7003 6935 9585 7867 2447 8602 7857 ...
## $ IncomePerCap : int 25713 18021 20689 24125 27526 30480 20442 32813 24028 24710 ...
## $ IncomePerCapErr: int 4548 2474 2817 2870 2813 7550 3245 4669 2233 4149 ...
## $ Poverty : num 8.1 25.5 12.7 2.1 11.4 14.4 28.9 13 13.9 6.8 ...
## $ ChildPoverty : num 8.4 40.3 19.7 1.6 17.5 21.9 41.9 25.9 18.3 10 ...
## $ Professional : num 34.7 22.3 31.4 27 49.6 24.2 19.5 42.8 31.5 29.3 ...
## $ Service : num 17 24.7 24.9 20.8 14.2 17.5 29.6 10.7 17.5 13.7 ...
## $ Office : num 21.3 21.5 22.1 27 18.2 35.4 25.3 34.2 26.1 17.7 ...
## $ Construction : num 11.9 9.4 9.2 8.7 2.1 7.9 10.1 5.5 7.8 11 ...
## $ Production : num 15.2 22 12.4 16.4 15.8 14.9 15.5 6.8 17.1 28.3 ...
## $ Drive : num 90.2 86.3 94.8 86.6 88 82.7 92.4 84.3 90.1 88.7 ...
## $ Carpool : num 4.8 13.1 2.8 9.1 10.5 6.9 7.6 8.1 8.6 7.9 ...
## $ Transit : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Walk : num 0.5 0 0 0 0 0 0 0 0 0 ...
## $ OtherTransp : num 2.3 0.7 0 2.6 0.6 6 0 1.7 0 1.2 ...
## $ WorkAtHome : num 2.1 0 2.5 1.6 0.9 4.5 0 5.9 1.3 2.1 ...
## $ MeanCommute : num 25 23.4 19.6 25.3 24.8 19.8 20 24.3 29.4 32.9 ...
## $ Employed : int 943 753 1373 1782 5037 1560 1166 1502 4348 2485 ...
## $ PrivateWork : num 77.1 77 64.1 75.7 67.1 79.4 82 78.1 73.3 77.9 ...
## $ PublicWork : num 18.3 16.9 23.6 21.2 27.6 14.7 14.6 14.8 22.1 15.2 ...
## $ SelfEmployed : num 4.6 6.1 12.3 3.1 5.3 5.8 3.4 7.1 4.6 6.9 ...
## $ FamilyWork : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Unemployment : num 5.4 13.3 6.2 10.8 4.2 10.9 11.4 8.2 8.7 7.2 ...
## $ State_1 : Factor w/ 2 levels "Alabama","Alaska": 1 1 1 1 1 1 1 1 1 1 ...
## $ County_1 : Factor w/ 96 levels "Aleutians East Borough",..: 4 4 4 4 4 4 4 4 4 4 ...
table(droplevels(US_data2$State))
##
## Alabama Alaska
## 1181 167
table(droplevels(US_data1$County))
##
## Aleutians East Borough Aleutians West Census Area
## 1 2
## Anchorage Municipality Autauga
## 47 12
## Baldwin Barbour
## 32 6
xtabs(~County_1+State_1,data = US_data1)
## State_1
## County_1 Alabama Alaska
## Aleutians East Borough 0 1
## Aleutians West Census Area 0 2
## Anchorage Municipality 0 47
## Autauga 12 0
## Baldwin 32 0
## Barbour 6 0
boxplot(Income~State_1,data = US_data2,
main="Boxplot of Income in 2 specific states in US",
xlab="States",ylab="Income")
boxplot(TotalPop~County_1,data = US_data1,
main="Boxplot of Population in 6 counties",
xlab="Counties",ylab="Population")
axis(side=1,at=2,labels = "Aleutians West")
hist(US_data$Employed,main="Histogram of No.of Employed people above 16 years in US",
xlab="No.of employed people per tract",ylab = "Count",xlim = c(0,10000),
col = "light blue")
hist(US_data$Drive,main="Histogram of percent of people who commute alone in US",
xlab="percent of people who commute alone per tract",ylab = "Count",col = "grey")
library(car)
##
## Attaching package: 'car'
## The following object is masked from 'package:psych':
##
## logit
scatterplot(Professional~White,data = US_data2,spread=FALSE,
smoother.args=list(lty=2),pch=1,
main="Plot between percent of White people and percent employed in Management,Business,Science and Arts")
scatterplot(Construction~Men,data = US_data2,spread=FALSE,
smoother.args=list(lty=2),pch=1,
main="Plot between no.of men and percent employed in Construction sector")
corr.test(US_data[,4:20])
## Call:corr.test(x = US_data[, 4:20])
## Correlation matrix
## TotalPop Men Women Hispanic White Black Native Asian
## TotalPop 1.00 0.98 0.98 0.11 -0.03 -0.11 -0.04 0.10
## Men 0.98 1.00 0.93 0.12 -0.02 -0.13 -0.03 0.10
## Women 0.98 0.93 1.00 0.10 -0.03 -0.09 -0.04 0.10
## Hispanic 0.11 0.12 0.10 1.00 -0.66 -0.12 -0.04 0.03
## White -0.03 -0.02 -0.03 -0.66 1.00 -0.58 -0.07 -0.25
## Black -0.11 -0.13 -0.09 -0.12 -0.58 1.00 -0.05 -0.11
## Native -0.04 -0.03 -0.04 -0.04 -0.07 -0.05 1.00 -0.04
## Asian 0.10 0.10 0.10 0.03 -0.25 -0.11 -0.04 1.00
## Pacific 0.02 0.03 0.02 0.02 -0.09 -0.04 0.01 0.16
## Citizen 0.94 0.92 0.92 -0.11 0.17 -0.13 -0.04 0.03
## Income 0.17 0.18 0.17 -0.23 0.31 -0.31 -0.07 0.28
## IncomeErr -0.01 0.00 -0.01 -0.10 0.10 -0.13 -0.06 0.22
## IncomePerCap 0.03 0.02 0.04 -0.31 0.38 -0.28 -0.07 0.20
## IncomePerCapErr -0.10 -0.10 -0.09 -0.16 0.17 -0.12 -0.05 0.13
## Poverty -0.15 -0.15 -0.15 0.34 -0.52 0.40 0.09 -0.12
## ChildPoverty -0.15 -0.15 -0.14 0.32 -0.50 0.41 0.07 -0.16
## Professional 0.08 0.07 0.08 -0.33 0.35 -0.25 -0.04 0.26
## Pacific Citizen Income IncomeErr IncomePerCap
## TotalPop 0.02 0.94 0.17 -0.01 0.03
## Men 0.03 0.92 0.18 0.00 0.02
## Women 0.02 0.92 0.17 -0.01 0.04
## Hispanic 0.02 -0.11 -0.23 -0.10 -0.31
## White -0.09 0.17 0.31 0.10 0.38
## Black -0.04 -0.13 -0.31 -0.13 -0.28
## Native 0.01 -0.04 -0.07 -0.06 -0.07
## Asian 0.16 0.03 0.28 0.22 0.20
## Pacific 1.00 0.00 0.01 0.01 -0.03
## Citizen 0.00 1.00 0.20 0.01 0.11
## Income 0.01 0.20 1.00 0.61 0.83
## IncomeErr 0.01 0.01 0.61 1.00 0.60
## IncomePerCap -0.03 0.11 0.83 0.60 1.00
## IncomePerCapErr -0.01 -0.05 0.50 0.52 0.77
## Poverty 0.01 -0.24 -0.70 -0.35 -0.61
## ChildPoverty 0.00 -0.24 -0.66 -0.35 -0.59
## Professional -0.03 0.18 0.73 0.49 0.80
## IncomePerCapErr Poverty ChildPoverty Professional
## TotalPop -0.10 -0.15 -0.15 0.08
## Men -0.10 -0.15 -0.15 0.07
## Women -0.09 -0.15 -0.14 0.08
## Hispanic -0.16 0.34 0.32 -0.33
## White 0.17 -0.52 -0.50 0.35
## Black -0.12 0.40 0.41 -0.25
## Native -0.05 0.09 0.07 -0.04
## Asian 0.13 -0.12 -0.16 0.26
## Pacific -0.01 0.01 0.00 -0.03
## Citizen -0.05 -0.24 -0.24 0.18
## Income 0.50 -0.70 -0.66 0.73
## IncomeErr 0.52 -0.35 -0.35 0.49
## IncomePerCap 0.77 -0.61 -0.59 0.80
## IncomePerCapErr 1.00 -0.29 -0.29 0.53
## Poverty -0.29 1.00 0.90 -0.54
## ChildPoverty -0.29 0.90 1.00 -0.57
## Professional 0.53 -0.54 -0.57 1.00
## Sample Size
## TotalPop Men Women Hispanic White Black Native Asian
## TotalPop 74001 74001 74001 73311 73311 73311 73311 73311
## Men 74001 74001 74001 73311 73311 73311 73311 73311
## Women 74001 74001 74001 73311 73311 73311 73311 73311
## Hispanic 73311 73311 73311 73311 73311 73311 73311 73311
## White 73311 73311 73311 73311 73311 73311 73311 73311
## Black 73311 73311 73311 73311 73311 73311 73311 73311
## Native 73311 73311 73311 73311 73311 73311 73311 73311
## Asian 73311 73311 73311 73311 73311 73311 73311 73311
## Pacific 73311 73311 73311 73311 73311 73311 73311 73311
## Citizen 74001 74001 74001 73311 73311 73311 73311 73311
## Income 72901 72901 72901 72901 72901 72901 72901 72901
## IncomeErr 72901 72901 72901 72901 72901 72901 72901 72901
## IncomePerCap 73261 73261 73261 73261 73261 73261 73261 73261
## IncomePerCapErr 73261 73261 73261 73261 73261 73261 73261 73261
## Poverty 73166 73166 73166 73166 73166 73166 73166 73166
## ChildPoverty 72883 72883 72883 72883 72883 72883 72883 72883
## Professional 73194 73194 73194 73194 73194 73194 73194 73194
## Pacific Citizen Income IncomeErr IncomePerCap
## TotalPop 73311 74001 72901 72901 73261
## Men 73311 74001 72901 72901 73261
## Women 73311 74001 72901 72901 73261
## Hispanic 73311 73311 72901 72901 73261
## White 73311 73311 72901 72901 73261
## Black 73311 73311 72901 72901 73261
## Native 73311 73311 72901 72901 73261
## Asian 73311 73311 72901 72901 73261
## Pacific 73311 73311 72901 72901 73261
## Citizen 73311 74001 72901 72901 73261
## Income 72901 72901 72901 72901 72901
## IncomeErr 72901 72901 72901 72901 72901
## IncomePerCap 73261 73261 72901 72901 73261
## IncomePerCapErr 73261 73261 72901 72901 73261
## Poverty 73166 73166 72901 72901 73120
## ChildPoverty 72883 72883 72748 72748 72874
## Professional 73194 73194 72899 72899 73156
## IncomePerCapErr Poverty ChildPoverty Professional
## TotalPop 73261 73166 72883 73194
## Men 73261 73166 72883 73194
## Women 73261 73166 72883 73194
## Hispanic 73261 73166 72883 73194
## White 73261 73166 72883 73194
## Black 73261 73166 72883 73194
## Native 73261 73166 72883 73194
## Asian 73261 73166 72883 73194
## Pacific 73261 73166 72883 73194
## Citizen 73261 73166 72883 73194
## Income 72901 72901 72748 72899
## IncomeErr 72901 72901 72748 72899
## IncomePerCap 73261 73120 72874 73156
## IncomePerCapErr 73261 73120 72874 73156
## Poverty 73120 73166 72883 73144
## ChildPoverty 72874 72883 72883 72880
## Professional 73156 73144 72880 73194
## Probability values (Entries above the diagonal are adjusted for multiple tests.)
## TotalPop Men Women Hispanic White Black Native Asian
## TotalPop 0.00 0.00 0.00 0 0 0 0.00 0
## Men 0.00 0.00 0.00 0 0 0 0.00 0
## Women 0.00 0.00 0.00 0 0 0 0.00 0
## Hispanic 0.00 0.00 0.00 0 0 0 0.00 0
## White 0.00 0.00 0.00 0 0 0 0.00 0
## Black 0.00 0.00 0.00 0 0 0 0.00 0
## Native 0.00 0.00 0.00 0 0 0 0.00 0
## Asian 0.00 0.00 0.00 0 0 0 0.00 0
## Pacific 0.00 0.00 0.00 0 0 0 0.02 0
## Citizen 0.00 0.00 0.00 0 0 0 0.00 0
## Income 0.00 0.00 0.00 0 0 0 0.00 0
## IncomeErr 0.06 0.23 0.01 0 0 0 0.00 0
## IncomePerCap 0.00 0.00 0.00 0 0 0 0.00 0
## IncomePerCapErr 0.00 0.00 0.00 0 0 0 0.00 0
## Poverty 0.00 0.00 0.00 0 0 0 0.00 0
## ChildPoverty 0.00 0.00 0.00 0 0 0 0.00 0
## Professional 0.00 0.00 0.00 0 0 0 0.00 0
## Pacific Citizen Income IncomeErr IncomePerCap
## TotalPop 0.00 0.00 0.00 0.35 0
## Men 0.00 0.00 0.00 0.69 0
## Women 0.00 0.00 0.00 0.10 0
## Hispanic 0.00 0.00 0.00 0.00 0
## White 0.00 0.00 0.00 0.00 0
## Black 0.00 0.00 0.00 0.00 0
## Native 0.15 0.00 0.00 0.00 0
## Asian 0.00 0.00 0.00 0.00 0
## Pacific 0.00 0.93 0.23 0.01 0
## Citizen 0.46 0.00 0.00 0.40 0
## Income 0.03 0.00 0.00 0.00 0
## IncomeErr 0.00 0.10 0.00 0.00 0
## IncomePerCap 0.00 0.00 0.00 0.00 0
## IncomePerCapErr 0.01 0.00 0.00 0.00 0
## Poverty 0.06 0.00 0.00 0.00 0
## ChildPoverty 0.54 0.00 0.00 0.00 0
## Professional 0.00 0.00 0.00 0.00 0
## IncomePerCapErr Poverty ChildPoverty Professional
## TotalPop 0.00 0.00 0.00 0
## Men 0.00 0.00 0.00 0
## Women 0.00 0.00 0.00 0
## Hispanic 0.00 0.00 0.00 0
## White 0.00 0.00 0.00 0
## Black 0.00 0.00 0.00 0
## Native 0.00 0.00 0.00 0
## Asian 0.00 0.00 0.00 0
## Pacific 0.05 0.35 0.93 0
## Citizen 0.00 0.00 0.00 0
## Income 0.00 0.00 0.00 0
## IncomeErr 0.00 0.00 0.00 0
## IncomePerCap 0.00 0.00 0.00 0
## IncomePerCapErr 0.00 0.00 0.00 0
## Poverty 0.00 0.00 0.00 0
## ChildPoverty 0.00 0.00 0.00 0
## Professional 0.00 0.00 0.00 0
##
## To see confidence intervals of the correlations, print with the short=FALSE option
library(corrgram)
## Warning: package 'corrgram' was built under R version 3.3.3
corrgram(US_data2[,4:20],order = TRUE,lower.panel = panel.shade,
upper.panel = panel.pie,text.panel = panel.txt,
main="Corrgram of numeric variables in the US Demographic Dataset")
scatterplotMatrix(formula=~Professional+Service+Office+Construction+Production,
cex=0.6,data = US_data2)
t.test(US_data2$Native,US_data2$Asian)
##
## Welch Two Sample t-test
##
## data: US_data2$Native and US_data2$Asian
## t = 2.9768, df = 1735, p-value = 0.002953
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 0.2857514 1.3896023
## sample estimates:
## mean of x mean of y
## 2.435071 1.597394
p-value suggests that there isn’t any significant difference between the percent of native and asian people.
t.test(US_data2$Men,US_data2$Women)
##
## Welch Two Sample t-test
##
## data: US_data2$Men and US_data2$Women
## t = -2.0056, df = 2694, p-value = 0.045
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -166.502622 -1.878683
## sample estimates:
## mean of x mean of y
## 2021.701 2105.892
p-value suggests that there is a significant difference between the no.of men and women.
fit1<-lm(Income~Men+Women+Hispanic+White+Black+Native+Asian+Pacific+Poverty+ChildPoverty
+Professional+Service+Office+Construction+Production+WorkAtHome+Employed
+PrivateWork
+PublicWork+SelfEmployed+Unemployment,data = US_data)
summary(fit1)
##
## Call:
## lm(formula = Income ~ Men + Women + Hispanic + White + Black +
## Native + Asian + Pacific + Poverty + ChildPoverty + Professional +
## Service + Office + Construction + Production + WorkAtHome +
## Employed + PrivateWork + PublicWork + SelfEmployed + Unemployment,
## data = US_data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -97309 -8956 -1307 7053 154502
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6091.0429 91246.7609 -0.067 0.94678
## Men 3.3171 0.1721 19.275 < 2e-16 ***
## Women -2.0055 0.1884 -10.647 < 2e-16 ***
## Hispanic -7.6404 24.7683 -0.308 0.75772
## White -124.1370 25.0030 -4.965 6.89e-07 ***
## Black -71.1451 25.1202 -2.832 0.00462 **
## Native -68.6338 29.3203 -2.341 0.01924 *
## Asian 163.6730 27.3585 5.983 2.21e-09 ***
## Pacific 188.5138 71.4594 2.638 0.00834 **
## Poverty -1125.4385 11.1288 -101.129 < 2e-16 ***
## ChildPoverty 82.7290 7.2094 11.475 < 2e-16 ***
## Professional 1232.6952 902.9948 1.365 0.17222
## Service 212.0743 903.0404 0.235 0.81433
## Office 352.2388 903.0593 0.390 0.69650
## Construction 436.8530 903.0663 0.484 0.62857
## Production 270.1197 903.0637 0.299 0.76485
## WorkAtHome 730.3336 19.6336 37.198 < 2e-16 ***
## Employed 0.2020 0.1840 1.098 0.27227
## PrivateWork 230.3000 128.5837 1.791 0.07329 .
## PublicWork -23.2482 129.0127 -0.180 0.85700
## SelfEmployed -36.1729 131.0663 -0.276 0.78256
## Unemployment 195.1275 14.6693 13.302 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 15670 on 72725 degrees of freedom
## (1254 observations deleted due to missingness)
## Multiple R-squared: 0.7014, Adjusted R-squared: 0.7013
## F-statistic: 8135 on 21 and 72725 DF, p-value: < 2.2e-16