Background Information on the Dataset

In the wake of the Great Recession of 2009, there has been a good deal of focus on employment statistics, one of the most important metrics policymakers use to gauge the overall strength of the economy. In the United States, the government measures unemployment using the Current Population Survey (CPS), which collects demographic and employment information from a wide range of Americans each month. In this exercise, we will employ the topics reviewed in the lectures as well as a few new techniques using the September 2013 version of this rich, nationally representative dataset (available online.

The observations in the dataset represent people surveyed in the September 2013 CPS who actually completed a survey. While the full dataset has 385 variables, in this exercise we will use a more compact version of the dataset, CPSData.csv, which has the following variables:

In this problem, we’ll take a look at how the stock dynamics of these companies have changed over time.

R Exercises

Load the dataset from CPSData.csv into a data frame called CPS, and view the dataset with the summary() and str() commands.

# Read in the datasets
CPS = read.csv("CPSData.csv")
MetroAreaMap = read.csv("MetroAreaCodes.csv")
CountryMap = read.csv("CountryCodes.csv")

How many interviewees are in the dataset?

# Output a string
str(CPS)
## 'data.frame':    131302 obs. of  14 variables:
##  $ PeopleInHousehold : int  1 3 3 3 3 3 3 2 2 2 ...
##  $ Region            : Factor w/ 4 levels "Midwest","Northeast",..: 3 3 3 3 3 3 3 3 3 3 ...
##  $ State             : Factor w/ 51 levels "Alabama","Alaska",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ MetroAreaCode     : int  26620 13820 13820 13820 26620 26620 26620 33660 33660 26620 ...
##  $ Age               : int  85 21 37 18 52 24 26 71 43 52 ...
##  $ Married           : Factor w/ 5 levels "Divorced","Married",..: 5 3 3 3 5 3 3 1 1 3 ...
##  $ Sex               : Factor w/ 2 levels "Female","Male": 1 2 1 2 1 2 2 1 2 2 ...
##  $ Education         : Factor w/ 8 levels "Associate degree",..: 1 4 4 6 1 2 4 4 4 2 ...
##  $ Race              : Factor w/ 6 levels "American Indian",..: 6 3 3 3 6 6 6 6 6 6 ...
##  $ Hispanic          : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ CountryOfBirthCode: int  57 57 57 57 57 57 57 57 57 57 ...
##  $ Citizenship       : Factor w/ 3 levels "Citizen, Native",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ EmploymentStatus  : Factor w/ 5 levels "Disabled","Employed",..: 4 5 1 3 2 2 2 2 3 2 ...
##  $ Industry          : Factor w/ 14 levels "Agriculture, forestry, fishing, and hunting",..: NA 11 NA NA 11 4 14 4 NA 12 ...

From str(CPS), we can read that there are 131302 interviewees.

Among the interviewees with a value reported for the Industry variable, what is the most common industry of employment?

# Output a summary
z = summary(CPS)
kable(z)
PeopleInHousehold Region State MetroAreaCode Age Married Sex Education Race Hispanic CountryOfBirthCode Citizenship EmploymentStatus Industry
Min. : 1.000 Midwest :30684 California :11570 Min. :10420 Min. : 0.00 Divorced :11151 Female:67481 High school :30906 American Indian : 1433 Min. :0.0000 Min. : 57.00 Citizen, Native :116639 Disabled : 5712 Educational and health services :15017
1st Qu.: 2.000 Northeast:25939 Texas : 7077 1st Qu.:21780 1st Qu.:19.00 Married :55509 Male :63821 Bachelor’s degree :19443 Asian : 6520 1st Qu.:0.0000 1st Qu.: 57.00 Citizen, Naturalized: 7073 Employed :61733 Trade : 8933
Median : 3.000 South :41502 New York : 5595 Median :34740 Median :39.00 Never Married:30772 NA Some college, no degree:18863 Black : 13913 Median :0.0000 Median : 57.00 Non-Citizen : 7590 Not in Labor Force:15246 Professional and business services: 7519
Mean : 3.284 West :33177 Florida : 5149 Mean :35075 Mean :38.83 Separated : 2027 NA No high school diploma :16095 Multiracial : 2897 Mean :0.1393 Mean : 82.68 NA Retired :18619 Manufacturing : 6791
3rd Qu.: 4.000 NA Pennsylvania: 3930 3rd Qu.:41860 3rd Qu.:57.00 Widowed : 6505 NA Associate degree : 9913 Pacific Islander: 618 3rd Qu.:0.0000 3rd Qu.: 57.00 NA Unemployed : 4203 Leisure and hospitality : 6364
Max. :15.000 NA Illinois : 3912 Max. :79600 Max. :85.00 NA’s :25338 NA (Other) :10744 White :105921 Max. :1.0000 Max. :555.00 NA NA’s :25789 (Other) :21618
NA NA (Other) :94069 NA’s :34238 NA NA NA NA’s :25338 NA NA NA NA NA NA’s :65060
# Tabulates the number of observations in the industry variable
z = table(CPS$Industry)
kable(z)
Var1 Freq
Agriculture, forestry, fishing, and hunting 1307
Armed forces 29
Construction 4387
Educational and health services 15017
Financial 4347
Information 1328
Leisure and hospitality 6364
Manufacturing 6791
Mining 550
Other services 3224
Professional and business services 7519
Public administration 3186
Trade 8933
Transportation and utilities 3260

The output of summary(CPS) orders the levels of a factor variable like Industry from largest to smallest, so we can see that “Educational and health services” is the most common Industry. table(CPS$Industry) would have provided the breakdown across all industries.

Which state has the fewest interviewees?

# Sorts the tabulation
z = sort(table(CPS$State))
kable(z)
Var1 Freq
New Mexico 1102
Montana 1214
Mississippi 1230
Alabama 1376
West Virginia 1409
Arkansas 1421
Louisiana 1450
Idaho 1518
Oklahoma 1523
Arizona 1528
Alaska 1590
Wyoming 1624
North Dakota 1645
South Carolina 1658
Tennessee 1784
District of Columbia 1791
Kentucky 1841
Utah 1842
Nevada 1856
Vermont 1890
Kansas 1935
Oregon 1943
Nebraska 1949
Massachusetts 1987
South Dakota 2000
Indiana 2004
Hawaii 2099
Missouri 2145
Rhode Island 2209
Delaware 2214
Maine 2263
Washington 2366
Iowa 2528
New Jersey 2567
North Carolina 2619
New Hampshire 2662
Wisconsin 2686
Georgia 2807
Connecticut 2836
Colorado 2925
Virginia 2953
Michigan 3063
Minnesota 3139
Maryland 3200
Ohio 3678
Illinois 3912
Pennsylvania 3930
Florida 5149
New York 5595
Texas 7077
California 11570

New Mexico.

Which state has the largest number of interviewees?

# Sorts the tabulation
z = sort(table(CPS$State))
kable(z)
Var1 Freq
New Mexico 1102
Montana 1214
Mississippi 1230
Alabama 1376
West Virginia 1409
Arkansas 1421
Louisiana 1450
Idaho 1518
Oklahoma 1523
Arizona 1528
Alaska 1590
Wyoming 1624
North Dakota 1645
South Carolina 1658
Tennessee 1784
District of Columbia 1791
Kentucky 1841
Utah 1842
Nevada 1856
Vermont 1890
Kansas 1935
Oregon 1943
Nebraska 1949
Massachusetts 1987
South Dakota 2000
Indiana 2004
Hawaii 2099
Missouri 2145
Rhode Island 2209
Delaware 2214
Maine 2263
Washington 2366
Iowa 2528
New Jersey 2567
North Carolina 2619
New Hampshire 2662
Wisconsin 2686
Georgia 2807
Connecticut 2836
Colorado 2925
Virginia 2953
Michigan 3063
Minnesota 3139
Maryland 3200
Ohio 3678
Illinois 3912
Pennsylvania 3930
Florida 5149
New York 5595
Texas 7077
California 11570

California.

What proportion of interviewees are citizens of the United States?

# Calculate the proportion
m = table(CPS$Citizenship)
kable(m)
Var1 Freq
Citizen, Native 116639
Citizen, Naturalized 7073
Non-Citizen 7590

(m[1]+m[2])/(m[1]+m[2]+m[3])
## Citizen, Native 
##       0.9421943

From table(CPS$Citizenship), we see that 123,712 of the 131,302 interviewees are citizens of the United States (either native or naturalized). This is a proportion of 123712/131302=0.942.

For which races are there at least 250 interviewees in the CPS dataset of Hispanic ethnicity?

# Tabulates the races and hispanic variables
z = table(CPS$Race, CPS$Hispanic) >=250
kable(z)
0 1
American Indian TRUE TRUE
Asian TRUE FALSE
Black TRUE TRUE
Multiracial TRUE TRUE
Pacific Islander TRUE FALSE
White TRUE TRUE

The breakdown of race and Hispanic ethnicity can be obtained with table(CPS\(Race, CPS\)Hispanic).

Which variables have at least one interviewee with a missing (NA) value?

# Outputs a summary
z = summary(CPS)
kable(z)
PeopleInHousehold Region State MetroAreaCode Age Married Sex Education Race Hispanic CountryOfBirthCode Citizenship EmploymentStatus Industry
Min. : 1.000 Midwest :30684 California :11570 Min. :10420 Min. : 0.00 Divorced :11151 Female:67481 High school :30906 American Indian : 1433 Min. :0.0000 Min. : 57.00 Citizen, Native :116639 Disabled : 5712 Educational and health services :15017
1st Qu.: 2.000 Northeast:25939 Texas : 7077 1st Qu.:21780 1st Qu.:19.00 Married :55509 Male :63821 Bachelor’s degree :19443 Asian : 6520 1st Qu.:0.0000 1st Qu.: 57.00 Citizen, Naturalized: 7073 Employed :61733 Trade : 8933
Median : 3.000 South :41502 New York : 5595 Median :34740 Median :39.00 Never Married:30772 NA Some college, no degree:18863 Black : 13913 Median :0.0000 Median : 57.00 Non-Citizen : 7590 Not in Labor Force:15246 Professional and business services: 7519
Mean : 3.284 West :33177 Florida : 5149 Mean :35075 Mean :38.83 Separated : 2027 NA No high school diploma :16095 Multiracial : 2897 Mean :0.1393 Mean : 82.68 NA Retired :18619 Manufacturing : 6791
3rd Qu.: 4.000 NA Pennsylvania: 3930 3rd Qu.:41860 3rd Qu.:57.00 Widowed : 6505 NA Associate degree : 9913 Pacific Islander: 618 3rd Qu.:0.0000 3rd Qu.: 57.00 NA Unemployed : 4203 Leisure and hospitality : 6364
Max. :15.000 NA Illinois : 3912 Max. :79600 Max. :85.00 NA’s :25338 NA (Other) :10744 White :105921 Max. :1.0000 Max. :555.00 NA NA’s :25789 (Other) :21618
NA NA (Other) :94069 NA’s :34238 NA NA NA NA’s :25338 NA NA NA NA NA NA’s :65060
This can be read from the output of summary (CPS).

Often when evaluating a new dataset, we try to identify if there is a pattern in the missing values in the dataset.

# Tabulates various relationship amongst the variables
z = table(CPS$Region, is.na(CPS$Married))
kable(z)
FALSE TRUE
Midwest 24609 6075
Northeast 21432 4507
South 33535 7967
West 26388 6789

z = table(CPS$Sex, is.na(CPS$Married))
kable(z)
FALSE TRUE
Female 55264 12217
Male 50700 13121

z = table(CPS$Age, is.na(CPS$Married))
kable(z)
FALSE TRUE
0 0 1283
1 0 1559
2 0 1574
3 0 1693
4 0 1695
5 0 1795
6 0 1721
7 0 1681
8 0 1729
9 0 1748
10 0 1750
11 0 1721
12 0 1797
13 0 1802
14 0 1790
15 1795 0
16 1751 0
17 1764 0
18 1596 0
19 1517 0
20 1398 0
21 1525 0
22 1536 0
23 1638 0
24 1627 0
25 1604 0
26 1643 0
27 1657 0
28 1736 0
29 1645 0
30 1854 0
31 1762 0
32 1790 0
33 1804 0
34 1653 0
35 1716 0
36 1663 0
37 1531 0
38 1530 0
39 1542 0
40 1571 0
41 1673 0
42 1711 0
43 1819 0
44 1764 0
45 1749 0
46 1665 0
47 1647 0
48 1791 0
49 1989 0
50 1966 0
51 1931 0
52 1935 0
53 1994 0
54 1912 0
55 1895 0
56 1935 0
57 1827 0
58 1874 0
59 1758 0
60 1746 0
61 1735 0
62 1595 0
63 1596 0
64 1519 0
65 1569 0
66 1577 0
67 1227 0
68 1130 0
69 1062 0
70 1195 0
71 1031 0
72 941 0
73 896 0
74 842 0
75 763 0
76 729 0
77 698 0
78 659 0
79 661 0
80 2664 0
85 2446 0

z = table(CPS$Citizenship, is.na(CPS$Married))
kable(z)
FALSE TRUE
Citizen, Native 91956 24683
Citizen, Naturalized 6910 163
Non-Citizen 7098 492

For each possible value of Region, Sex, and Citizenship, there are both interviewees with missing and non-missing Married values. However, Married is missing for all interviewees Aged 0-14 and is present for all interviewees aged 15 and older. This is because the CPS does not ask about marriage status for interviewees 14 and younger.

How many states had all interviewees living in a non-metropolitan area (aka they have a missing MetroAreaCode value)?

# Tabulates the observations living in a non-MetroAreaCode
z = table(CPS$State, is.na(CPS$MetroAreaCode))
kable(z)
FALSE TRUE
Alabama 1020 356
Alaska 0 1590
Arizona 1327 201
Arkansas 724 697
California 11333 237
Colorado 2545 380
Connecticut 2593 243
Delaware 1696 518
District of Columbia 1791 0
Florida 4947 202
Georgia 2250 557
Hawaii 1576 523
Idaho 761 757
Illinois 3473 439
Indiana 1420 584
Iowa 1297 1231
Kansas 1234 701
Kentucky 908 933
Louisiana 1216 234
Maine 909 1354
Maryland 2978 222
Massachusetts 1858 129
Michigan 2517 546
Minnesota 2150 989
Mississippi 376 854
Missouri 1440 705
Montana 199 1015
Nebraska 816 1133
Nevada 1609 247
New Hampshire 1148 1514
New Jersey 2567 0
New Mexico 832 270
New York 5144 451
North Carolina 1642 977
North Dakota 432 1213
Ohio 2754 924
Oklahoma 1024 499
Oregon 1519 424
Pennsylvania 3245 685
Rhode Island 2209 0
South Carolina 1139 519
South Dakota 595 1405
Tennessee 1149 635
Texas 6060 1017
Utah 1455 387
Vermont 657 1233
Virginia 2367 586
Washington 1937 429
West Virginia 344 1065
Wisconsin 1882 804
Wyoming 0 1624

2 interviewees

How many states had all interviewees living in a metropolitan area?

# Tabulates the number of observations not living in a metro area
z = table(CPS$State, is.na(CPS$MetroAreaCode))
kable(z)
FALSE TRUE
Alabama 1020 356
Alaska 0 1590
Arizona 1327 201
Arkansas 724 697
California 11333 237
Colorado 2545 380
Connecticut 2593 243
Delaware 1696 518
District of Columbia 1791 0
Florida 4947 202
Georgia 2250 557
Hawaii 1576 523
Idaho 761 757
Illinois 3473 439
Indiana 1420 584
Iowa 1297 1231
Kansas 1234 701
Kentucky 908 933
Louisiana 1216 234
Maine 909 1354
Maryland 2978 222
Massachusetts 1858 129
Michigan 2517 546
Minnesota 2150 989
Mississippi 376 854
Missouri 1440 705
Montana 199 1015
Nebraska 816 1133
Nevada 1609 247
New Hampshire 1148 1514
New Jersey 2567 0
New Mexico 832 270
New York 5144 451
North Carolina 1642 977
North Dakota 432 1213
Ohio 2754 924
Oklahoma 1024 499
Oregon 1519 424
Pennsylvania 3245 685
Rhode Island 2209 0
South Carolina 1139 519
South Dakota 595 1405
Tennessee 1149 635
Texas 6060 1017
Utah 1455 387
Vermont 657 1233
Virginia 2367 586
Washington 1937 429
West Virginia 344 1065
Wisconsin 1882 804
Wyoming 0 1624

Which region of the United States has the largest proportion of interviewees living in a non-metropolitan area?

# Calculates the proprotion
m = table(CPS$Region, is.na(CPS$MetroAreaCode))
z = prop.table(m,1)
kable(m)
FALSE TRUE
Midwest 20010 10674
Northeast 20330 5609
South 31631 9871
West 25093 8084

Explanation: We can then compute the proportion of interviewees in each region that live in a non-metropolitan area: 34.8% in the Midwest, 21.6% in the Northeast, 23.8% in the South, and 24.4% in the West.

Which state has a proportion of interviewees living in a non-metropolitan area closest to 30%?

# Sorts the tabulation
z = sort(tapply(is.na(CPS$MetroAreaCode), CPS$State, mean))
kable(z)
x
District of Columbia 0.00000000
New Jersey 0.00000000
Rhode Island 0.00000000
California 0.02048401
Florida 0.03923092
Massachusetts 0.06492199
Maryland 0.06937500
New York 0.08060769
Connecticut 0.08568406
Illinois 0.11221881
Colorado 0.12991453
Arizona 0.13154450
Nevada 0.13308190
Texas 0.14370496
Louisiana 0.16137931
Pennsylvania 0.17430025
Michigan 0.17825661
Washington 0.18131868
Georgia 0.19843249
Virginia 0.19844226
Utah 0.21009772
Oregon 0.21821925
Delaware 0.23396567
New Mexico 0.24500907
Hawaii 0.24916627
Ohio 0.25122349
Alabama 0.25872093
Indiana 0.29141717
Wisconsin 0.29932986
South Carolina 0.31302774
Minnesota 0.31506849
Oklahoma 0.32764281
Missouri 0.32867133
Tennessee 0.35594170
Kansas 0.36227390
North Carolina 0.37304315
Iowa 0.48694620
Arkansas 0.49049965
Idaho 0.49868248
Kentucky 0.50678979
New Hampshire 0.56874530
Nebraska 0.58132376
Maine 0.59832081
Vermont 0.65238095
Mississippi 0.69430894
South Dakota 0.70250000
North Dakota 0.73738602
West Virginia 0.75585522
Montana 0.83607908
Alaska 1.00000000
Wyoming 1.00000000

Wisconsin.

Which state has the largest proportion of non-metropolitan interviewees, ignoring states where all interviewees were non-metropolitan?

# Sorts the tabulation
z = sort(tapply(is.na(CPS$MetroAreaCode), CPS$State, mean))
kable(z)
x
District of Columbia 0.00000000
New Jersey 0.00000000
Rhode Island 0.00000000
California 0.02048401
Florida 0.03923092
Massachusetts 0.06492199
Maryland 0.06937500
New York 0.08060769
Connecticut 0.08568406
Illinois 0.11221881
Colorado 0.12991453
Arizona 0.13154450
Nevada 0.13308190
Texas 0.14370496
Louisiana 0.16137931
Pennsylvania 0.17430025
Michigan 0.17825661
Washington 0.18131868
Georgia 0.19843249
Virginia 0.19844226
Utah 0.21009772
Oregon 0.21821925
Delaware 0.23396567
New Mexico 0.24500907
Hawaii 0.24916627
Ohio 0.25122349
Alabama 0.25872093
Indiana 0.29141717
Wisconsin 0.29932986
South Carolina 0.31302774
Minnesota 0.31506849
Oklahoma 0.32764281
Missouri 0.32867133
Tennessee 0.35594170
Kansas 0.36227390
North Carolina 0.37304315
Iowa 0.48694620
Arkansas 0.49049965
Idaho 0.49868248
Kentucky 0.50678979
New Hampshire 0.56874530
Nebraska 0.58132376
Maine 0.59832081
Vermont 0.65238095
Mississippi 0.69430894
South Dakota 0.70250000
North Dakota 0.73738602
West Virginia 0.75585522
Montana 0.83607908
Alaska 1.00000000
Wyoming 1.00000000

Montana

Integrating Metropolitan Area Data

Codes like MetroAreaCode and CountryOfBirthCode are a compact way to encode factor variables with text as their possible values, and they are therefore quite common in survey datasets. In fact, all but one of the variables in this dataset were actually stored by a numeric code in the original CPS datafile.

When analyzing a variable stored by a numeric code, we will often want to convert it into the values the codes represent. To do this, we will use a dictionary, which maps the the code to the actual value of the variable. We have provided dictionaries MetroAreaCodes.csv and CountryCodes.csv, which respectively map MetroAreaCode and CountryOfBirthCode into their true values. Read these two dictionaries into data frames MetroAreaMap and CountryMap.

How many observations (codes for metropolitan areas) are there in MetroAreaMap?

# Outputs the string
str(MetroAreaMap)
## 'data.frame':    271 obs. of  2 variables:
##  $ Code     : int  460 3000 3160 3610 3720 6450 10420 10500 10580 10740 ...
##  $ MetroArea: Factor w/ 271 levels "Akron, OH","Albany-Schenectady-Troy, NY",..: 12 92 97 117 122 195 1 3 2 4 ...

271 observations.

How many observations (codes for metropolitan areas) are there in MetroAreaMap?

str(CountryMap)
## 'data.frame':    149 obs. of  2 variables:
##  $ Code   : int  57 66 73 78 96 100 102 103 104 105 ...
##  $ Country: Factor w/ 149 levels "Afghanistan",..: 139 57 105 135 97 3 11 18 24 37 ...

149 observations.

To merge in the metropolitan areas, we want to connect the field MetroAreaCode from the CPS data frame with the field Code in MetroAreaMap. The following command merges the two data frames on these columns, overwriting the CPS data frame with the result:

# Merge data
CPS = merge(CPS, MetroAreaMap, by.x="MetroAreaCode", by.y="Code", all.x=TRUE)

What is the name of the variable that was added to the data frame by the merge() operation?

# Output a summary
z = summary(CPS)
kable(z)
MetroAreaCode PeopleInHousehold Region State Age Married Sex Education Race Hispanic CountryOfBirthCode Citizenship EmploymentStatus Industry MetroArea
Min. :10420 Min. : 1.000 Midwest :30684 California :11570 Min. : 0.00 Divorced :11151 Female:67481 High school :30906 American Indian : 1433 Min. :0.0000 Min. : 57.00 Citizen, Native :116639 Disabled : 5712 Educational and health services :15017 New York-Northern New Jersey-Long Island, NY-NJ-PA: 5409
1st Qu.:21780 1st Qu.: 2.000 Northeast:25939 Texas : 7077 1st Qu.:19.00 Married :55509 Male :63821 Bachelor’s degree :19443 Asian : 6520 1st Qu.:0.0000 1st Qu.: 57.00 Citizen, Naturalized: 7073 Employed :61733 Trade : 8933 Washington-Arlington-Alexandria, DC-VA-MD-WV : 4177
Median :34740 Median : 3.000 South :41502 New York : 5595 Median :39.00 Never Married:30772 NA Some college, no degree:18863 Black : 13913 Median :0.0000 Median : 57.00 Non-Citizen : 7590 Not in Labor Force:15246 Professional and business services: 7519 Los Angeles-Long Beach-Santa Ana, CA : 4102
Mean :35075 Mean : 3.284 West :33177 Florida : 5149 Mean :38.83 Separated : 2027 NA No high school diploma :16095 Multiracial : 2897 Mean :0.1393 Mean : 82.68 NA Retired :18619 Manufacturing : 6791 Philadelphia-Camden-Wilmington, PA-NJ-DE : 2855
3rd Qu.:41860 3rd Qu.: 4.000 NA Pennsylvania: 3930 3rd Qu.:57.00 Widowed : 6505 NA Associate degree : 9913 Pacific Islander: 618 3rd Qu.:0.0000 3rd Qu.: 57.00 NA Unemployed : 4203 Leisure and hospitality : 6364 Chicago-Naperville-Joliet, IN-IN-WI : 2772
Max. :79600 Max. :15.000 NA Illinois : 3912 Max. :85.00 NA’s :25338 NA (Other) :10744 White :105921 Max. :1.0000 Max. :555.00 NA NA’s :25789 (Other) :21618 (Other) :77749
NA’s :34238 NA NA (Other) :94069 NA NA NA NA’s :25338 NA NA NA NA NA NA’s :65060 NA’s :34238

MetroArea

How many interviewees have a missing value for the new metropolitan area variable?

# Output a summary
z = summary(CPS)
kable(z)
MetroAreaCode PeopleInHousehold Region State Age Married Sex Education Race Hispanic CountryOfBirthCode Citizenship EmploymentStatus Industry MetroArea
Min. :10420 Min. : 1.000 Midwest :30684 California :11570 Min. : 0.00 Divorced :11151 Female:67481 High school :30906 American Indian : 1433 Min. :0.0000 Min. : 57.00 Citizen, Native :116639 Disabled : 5712 Educational and health services :15017 New York-Northern New Jersey-Long Island, NY-NJ-PA: 5409
1st Qu.:21780 1st Qu.: 2.000 Northeast:25939 Texas : 7077 1st Qu.:19.00 Married :55509 Male :63821 Bachelor’s degree :19443 Asian : 6520 1st Qu.:0.0000 1st Qu.: 57.00 Citizen, Naturalized: 7073 Employed :61733 Trade : 8933 Washington-Arlington-Alexandria, DC-VA-MD-WV : 4177
Median :34740 Median : 3.000 South :41502 New York : 5595 Median :39.00 Never Married:30772 NA Some college, no degree:18863 Black : 13913 Median :0.0000 Median : 57.00 Non-Citizen : 7590 Not in Labor Force:15246 Professional and business services: 7519 Los Angeles-Long Beach-Santa Ana, CA : 4102
Mean :35075 Mean : 3.284 West :33177 Florida : 5149 Mean :38.83 Separated : 2027 NA No high school diploma :16095 Multiracial : 2897 Mean :0.1393 Mean : 82.68 NA Retired :18619 Manufacturing : 6791 Philadelphia-Camden-Wilmington, PA-NJ-DE : 2855
3rd Qu.:41860 3rd Qu.: 4.000 NA Pennsylvania: 3930 3rd Qu.:57.00 Widowed : 6505 NA Associate degree : 9913 Pacific Islander: 618 3rd Qu.:0.0000 3rd Qu.: 57.00 NA Unemployed : 4203 Leisure and hospitality : 6364 Chicago-Naperville-Joliet, IN-IN-WI : 2772
Max. :79600 Max. :15.000 NA Illinois : 3912 Max. :85.00 NA’s :25338 NA (Other) :10744 White :105921 Max. :1.0000 Max. :555.00 NA NA’s :25789 (Other) :21618 (Other) :77749
NA’s :34238 NA NA (Other) :94069 NA NA NA NA’s :25338 NA NA NA NA NA NA’s :65060 NA’s :34238

34238 interviewees.

Which of the following metropolitan areas has the largest number of interviewees?

# Sort the tabulation
z = sort(table(CPS$MetroArea))
kable(z)
Var1 Freq
Appleton-Oshkosh-Neenah, WI 0
Grand Rapids-Muskegon-Holland, MI 0
Greenville-Spartanburg-Anderson, SC 0
Hinesville-Fort Stewart, GA 0
Jamestown, NY 0
Kalamazoo-Battle Creek, MI 0
Portsmouth-Rochester, NH-ME 0
Bowling Green, KY 29
Ocean City, NJ 30
Springfield, OH 34
Bloomington-Normal IL 40
Valdosta, GA 42
Warner Robins, GA 42
Tallahassee, FL 43
Columbia, MO 47
Punta Gorda, FL 48
Midland, TX 51
Niles-Benton Harbor, MI 51
Johnson City, TN 52
Santa Fe, NM 52
Prescott, AZ 54
Vineland-Millville-Bridgeton, NJ 54
Hickory-Morgantown-Lenoir, NC 57
Madera, CA 57
Columbus, GA-AL 59
Joplin, MO 59
Panama City-Lynn Haven, FL 59
Chico, CA 60
Anniston-Oxford, AL 61
Napa, CA 61
Anderson, IN 62
Florence, AL 63
Jacksonville, NC 63
Johnstown, PA 63
Lubbock, TX 63
Monroe, MI 63
Anderson, SC 64
Farmington, NM 64
Athens-Clark County, GA 65
Gulfport-Biloxi, MS 65
Longview, TX 65
Macon, GA 65
Leominster-Fitchburg-Gardner, MA 66
Roanoke, VA 66
Santa-Cruz-Watsonville, CA 66
Kingsport-Bristol, TN-VA 67
Albany, GA 68
Bellingham, WA 70
Gainesville, FL 70
Jackson, MI 70
Binghamton, NY 73
Lynchburg, VA 73
Saginaw-Saginaw Township North, MI 74
Salisbury, MD 74
Barnstable Town, MA 75
Ocala, FL 76
Springfield, IL 76
Fayetteville, NC 77
Michigan City-La Porte, IN 77
San Luis Obispo-Paso Robles, CA 77
Holland-Grand Haven, MI 78
Tuscaloosa, AL 78
Brownsville-Harlingen, TX 79
Vero Beach, FL 79
Waco, TX 79
Fort Walton Beach-Crestview-Destin, FL 80
Utica-Rome, NY 80
Decatur, IL 81
Lake Charles, LA 81
South Bend-Mishawaka, IN-MI 81
Altoona, PA 82
Huntington-Ashland, WV-KY-OH 82
Medford, OR 82
Naples-Marco Island, FL 82
St. Cloud, MN 82
Ann Arbor, MI 85
Oshkosh-Neenah, WI 85
Hagerstown-Martinsburg, MD-WV 86
Bremerton-Silverdale, WA 87
Erie, PA 87
Kankakee-Bradley, IL 87
Kingston, NY 87
Amarillo, TX 88
Laredo, TX 89
Harrisonburg, VA 90
Muskegon-Norton Shores, MI 90
Trenton-Ewing, NJ 91
Decatur, Al 96
Wausau, WI 96
Lawton, OK 97
Lawrence, KS 98
El Centro, CA 99
Evansville, IN-KY 99
Janesville, WI 99
Olympia, WA 99
Spartanburg, SC 99
Killeen-Temple-Fort Hood, TX 101
Flint, MI 102
Myrtle Beach-Conway-North Myrtle Beach, SC 102
Montgomery, AL 103
Bloomington, IN 104
Salinas, CA 104
Fort Smith, AR-OK 105
Merced, CA 106
Las Cruses, NM 107
Pensacola-Ferry Pass-Brent, FL 107
Port St. Lucie-Fort Pierce, FL 109
Eau Claire, WI 110
Mobile, AL 110
Atlantic City, NJ 111
Danbury, CT 112
Peoria, IL 112
Yakima, WA 112
La Crosse, WI 114
Rockford, IL 114
Asheville, NC 116
Victoria, TX 116
Coeur d’Alene, ID 117
Huntsville, AL 117
York-Hanover, PA 117
Canton-Massillon, OH 118
Lansing-East Lansing, MI 119
Racine, WI 119
Visalia-Porterville, CA 121
Champaign-Urbana, IL 122
Beaumont-Port Author, TX 123
Appleton,WI 125
Duluth, MN-WI 126
Kalamazoo-Portage, MI 127
Winston-Salem, NC 127
Santa Rosa-Petaluma, CA 129
Pueblo, CO 130
Iowa City, IA 131
Corpus Christi, TX 132
Santa Barbara-Santa Maria-Goleta, CA 132
Vallejo-Fairfield, CA 133
Fort Wayne, IN 136
Green Bay, WI 136
Bend, OR 140
Deltona-Daytona Beach-Ormond Beach, FL 140
Reading, PA 142
Worcester, MA-CT 144
Cape Coral-Fort Myers, FL 146
Shreveport-Bossier City, LA 146
Lakeland-Winter Haven, FL 149
Youngstown-Warren-Boardman, OH 153
Springfield, MA-CT 155
Lancaster, PA 156
Spokane, WA 156
Waterloo-Cedar Falls, IA 156
Waterbury, CT 157
Modesto, CA 158
Augusta-Richmond County, GA-SC 161
Springfield, MO 161
Greeley, CO 162
Chattanooga, TN-GA 167
Knoxville, TN 168
Palm Bay-Melbourne-Titusville, FL 168
Salem, OR 170
Boulder, CO 171
Harrisburg-Carlisle, PA 174
Scranton-Wilkes Barre, PA 176
Monroe, LA 179
Lafayette, LA 181
Topeka, KS 182
Greenville, SC 185
Durham, NC 189
Sarasota-Bradenton-Venice, FL 192
Stockton, CA 193
McAllen-Edinburg-Pharr, TX 195
Cedar Rapids, IA 196
Eugene-Springfield, OR 196
Lexington-Fayette, KY 198
Billings, MT 199
Poughkeepsie-Newburgh-Middletown, NY 201
Savannah, GA 202
Norwich-New London, CT-RI 203
Fort Collins-Loveland, CO 206
Bangor, ME 208
Fayetteville-Springdale-Rogers, AR-MO 215
Jackson, MS 222
Syracuse, NY 223
Akron, OH 231
Charleston-North Charleston, SC 232
Toledo, OH 235
Davenport-Moline-Rock Island, IA-IL 240
El Paso, TX 244
Bakersfield, CA 245
Greensboro-High Point, NC 251
Baton Rouge, LA 262
Charleston, WV 262
Rochester-Dover, NH-ME 262
Oxnard-Thousand Oaks-Ventura, CA 267
Albany-Schenectady-Troy, NY 268
Dayton, OH 268
Madison, WI 284
Columbia, SC 291
Tucson, AZ 302
Fresno, CA 303
Grand Rapids-Wyoming, MI 304
Rochester, NY 307
Provo-Orem, UT 309
Reno-Sparks, NV 310
Tulsa, OK 323
Allentown-Bethlehem-Easton, PA-NJ 334
Raleigh-Cary, NC 336
Buffalo-Niagara Falls, NY 344
Memphis, TN-MS-AR 348
New Orleans-Metairie-Kenner, LA 367
Colorado Springs, CO 372
Birmingham-Hoover, AL 392
Jacksonville, FL 393
Little Rock-North Little Rock, AR 404
Ogden-Clearfield, UT 423
Wichita, KS 427
Fargo, ND-MN 432
Dover, DE 456
Richmond, VA 490
Des Moines, IA 501
Nashville-Davidson-Murfreesboro, TN 505
New Haven, CT 506
Austin-Round Rock, TX 516
Charlotte-Gastonia-Concord, NC-SC 517
Louisville, KY-IN 519
Columbus, OH 551
Indianapolis, IN 570
Sioux Falls, SD 595
Virginia Beach-Norfolk-Newport News, VA-NC 597
Oklahoma City, OK 604
San Antonio, TX 607
Albuquerque, NM 609
Orlando, FL 610
Boise City-Nampa, ID 644
Burlington-South Burlington, VT 657
Sacramento-Arden-Arcade-Roseville, CA 667
San Jose-Sunnyvale-Santa Clara, CA 670
Cleveland-Elyria-Mentor, OH 681
Portland-South Portland, ME 701
Milwaukee-Waukesha-West Allis, WI 714
Cincinnati-Middletown, OH-KY-IN 719
Salt Lake City, UT 723
Bridgeport-Stamford-Norwalk, CT 730
Pittsburgh, PA 732
Tampa-St. Petersburg-Clearwater, FL 842
Hartford-West Hartford-East Hartford, CT 885
San Diego-Carlsbad-San Marcos, CA 907
St. Louis, MO-IL 956
Omaha-Council Bluffs, NE-IA 957
Kansas City, MO-KS 962
Phoenix-Mesa-Scottsdale, AZ 971
Portland-Vancouver-Beaverton, OR-WA 1089
Seattle-Tacoma-Bellevue, WA 1255
Riverside-San Bernardino, CA 1290
Las Vegas-Paradise, NV 1299
Detroit-Warren-Livonia, MI 1354
San Francisco-Oakland-Fremont, CA 1386
Baltimore-Towson, MD 1483
Denver-Aurora, CO 1504
Atlanta-Sandy Springs-Marietta, GA 1552
Miami-Fort Lauderdale-Miami Beach, FL 1554
Honolulu, HI 1576
Houston-Baytown-Sugar Land, TX 1649
Dallas-Fort Worth-Arlington, TX 1863
Minneapolis-St Paul-Bloomington, MN-WI 1942
Boston-Cambridge-Quincy, MA-NH 2229
Providence-Fall River-Warwick, MA-RI 2284
Chicago-Naperville-Joliet, IN-IN-WI 2772
Philadelphia-Camden-Wilmington, PA-NJ-DE 2855
Los Angeles-Long Beach-Santa Ana, CA 4102
Washington-Arlington-Alexandria, DC-VA-MD-WV 4177
New York-Northern New Jersey-Long Island, NY-NJ-PA 5409

From table(CPS$MetroArea), we can read that Boston-Cambridge-Quincy, MA-NH has the largest number of interviewees of these options, with 2229.

Which metropolitan area has the highest proportion of interviewees of Hispanic ethnicity?

# Sort the tabulation
z = sort(tapply(CPS$Hispanic, CPS$MetroArea, mean))

96.6% of the interviewees from Laredo, TX, are of Hispanic ethnicity, the highest proportion among metropolitan areas in the United States.

Determine the number of metropolitan areas in the United States from which at least 20% of interviewees are Asian?

# Sort the tabulation
z = sort(tapply(CPS$Race == "Asian", CPS$MetroArea, mean))

We can read from the sorted output that Honolulu, HI; San Francisco-Oakland-Fremont, CA; San Jose-Sunnyvale-Santa Clara, CA; and Vallejo-Fairfield, CA had at least 20% of their interviewees of the Asian race.

Normally, we would look at the sorted proportion of interviewees from each metropolitan area who have not received a high school diploma with the command:

# Sort the tabulation
z = sort(tapply(CPS$Education == "No high school diploma", CPS$MetroArea, mean))
kable(z)
x

Determine which metropolitan area has the smallest proportion of interviewees who have received no high school diploma?

# Sort the tabulation
z = sort(tapply(CPS$Education == "No high school diploma", CPS$MetroArea, mean, na.rm=TRUE))
kable(z)
x
Iowa City, IA 0.02912621
Bowling Green, KY 0.03703704
Kalamazoo-Portage, MI 0.05050505
Champaign-Urbana, IL 0.05154639
Bremerton-Silverdale, WA 0.05405405
Lawrence, KS 0.05952381
Bloomington-Normal IL 0.06060606
Jacksonville, NC 0.06122449
Eau Claire, WI 0.06250000
Palm Bay-Melbourne-Titusville, FL 0.06666667
Salisbury, MD 0.06779661
Gainesville, FL 0.06896552
Fort Collins-Loveland, CO 0.06936416
Altoona, PA 0.07142857
Madison, WI 0.07423581
Tallahassee, FL 0.07500000
Fargo, ND-MN 0.07902736
Albany-Schenectady-Troy, NY 0.07929515
Ocean City, NJ 0.08000000
Lakeland-Winter Haven, FL 0.08130081
Billings, MT 0.08280255
Coeur d’Alene, ID 0.08333333
Burlington-South Burlington, VT 0.08394161
Akron, OH 0.08421053
Ann Arbor, MI 0.08695652
Asheville, NC 0.08695652
Pensacola-Ferry Pass-Brent, FL 0.08695652
Oshkosh-Neenah, WI 0.08823529
Rochester-Dover, NH-ME 0.08928571
Knoxville, TN 0.08965517
Pittsburgh, PA 0.09060403
Barnstable Town, MA 0.09090909
Bridgeport-Stamford-Norwalk, CT 0.09563758
Johnstown, PA 0.09615385
Austin-Round Rock, TX 0.09629630
La Crosse, WI 0.09677419
Boulder, CO 0.09701493
Charleston-North Charleston, SC 0.09890110
Fort Wayne, IN 0.09900990
Roanoke, VA 0.10169492
Prescott, AZ 0.10204082
Santa Rosa-Petaluma, CA 0.10280374
Evansville, IN-KY 0.10389610
Spokane, WA 0.10434783
Poughkeepsie-Newburgh-Middletown, NY 0.10559006
Tampa-St. Petersburg-Clearwater, FL 0.10579710
Grand Rapids-Wyoming, MI 0.10612245
Portland-South Portland, ME 0.10638298
Honolulu, HI 0.10739300
Michigan City-La Porte, IN 0.10769231
Eugene-Springfield, OR 0.11038961
Boston-Cambridge-Quincy, MA-NH 0.11080485
Bend, OR 0.11111111
Vero Beach, FL 0.11428571
Sarasota-Bradenton-Venice, FL 0.11464968
Fort Walton Beach-Crestview-Destin, FL 0.11475410
Flint, MI 0.11538462
Cedar Rapids, IA 0.11564626
Minneapolis-St Paul-Bloomington, MN-WI 0.11638204
Portland-Vancouver-Beaverton, OR-WA 0.11657143
Washington-Arlington-Alexandria, DC-VA-MD-WV 0.11683748
Mobile, AL 0.11702128
Scranton-Wilkes Barre, PA 0.11724138
Topeka, KS 0.11724138
Colorado Springs, CO 0.11764706
Olympia, WA 0.11764706
Reno-Sparks, NV 0.11764706
Appleton,WI 0.11827957
Santa Fe, NM 0.11904762
Virginia Beach-Norfolk-Newport News, VA-NC 0.11909651
Allentown-Bethlehem-Easton, PA-NJ 0.11929825
Rochester, NY 0.12132353
Seattle-Tacoma-Bellevue, WA 0.12168793
Kansas City, MO-KS 0.12172775
Napa, CA 0.12244898
Duluth, MN-WI 0.12264151
New Haven, CT 0.12354312
Canton-Massillon, OH 0.12371134
Fayetteville, NC 0.12500000
San Luis Obispo-Paso Robles, CA 0.12500000
Worcester, MA-CT 0.12605042
Philadelphia-Camden-Wilmington, PA-NJ-DE 0.12717253
Davenport-Moline-Rock Island, IA-IL 0.12727273
Waterloo-Cedar Falls, IA 0.12800000
Pueblo, CO 0.12844037
Baton Rouge, LA 0.12871287
Racine, WI 0.12903226
Des Moines, IA 0.12944162
Detroit-Warren-Livonia, MI 0.12964642
Omaha-Council Bluffs, NE-IA 0.12972973
Richmond, VA 0.12990196
Savannah, GA 0.13013699
Danbury, CT 0.13043478
Bloomington, IN 0.13095238
Valdosta, GA 0.13157895
Wausau, WI 0.13157895
Deltona-Daytona Beach-Ormond Beach, FL 0.13178295
Tulsa, OK 0.13178295
Harrisburg-Carlisle, PA 0.13286713
Las Vegas-Paradise, NV 0.13307985
Myrtle Beach-Conway-North Myrtle Beach, SC 0.13333333
Provo-Orem, UT 0.13366337
Anderson, IN 0.13461538
Chico, CA 0.13461538
St. Louis, MO-IL 0.13461538
Niles-Benton Harbor, MI 0.13513514
Ogden-Clearfield, UT 0.13571429
Baltimore-Towson, MD 0.13583333
Buffalo-Niagara Falls, NY 0.13684211
Milwaukee-Waukesha-West Allis, WI 0.13693694
Chicago-Naperville-Joliet, IN-IN-WI 0.13737734
Louisville, KY-IN 0.13785047
Lynchburg, VA 0.13793103
Peoria, IL 0.13829787
Sioux Falls, SD 0.13832200
Ocala, FL 0.13888889
Leominster-Fitchburg-Gardner, MA 0.14035088
Oklahoma City, OK 0.14137214
San Diego-Carlsbad-San Marcos, CA 0.14188267
Jacksonville, FL 0.14244186
Atlantic City, NJ 0.14285714
Holland-Grand Haven, MI 0.14285714
Medford, OR 0.14285714
Naples-Marco Island, FL 0.14285714
Punta Gorda, FL 0.14285714
Victoria, TX 0.14285714
Winston-Salem, NC 0.14285714
Salt Lake City, UT 0.14338235
Atlanta-Sandy Springs-Marietta, GA 0.14421553
Decatur, IL 0.14516129
Springfield, IL 0.14516129
Monroe, MI 0.14545455
Denver-Aurora, CO 0.14574558
Hartford-West Hartford-East Hartford, CT 0.14574899
Greeley, CO 0.14615385
San Francisco-Oakland-Fremont, CA 0.14651368
Boise City-Nampa, ID 0.14653465
Greenville, SC 0.14666667
Birmingham-Hoover, AL 0.14678899
Saginaw-Saginaw Township North, MI 0.14754098
Santa-Cruz-Watsonville, CA 0.14814815
Trenton-Ewing, NJ 0.14814815
Lexington-Fayette, KY 0.14838710
San Jose-Sunnyvale-Santa Clara, CA 0.14922481
Bellingham, WA 0.15000000
Norwich-New London, CT-RI 0.15060241
Lubbock, TX 0.15094340
Huntington-Ashland, WV-KY-OH 0.15151515
St. Cloud, MN 0.15151515
Jackson, MS 0.15168539
Dayton, OH 0.15207373
Chattanooga, TN-GA 0.15217391
Syracuse, NY 0.15428571
New York-Northern New Jersey-Long Island, NY-NJ-PA 0.15573586
Columbia, SC 0.15600000
Columbus, OH 0.15617716
Memphis, TN-MS-AR 0.15714286
Orlando, FL 0.16108787
Warner Robins, GA 0.16216216
Cleveland-Elyria-Mentor, OH 0.16250000
Columbia, MO 0.16279070
Durham, NC 0.16326531
Miami-Fort Lauderdale-Miami Beach, FL 0.16356589
Indianapolis, IN 0.16371681
Albuquerque, NM 0.16424116
Cape Coral-Fort Myers, FL 0.16528926
Amarillo, TX 0.16666667
Anniston-Oxford, AL 0.16666667
Athens-Clark County, GA 0.16666667
Binghamton, NY 0.16666667
Phoenix-Mesa-Scottsdale, AZ 0.16687737
Green Bay, WI 0.16831683
Bangor, ME 0.16860465
Providence-Fall River-Warwick, MA-RI 0.16915688
Muskegon-Norton Shores, MI 0.16923077
Tuscaloosa, AL 0.16949153
Rockford, IL 0.17021277
Las Cruses, NM 0.17283951
Gulfport-Biloxi, MS 0.17307692
Huntsville, AL 0.17391304
Utica-Rome, NY 0.17391304
Fort Smith, AR-OK 0.17441860
Charlotte-Gastonia-Concord, NC-SC 0.17444717
El Centro, CA 0.17567568
Erie, PA 0.17567568
Jackson, MI 0.17741935
Cincinnati-Middletown, OH-KY-IN 0.17773788
Springfield, MA-CT 0.17829457
Reading, PA 0.17857143
Vallejo-Fairfield, CA 0.17924528
Salem, OR 0.17985612
Nashville-Davidson-Murfreesboro, TN 0.18112245
Johnson City, TN 0.18181818
Wichita, KS 0.18181818
York-Hanover, PA 0.18181818
Janesville, WI 0.18292683
Lansing-East Lansing, MI 0.18348624
Greensboro-High Point, NC 0.18357488
Decatur, Al 0.18421053
Albany, GA 0.18604651
Augusta-Richmond County, GA-SC 0.18796992
Charleston, WV 0.18834081
Shreveport-Bossier City, LA 0.18918919
Raleigh-Cary, NC 0.18959108
Toledo, OH 0.18965517
Spartanburg, SC 0.18987342
Dallas-Fort Worth-Arlington, TX 0.19077135
Sacramento-Arden-Arcade-Roseville, CA 0.19136961
Santa Barbara-Santa Maria-Goleta, CA 0.19191919
Monroe, LA 0.19205298
Dover, DE 0.19220056
South Bend-Mishawaka, IN-MI 0.19354839
Fayetteville-Springdale-Rogers, AR-MO 0.19393939
Columbus, GA-AL 0.19607843
Kingston, NY 0.19696970
Port St. Lucie-Fort Pierce, FL 0.19767442
Waterbury, CT 0.19852941
Little Rock-North Little Rock, AR 0.19939577
Springfield, MO 0.20000000
Modesto, CA 0.20325203
Houston-Baytown-Sugar Land, TX 0.20439739
Oxnard-Thousand Oaks-Ventura, CA 0.20657277
Anderson, SC 0.20689655
Midland, TX 0.21052632
New Orleans-Metairie-Kenner, LA 0.21088435
Fresno, CA 0.21120690
Lake Charles, LA 0.21739130
Visalia-Porterville, CA 0.21782178
San Antonio, TX 0.22004357
Hagerstown-Martinsburg, MD-WV 0.22222222
Yakima, WA 0.22222222
Hickory-Morgantown-Lenoir, NC 0.22448980
Los Angeles-Long Beach-Santa Ana, CA 0.22882883
Panama City-Lynn Haven, FL 0.22916667
Harrisonburg, VA 0.23287671
Kankakee-Bradley, IL 0.23437500
Beaumont-Port Author, TX 0.23469388
Youngstown-Warren-Boardman, OH 0.23622047
Riverside-San Bernardino, CA 0.23780488
Farmington, NM 0.23913043
Killeen-Temple-Fort Hood, TX 0.24050633
Waco, TX 0.24074074
Montgomery, AL 0.24137931
Tucson, AZ 0.24603175
Lafayette, LA 0.24822695
Joplin, MO 0.25000000
Stockton, CA 0.25333333
Brownsville-Harlingen, TX 0.25396825
Lancaster, PA 0.26771654
Bakersfield, CA 0.27218935
Vineland-Millville-Bridgeton, NJ 0.27500000
Lawton, OK 0.28000000
Merced, CA 0.28358209
Corpus Christi, TX 0.29702970
El Paso, TX 0.30219780
Springfield, OH 0.31034483
Florence, AL 0.32075472
Madera, CA 0.33333333
Salinas, CA 0.34090909
Laredo, TX 0.34426230
Kingsport-Bristol, TN-VA 0.36363636
Longview, TX 0.38297872
McAllen-Edinburg-Pharr, TX 0.38297872
Macon, GA 0.40816327

We can see that Iowa City, IA had 2.9% of interviewees not finish high school, the smallest value of any metropolitan area.

Integrating Country of Birth Data

# Combine dataset
CPS = merge(CPS, CountryMap, by.x="CountryOfBirthCode", by.y="Code", all.x=TRUE)

What is the name of the variable added to the CPS data frame by this merge operation? How many interviewees have a missing value for the new country of birth variable?

# Ouput summary
z = summary(CPS)
kable(z)
CountryOfBirthCode MetroAreaCode PeopleInHousehold Region State Age Married Sex Education Race Hispanic Citizenship EmploymentStatus Industry MetroArea Country
Min. : 57.00 Min. :10420 Min. : 1.000 Midwest :30684 California :11570 Min. : 0.00 Divorced :11151 Female:67481 High school :30906 American Indian : 1433 Min. :0.0000 Citizen, Native :116639 Disabled : 5712 Educational and health services :15017 New York-Northern New Jersey-Long Island, NY-NJ-PA: 5409 United States:115063
1st Qu.: 57.00 1st Qu.:21780 1st Qu.: 2.000 Northeast:25939 Texas : 7077 1st Qu.:19.00 Married :55509 Male :63821 Bachelor’s degree :19443 Asian : 6520 1st Qu.:0.0000 Citizen, Naturalized: 7073 Employed :61733 Trade : 8933 Washington-Arlington-Alexandria, DC-VA-MD-WV : 4177 Mexico : 3921
Median : 57.00 Median :34740 Median : 3.000 South :41502 New York : 5595 Median :39.00 Never Married:30772 NA Some college, no degree:18863 Black : 13913 Median :0.0000 Non-Citizen : 7590 Not in Labor Force:15246 Professional and business services: 7519 Los Angeles-Long Beach-Santa Ana, CA : 4102 Philippines : 839
Mean : 82.68 Mean :35075 Mean : 3.284 West :33177 Florida : 5149 Mean :38.83 Separated : 2027 NA No high school diploma :16095 Multiracial : 2897 Mean :0.1393 NA Retired :18619 Manufacturing : 6791 Philadelphia-Camden-Wilmington, PA-NJ-DE : 2855 India : 770
3rd Qu.: 57.00 3rd Qu.:41860 3rd Qu.: 4.000 NA Pennsylvania: 3930 3rd Qu.:57.00 Widowed : 6505 NA Associate degree : 9913 Pacific Islander: 618 3rd Qu.:0.0000 NA Unemployed : 4203 Leisure and hospitality : 6364 Chicago-Naperville-Joliet, IN-IN-WI : 2772 China : 581
Max. :555.00 Max. :79600 Max. :15.000 NA Illinois : 3912 Max. :85.00 NA’s :25338 NA (Other) :10744 White :105921 Max. :1.0000 NA NA’s :25789 (Other) :21618 (Other) :77749 (Other) : 9952
NA NA’s :34238 NA NA (Other) :94069 NA NA NA NA’s :25338 NA NA NA NA NA’s :65060 NA’s :34238 NA’s : 176

From summary(CPS), we can read that Country is the name of the added variable, and that it has 176 missing values.

Among all interviewees born outside of North America, which country was the most common place of birth?

# Sorts the tabulation
 z = sort(table(CPS$Country))
kable(z)
Var1 Freq
Cyprus 0
Kosovo 0
Oceania, not specified 0
Other U. S. Island Areas 0
Wales 0
Northern Ireland 2
Tanzania 2
Azerbaijan 3
Czechoslovakia 3
St. Kitts–Nevis 3
Georgia 5
Barbados 6
Denmark 6
Latvia 6
Samoa 6
Senegal 6
Singapore 6
Slovakia 6
Tonga 6
Zimbabwe 6
South America, not specified 7
St. Lucia 7
Algeria 9
Americas, not specified 9
Belize 9
Fiji 9
St. Vincent and the Grenadines 9
Bahamas 10
Finland 10
Kuwait 10
Lithuania 10
Czech Republic 11
Dominica 11
Paraguay 11
Croatia 12
Macedonia 12
Moldova 12
Antigua and Barbuda 13
Belgium 13
Bermuda 13
Bolivia 13
Grenada 13
Sudan 13
Cape Verde 15
Eritrea 15
Sierra Leone 15
Uganda 15
Austria 17
Morocco 17
Sri Lanka 17
U. S. Virgin Islands 17
Uruguay 17
Albania 18
Norway 18
Europe, not specified 19
Uzbekistan 19
West Indies, not specified 19
Malaysia 20
Serbia 20
Azores 22
USSR 22
New Zealand 23
Switzerland 23
Yemen 23
Belarus 24
Scotland 24
Yugoslavia 24
Hungary 25
Afghanistan 26
Indonesia 26
Netherlands 28
Sweden 28
Bulgaria 29
Costa Rica 29
Saudi Arabia 29
Guam 31
Cameroon 32
Syria 32
Armenia 35
Jordan 36
Chile 37
Asia, not specified 39
Ireland 39
Spain 41
Bangladesh 42
Australia 43
Nepal 44
Panama 44
Lebanon 45
Myanmar (Burma) 45
South Africa 48
Turkey 48
Cambodia 49
Liberia 52
Kenya 55
Romania 55
Greece 56
Israel 57
Trinidad and Tobago 60
Bosnia & Herzegovina 61
Venezuela 61
Argentina 64
Hong Kong 64
Portugal 64
Egypt 65
Somalia 72
France 73
South Korea 73
Ghana 76
Nicaragua 76
Ethiopia 80
Elsewhere 81
Nigeria 85
Iraq 97
Laos 98
Taiwan 102
Ukraine 104
Guyana 109
Pakistan 109
United Kingdom 111
Thailand 128
Africa, not specified 129
Ecuador 136
Peru 136
Iran 144
Italy 149
Brazil 159
Poland 162
Haiti 167
Russia 173
England 179
Japan 187
Honduras 189
Columbia 206
Jamaica 217
Guatemala 309
Dominican Republic 330
Korea 334
Canada 410
Cuba 426
Germany 438
Vietnam 458
El Salvador 477
Puerto Rico 518
China 581
India 770
Philippines 839
Mexico 3921
United States 115063

From the summary(CPS) output, or alternately sort(table(CPS$Country)), we see that the top two countries of birth were United States and Mexico, both of which are in North America. The third highest value, 839, was for the Philippines.

What proportion of the interviewees from the “New York-Northern New Jersey-Long Island, NY-NJ-PA” metropolitan area have a country of birth that is not the United States?

# Calculates the proportion
m = table(CPS$MetroArea == "New York-Northern New Jersey-Long Island, NY-NJ-PA", CPS$Country != "United States")

z = prop.table(m,1)
kable(z)
FALSE TRUE
FALSE 0.8607228 0.1392772
TRUE 0.6913397 0.3086603

From table(CPS$MetroArea == “New York-Northern New Jersey-Long Island, NY-NJ-PA”, CPS$Country != “United States”), we can see that 1668 of interviewees from this metropolitan area were born outside the United States and 3736 were born in the United States (it turns out an additional 5 have a missing country of origin). Therefore, the proportion is 1668/(1668+3736)=0.309.

Which metropolitan area has the largest number (note – not proportion) of interviewees with a country of birth in India? In Brazil? In Somalia?

# Calculates the proportion
z = sort(tapply(CPS$Country == "India", CPS$MetroArea, sum, na.rm=TRUE))
kable(z)
x
Akron, OH 0
Albany-Schenectady-Troy, NY 0
Albany, GA 0
Allentown-Bethlehem-Easton, PA-NJ 0
Altoona, PA 0
Amarillo, TX 0
Anderson, IN 0
Ann Arbor, MI 0
Anniston-Oxford, AL 0
Appleton,WI 0
Asheville, NC 0
Athens-Clark County, GA 0
Augusta-Richmond County, GA-SC 0
Bangor, ME 0
Barnstable Town, MA 0
Baton Rouge, LA 0
Beaumont-Port Author, TX 0
Bellingham, WA 0
Bend, OR 0
Billings, MT 0
Binghamton, NY 0
Bloomington, IN 0
Boulder, CO 0
Bowling Green, KY 0
Bremerton-Silverdale, WA 0
Buffalo-Niagara Falls, NY 0
Canton-Massillon, OH 0
Cape Coral-Fort Myers, FL 0
Cedar Rapids, IA 0
Champaign-Urbana, IL 0
Charleston, WV 0
Chattanooga, TN-GA 0
Chico, CA 0
Coeur d’Alene, ID 0
Colorado Springs, CO 0
Columbia, MO 0
Columbus, GA-AL 0
Columbus, OH 0
Corpus Christi, TX 0
Danbury, CT 0
Davenport-Moline-Rock Island, IA-IL 0
Dayton, OH 0
Decatur, Al 0
Decatur, IL 0
Denver-Aurora, CO 0
Dover, DE 0
Duluth, MN-WI 0
Durham, NC 0
Eau Claire, WI 0
El Centro, CA 0
El Paso, TX 0
Erie, PA 0
Eugene-Springfield, OR 0
Evansville, IN-KY 0
Fargo, ND-MN 0
Farmington, NM 0
Fayetteville, NC 0
Flint, MI 0
Florence, AL 0
Fort Collins-Loveland, CO 0
Fort Smith, AR-OK 0
Fort Walton Beach-Crestview-Destin, FL 0
Gainesville, FL 0
Grand Rapids-Wyoming, MI 0
Greeley, CO 0
Green Bay, WI 0
Greensboro-High Point, NC 0
Gulfport-Biloxi, MS 0
Hagerstown-Martinsburg, MD-WV 0
Harrisonburg, VA 0
Hickory-Morgantown-Lenoir, NC 0
Holland-Grand Haven, MI 0
Huntington-Ashland, WV-KY-OH 0
Huntsville, AL 0
Jackson, MI 0
Jackson, MS 0
Jacksonville, NC 0
Janesville, WI 0
Johnson City, TN 0
Johnstown, PA 0
Joplin, MO 0
Kalamazoo-Portage, MI 0
Kankakee-Bradley, IL 0
Killeen-Temple-Fort Hood, TX 0
Kingsport-Bristol, TN-VA 0
Kingston, NY 0
Knoxville, TN 0
La Crosse, WI 0
Lafayette, LA 0
Lake Charles, LA 0
Lakeland-Winter Haven, FL 0
Lancaster, PA 0
Lansing-East Lansing, MI 0
Laredo, TX 0
Las Cruses, NM 0
Lawton, OK 0
Leominster-Fitchburg-Gardner, MA 0
Lexington-Fayette, KY 0
Longview, TX 0
Louisville, KY-IN 0
Lubbock, TX 0
Lynchburg, VA 0
Macon, GA 0
Madera, CA 0
McAllen-Edinburg-Pharr, TX 0
Medford, OR 0
Merced, CA 0
Michigan City-La Porte, IN 0
Midland, TX 0
Mobile, AL 0
Modesto, CA 0
Monroe, LA 0
Monroe, MI 0
Montgomery, AL 0
Muskegon-Norton Shores, MI 0
Myrtle Beach-Conway-North Myrtle Beach, SC 0
Napa, CA 0
Niles-Benton Harbor, MI 0
Ocala, FL 0
Ocean City, NJ 0
Oshkosh-Neenah, WI 0
Palm Bay-Melbourne-Titusville, FL 0
Panama City-Lynn Haven, FL 0
Pensacola-Ferry Pass-Brent, FL 0
Port St. Lucie-Fort Pierce, FL 0
Portland-South Portland, ME 0
Poughkeepsie-Newburgh-Middletown, NY 0
Prescott, AZ 0
Pueblo, CO 0
Punta Gorda, FL 0
Racine, WI 0
Raleigh-Cary, NC 0
Reading, PA 0
Richmond, VA 0
Riverside-San Bernardino, CA 0
Roanoke, VA 0
Rockford, IL 0
Saginaw-Saginaw Township North, MI 0
Salem, OR 0
Salinas, CA 0
Salisbury, MD 0
San Antonio, TX 0
San Luis Obispo-Paso Robles, CA 0
Santa-Cruz-Watsonville, CA 0
Santa Barbara-Santa Maria-Goleta, CA 0
Santa Fe, NM 0
Santa Rosa-Petaluma, CA 0
Sarasota-Bradenton-Venice, FL 0
Savannah, GA 0
Scranton-Wilkes Barre, PA 0
Shreveport-Bossier City, LA 0
Sioux Falls, SD 0
South Bend-Mishawaka, IN-MI 0
Spartanburg, SC 0
Spokane, WA 0
Springfield, MA-CT 0
Springfield, MO 0
Springfield, OH 0
St. Cloud, MN 0
St. Louis, MO-IL 0
Stockton, CA 0
Tallahassee, FL 0
Toledo, OH 0
Topeka, KS 0
Tuscaloosa, AL 0
Utica-Rome, NY 0
Valdosta, GA 0
Vallejo-Fairfield, CA 0
Vero Beach, FL 0
Victoria, TX 0
Vineland-Millville-Bridgeton, NJ 0
Virginia Beach-Norfolk-Newport News, VA-NC 0
Waco, TX 0
Waterbury, CT 0
Waterloo-Cedar Falls, IA 0
Wausau, WI 0
Wichita, KS 0
Worcester, MA-CT 0
Yakima, WA 0
York-Hanover, PA 0
Youngstown-Warren-Boardman, OH 0
Anderson, SC 1
Bloomington-Normal IL 1
Boise City-Nampa, ID 1
Cincinnati-Middletown, OH-KY-IN 1
Columbia, SC 1
Greenville, SC 1
Harrisburg-Carlisle, PA 1
Jacksonville, FL 1
Lawrence, KS 1
Naples-Marco Island, FL 1
New Orleans-Metairie-Kenner, LA 1
Olympia, WA 1
Provo-Orem, UT 1
Syracuse, NY 1
Tucson, AZ 1
Atlantic City, NJ 2
Bakersfield, CA 2
Birmingham-Hoover, AL 2
Burlington-South Burlington, VT 2
Charleston-North Charleston, SC 2
Cleveland-Elyria-Mentor, OH 2
Deltona-Daytona Beach-Ormond Beach, FL 2
Fort Wayne, IN 2
Las Vegas-Paradise, NV 2
Memphis, TN-MS-AR 2
Miami-Fort Lauderdale-Miami Beach, FL 2
Nashville-Davidson-Murfreesboro, TN 2
Ogden-Clearfield, UT 2
Oklahoma City, OK 2
Oxnard-Thousand Oaks-Ventura, CA 2
Phoenix-Mesa-Scottsdale, AZ 2
Rochester, NY 2
Salt Lake City, UT 2
Springfield, IL 2
Winston-Salem, NC 2
Albuquerque, NM 3
Iowa City, IA 3
Madison, WI 3
Norwich-New London, CT-RI 3
Reno-Sparks, NV 3
Visalia-Porterville, CA 3
Charlotte-Gastonia-Concord, NC-SC 4
Indianapolis, IN 4
Omaha-Council Bluffs, NE-IA 4
Peoria, IL 4
Rochester-Dover, NH-ME 4
San Diego-Carlsbad-San Marcos, CA 4
Trenton-Ewing, NJ 4
Tulsa, OK 4
Orlando, FL 5
Seattle-Tacoma-Bellevue, WA 5
Austin-Round Rock, TX 6
Brownsville-Harlingen, TX 6
Des Moines, IA 6
Little Rock-North Little Rock, AR 6
New Haven, CT 6
Portland-Vancouver-Beaverton, OR-WA 6
Warner Robins, GA 6
Tampa-St. Petersburg-Clearwater, FL 7
Fayetteville-Springdale-Rogers, AR-MO 8
Sacramento-Arden-Arcade-Roseville, CA 8
Honolulu, HI 9
Boston-Cambridge-Quincy, MA-NH 11
Kansas City, MO-KS 11
Bridgeport-Stamford-Norwalk, CT 12
Milwaukee-Waukesha-West Allis, WI 12
Providence-Fall River-Warwick, MA-RI 14
Houston-Baytown-Sugar Land, TX 15
Baltimore-Towson, MD 16
Fresno, CA 16
Pittsburgh, PA 16
Dallas-Fort Worth-Arlington, TX 18
Los Angeles-Long Beach-Santa Ana, CA 19
San Jose-Sunnyvale-Santa Clara, CA 19
Minneapolis-St Paul-Bloomington, MN-WI 23
Hartford-West Hartford-East Hartford, CT 26
Atlanta-Sandy Springs-Marietta, GA 27
San Francisco-Oakland-Fremont, CA 27
Detroit-Warren-Livonia, MI 30
Chicago-Naperville-Joliet, IN-IN-WI 31
Philadelphia-Camden-Wilmington, PA-NJ-DE 32
Washington-Arlington-Alexandria, DC-VA-MD-WV 50
New York-Northern New Jersey-Long Island, NY-NJ-PA 96


z = sort(tapply(CPS$Country == "Brazil", CPS$MetroArea, sum, na.rm=TRUE))
kable(z)
x
Albany-Schenectady-Troy, NY 0
Albany, GA 0
Allentown-Bethlehem-Easton, PA-NJ 0
Altoona, PA 0
Amarillo, TX 0
Anderson, IN 0
Anderson, SC 0
Ann Arbor, MI 0
Anniston-Oxford, AL 0
Appleton,WI 0
Asheville, NC 0
Athens-Clark County, GA 0
Atlantic City, NJ 0
Augusta-Richmond County, GA-SC 0
Austin-Round Rock, TX 0
Bakersfield, CA 0
Baltimore-Towson, MD 0
Bangor, ME 0
Baton Rouge, LA 0
Beaumont-Port Author, TX 0
Bellingham, WA 0
Bend, OR 0
Billings, MT 0
Binghamton, NY 0
Birmingham-Hoover, AL 0
Bloomington-Normal IL 0
Bloomington, IN 0
Boise City-Nampa, ID 0
Boulder, CO 0
Bowling Green, KY 0
Brownsville-Harlingen, TX 0
Buffalo-Niagara Falls, NY 0
Burlington-South Burlington, VT 0
Cedar Rapids, IA 0
Champaign-Urbana, IL 0
Charleston-North Charleston, SC 0
Charleston, WV 0
Chattanooga, TN-GA 0
Cleveland-Elyria-Mentor, OH 0
Coeur d’Alene, ID 0
Colorado Springs, CO 0
Columbia, MO 0
Columbus, GA-AL 0
Columbus, OH 0
Corpus Christi, TX 0
Dayton, OH 0
Decatur, Al 0
Decatur, IL 0
Deltona-Daytona Beach-Ormond Beach, FL 0
Des Moines, IA 0
Detroit-Warren-Livonia, MI 0
Dover, DE 0
Duluth, MN-WI 0
Durham, NC 0
Eau Claire, WI 0
El Centro, CA 0
El Paso, TX 0
Erie, PA 0
Eugene-Springfield, OR 0
Evansville, IN-KY 0
Fargo, ND-MN 0
Farmington, NM 0
Fayetteville-Springdale-Rogers, AR-MO 0
Fayetteville, NC 0
Flint, MI 0
Florence, AL 0
Fort Collins-Loveland, CO 0
Fort Smith, AR-OK 0
Fort Walton Beach-Crestview-Destin, FL 0
Fort Wayne, IN 0
Fresno, CA 0
Gainesville, FL 0
Grand Rapids-Wyoming, MI 0
Greeley, CO 0
Green Bay, WI 0
Greensboro-High Point, NC 0
Greenville, SC 0
Gulfport-Biloxi, MS 0
Hagerstown-Martinsburg, MD-WV 0
Harrisburg-Carlisle, PA 0
Harrisonburg, VA 0
Hickory-Morgantown-Lenoir, NC 0
Holland-Grand Haven, MI 0
Honolulu, HI 0
Houston-Baytown-Sugar Land, TX 0
Huntington-Ashland, WV-KY-OH 0
Huntsville, AL 0
Indianapolis, IN 0
Iowa City, IA 0
Jackson, MI 0
Jackson, MS 0
Jacksonville, NC 0
Janesville, WI 0
Johnson City, TN 0
Johnstown, PA 0
Joplin, MO 0
Kalamazoo-Portage, MI 0
Kankakee-Bradley, IL 0
Killeen-Temple-Fort Hood, TX 0
Kingsport-Bristol, TN-VA 0
Kingston, NY 0
Knoxville, TN 0
La Crosse, WI 0
Lafayette, LA 0
Lake Charles, LA 0
Lakeland-Winter Haven, FL 0
Lancaster, PA 0
Lansing-East Lansing, MI 0
Laredo, TX 0
Las Cruses, NM 0
Las Vegas-Paradise, NV 0
Lawrence, KS 0
Lawton, OK 0
Lexington-Fayette, KY 0
Little Rock-North Little Rock, AR 0
Longview, TX 0
Lubbock, TX 0
Lynchburg, VA 0
Macon, GA 0
Madera, CA 0
Madison, WI 0
McAllen-Edinburg-Pharr, TX 0
Medford, OR 0
Memphis, TN-MS-AR 0
Merced, CA 0
Michigan City-La Porte, IN 0
Midland, TX 0
Milwaukee-Waukesha-West Allis, WI 0
Mobile, AL 0
Modesto, CA 0
Monroe, MI 0
Muskegon-Norton Shores, MI 0
Myrtle Beach-Conway-North Myrtle Beach, SC 0
Napa, CA 0
Naples-Marco Island, FL 0
Nashville-Davidson-Murfreesboro, TN 0
New Haven, CT 0
New Orleans-Metairie-Kenner, LA 0
Niles-Benton Harbor, MI 0
Norwich-New London, CT-RI 0
Ocala, FL 0
Ocean City, NJ 0
Ogden-Clearfield, UT 0
Oklahoma City, OK 0
Olympia, WA 0
Omaha-Council Bluffs, NE-IA 0
Oshkosh-Neenah, WI 0
Palm Bay-Melbourne-Titusville, FL 0
Panama City-Lynn Haven, FL 0
Peoria, IL 0
Pittsburgh, PA 0
Port St. Lucie-Fort Pierce, FL 0
Portland-South Portland, ME 0
Portland-Vancouver-Beaverton, OR-WA 0
Poughkeepsie-Newburgh-Middletown, NY 0
Prescott, AZ 0
Provo-Orem, UT 0
Pueblo, CO 0
Punta Gorda, FL 0
Raleigh-Cary, NC 0
Reading, PA 0
Reno-Sparks, NV 0
Richmond, VA 0
Riverside-San Bernardino, CA 0
Roanoke, VA 0
Rochester-Dover, NH-ME 0
Rockford, IL 0
Saginaw-Saginaw Township North, MI 0
Salinas, CA 0
Salisbury, MD 0
San Antonio, TX 0
San Diego-Carlsbad-San Marcos, CA 0
San Luis Obispo-Paso Robles, CA 0
Santa-Cruz-Watsonville, CA 0
Santa Barbara-Santa Maria-Goleta, CA 0
Santa Fe, NM 0
Santa Rosa-Petaluma, CA 0
Sarasota-Bradenton-Venice, FL 0
Savannah, GA 0
Scranton-Wilkes Barre, PA 0
Shreveport-Bossier City, LA 0
Sioux Falls, SD 0
South Bend-Mishawaka, IN-MI 0
Spartanburg, SC 0
Spokane, WA 0
Springfield, IL 0
Springfield, MA-CT 0
Springfield, MO 0
Springfield, OH 0
St. Cloud, MN 0
St. Louis, MO-IL 0
Stockton, CA 0
Syracuse, NY 0
Tallahassee, FL 0
Toledo, OH 0
Topeka, KS 0
Tucson, AZ 0
Tulsa, OK 0
Tuscaloosa, AL 0
Utica-Rome, NY 0
Valdosta, GA 0
Vallejo-Fairfield, CA 0
Vero Beach, FL 0
Victoria, TX 0
Vineland-Millville-Bridgeton, NJ 0
Visalia-Porterville, CA 0
Waco, TX 0
Warner Robins, GA 0
Waterloo-Cedar Falls, IA 0
Wausau, WI 0
Winston-Salem, NC 0
Worcester, MA-CT 0
Yakima, WA 0
York-Hanover, PA 0
Youngstown-Warren-Boardman, OH 0
Akron, OH 1
Albuquerque, NM 1
Atlanta-Sandy Springs-Marietta, GA 1
Bremerton-Silverdale, WA 1
Cape Coral-Fort Myers, FL 1
Chico, CA 1
Cincinnati-Middletown, OH-KY-IN 1
Denver-Aurora, CO 1
Hartford-West Hartford-East Hartford, CT 1
Kansas City, MO-KS 1
Leominster-Fitchburg-Gardner, MA 1
Louisville, KY-IN 1
Minneapolis-St Paul-Bloomington, MN-WI 1
Monroe, LA 1
Montgomery, AL 1
Oxnard-Thousand Oaks-Ventura, CA 1
Pensacola-Ferry Pass-Brent, FL 1
Racine, WI 1
Rochester, NY 1
Salem, OR 1
San Jose-Sunnyvale-Santa Clara, CA 1
Seattle-Tacoma-Bellevue, WA 1
Tampa-St. Petersburg-Clearwater, FL 1
Trenton-Ewing, NJ 1
Virginia Beach-Norfolk-Newport News, VA-NC 1
Waterbury, CT 1
Wichita, KS 1
Barnstable Town, MA 2
Charlotte-Gastonia-Concord, NC-SC 2
Chicago-Naperville-Joliet, IN-IN-WI 2
Columbia, SC 2
Dallas-Fort Worth-Arlington, TX 2
Jacksonville, FL 2
Orlando, FL 2
Sacramento-Arden-Arcade-Roseville, CA 2
Canton-Massillon, OH 3
Phoenix-Mesa-Scottsdale, AZ 3
Providence-Fall River-Warwick, MA-RI 3
Salt Lake City, UT 3
Davenport-Moline-Rock Island, IA-IL 4
Philadelphia-Camden-Wilmington, PA-NJ-DE 4
Danbury, CT 5
San Francisco-Oakland-Fremont, CA 6
Bridgeport-Stamford-Norwalk, CT 7
New York-Northern New Jersey-Long Island, NY-NJ-PA 7
Washington-Arlington-Alexandria, DC-VA-MD-WV 8
Los Angeles-Long Beach-Santa Ana, CA 9
Miami-Fort Lauderdale-Miami Beach, FL 16
Boston-Cambridge-Quincy, MA-NH 18


z = sort(tapply(CPS$Country == "Somalia", CPS$MetroArea, sum, na.rm=TRUE))
kable(z)
x
Akron, OH 0
Albany-Schenectady-Troy, NY 0
Albany, GA 0
Albuquerque, NM 0
Allentown-Bethlehem-Easton, PA-NJ 0
Altoona, PA 0
Amarillo, TX 0
Anderson, IN 0
Anderson, SC 0
Ann Arbor, MI 0
Anniston-Oxford, AL 0
Appleton,WI 0
Asheville, NC 0
Athens-Clark County, GA 0
Atlanta-Sandy Springs-Marietta, GA 0
Atlantic City, NJ 0
Augusta-Richmond County, GA-SC 0
Austin-Round Rock, TX 0
Bakersfield, CA 0
Baltimore-Towson, MD 0
Bangor, ME 0
Barnstable Town, MA 0
Baton Rouge, LA 0
Beaumont-Port Author, TX 0
Bellingham, WA 0
Bend, OR 0
Billings, MT 0
Binghamton, NY 0
Birmingham-Hoover, AL 0
Bloomington-Normal IL 0
Bloomington, IN 0
Boise City-Nampa, ID 0
Boston-Cambridge-Quincy, MA-NH 0
Boulder, CO 0
Bowling Green, KY 0
Bremerton-Silverdale, WA 0
Bridgeport-Stamford-Norwalk, CT 0
Brownsville-Harlingen, TX 0
Buffalo-Niagara Falls, NY 0
Canton-Massillon, OH 0
Cape Coral-Fort Myers, FL 0
Cedar Rapids, IA 0
Champaign-Urbana, IL 0
Charleston-North Charleston, SC 0
Charleston, WV 0
Charlotte-Gastonia-Concord, NC-SC 0
Chattanooga, TN-GA 0
Chicago-Naperville-Joliet, IN-IN-WI 0
Chico, CA 0
Cincinnati-Middletown, OH-KY-IN 0
Cleveland-Elyria-Mentor, OH 0
Coeur d’Alene, ID 0
Colorado Springs, CO 0
Columbia, MO 0
Columbia, SC 0
Columbus, GA-AL 0
Corpus Christi, TX 0
Dallas-Fort Worth-Arlington, TX 0
Danbury, CT 0
Davenport-Moline-Rock Island, IA-IL 0
Decatur, Al 0
Decatur, IL 0
Deltona-Daytona Beach-Ormond Beach, FL 0
Denver-Aurora, CO 0
Des Moines, IA 0
Detroit-Warren-Livonia, MI 0
Dover, DE 0
Duluth, MN-WI 0
Durham, NC 0
Eau Claire, WI 0
El Centro, CA 0
El Paso, TX 0
Erie, PA 0
Eugene-Springfield, OR 0
Evansville, IN-KY 0
Farmington, NM 0
Fayetteville-Springdale-Rogers, AR-MO 0
Fayetteville, NC 0
Flint, MI 0
Florence, AL 0
Fort Collins-Loveland, CO 0
Fort Smith, AR-OK 0
Fort Walton Beach-Crestview-Destin, FL 0
Fort Wayne, IN 0
Fresno, CA 0
Gainesville, FL 0
Grand Rapids-Wyoming, MI 0
Greeley, CO 0
Green Bay, WI 0
Greensboro-High Point, NC 0
Greenville, SC 0
Gulfport-Biloxi, MS 0
Hagerstown-Martinsburg, MD-WV 0
Harrisburg-Carlisle, PA 0
Harrisonburg, VA 0
Hartford-West Hartford-East Hartford, CT 0
Hickory-Morgantown-Lenoir, NC 0
Holland-Grand Haven, MI 0
Honolulu, HI 0
Huntington-Ashland, WV-KY-OH 0
Huntsville, AL 0
Indianapolis, IN 0
Iowa City, IA 0
Jackson, MI 0
Jackson, MS 0
Jacksonville, FL 0
Jacksonville, NC 0
Janesville, WI 0
Johnson City, TN 0
Johnstown, PA 0
Joplin, MO 0
Kalamazoo-Portage, MI 0
Kankakee-Bradley, IL 0
Kansas City, MO-KS 0
Killeen-Temple-Fort Hood, TX 0
Kingsport-Bristol, TN-VA 0
Kingston, NY 0
Knoxville, TN 0
La Crosse, WI 0
Lafayette, LA 0
Lake Charles, LA 0
Lakeland-Winter Haven, FL 0
Lancaster, PA 0
Lansing-East Lansing, MI 0
Laredo, TX 0
Las Cruses, NM 0
Las Vegas-Paradise, NV 0
Lawrence, KS 0
Lawton, OK 0
Leominster-Fitchburg-Gardner, MA 0
Lexington-Fayette, KY 0
Little Rock-North Little Rock, AR 0
Longview, TX 0
Los Angeles-Long Beach-Santa Ana, CA 0
Louisville, KY-IN 0
Lubbock, TX 0
Lynchburg, VA 0
Macon, GA 0
Madera, CA 0
Madison, WI 0
McAllen-Edinburg-Pharr, TX 0
Medford, OR 0
Memphis, TN-MS-AR 0
Merced, CA 0
Miami-Fort Lauderdale-Miami Beach, FL 0
Michigan City-La Porte, IN 0
Midland, TX 0
Milwaukee-Waukesha-West Allis, WI 0
Mobile, AL 0
Modesto, CA 0
Monroe, LA 0
Monroe, MI 0
Montgomery, AL 0
Muskegon-Norton Shores, MI 0
Myrtle Beach-Conway-North Myrtle Beach, SC 0
Napa, CA 0
Naples-Marco Island, FL 0
Nashville-Davidson-Murfreesboro, TN 0
New Haven, CT 0
New Orleans-Metairie-Kenner, LA 0
New York-Northern New Jersey-Long Island, NY-NJ-PA 0
Niles-Benton Harbor, MI 0
Norwich-New London, CT-RI 0
Ocala, FL 0
Ocean City, NJ 0
Ogden-Clearfield, UT 0
Oklahoma City, OK 0
Olympia, WA 0
Omaha-Council Bluffs, NE-IA 0
Orlando, FL 0
Oshkosh-Neenah, WI 0
Oxnard-Thousand Oaks-Ventura, CA 0
Palm Bay-Melbourne-Titusville, FL 0
Panama City-Lynn Haven, FL 0
Pensacola-Ferry Pass-Brent, FL 0
Peoria, IL 0
Philadelphia-Camden-Wilmington, PA-NJ-DE 0
Pittsburgh, PA 0
Port St. Lucie-Fort Pierce, FL 0
Poughkeepsie-Newburgh-Middletown, NY 0
Prescott, AZ 0
Providence-Fall River-Warwick, MA-RI 0
Provo-Orem, UT 0
Pueblo, CO 0
Punta Gorda, FL 0
Racine, WI 0
Raleigh-Cary, NC 0
Reading, PA 0
Reno-Sparks, NV 0
Riverside-San Bernardino, CA 0
Roanoke, VA 0
Rochester-Dover, NH-ME 0
Rochester, NY 0
Rockford, IL 0
Sacramento-Arden-Arcade-Roseville, CA 0
Saginaw-Saginaw Township North, MI 0
Salem, OR 0
Salinas, CA 0
Salisbury, MD 0
Salt Lake City, UT 0
San Antonio, TX 0
San Diego-Carlsbad-San Marcos, CA 0
San Francisco-Oakland-Fremont, CA 0
San Jose-Sunnyvale-Santa Clara, CA 0
San Luis Obispo-Paso Robles, CA 0
Santa-Cruz-Watsonville, CA 0
Santa Barbara-Santa Maria-Goleta, CA 0
Santa Fe, NM 0
Santa Rosa-Petaluma, CA 0
Sarasota-Bradenton-Venice, FL 0
Savannah, GA 0
Scranton-Wilkes Barre, PA 0
Shreveport-Bossier City, LA 0
South Bend-Mishawaka, IN-MI 0
Spartanburg, SC 0
Spokane, WA 0
Springfield, IL 0
Springfield, MA-CT 0
Springfield, MO 0
Springfield, OH 0
St. Louis, MO-IL 0
Stockton, CA 0
Syracuse, NY 0
Tallahassee, FL 0
Tampa-St. Petersburg-Clearwater, FL 0
Toledo, OH 0
Topeka, KS 0
Trenton-Ewing, NJ 0
Tucson, AZ 0
Tulsa, OK 0
Tuscaloosa, AL 0
Utica-Rome, NY 0
Valdosta, GA 0
Vallejo-Fairfield, CA 0
Vero Beach, FL 0
Victoria, TX 0
Vineland-Millville-Bridgeton, NJ 0
Virginia Beach-Norfolk-Newport News, VA-NC 0
Visalia-Porterville, CA 0
Waco, TX 0
Warner Robins, GA 0
Washington-Arlington-Alexandria, DC-VA-MD-WV 0
Waterbury, CT 0
Waterloo-Cedar Falls, IA 0
Wausau, WI 0
Wichita, KS 0
Winston-Salem, NC 0
Worcester, MA-CT 0
Yakima, WA 0
York-Hanover, PA 0
Youngstown-Warren-Boardman, OH 0
Dayton, OH 1
Richmond, VA 1
Houston-Baytown-Sugar Land, TX 2
Sioux Falls, SD 2
Burlington-South Burlington, VT 3
Portland-South Portland, ME 3
Portland-Vancouver-Beaverton, OR-WA 3
Columbus, OH 5
Fargo, ND-MN 5
Phoenix-Mesa-Scottsdale, AZ 7
Seattle-Tacoma-Bellevue, WA 7
St. Cloud, MN 7
Minneapolis-St Paul-Bloomington, MN-WI 17

We see that New York has the most interviewees born in India (96), Boston has the most born in Brazil (18), and Minneapolis has the most born in Somalia (17).