In the wake of the Great Recession of 2009, there has been a good deal of focus on employment statistics, one of the most important metrics policymakers use to gauge the overall strength of the economy. In the United States, the government measures unemployment using the Current Population Survey (CPS), which collects demographic and employment information from a wide range of Americans each month. In this exercise, we will employ the topics reviewed in the lectures as well as a few new techniques using the September 2013 version of this rich, nationally representative dataset (available online).
The observations in the dataset represent people surveyed in the September 2013 CPS who actually completed a survey. While the full dataset has 385 variables, in this exercise we will use a more compact version of the dataset, CPSData.csv, which has the following variables:
PeopleInHousehold: The number of people in the interviewee’s household.
Region: The census region where the interviewee lives.
State: The state where the interviewee lives.
MetroAreaCode: A code that identifies the metropolitan area in which the interviewee lives (missing if the interviewee does not live in a metropolitan area). The mapping from codes to names of metropolitan areas is provided in the file MetroAreaCodes.csv.
Age: The age, in years, of the interviewee. 80 represents people aged 80-84, and 85 represents people aged 85 and higher.
Married: The marriage status of the interviewee.
Sex: The sex of the interviewee.
Education: The maximum level of education obtained by the interviewee.
Race: The race of the interviewee.
Hispanic: Whether the interviewee is of Hispanic ethnicity.
CountryOfBirthCode: A code identifying the country of birth of the interviewee. The mapping from codes to names of countries is provided in the file CountryCodes.csv.
Citizenship: The United States citizenship status of the interviewee.
EmploymentStatus: The status of employment of the interviewee.
Industry: The industry of employment of the interviewee (only available if they are employed).
Load the dataset from CPSData.csv into a data frame called CPS, and view the dataset with the summary() and str() commands.
CPS=read.csv('CPSData.csv')
summary(CPS)
PeopleInHousehold Region State MetroAreaCode Age Married
Min. : 1.000 Midwest :30684 California :11570 Min. :10420 Min. : 0.00 Divorced :11151
1st Qu.: 2.000 Northeast:25939 Texas : 7077 1st Qu.:21780 1st Qu.:19.00 Married :55509
Median : 3.000 South :41502 New York : 5595 Median :34740 Median :39.00 Never Married:30772
Mean : 3.284 West :33177 Florida : 5149 Mean :35075 Mean :38.83 Separated : 2027
3rd Qu.: 4.000 Pennsylvania: 3930 3rd Qu.:41860 3rd Qu.:57.00 Widowed : 6505
Max. :15.000 Illinois : 3912 Max. :79600 Max. :85.00 NA's :25338
(Other) :94069 NA's :34238
Sex Education Race Hispanic CountryOfBirthCode
Female:67481 High school :30906 American Indian : 1433 Min. :0.0000 Min. : 57.00
Male :63821 Bachelor's degree :19443 Asian : 6520 1st Qu.:0.0000 1st Qu.: 57.00
Some college, no degree:18863 Black : 13913 Median :0.0000 Median : 57.00
No high school diploma :16095 Multiracial : 2897 Mean :0.1393 Mean : 82.68
Associate degree : 9913 Pacific Islander: 618 3rd Qu.:0.0000 3rd Qu.: 57.00
(Other) :10744 White :105921 Max. :1.0000 Max. :555.00
NA's :25338
Citizenship EmploymentStatus Industry
Citizen, Native :116639 Disabled : 5712 Educational and health services :15017
Citizen, Naturalized: 7073 Employed :61733 Trade : 8933
Non-Citizen : 7590 Not in Labor Force:15246 Professional and business services: 7519
Retired :18619 Manufacturing : 6791
Unemployed : 4203 Leisure and hospitality : 6364
NA's :25789 (Other) :21618
NA's :65060
str(CPS)
'data.frame': 131302 obs. of 14 variables:
$ PeopleInHousehold : int 1 3 3 3 3 3 3 2 2 2 ...
$ Region : Factor w/ 4 levels "Midwest","Northeast",..: 3 3 3 3 3 3 3 3 3 3 ...
$ State : Factor w/ 51 levels "Alabama","Alaska",..: 1 1 1 1 1 1 1 1 1 1 ...
$ MetroAreaCode : int 26620 13820 13820 13820 26620 26620 26620 33660 33660 26620 ...
$ Age : int 85 21 37 18 52 24 26 71 43 52 ...
$ Married : Factor w/ 5 levels "Divorced","Married",..: 5 3 3 3 5 3 3 1 1 3 ...
$ Sex : Factor w/ 2 levels "Female","Male": 1 2 1 2 1 2 2 1 2 2 ...
$ Education : Factor w/ 8 levels "Associate degree",..: 1 4 4 6 1 2 4 4 4 2 ...
$ Race : Factor w/ 6 levels "American Indian",..: 6 3 3 3 6 6 6 6 6 6 ...
$ Hispanic : int 0 0 0 0 0 0 0 0 0 0 ...
$ CountryOfBirthCode: int 57 57 57 57 57 57 57 57 57 57 ...
$ Citizenship : Factor w/ 3 levels "Citizen, Native",..: 1 1 1 1 1 1 1 1 1 1 ...
$ EmploymentStatus : Factor w/ 5 levels "Disabled","Employed",..: 4 5 1 3 2 2 2 2 3 2 ...
$ Industry : Factor w/ 14 levels "Agriculture, forestry, fishing, and hunting",..: NA 11 NA NA 11 4 14 4 NA 12 ...
Among the interviewees with a value reported for the Industry variable, what is the most common industry of employment? Please enter the name exactly how you see it.
sort(table(CPS$Industry))
Armed forces Mining
29 550
Agriculture, forestry, fishing, and hunting Information
1307 1328
Public administration Other services
3186 3224
Transportation and utilities Financial
3260 4347
Construction Leisure and hospitality
4387 6364
Manufacturing Professional and business services
6791 7519
Trade Educational and health services
8933 15017
print("Educational and health services")
[1] "Educational and health services"
Recall from the homework assignment “The Analytical Detective” that you can call the sort() function on the output of the table() function to obtain a sorted breakdown of a variable. For instance, sort(table(CPS$Region)) sorts the regions by the number of interviewees from that region.
Which state has the fewest interviewees?
sort(table(CPS$State))
New Mexico Montana Mississippi Alabama West Virginia
1102 1214 1230 1376 1409
Arkansas Louisiana Idaho Oklahoma Arizona
1421 1450 1518 1523 1528
Alaska Wyoming North Dakota South Carolina Tennessee
1590 1624 1645 1658 1784
District of Columbia Kentucky Utah Nevada Vermont
1791 1841 1842 1856 1890
Kansas Oregon Nebraska Massachusetts South Dakota
1935 1943 1949 1987 2000
Indiana Hawaii Missouri Rhode Island Delaware
2004 2099 2145 2209 2214
Maine Washington Iowa New Jersey North Carolina
2263 2366 2528 2567 2619
New Hampshire Wisconsin Georgia Connecticut Colorado
2662 2686 2807 2836 2925
Virginia Michigan Minnesota Maryland Ohio
2953 3063 3139 3200 3678
Illinois Pennsylvania Florida New York Texas
3912 3930 5149 5595 7077
California
11570
print('New Mexico')
[1] "New Mexico"
Which state has the largest number of interviewees?
print('California')
[1] "California"
What proportion of interviewees are citizens of the United States?
table(CPS$Citizenship)
Citizen, Native Citizen, Naturalized Non-Citizen
116639 7073 7590
print((116639+7073)/(116639+7073+7590))
[1] 0.9421943
The CPS differentiates between race (with possible values American Indian, Asian, Black, Pacific Islander, White, or Multiracial) and ethnicity. A number of interviewees are of Hispanic ethnicity, as captured by the Hispanic variable. For which races are there at least 250 interviewees in the CPS dataset of Hispanic ethnicity? (Select all that apply.)
table(CPS$Race,CPS$Hispanic)
0 1
American Indian 1129 304
Asian 6407 113
Black 13292 621
Multiracial 2449 448
Pacific Islander 541 77
White 89190 16731
print("American Indian,Black,Multiracial,White")
[1] "American Indian,Black,Multiracial,White"
Which variables have at least one interviewee with a missing (NA) value? (Select all that apply.)
summary(CPS)
PeopleInHousehold Region State MetroAreaCode Age Married
Min. : 1.000 Midwest :30684 California :11570 Min. :10420 Min. : 0.00 Divorced :11151
1st Qu.: 2.000 Northeast:25939 Texas : 7077 1st Qu.:21780 1st Qu.:19.00 Married :55509
Median : 3.000 South :41502 New York : 5595 Median :34740 Median :39.00 Never Married:30772
Mean : 3.284 West :33177 Florida : 5149 Mean :35075 Mean :38.83 Separated : 2027
3rd Qu.: 4.000 Pennsylvania: 3930 3rd Qu.:41860 3rd Qu.:57.00 Widowed : 6505
Max. :15.000 Illinois : 3912 Max. :79600 Max. :85.00 NA's :25338
(Other) :94069 NA's :34238
Sex Education Race Hispanic CountryOfBirthCode
Female:67481 High school :30906 American Indian : 1433 Min. :0.0000 Min. : 57.00
Male :63821 Bachelor's degree :19443 Asian : 6520 1st Qu.:0.0000 1st Qu.: 57.00
Some college, no degree:18863 Black : 13913 Median :0.0000 Median : 57.00
No high school diploma :16095 Multiracial : 2897 Mean :0.1393 Mean : 82.68
Associate degree : 9913 Pacific Islander: 618 3rd Qu.:0.0000 3rd Qu.: 57.00
(Other) :10744 White :105921 Max. :1.0000 Max. :555.00
NA's :25338
Citizenship EmploymentStatus Industry
Citizen, Native :116639 Disabled : 5712 Educational and health services :15017
Citizen, Naturalized: 7073 Employed :61733 Trade : 8933
Non-Citizen : 7590 Not in Labor Force:15246 Professional and business services: 7519
Retired :18619 Manufacturing : 6791
Unemployed : 4203 Leisure and hospitality : 6364
NA's :25789 (Other) :21618
NA's :65060
print('MetroAreaCode,Married,Education,EmploymentStatus,Industry')
[1] "MetroAreaCode,Married,Education,EmploymentStatus,Industry"
Often when evaluating a new dataset, we try to identify if there is a pattern in the missing values in the dataset. We will try to determine if there is a pattern in the missing values of the Married variable. The function
is.na(CPS$Married)
returns a vector of TRUE/FALSE values for whether the Married variable is missing. We can see the breakdown of whether Married is missing based on the reported value of the Region variable with the function
table(CPS$Region, is.na(CPS$Married))
Which is the most accurate:
table(CPS$Region, is.na(CPS$Married))
FALSE TRUE
Midwest 24609 6075
Northeast 21432 4507
South 33535 7967
West 26388 6789
table(CPS$Sex, is.na(CPS$Married))
FALSE TRUE
Female 55264 12217
Male 50700 13121
table(CPS$Age, is.na(CPS$Married))
FALSE TRUE
0 0 1283
1 0 1559
2 0 1574
3 0 1693
4 0 1695
5 0 1795
6 0 1721
7 0 1681
8 0 1729
9 0 1748
10 0 1750
11 0 1721
12 0 1797
13 0 1802
14 0 1790
15 1795 0
16 1751 0
17 1764 0
18 1596 0
19 1517 0
20 1398 0
21 1525 0
22 1536 0
23 1638 0
24 1627 0
25 1604 0
26 1643 0
27 1657 0
28 1736 0
29 1645 0
30 1854 0
31 1762 0
32 1790 0
33 1804 0
34 1653 0
35 1716 0
36 1663 0
37 1531 0
38 1530 0
39 1542 0
40 1571 0
41 1673 0
42 1711 0
43 1819 0
44 1764 0
45 1749 0
46 1665 0
47 1647 0
48 1791 0
49 1989 0
50 1966 0
51 1931 0
52 1935 0
53 1994 0
54 1912 0
55 1895 0
56 1935 0
57 1827 0
58 1874 0
59 1758 0
60 1746 0
61 1735 0
62 1595 0
63 1596 0
64 1519 0
65 1569 0
66 1577 0
67 1227 0
68 1130 0
69 1062 0
70 1195 0
71 1031 0
72 941 0
73 896 0
74 842 0
75 763 0
76 729 0
77 698 0
78 659 0
79 661 0
80 2664 0
85 2446 0
table(CPS$Citizenship, is.na(CPS$Married))
FALSE TRUE
Citizen, Native 91956 24683
Citizen, Naturalized 6910 163
Non-Citizen 7098 492
print('The Married variable being missing is related to the Age value for the interviewee.')
[1] "The Married variable being missing is related to the Age value for the interviewee."
As mentioned in the variable descriptions, MetroAreaCode is missing if an interviewee does not live in a metropolitan area. Using the same technique as in the previous question, answer the following questions about people who live in non-metropolitan areas.
How many states had all interviewees living in a non-metropolitan area (aka they have a missing MetroAreaCode value)? For this question, treat the District of Columbia as a state (even though it is not technically a state).
table(CPS$State,(is.na(CPS$MetroAreaCode)))
FALSE TRUE
Alabama 1020 356
Alaska 0 1590
Arizona 1327 201
Arkansas 724 697
California 11333 237
Colorado 2545 380
Connecticut 2593 243
Delaware 1696 518
District of Columbia 1791 0
Florida 4947 202
Georgia 2250 557
Hawaii 1576 523
Idaho 761 757
Illinois 3473 439
Indiana 1420 584
Iowa 1297 1231
Kansas 1234 701
Kentucky 908 933
Louisiana 1216 234
Maine 909 1354
Maryland 2978 222
Massachusetts 1858 129
Michigan 2517 546
Minnesota 2150 989
Mississippi 376 854
Missouri 1440 705
Montana 199 1015
Nebraska 816 1133
Nevada 1609 247
New Hampshire 1148 1514
New Jersey 2567 0
New Mexico 832 270
New York 5144 451
North Carolina 1642 977
North Dakota 432 1213
Ohio 2754 924
Oklahoma 1024 499
Oregon 1519 424
Pennsylvania 3245 685
Rhode Island 2209 0
South Carolina 1139 519
South Dakota 595 1405
Tennessee 1149 635
Texas 6060 1017
Utah 1455 387
Vermont 657 1233
Virginia 2367 586
Washington 1937 429
West Virginia 344 1065
Wisconsin 1882 804
Wyoming 0 1624
print('two state.Alaska and Wyoming')
[1] "two state.Alaska and Wyoming"
How many states had all interviewees living in a metropolitan area? Again, treat the District of Columbia as a state.
print('three state.District of Columbia ,New Jersey and Rhode Island')
[1] "three state.District of Columbia ,New Jersey and Rhode Island"
Which region of the United States has the largest proportion of interviewees living in a non-metropolitan area?
table(CPS$Region,is.na(CPS$MetroAreaCode))
FALSE TRUE
Midwest 20010 10674
Northeast 20330 5609
South 31631 9871
West 25093 8084
10674/(20010+10674)
[1] 0.3478686
5609/(20330+5609)
[1] 0.2162381
9871/(31631+9871)
[1] 0.237844
8084/(25093+8084)
[1] 0.2436628
print('Midwest')
[1] "Midwest"
While we were able to use the table() command to compute the proportion of interviewees from each region not living in a metropolitan area, it was somewhat tedious (it involved manually computing the proportion for each region) and isn’t something you would want to do if there were a larger number of options. It turns out there is a less tedious way to compute the proportion of values that are TRUE. The mean() function, which takes the average of the values passed to it, will treat TRUE as 1 and FALSE as 0, meaning it returns the proportion of values that are true. For instance, mean(c(TRUE, FALSE, TRUE, TRUE)) returns 0.75. Knowing this, use tapply() with the mean function to answer the following questions:
Which state has a proportion of interviewees living in a non-metropolitan area closest to 30%?
tapply(is.na(CPS$MetroAreaCode),CPS$State,mean)
Alabama Alaska Arizona Arkansas California
0.25872093 1.00000000 0.13154450 0.49049965 0.02048401
Colorado Connecticut Delaware District of Columbia Florida
0.12991453 0.08568406 0.23396567 0.00000000 0.03923092
Georgia Hawaii Idaho Illinois Indiana
0.19843249 0.24916627 0.49868248 0.11221881 0.29141717
Iowa Kansas Kentucky Louisiana Maine
0.48694620 0.36227390 0.50678979 0.16137931 0.59832081
Maryland Massachusetts Michigan Minnesota Mississippi
0.06937500 0.06492199 0.17825661 0.31506849 0.69430894
Missouri Montana Nebraska Nevada New Hampshire
0.32867133 0.83607908 0.58132376 0.13308190 0.56874530
New Jersey New Mexico New York North Carolina North Dakota
0.00000000 0.24500907 0.08060769 0.37304315 0.73738602
Ohio Oklahoma Oregon Pennsylvania Rhode Island
0.25122349 0.32764281 0.21821925 0.17430025 0.00000000
South Carolina South Dakota Tennessee Texas Utah
0.31302774 0.70250000 0.35594170 0.14370496 0.21009772
Vermont Virginia Washington West Virginia Wisconsin
0.65238095 0.19844226 0.18131868 0.75585522 0.29932986
Wyoming
1.00000000
print('Wisconsin')
[1] "Wisconsin"
Which state has the largest proportion of non-metropolitan interviewees, ignoring states where all interviewees were non-metropolitan?
sort(tapply(is.na(CPS$MetroAreaCode),CPS$State,mean))
District of Columbia New Jersey Rhode Island California Florida
0.00000000 0.00000000 0.00000000 0.02048401 0.03923092
Massachusetts Maryland New York Connecticut Illinois
0.06492199 0.06937500 0.08060769 0.08568406 0.11221881
Colorado Arizona Nevada Texas Louisiana
0.12991453 0.13154450 0.13308190 0.14370496 0.16137931
Pennsylvania Michigan Washington Georgia Virginia
0.17430025 0.17825661 0.18131868 0.19843249 0.19844226
Utah Oregon Delaware New Mexico Hawaii
0.21009772 0.21821925 0.23396567 0.24500907 0.24916627
Ohio Alabama Indiana Wisconsin South Carolina
0.25122349 0.25872093 0.29141717 0.29932986 0.31302774
Minnesota Oklahoma Missouri Tennessee Kansas
0.31506849 0.32764281 0.32867133 0.35594170 0.36227390
North Carolina Iowa Arkansas Idaho Kentucky
0.37304315 0.48694620 0.49049965 0.49868248 0.50678979
New Hampshire Nebraska Maine Vermont Mississippi
0.56874530 0.58132376 0.59832081 0.65238095 0.69430894
South Dakota North Dakota West Virginia Montana Alaska
0.70250000 0.73738602 0.75585522 0.83607908 1.00000000
Wyoming
1.00000000
print('Montana')
[1] "Montana"
Codes like MetroAreaCode and CountryOfBirthCode are a compact way to encode factor variables with text as their possible values, and they are therefore quite common in survey datasets. In fact, all but one of the variables in this dataset were actually stored by a numeric code in the original CPS datafile.
When analyzing a variable stored by a numeric code, we will often want to convert it into the values the codes represent. To do this, we will use a dictionary, which maps the the code to the actual value of the variable. We have provided dictionaries MetroAreaCodes.csv and CountryCodes.csv, which respectively map MetroAreaCode and CountryOfBirthCode into their true values. Read these two dictionaries into data frames MetroAreaMap and CountryMap.
How many observations (codes for metropolitan areas) are there in MetroAreaMap?
MetroAreaCodes=read.csv("MetroAreaCodes.csv")
CountryCodes=read.csv("CountryCodes.csv")
str(MetroAreaCodes)
'data.frame': 271 obs. of 2 variables:
$ Code : int 460 3000 3160 3610 3720 6450 10420 10500 10580 10740 ...
$ MetroArea: Factor w/ 271 levels "Akron, OH","Albany-Schenectady-Troy, NY",..: 12 92 97 117 122 195 1 3 2 4 ...
print(217)
[1] 217
How many observations (codes for countries) are there in CountryMap?
str(CountryCodes)
'data.frame': 149 obs. of 2 variables:
$ Code : int 57 66 73 78 96 100 102 103 104 105 ...
$ Country: Factor w/ 149 levels "Afghanistan",..: 139 57 105 135 97 3 11 18 24 37 ...
print(149)
[1] 149
To merge in the metropolitan areas, we want to connect the field MetroAreaCode from the CPS data frame with the field Code in MetroAreaMap. The following command merges the two data frames on these columns, overwriting the CPS data frame with the result:
CPS = merge(CPS, MetroAreaMap, by.x="MetroAreaCode", by.y="Code", all.x=TRUE)
The first two arguments determine the data frames to be merged (they are called “x” and “y”, respectively, in the subsequent parameters to the merge function). by.x=“MetroAreaCode” means we’re matching on the MetroAreaCode variable from the “x” data frame (CPS), while by.y=“Code” means we’re matching on the Code variable from the “y” data frame (MetroAreaMap). Finally, all.x=TRUE means we want to keep all rows from the “x” data frame (CPS), even if some of the rows’ MetroAreaCode doesn’t match any codes in MetroAreaMap (for those familiar with database terminology, this parameter makes the operation a left outer join instead of an inner join).
Review the new version of the CPS data frame with the summary() and str() functions. What is the name of the variable that was added to the data frame by the merge() operation?
CPS = merge(CPS, MetroAreaCodes, by.x="MetroAreaCode", by.y="Code", all.x=TRUE)
summary(CPS)
MetroAreaCode PeopleInHousehold Region State Age Married
Min. :10420 Min. : 1.000 Midwest :30684 California :11570 Min. : 0.00 Divorced :11151
1st Qu.:21780 1st Qu.: 2.000 Northeast:25939 Texas : 7077 1st Qu.:19.00 Married :55509
Median :34740 Median : 3.000 South :41502 New York : 5595 Median :39.00 Never Married:30772
Mean :35075 Mean : 3.284 West :33177 Florida : 5149 Mean :38.83 Separated : 2027
3rd Qu.:41860 3rd Qu.: 4.000 Pennsylvania: 3930 3rd Qu.:57.00 Widowed : 6505
Max. :79600 Max. :15.000 Illinois : 3912 Max. :85.00 NA's :25338
NA's :34238 (Other) :94069
Sex Education Race Hispanic CountryOfBirthCode
Female:67481 High school :30906 American Indian : 1433 Min. :0.0000 Min. : 57.00
Male :63821 Bachelor's degree :19443 Asian : 6520 1st Qu.:0.0000 1st Qu.: 57.00
Some college, no degree:18863 Black : 13913 Median :0.0000 Median : 57.00
No high school diploma :16095 Multiracial : 2897 Mean :0.1393 Mean : 82.68
Associate degree : 9913 Pacific Islander: 618 3rd Qu.:0.0000 3rd Qu.: 57.00
(Other) :10744 White :105921 Max. :1.0000 Max. :555.00
NA's :25338
Citizenship EmploymentStatus Industry
Citizen, Native :116639 Disabled : 5712 Educational and health services :15017
Citizen, Naturalized: 7073 Employed :61733 Trade : 8933
Non-Citizen : 7590 Not in Labor Force:15246 Professional and business services: 7519
Retired :18619 Manufacturing : 6791
Unemployed : 4203 Leisure and hospitality : 6364
NA's :25789 (Other) :21618
NA's :65060
MetroArea
New York-Northern New Jersey-Long Island, NY-NJ-PA: 5409
Washington-Arlington-Alexandria, DC-VA-MD-WV : 4177
Los Angeles-Long Beach-Santa Ana, CA : 4102
Philadelphia-Camden-Wilmington, PA-NJ-DE : 2855
Chicago-Naperville-Joliet, IN-IN-WI : 2772
(Other) :77749
NA's :34238
str(CPS)
'data.frame': 131302 obs. of 15 variables:
$ MetroAreaCode : int 10420 10420 10420 10420 10420 10420 10420 10420 10420 10420 ...
$ PeopleInHousehold : int 4 4 2 4 1 3 4 4 2 3 ...
$ Region : Factor w/ 4 levels "Midwest","Northeast",..: 1 1 1 1 1 1 1 1 1 1 ...
$ State : Factor w/ 51 levels "Alabama","Alaska",..: 36 36 36 36 36 36 36 36 36 36 ...
$ Age : int 2 9 73 40 63 19 30 6 60 32 ...
$ Married : Factor w/ 5 levels "Divorced","Married",..: NA NA 2 2 3 3 2 NA 2 2 ...
$ Sex : Factor w/ 2 levels "Female","Male": 2 2 1 1 2 1 1 1 1 2 ...
$ Education : Factor w/ 8 levels "Associate degree",..: NA NA 8 4 6 4 2 NA 4 4 ...
$ Race : Factor w/ 6 levels "American Indian",..: 6 6 6 6 6 6 2 6 6 6 ...
$ Hispanic : int 0 0 0 0 0 0 0 1 0 0 ...
$ CountryOfBirthCode: int 57 57 57 362 57 57 203 57 57 57 ...
$ Citizenship : Factor w/ 3 levels "Citizen, Native",..: 1 1 1 2 1 1 3 1 1 1 ...
$ EmploymentStatus : Factor w/ 5 levels "Disabled","Employed",..: NA NA 4 3 1 2 3 NA 2 2 ...
$ Industry : Factor w/ 14 levels "Agriculture, forestry, fishing, and hunting",..: NA NA NA NA NA 7 NA NA 4 13 ...
$ MetroArea : Factor w/ 271 levels "Akron, OH","Albany-Schenectady-Troy, NY",..: 1 1 1 1 1 1 1 1 1 1 ...
print('MetroArea')
[1] "MetroArea"
How many interviewees have a missing value for the new metropolitan area variable? Note that all of these interviewees would have been removed from the merged data frame if we did not include the all.x=TRUE parameter.
34238
[1] 34238
Which of the following metropolitan areas has the largest number of interviewees?
sort(table(CPS$MetroArea),decreasing = TRUE)
New York-Northern New Jersey-Long Island, NY-NJ-PA Washington-Arlington-Alexandria, DC-VA-MD-WV
5409 4177
Los Angeles-Long Beach-Santa Ana, CA Philadelphia-Camden-Wilmington, PA-NJ-DE
4102 2855
Chicago-Naperville-Joliet, IN-IN-WI Providence-Fall River-Warwick, MA-RI
2772 2284
Boston-Cambridge-Quincy, MA-NH Minneapolis-St Paul-Bloomington, MN-WI
2229 1942
Dallas-Fort Worth-Arlington, TX Houston-Baytown-Sugar Land, TX
1863 1649
Honolulu, HI Miami-Fort Lauderdale-Miami Beach, FL
1576 1554
Atlanta-Sandy Springs-Marietta, GA Denver-Aurora, CO
1552 1504
Baltimore-Towson, MD San Francisco-Oakland-Fremont, CA
1483 1386
Detroit-Warren-Livonia, MI Las Vegas-Paradise, NV
1354 1299
Riverside-San Bernardino, CA Seattle-Tacoma-Bellevue, WA
1290 1255
Portland-Vancouver-Beaverton, OR-WA Phoenix-Mesa-Scottsdale, AZ
1089 971
Kansas City, MO-KS Omaha-Council Bluffs, NE-IA
962 957
St. Louis, MO-IL San Diego-Carlsbad-San Marcos, CA
956 907
Hartford-West Hartford-East Hartford, CT Tampa-St. Petersburg-Clearwater, FL
885 842
Pittsburgh, PA Bridgeport-Stamford-Norwalk, CT
732 730
Salt Lake City, UT Cincinnati-Middletown, OH-KY-IN
723 719
Milwaukee-Waukesha-West Allis, WI Portland-South Portland, ME
714 701
Cleveland-Elyria-Mentor, OH San Jose-Sunnyvale-Santa Clara, CA
681 670
Sacramento-Arden-Arcade-Roseville, CA Burlington-South Burlington, VT
667 657
Boise City-Nampa, ID Orlando, FL
644 610
Albuquerque, NM San Antonio, TX
609 607
Oklahoma City, OK Virginia Beach-Norfolk-Newport News, VA-NC
604 597
Sioux Falls, SD Indianapolis, IN
595 570
Columbus, OH Louisville, KY-IN
551 519
Charlotte-Gastonia-Concord, NC-SC Austin-Round Rock, TX
517 516
New Haven, CT Nashville-Davidson-Murfreesboro, TN
506 505
Des Moines, IA Richmond, VA
501 490
Dover, DE Fargo, ND-MN
456 432
Wichita, KS Ogden-Clearfield, UT
427 423
Little Rock-North Little Rock, AR Jacksonville, FL
404 393
Birmingham-Hoover, AL Colorado Springs, CO
392 372
New Orleans-Metairie-Kenner, LA Memphis, TN-MS-AR
367 348
Buffalo-Niagara Falls, NY Raleigh-Cary, NC
344 336
Allentown-Bethlehem-Easton, PA-NJ Tulsa, OK
334 323
Reno-Sparks, NV Provo-Orem, UT
310 309
Rochester, NY Grand Rapids-Wyoming, MI
307 304
Fresno, CA Tucson, AZ
303 302
Columbia, SC Madison, WI
291 284
Albany-Schenectady-Troy, NY Dayton, OH
268 268
Oxnard-Thousand Oaks-Ventura, CA Baton Rouge, LA
267 262
Charleston, WV Rochester-Dover, NH-ME
262 262
Greensboro-High Point, NC Bakersfield, CA
251 245
El Paso, TX Davenport-Moline-Rock Island, IA-IL
244 240
Toledo, OH Charleston-North Charleston, SC
235 232
Akron, OH Syracuse, NY
231 223
Jackson, MS Fayetteville-Springdale-Rogers, AR-MO
222 215
Bangor, ME Fort Collins-Loveland, CO
208 206
Norwich-New London, CT-RI Savannah, GA
203 202
Poughkeepsie-Newburgh-Middletown, NY Billings, MT
201 199
Lexington-Fayette, KY Cedar Rapids, IA
198 196
Eugene-Springfield, OR McAllen-Edinburg-Pharr, TX
196 195
Stockton, CA Sarasota-Bradenton-Venice, FL
193 192
Durham, NC Greenville, SC
189 185
Topeka, KS Lafayette, LA
182 181
Monroe, LA Scranton-Wilkes Barre, PA
179 176
Harrisburg-Carlisle, PA Boulder, CO
174 171
Salem, OR Knoxville, TN
170 168
Palm Bay-Melbourne-Titusville, FL Chattanooga, TN-GA
168 167
Greeley, CO Augusta-Richmond County, GA-SC
162 161
Springfield, MO Modesto, CA
161 158
Waterbury, CT Lancaster, PA
157 156
Spokane, WA Waterloo-Cedar Falls, IA
156 156
Springfield, MA-CT Youngstown-Warren-Boardman, OH
155 153
Lakeland-Winter Haven, FL Cape Coral-Fort Myers, FL
149 146
Shreveport-Bossier City, LA Worcester, MA-CT
146 144
Reading, PA Bend, OR
142 140
Deltona-Daytona Beach-Ormond Beach, FL Fort Wayne, IN
140 136
Green Bay, WI Vallejo-Fairfield, CA
136 133
Corpus Christi, TX Santa Barbara-Santa Maria-Goleta, CA
132 132
Iowa City, IA Pueblo, CO
131 130
Santa Rosa-Petaluma, CA Kalamazoo-Portage, MI
129 127
Winston-Salem, NC Duluth, MN-WI
127 126
Appleton,WI Beaumont-Port Author, TX
125 123
Champaign-Urbana, IL Visalia-Porterville, CA
122 121
Lansing-East Lansing, MI Racine, WI
119 119
Canton-Massillon, OH Coeur d'Alene, ID
118 117
Huntsville, AL York-Hanover, PA
117 117
Asheville, NC Victoria, TX
116 116
La Crosse, WI Rockford, IL
114 114
Danbury, CT Peoria, IL
112 112
Yakima, WA Atlantic City, NJ
112 111
Eau Claire, WI Mobile, AL
110 110
Port St. Lucie-Fort Pierce, FL Las Cruses, NM
109 107
Pensacola-Ferry Pass-Brent, FL Merced, CA
107 106
Fort Smith, AR-OK Bloomington, IN
105 104
Salinas, CA Montgomery, AL
104 103
Flint, MI Myrtle Beach-Conway-North Myrtle Beach, SC
102 102
Killeen-Temple-Fort Hood, TX El Centro, CA
101 99
Evansville, IN-KY Janesville, WI
99 99
Olympia, WA Spartanburg, SC
99 99
Lawrence, KS Lawton, OK
98 97
Decatur, Al Wausau, WI
96 96
Trenton-Ewing, NJ Harrisonburg, VA
91 90
Muskegon-Norton Shores, MI Laredo, TX
90 89
Amarillo, TX Bremerton-Silverdale, WA
88 87
Erie, PA Kankakee-Bradley, IL
87 87
Kingston, NY Hagerstown-Martinsburg, MD-WV
87 86
Ann Arbor, MI Oshkosh-Neenah, WI
85 85
Altoona, PA Huntington-Ashland, WV-KY-OH
82 82
Medford, OR Naples-Marco Island, FL
82 82
St. Cloud, MN Decatur, IL
82 81
Lake Charles, LA South Bend-Mishawaka, IN-MI
81 81
Fort Walton Beach-Crestview-Destin, FL Utica-Rome, NY
80 80
Brownsville-Harlingen, TX Vero Beach, FL
79 79
Waco, TX Holland-Grand Haven, MI
79 78
Tuscaloosa, AL Fayetteville, NC
78 77
Michigan City-La Porte, IN San Luis Obispo-Paso Robles, CA
77 77
Ocala, FL Springfield, IL
76 76
Barnstable Town, MA Saginaw-Saginaw Township North, MI
75 74
Salisbury, MD Binghamton, NY
74 73
Lynchburg, VA Bellingham, WA
73 70
Gainesville, FL Jackson, MI
70 70
Albany, GA Kingsport-Bristol, TN-VA
68 67
Leominster-Fitchburg-Gardner, MA Roanoke, VA
66 66
Santa-Cruz-Watsonville, CA Athens-Clark County, GA
66 65
Gulfport-Biloxi, MS Longview, TX
65 65
Macon, GA Anderson, SC
65 64
Farmington, NM Florence, AL
64 63
Jacksonville, NC Johnstown, PA
63 63
Lubbock, TX Monroe, MI
63 63
Anderson, IN Anniston-Oxford, AL
62 61
Napa, CA Chico, CA
61 60
Columbus, GA-AL Joplin, MO
59 59
Panama City-Lynn Haven, FL Hickory-Morgantown-Lenoir, NC
59 57
Madera, CA Prescott, AZ
57 54
Vineland-Millville-Bridgeton, NJ Johnson City, TN
54 52
Santa Fe, NM Midland, TX
52 51
Niles-Benton Harbor, MI Punta Gorda, FL
51 48
Columbia, MO Tallahassee, FL
47 43
Valdosta, GA Warner Robins, GA
42 42
Bloomington-Normal IL Springfield, OH
40 34
Ocean City, NJ Bowling Green, KY
30 29
Appleton-Oshkosh-Neenah, WI Grand Rapids-Muskegon-Holland, MI
0 0
Greenville-Spartanburg-Anderson, SC Hinesville-Fort Stewart, GA
0 0
Jamestown, NY Kalamazoo-Battle Creek, MI
0 0
Portsmouth-Rochester, NH-ME
0
print('Boston-Cambridge-Quincy, MA-NH')
[1] "Boston-Cambridge-Quincy, MA-NH"
Which metropolitan area has the highest proportion of interviewees of Hispanic ethnicity? Hint: Use tapply() with mean, as in the previous subproblem. Calling sort() on the output of tapply() could also be helpful here.
sort(tapply(CPS$Hispanic, CPS$MetroArea, mean),decreasing = T)
Laredo, TX McAllen-Edinburg-Pharr, TX
0.966292135 0.948717949
Brownsville-Harlingen, TX El Paso, TX
0.797468354 0.790983607
El Centro, CA San Antonio, TX
0.686868687 0.644151565
Madera, CA Corpus Christi, TX
0.614035088 0.606060606
Merced, CA Salinas, CA
0.566037736 0.557692308
Las Cruses, NM Tucson, AZ
0.542056075 0.506622517
Riverside-San Bernardino, CA Bakersfield, CA
0.502325581 0.489795918
Miami-Fort Lauderdale-Miami Beach, FL Victoria, TX
0.467824968 0.465517241
Santa Fe, NM Los Angeles-Long Beach-Santa Ana, CA
0.461538462 0.460263286
Albuquerque, NM Cape Coral-Fort Myers, FL
0.441707718 0.438356164
Visalia-Porterville, CA Fresno, CA
0.438016529 0.409240924
Vineland-Millville-Bridgeton, NJ Santa Barbara-Santa Maria-Goleta, CA
0.407407407 0.401515152
Killeen-Temple-Fort Hood, TX Oxnard-Thousand Oaks-Ventura, CA
0.386138614 0.359550562
Houston-Baytown-Sugar Land, TX Yakima, WA
0.359005458 0.357142857
Midland, TX Modesto, CA
0.352941176 0.341772152
Danbury, CT Waco, TX
0.339285714 0.329113924
Stockton, CA San Jose-Sunnyvale-Santa Clara, CA
0.321243523 0.316417910
Austin-Round Rock, TX Pueblo, CO
0.310077519 0.307692308
Longview, TX Lubbock, TX
0.292307692 0.285714286
Dallas-Fort Worth-Arlington, TX Poughkeepsie-Newburgh-Middletown, NY
0.283950617 0.273631841
San Diego-Carlsbad-San Marcos, CA Sacramento-Arden-Arcade-Roseville, CA
0.269018743 0.263868066
Amarillo, TX Phoenix-Mesa-Scottsdale, AZ
0.261363636 0.254376931
Las Vegas-Paradise, NV Waterbury, CT
0.251732102 0.248407643
San Luis Obispo-Paso Robles, CA Farmington, NM
0.246753247 0.234375000
Santa Rosa-Petaluma, CA Denver-Aurora, CO
0.232558140 0.232047872
Napa, CA New York-Northern New Jersey-Long Island, NY-NJ-PA
0.229508197 0.228508042
Beaumont-Port Author, TX Springfield, MA-CT
0.227642276 0.219354839
Orlando, FL Salem, OR
0.213114754 0.211764706
Reading, PA Vallejo-Fairfield, CA
0.211267606 0.210526316
Columbus, GA-AL San Francisco-Oakland-Fremont, CA
0.203389831 0.199855700
Reno-Sparks, NV Naples-Marco Island, FL
0.196774194 0.182926829
Chicago-Naperville-Joliet, IN-IN-WI Greeley, CO
0.167388167 0.160493827
Tampa-St. Petersburg-Clearwater, FL Ocala, FL
0.159144893 0.157894737
Fayetteville, NC Salt Lake City, UT
0.155844156 0.154910097
Santa-Cruz-Watsonville, CA Fayetteville-Springdale-Rogers, AR-MO
0.151515152 0.148837209
Boulder, CO Ogden-Clearfield, UT
0.146198830 0.144208038
Grand Rapids-Wyoming, MI Scranton-Wilkes Barre, PA
0.138157895 0.136363636
Lakeland-Winter Haven, FL Wichita, KS
0.134228188 0.133489461
Trenton-Ewing, NJ Prescott, AZ
0.131868132 0.129629630
Jacksonville, NC Green Bay, WI
0.126984127 0.125000000
Lawton, OK Athens-Clark County, GA
0.123711340 0.123076923
Kansas City, MO-KS Washington-Arlington-Alexandria, DC-VA-MD-WV
0.121621622 0.121378980
Fort Collins-Loveland, CO Olympia, WA
0.121359223 0.121212121
Colorado Springs, CO Raleigh-Cary, NC
0.120967742 0.119047619
Charlotte-Gastonia-Concord, NC-SC Chico, CA
0.117988395 0.116666667
Kankakee-Bradley, IL Tulsa, OK
0.114942529 0.114551084
Providence-Fall River-Warwick, MA-RI Fort Walton Beach-Crestview-Destin, FL
0.114273205 0.112500000
Bridgeport-Stamford-Norwalk, CT New Orleans-Metairie-Kenner, LA
0.112328767 0.111716621
Durham, NC Waterloo-Cedar Falls, IA
0.111111111 0.108974359
Oklahoma City, OK Hartford-West Hartford-East Hartford, CT
0.107615894 0.105084746
Norwich-New London, CT-RI Lancaster, PA
0.103448276 0.102564103
Tuscaloosa, AL Port St. Lucie-Fort Pierce, FL
0.102564103 0.100917431
Deltona-Daytona Beach-Ormond Beach, FL Portland-Vancouver-Beaverton, OR-WA
0.100000000 0.094582185
Topeka, KS Augusta-Richmond County, GA-SC
0.093406593 0.093167702
Boise City-Nampa, ID Davenport-Moline-Rock Island, IA-IL
0.093167702 0.091666667
Jacksonville, FL Leominster-Fitchburg-Gardner, MA
0.091603053 0.090909091
Atlantic City, NJ Seattle-Tacoma-Bellevue, WA
0.090090090 0.088446215
Hickory-Morgantown-Lenoir, NC Allentown-Bethlehem-Easton, PA-NJ
0.087719298 0.086826347
Fort Smith, AR-OK Atlanta-Sandy Springs-Marietta, GA
0.085714286 0.085695876
Milwaukee-Waukesha-West Allis, WI Medford, OR
0.085434174 0.085365854
Lansing-East Lansing, MI Worcester, MA-CT
0.084033613 0.083333333
Baltimore-Towson, MD Shreveport-Bossier City, LA
0.082265678 0.082191781
Syracuse, NY Columbia, SC
0.080717489 0.079037801
Philadelphia-Camden-Wilmington, PA-NJ-DE Chattanooga, TN-GA
0.078458844 0.077844311
Eugene-Springfield, OR Canton-Massillon, OH
0.076530612 0.076271186
Vero Beach, FL Greensboro-High Point, NC
0.075949367 0.075697211
Utica-Rome, NY Des Moines, IA
0.075000000 0.073852295
New Haven, CT Indianapolis, IN
0.073122530 0.071929825
Omaha-Council Bluffs, NE-IA Tallahassee, FL
0.070010449 0.069767442
Boston-Cambridge-Quincy, MA-NH Nashville-Davidson-Murfreesboro, TN
0.069537909 0.069306931
Kingston, NY Panama City-Lynn Haven, FL
0.068965517 0.067796610
Ocean City, NJ Provo-Orem, UT
0.066666667 0.064724919
Anderson, IN Monroe, MI
0.064516129 0.063492063
Peoria, IL Lafayette, LA
0.062500000 0.060773481
Asheville, NC Cleveland-Elyria-Mentor, OH
0.060344828 0.060205580
Honolulu, HI Myrtle Beach-Conway-North Myrtle Beach, SC
0.059644670 0.058823529
Racine, WI Rochester, NY
0.058823529 0.058631922
Bremerton-Silverdale, WA Dover, DE
0.057471264 0.057017544
Winston-Salem, NC Birmingham-Hoover, AL
0.055118110 0.053571429
Palm Bay-Melbourne-Titusville, FL Decatur, Al
0.053571429 0.052083333
Minneapolis-St Paul-Bloomington, MN-WI Virginia Beach-Norfolk-Newport News, VA-NC
0.052008239 0.050251256
South Bend-Mishawaka, IN-MI Anniston-Oxford, AL
0.049382716 0.049180328
Valdosta, GA Sarasota-Bradenton-Venice, FL
0.047619048 0.046875000
Albany, GA Rockford, IL
0.044117647 0.043859649
Columbus, OH Springfield, MO
0.043557169 0.043478261
Gainesville, FL Richmond, VA
0.042857143 0.042857143
York-Hanover, PA Columbia, MO
0.042735043 0.042553191
Sioux Falls, SD Punta Gorda, FL
0.042016807 0.041666667
Binghamton, NY Albany-Schenectady-Troy, NY
0.041095890 0.041044776
Lawrence, KS Lexington-Fayette, KY
0.040816327 0.040404040
Cincinnati-Middletown, OH-KY-IN Flint, MI
0.040333797 0.039215686
Michigan City-La Porte, IN Louisville, KY-IN
0.038961039 0.038535645
Johnson City, TN Baton Rouge, LA
0.038461538 0.038167939
Greenville, SC Detroit-Warren-Livonia, MI
0.037837838 0.037666174
Little Rock-North Little Rock, AR Fort Wayne, IN
0.037128713 0.036764706
Toledo, OH Champaign-Urbana, IL
0.034042553 0.032786885
Youngstown-Warren-Boardman, OH Kalamazoo-Portage, MI
0.032679739 0.031496063
Iowa City, IA Rochester-Dover, NH-ME
0.030534351 0.030534351
St. Louis, MO-IL Janesville, WI
0.030334728 0.030303030
Roanoke, VA Billings, MT
0.030303030 0.030150754
Springfield, OH Memphis, TN-MS-AR
0.029411765 0.028735632
Pensacola-Ferry Pass-Brent, FL Lynchburg, VA
0.028037383 0.027397260
Saginaw-Saginaw Township North, MI Coeur d'Alene, ID
0.027027027 0.025641026
Spokane, WA Fargo, ND-MN
0.025641026 0.025462963
Lake Charles, LA Madison, WI
0.024691358 0.024647887
Erie, PA Harrisburg-Carlisle, PA
0.022988506 0.022988506
Muskegon-Norton Shores, MI Bend, OR
0.022222222 0.021428571
Evansville, IN-KY Spartanburg, SC
0.020202020 0.020202020
Niles-Benton Harbor, MI La Crosse, WI
0.019607843 0.017543860
Buffalo-Niagara Falls, NY Charleston-North Charleston, SC
0.017441860 0.017241379
Joplin, MO Pittsburgh, PA
0.016949153 0.016393443
Duluth, MN-WI Gulfport-Biloxi, MS
0.015873016 0.015384615
Cedar Rapids, IA Kingsport-Bristol, TN-VA
0.015306122 0.014925373
Bangor, ME Bellingham, WA
0.014423077 0.014285714
Springfield, IL Akron, OH
0.013157895 0.012987013
Holland-Grand Haven, MI Altoona, PA
0.012820513 0.012195122
St. Cloud, MN Oshkosh-Neenah, WI
0.012195122 0.011764706
Portland-South Portland, ME Wausau, WI
0.011412268 0.010416667
Montgomery, AL Burlington-South Burlington, VT
0.009708738 0.009132420
Jackson, MS Appleton,WI
0.009009009 0.008000000
Charleston, WV Knoxville, TN
0.007633588 0.005952381
Monroe, LA Dayton, OH
0.005586592 0.003731343
Anderson, SC Ann Arbor, MI
0.000000000 0.000000000
Barnstable Town, MA Bloomington-Normal IL
0.000000000 0.000000000
Bloomington, IN Bowling Green, KY
0.000000000 0.000000000
Decatur, IL Eau Claire, WI
0.000000000 0.000000000
Florence, AL Hagerstown-Martinsburg, MD-WV
0.000000000 0.000000000
Harrisonburg, VA Huntington-Ashland, WV-KY-OH
0.000000000 0.000000000
Huntsville, AL Jackson, MI
0.000000000 0.000000000
Johnstown, PA Macon, GA
0.000000000 0.000000000
Mobile, AL Salisbury, MD
0.000000000 0.000000000
Savannah, GA Warner Robins, GA
0.000000000 0.000000000
print('Laredo, TX')
[1] "Laredo, TX"
Remembering that CPS$Race == “Asian” returns a TRUE/FALSE vector of whether an interviewee is Asian, determine the number of metropolitan areas in the United States from which at least 20% of interviewees are Asian.
sort(tapply(CPS$Race=="Asian", CPS$MetroArea, mean,na.rm=T),decreasing = T)
Honolulu, HI San Francisco-Oakland-Fremont, CA
0.501903553 0.246753247
San Jose-Sunnyvale-Santa Clara, CA Vallejo-Fairfield, CA
0.241791045 0.203007519
Fresno, CA Warner Robins, GA
0.184818482 0.166666667
Stockton, CA Atlantic City, NJ
0.155440415 0.144144144
Sacramento-Arden-Arcade-Roseville, CA San Diego-Carlsbad-San Marcos, CA
0.142428786 0.142227122
Los Angeles-Long Beach-Santa Ana, CA Olympia, WA
0.135056070 0.131313131
Salinas, CA New York-Northern New Jersey-Long Island, NY-NJ-PA
0.125000000 0.104270660
Seattle-Tacoma-Bellevue, WA Visalia-Porterville, CA
0.099601594 0.090909091
Green Bay, WI La Crosse, WI
0.088235294 0.087719298
Ann Arbor, MI Bakersfield, CA
0.082352941 0.081632653
Greensboro-High Point, NC Las Vegas-Paradise, NV
0.079681275 0.078521940
Minneapolis-St Paul-Bloomington, MN-WI Brownsville-Harlingen, TX
0.076725026 0.075949367
Bloomington-Normal IL Oxnard-Thousand Oaks-Ventura, CA
0.075000000 0.074906367
Lake Charles, LA Norwich-New London, CT-RI
0.074074074 0.073891626
Atlanta-Sandy Springs-Marietta, GA Washington-Arlington-Alexandria, DC-VA-MD-WV
0.072809278 0.070624850
Portland-Vancouver-Beaverton, OR-WA Hartford-West Hartford-East Hartford, CT
0.069788797 0.066666667
Cedar Rapids, IA Rochester, NY
0.066326531 0.065146580
Columbia, MO Dallas-Fort Worth-Arlington, TX
0.063829787 0.062801932
Danbury, CT Riverside-San Bernardino, CA
0.062500000 0.062015504
Houston-Baytown-Sugar Land, TX Boulder, CO
0.061249242 0.058479532
Chicago-Naperville-Joliet, IN-IN-WI Reno-Sparks, NV
0.058441558 0.058064516
Baltimore-Towson, MD Lancaster, PA
0.057990560 0.057692308
Nashville-Davidson-Murfreesboro, TN Fort Smith, AR-OK
0.057425743 0.057142857
Merced, CA Madison, WI
0.056603774 0.056338028
Peoria, IL Iowa City, IA
0.053571429 0.053435115
Springfield, IL Austin-Round Rock, TX
0.052631579 0.052325581
Buffalo-Niagara Falls, NY Boston-Cambridge-Quincy, MA-NH
0.052325581 0.052041274
San Luis Obispo-Paso Robles, CA Fayetteville-Springdale-Rogers, AR-MO
0.051948052 0.051162791
Orlando, FL Raleigh-Cary, NC
0.050819672 0.050595238
Tulsa, OK Anniston-Oxford, AL
0.049535604 0.049180328
Burlington-South Burlington, VT Jacksonville, FL
0.048706240 0.048346056
Jacksonville, NC Milwaukee-Waukesha-West Allis, WI
0.047619048 0.047619048
New Haven, CT Trenton-Ewing, NJ
0.047430830 0.043956044
Detroit-Warren-Livonia, MI Gainesville, FL
0.043574594 0.042857143
Portland-South Portland, ME Decatur, Al
0.042796006 0.041666667
Albuquerque, NM Syracuse, NY
0.041050903 0.040358744
Duluth, MN-WI Tampa-St. Petersburg-Clearwater, FL
0.039682540 0.039192399
Providence-Fall River-Warwick, MA-RI Bridgeport-Stamford-Norwalk, CT
0.038966725 0.038356164
Pittsburgh, PA Phoenix-Mesa-Scottsdale, AZ
0.038251366 0.038105046
Des Moines, IA Fort Walton Beach-Crestview-Destin, FL
0.037924152 0.037500000
Fort Wayne, IN Richmond, VA
0.036764706 0.036734694
Huntington-Ashland, WV-KY-OH Mobile, AL
0.036585366 0.036363636
Salt Lake City, UT Palm Bay-Melbourne-Titusville, FL
0.035961272 0.035714286
Miami-Fort Lauderdale-Miami Beach, FL Lexington-Fayette, KY
0.035392535 0.035353535
Hickory-Morgantown-Lenoir, NC Oklahoma City, OK
0.035087719 0.034768212
Worcester, MA-CT Kansas City, MO-KS
0.034722222 0.034303534
Cape Coral-Fort Myers, FL Harrisonburg, VA
0.034246575 0.033333333
Philadelphia-Camden-Wilmington, PA-NJ-DE Greenville, SC
0.032924694 0.032432432
Denver-Aurora, CO Anderson, SC
0.031914894 0.031250000
Athens-Clark County, GA Gulfport-Biloxi, MS
0.030769231 0.030769231
Wichita, KS Akron, OH
0.030444965 0.030303030
Omaha-Council Bluffs, NE-IA Montgomery, AL
0.029258098 0.029126214
Bellingham, WA Fargo, ND-MN
0.028571429 0.027777778
Columbia, SC Lakeland-Winter Haven, FL
0.027491409 0.026845638
Virginia Beach-Norfolk-Newport News, VA-NC Rochester-Dover, NH-ME
0.026800670 0.026717557
Ogden-Clearfield, UT Fayetteville, NC
0.026004728 0.025974026
Holland-Grand Haven, MI Augusta-Richmond County, GA-SC
0.025641026 0.024844720
Indianapolis, IN Naples-Marco Island, FL
0.024561404 0.024390244
Bangor, ME Bremerton-Silverdale, WA
0.024038462 0.022988506
Baton Rouge, LA Albany-Schenectady-Troy, NY
0.022900763 0.022388060
Little Rock-North Little Rock, AR Cincinnati-Middletown, OH-KY-IN
0.022277228 0.022253129
Topeka, KS Deltona-Daytona Beach-Ormond Beach, FL
0.021978022 0.021428571
Davenport-Moline-Rock Island, IA-IL Eugene-Springfield, OR
0.020833333 0.020408163
El Centro, CA Tucson, AZ
0.020202020 0.019867550
Savannah, GA Flint, MI
0.019801980 0.019607843
Fort Collins-Loveland, CO Spokane, WA
0.019417476 0.019230769
Las Cruses, NM Pensacola-Ferry Pass-Brent, FL
0.018691589 0.018691589
Prescott, AZ Columbus, OH
0.018518519 0.018148820
Memphis, TN-MS-AR Panama City-Lynn Haven, FL
0.017241379 0.016949153
Champaign-Urbana, IL Napa, CA
0.016393443 0.016393443
Colorado Springs, CO Johnstown, PA
0.016129032 0.015873016
Kalamazoo-Portage, MI Winston-Salem, NC
0.015748031 0.015748031
Sarasota-Bradenton-Venice, FL Charlotte-Gastonia-Concord, NC-SC
0.015625000 0.015473888
Dover, DE Corpus Christi, TX
0.015350877 0.015151515
Allentown-Bethlehem-Easton, PA-NJ Ocala, FL
0.014970060 0.013157895
Youngstown-Warren-Boardman, OH Provo-Orem, UT
0.013071895 0.012944984
Waterloo-Cedar Falls, IA Birmingham-Hoover, AL
0.012820513 0.012755102
Springfield, MO Greeley, CO
0.012422360 0.012345679
Medford, OR Louisville, KY-IN
0.012195122 0.011560694
Harrisburg-Carlisle, PA Kingston, NY
0.011494253 0.011494253
Boise City-Nampa, ID Lawton, OK
0.010869565 0.010309278
Cleveland-Elyria-Mentor, OH Lawrence, KS
0.010279001 0.010204082
Evansville, IN-KY Sioux Falls, SD
0.010101010 0.010084034
Grand Rapids-Wyoming, MI Yakima, WA
0.009868421 0.008928571
Coeur d'Alene, ID York-Hanover, PA
0.008547009 0.008547009
Toledo, OH Santa Rosa-Petaluma, CA
0.008510638 0.007751938
Santa Barbara-Santa Maria-Goleta, CA Dayton, OH
0.007575758 0.007462687
Bend, OR Modesto, CA
0.007142857 0.006329114
Chattanooga, TN-GA Monroe, LA
0.005988024 0.005586592
Charleston-North Charleston, SC San Antonio, TX
0.004310345 0.003294893
New Orleans-Metairie-Kenner, LA St. Louis, MO-IL
0.002724796 0.002092050
Albany, GA Altoona, PA
0.000000000 0.000000000
Amarillo, TX Anderson, IN
0.000000000 0.000000000
Appleton,WI Asheville, NC
0.000000000 0.000000000
Barnstable Town, MA Beaumont-Port Author, TX
0.000000000 0.000000000
Billings, MT Binghamton, NY
0.000000000 0.000000000
Bloomington, IN Bowling Green, KY
0.000000000 0.000000000
Canton-Massillon, OH Charleston, WV
0.000000000 0.000000000
Chico, CA Columbus, GA-AL
0.000000000 0.000000000
Decatur, IL Durham, NC
0.000000000 0.000000000
Eau Claire, WI El Paso, TX
0.000000000 0.000000000
Erie, PA Farmington, NM
0.000000000 0.000000000
Florence, AL Hagerstown-Martinsburg, MD-WV
0.000000000 0.000000000
Huntsville, AL Jackson, MI
0.000000000 0.000000000
Jackson, MS Janesville, WI
0.000000000 0.000000000
Johnson City, TN Joplin, MO
0.000000000 0.000000000
Kankakee-Bradley, IL Killeen-Temple-Fort Hood, TX
0.000000000 0.000000000
Kingsport-Bristol, TN-VA Knoxville, TN
0.000000000 0.000000000
Lafayette, LA Lansing-East Lansing, MI
0.000000000 0.000000000
Laredo, TX Leominster-Fitchburg-Gardner, MA
0.000000000 0.000000000
Longview, TX Lubbock, TX
0.000000000 0.000000000
Lynchburg, VA Macon, GA
0.000000000 0.000000000
Madera, CA McAllen-Edinburg-Pharr, TX
0.000000000 0.000000000
Michigan City-La Porte, IN Midland, TX
0.000000000 0.000000000
Monroe, MI Muskegon-Norton Shores, MI
0.000000000 0.000000000
Myrtle Beach-Conway-North Myrtle Beach, SC Niles-Benton Harbor, MI
0.000000000 0.000000000
Ocean City, NJ Oshkosh-Neenah, WI
0.000000000 0.000000000
Port St. Lucie-Fort Pierce, FL Poughkeepsie-Newburgh-Middletown, NY
0.000000000 0.000000000
Pueblo, CO Punta Gorda, FL
0.000000000 0.000000000
Racine, WI Reading, PA
0.000000000 0.000000000
Roanoke, VA Rockford, IL
0.000000000 0.000000000
Saginaw-Saginaw Township North, MI Salem, OR
0.000000000 0.000000000
Salisbury, MD Santa-Cruz-Watsonville, CA
0.000000000 0.000000000
Santa Fe, NM Scranton-Wilkes Barre, PA
0.000000000 0.000000000
Shreveport-Bossier City, LA South Bend-Mishawaka, IN-MI
0.000000000 0.000000000
Spartanburg, SC Springfield, MA-CT
0.000000000 0.000000000
Springfield, OH St. Cloud, MN
0.000000000 0.000000000
Tallahassee, FL Tuscaloosa, AL
0.000000000 0.000000000
Utica-Rome, NY Valdosta, GA
0.000000000 0.000000000
Vero Beach, FL Victoria, TX
0.000000000 0.000000000
Vineland-Millville-Bridgeton, NJ Waco, TX
0.000000000 0.000000000
Waterbury, CT Wausau, WI
0.000000000 0.000000000
print('four')
[1] "four"
Normally, we would look at the sorted proportion of interviewees from each metropolitan area who have not received a high school diploma with the command:
sort(tapply(CPS$Education == "No high school diploma", CPS$MetroArea, mean))
However, none of the interviewees aged 14 and younger have an education value reported, so the mean value is reported as NA for each metropolitan area. To get mean (and related functions, like sum) to ignore missing values, you can pass the parameter na.rm=TRUE. Passing na.rm=TRUE to the tapply function, determine which metropolitan area has the smallest proportion of interviewees who have received no high school diploma.
Just as we did with the metropolitan area information, merge in the country of birth information from the CountryMap data frame, replacing the CPS data frame with the result. If you accidentally overwrite CPS with the wrong values, remember that you can restore it by re-loading the data frame from CPSData.csv and then merging in the metropolitan area information using the command provided in the previous subproblem.
What is the name of the variable added to the CPS data frame by this merge operation?
CPS = merge(CPS, CountryCodes, by.x="CountryOfBirthCode", by.y="Code", all.x=TRUE)
summary(CPS)
CountryOfBirthCode MetroAreaCode PeopleInHousehold Region State Age
Min. : 57.00 Min. :10420 Min. : 1.000 Midwest :30684 California :11570 Min. : 0.00
1st Qu.: 57.00 1st Qu.:21780 1st Qu.: 2.000 Northeast:25939 Texas : 7077 1st Qu.:19.00
Median : 57.00 Median :34740 Median : 3.000 South :41502 New York : 5595 Median :39.00
Mean : 82.68 Mean :35075 Mean : 3.284 West :33177 Florida : 5149 Mean :38.83
3rd Qu.: 57.00 3rd Qu.:41860 3rd Qu.: 4.000 Pennsylvania: 3930 3rd Qu.:57.00
Max. :555.00 Max. :79600 Max. :15.000 Illinois : 3912 Max. :85.00
NA's :34238 (Other) :94069
Married Sex Education Race Hispanic
Divorced :11151 Female:67481 High school :30906 American Indian : 1433 Min. :0.0000
Married :55509 Male :63821 Bachelor's degree :19443 Asian : 6520 1st Qu.:0.0000
Never Married:30772 Some college, no degree:18863 Black : 13913 Median :0.0000
Separated : 2027 No high school diploma :16095 Multiracial : 2897 Mean :0.1393
Widowed : 6505 Associate degree : 9913 Pacific Islander: 618 3rd Qu.:0.0000
NA's :25338 (Other) :10744 White :105921 Max. :1.0000
NA's :25338
Citizenship EmploymentStatus Industry
Citizen, Native :116639 Disabled : 5712 Educational and health services :15017
Citizen, Naturalized: 7073 Employed :61733 Trade : 8933
Non-Citizen : 7590 Not in Labor Force:15246 Professional and business services: 7519
Retired :18619 Manufacturing : 6791
Unemployed : 4203 Leisure and hospitality : 6364
NA's :25789 (Other) :21618
NA's :65060
MetroArea Country
New York-Northern New Jersey-Long Island, NY-NJ-PA: 5409 United States:115063
Washington-Arlington-Alexandria, DC-VA-MD-WV : 4177 Mexico : 3921
Los Angeles-Long Beach-Santa Ana, CA : 4102 Philippines : 839
Philadelphia-Camden-Wilmington, PA-NJ-DE : 2855 India : 770
Chicago-Naperville-Joliet, IN-IN-WI : 2772 China : 581
(Other) :77749 (Other) : 9952
NA's :34238 NA's : 176
print('Country')
[1] "Country"
How many interviewees have a missing value for the new country of birth variable?
sum(is.na(CPS$Country))
[1] 176
Among all interviewees born outside of North America, which country was the most common place of birth?
sort(table(CPS$CountryOfBirthCode),decreasing = T)
57 303 233 210 207 73 312 247 110 327 301 217 329 313 333 364
115063 3921 839 770 581 518 477 458 438 426 410 334 330 309 217 206
314 215 139 163 332 128 362 120 212 365 370 462 242 138 231 368
189 187 179 173 167 162 159 149 144 136 136 129 128 111 109 109
164 240 223 213 440 555 416 315 421 109 220 448 414 129 209 360
104 102 98 97 85 81 80 76 76 73 73 72 65 64 64 64
150 373 341 214 116 132 427 203 429 206 243 449 205 224 229 316
61 61 60 57 56 55 55 52 52 49 48 48 45 45 44 44
501 202 134 119 249 363 216 158 239 407 66 104 235 311 126 136
43 42 41 39 39 37 36 35 32 32 31 29 29 29 28 28
200 211 117 140 147 160 512 137 248 515 130 165 154 226 166 246
26 26 25 24 24 24 24 23 23 23 22 22 20 20 19 19
343 100 127 412 78 102 238 372 436 511 408 417 447 457 60 103
19 18 18 18 17 17 17 17 17 16 15 15 15 15 13 13
300 321 330 361 451 151 152 162 148 218 328 369 454 108 157 222
13 13 13 13 13 12 12 12 11 11 11 11 11 10 10 10
323 310 340 399 400 508 339 374 106 149 156 236 324 444 461 523
10 9 9 9 9 9 7 7 6 6 6 6 6 6 6 6
527 161 245 459 155 168 105 118 159 338 423 425 142 228 453 430
6 5 5 5 4 4 3 3 3 3 3 3 2 2 2 1
460
1
table(CPS$Country,CPS$CountryOfBirthCode=='233')
FALSE TRUE
Afghanistan 26 0
Africa, not specified 129 0
Albania 18 0
Algeria 9 0
Americas, not specified 9 0
Antigua and Barbuda 13 0
Argentina 64 0
Armenia 35 0
Asia, not specified 39 0
Australia 43 0
Austria 17 0
Azerbaijan 3 0
Azores 22 0
Bahamas 10 0
Bangladesh 42 0
Barbados 6 0
Belarus 24 0
Belgium 13 0
Belize 9 0
Bermuda 13 0
Bolivia 13 0
Bosnia & Herzegovina 61 0
Brazil 159 0
Bulgaria 29 0
Cambodia 49 0
Cameroon 32 0
Canada 410 0
Cape Verde 15 0
Chile 37 0
China 581 0
Columbia 206 0
Costa Rica 29 0
Croatia 12 0
Cuba 426 0
Cyprus 0 0
Czech Republic 11 0
Czechoslovakia 3 0
Denmark 6 0
Dominica 11 0
Dominican Republic 330 0
Ecuador 136 0
Egypt 65 0
El Salvador 477 0
Elsewhere 81 0
England 179 0
Eritrea 15 0
Ethiopia 80 0
Europe, not specified 19 0
Fiji 9 0
Finland 10 0
France 73 0
Georgia 5 0
Germany 438 0
Ghana 76 0
Greece 56 0
Grenada 13 0
Guam 31 0
Guatemala 309 0
Guyana 109 0
Haiti 167 0
Honduras 189 0
Hong Kong 64 0
Hungary 25 0
India 770 0
Indonesia 26 0
Iran 144 0
Iraq 97 0
Ireland 39 0
Israel 57 0
Italy 149 0
Jamaica 217 0
Japan 187 0
Jordan 36 0
Kenya 55 0
Korea 334 0
Kosovo 0 0
Kuwait 10 0
Laos 98 0
Latvia 6 0
Lebanon 45 0
Liberia 52 0
Lithuania 10 0
Macedonia 12 0
Malaysia 20 0
Mexico 3921 0
Moldova 12 0
Morocco 17 0
Myanmar (Burma) 45 0
Nepal 44 0
Netherlands 28 0
New Zealand 23 0
Nicaragua 76 0
Nigeria 85 0
Northern Ireland 2 0
Norway 18 0
Oceania, not specified 0 0
Other U. S. Island Areas 0 0
Pakistan 109 0
Panama 44 0
Paraguay 11 0
Peru 136 0
Philippines 0 839
Poland 162 0
Portugal 64 0
Puerto Rico 518 0
Romania 55 0
Russia 173 0
Samoa 6 0
Saudi Arabia 29 0
Scotland 24 0
Senegal 6 0
Serbia 20 0
Sierra Leone 15 0
Singapore 6 0
Slovakia 6 0
Somalia 72 0
South Africa 48 0
South America, not specified 7 0
South Korea 73 0
Spain 41 0
Sri Lanka 17 0
St. Kitts--Nevis 3 0
St. Lucia 7 0
St. Vincent and the Grenadines 9 0
Sudan 13 0
Sweden 28 0
Switzerland 23 0
Syria 32 0
Taiwan 102 0
Tanzania 2 0
Thailand 128 0
Tonga 6 0
Trinidad and Tobago 60 0
Turkey 48 0
U. S. Virgin Islands 17 0
Uganda 15 0
Ukraine 104 0
United Kingdom 111 0
United States 115063 0
Uruguay 17 0
USSR 22 0
Uzbekistan 19 0
Venezuela 61 0
Vietnam 458 0
Wales 0 0
West Indies, not specified 19 0
Yemen 23 0
Yugoslavia 24 0
Zimbabwe 6 0
print('Philippines ')
[1] "Philippines "
What proportion of the interviewees from the “New York-Northern New Jersey-Long Island, NY-NJ-PA” metropolitan area have a country of birth that is not the United States? For this computation, don’t include people from this metropolitan area who have a missing country of birth.
newyork=subset(CPS, CPS$MetroArea=="New York-Northern New Jersey-Long Island, NY-NJ-PA", na.rm=T)
mean(newyork$Country!="United States", na.rm=T)
[1] 0.3086603
Which metropolitan area has the largest number (note – not proportion) of interviewees with a country of birth in India? Hint – remember to include na.rm=TRUE if you are using tapply() to answer this question.
sort(tapply(CPS$Country=="India", CPS$MetroArea, sum, na.rm=T),decreasing = T)
New York-Northern New Jersey-Long Island, NY-NJ-PA Washington-Arlington-Alexandria, DC-VA-MD-WV
96 50
Philadelphia-Camden-Wilmington, PA-NJ-DE Chicago-Naperville-Joliet, IN-IN-WI
32 31
Detroit-Warren-Livonia, MI Atlanta-Sandy Springs-Marietta, GA
30 27
San Francisco-Oakland-Fremont, CA Hartford-West Hartford-East Hartford, CT
27 26
Minneapolis-St Paul-Bloomington, MN-WI Los Angeles-Long Beach-Santa Ana, CA
23 19
San Jose-Sunnyvale-Santa Clara, CA Dallas-Fort Worth-Arlington, TX
19 18
Baltimore-Towson, MD Fresno, CA
16 16
Pittsburgh, PA Houston-Baytown-Sugar Land, TX
16 15
Providence-Fall River-Warwick, MA-RI Bridgeport-Stamford-Norwalk, CT
14 12
Milwaukee-Waukesha-West Allis, WI Boston-Cambridge-Quincy, MA-NH
12 11
Kansas City, MO-KS Honolulu, HI
11 9
Fayetteville-Springdale-Rogers, AR-MO Sacramento-Arden-Arcade-Roseville, CA
8 8
Tampa-St. Petersburg-Clearwater, FL Austin-Round Rock, TX
7 6
Brownsville-Harlingen, TX Des Moines, IA
6 6
Little Rock-North Little Rock, AR New Haven, CT
6 6
Portland-Vancouver-Beaverton, OR-WA Warner Robins, GA
6 6
Orlando, FL Seattle-Tacoma-Bellevue, WA
5 5
Charlotte-Gastonia-Concord, NC-SC Indianapolis, IN
4 4
Omaha-Council Bluffs, NE-IA Peoria, IL
4 4
Rochester-Dover, NH-ME San Diego-Carlsbad-San Marcos, CA
4 4
Trenton-Ewing, NJ Tulsa, OK
4 4
Albuquerque, NM Iowa City, IA
3 3
Madison, WI Norwich-New London, CT-RI
3 3
Reno-Sparks, NV Visalia-Porterville, CA
3 3
Atlantic City, NJ Bakersfield, CA
2 2
Birmingham-Hoover, AL Burlington-South Burlington, VT
2 2
Charleston-North Charleston, SC Cleveland-Elyria-Mentor, OH
2 2
Deltona-Daytona Beach-Ormond Beach, FL Fort Wayne, IN
2 2
Las Vegas-Paradise, NV Memphis, TN-MS-AR
2 2
Miami-Fort Lauderdale-Miami Beach, FL Nashville-Davidson-Murfreesboro, TN
2 2
Ogden-Clearfield, UT Oklahoma City, OK
2 2
Oxnard-Thousand Oaks-Ventura, CA Phoenix-Mesa-Scottsdale, AZ
2 2
Rochester, NY Salt Lake City, UT
2 2
Springfield, IL Winston-Salem, NC
2 2
Anderson, SC Bloomington-Normal IL
1 1
Boise City-Nampa, ID Cincinnati-Middletown, OH-KY-IN
1 1
Columbia, SC Greenville, SC
1 1
Harrisburg-Carlisle, PA Jacksonville, FL
1 1
Lawrence, KS Naples-Marco Island, FL
1 1
New Orleans-Metairie-Kenner, LA Olympia, WA
1 1
Provo-Orem, UT Syracuse, NY
1 1
Tucson, AZ Akron, OH
1 0
Albany-Schenectady-Troy, NY Albany, GA
0 0
Allentown-Bethlehem-Easton, PA-NJ Altoona, PA
0 0
Amarillo, TX Anderson, IN
0 0
Ann Arbor, MI Anniston-Oxford, AL
0 0
Appleton,WI Asheville, NC
0 0
Athens-Clark County, GA Augusta-Richmond County, GA-SC
0 0
Bangor, ME Barnstable Town, MA
0 0
Baton Rouge, LA Beaumont-Port Author, TX
0 0
Bellingham, WA Bend, OR
0 0
Billings, MT Binghamton, NY
0 0
Bloomington, IN Boulder, CO
0 0
Bowling Green, KY Bremerton-Silverdale, WA
0 0
Buffalo-Niagara Falls, NY Canton-Massillon, OH
0 0
Cape Coral-Fort Myers, FL Cedar Rapids, IA
0 0
Champaign-Urbana, IL Charleston, WV
0 0
Chattanooga, TN-GA Chico, CA
0 0
Coeur d'Alene, ID Colorado Springs, CO
0 0
Columbia, MO Columbus, GA-AL
0 0
Columbus, OH Corpus Christi, TX
0 0
Danbury, CT Davenport-Moline-Rock Island, IA-IL
0 0
Dayton, OH Decatur, Al
0 0
Decatur, IL Denver-Aurora, CO
0 0
Dover, DE Duluth, MN-WI
0 0
Durham, NC Eau Claire, WI
0 0
El Centro, CA El Paso, TX
0 0
Erie, PA Eugene-Springfield, OR
0 0
Evansville, IN-KY Fargo, ND-MN
0 0
Farmington, NM Fayetteville, NC
0 0
Flint, MI Florence, AL
0 0
Fort Collins-Loveland, CO Fort Smith, AR-OK
0 0
Fort Walton Beach-Crestview-Destin, FL Gainesville, FL
0 0
Grand Rapids-Wyoming, MI Greeley, CO
0 0
Green Bay, WI Greensboro-High Point, NC
0 0
Gulfport-Biloxi, MS Hagerstown-Martinsburg, MD-WV
0 0
Harrisonburg, VA Hickory-Morgantown-Lenoir, NC
0 0
Holland-Grand Haven, MI Huntington-Ashland, WV-KY-OH
0 0
Huntsville, AL Jackson, MI
0 0
Jackson, MS Jacksonville, NC
0 0
Janesville, WI Johnson City, TN
0 0
Johnstown, PA Joplin, MO
0 0
Kalamazoo-Portage, MI Kankakee-Bradley, IL
0 0
Killeen-Temple-Fort Hood, TX Kingsport-Bristol, TN-VA
0 0
Kingston, NY Knoxville, TN
0 0
La Crosse, WI Lafayette, LA
0 0
Lake Charles, LA Lakeland-Winter Haven, FL
0 0
Lancaster, PA Lansing-East Lansing, MI
0 0
Laredo, TX Las Cruses, NM
0 0
Lawton, OK Leominster-Fitchburg-Gardner, MA
0 0
Lexington-Fayette, KY Longview, TX
0 0
Louisville, KY-IN Lubbock, TX
0 0
Lynchburg, VA Macon, GA
0 0
Madera, CA McAllen-Edinburg-Pharr, TX
0 0
Medford, OR Merced, CA
0 0
Michigan City-La Porte, IN Midland, TX
0 0
Mobile, AL Modesto, CA
0 0
Monroe, LA Monroe, MI
0 0
Montgomery, AL Muskegon-Norton Shores, MI
0 0
Myrtle Beach-Conway-North Myrtle Beach, SC Napa, CA
0 0
Niles-Benton Harbor, MI Ocala, FL
0 0
Ocean City, NJ Oshkosh-Neenah, WI
0 0
Palm Bay-Melbourne-Titusville, FL Panama City-Lynn Haven, FL
0 0
Pensacola-Ferry Pass-Brent, FL Port St. Lucie-Fort Pierce, FL
0 0
Portland-South Portland, ME Poughkeepsie-Newburgh-Middletown, NY
0 0
Prescott, AZ Pueblo, CO
0 0
Punta Gorda, FL Racine, WI
0 0
Raleigh-Cary, NC Reading, PA
0 0
Richmond, VA Riverside-San Bernardino, CA
0 0
Roanoke, VA Rockford, IL
0 0
Saginaw-Saginaw Township North, MI Salem, OR
0 0
Salinas, CA Salisbury, MD
0 0
San Antonio, TX San Luis Obispo-Paso Robles, CA
0 0
Santa-Cruz-Watsonville, CA Santa Barbara-Santa Maria-Goleta, CA
0 0
Santa Fe, NM Santa Rosa-Petaluma, CA
0 0
Sarasota-Bradenton-Venice, FL Savannah, GA
0 0
Scranton-Wilkes Barre, PA Shreveport-Bossier City, LA
0 0
Sioux Falls, SD South Bend-Mishawaka, IN-MI
0 0
Spartanburg, SC Spokane, WA
0 0
Springfield, MA-CT Springfield, MO
0 0
Springfield, OH St. Cloud, MN
0 0
St. Louis, MO-IL Stockton, CA
0 0
Tallahassee, FL Toledo, OH
0 0
Topeka, KS Tuscaloosa, AL
0 0
Utica-Rome, NY Valdosta, GA
0 0
Vallejo-Fairfield, CA Vero Beach, FL
0 0
Victoria, TX Vineland-Millville-Bridgeton, NJ
0 0
Virginia Beach-Norfolk-Newport News, VA-NC Waco, TX
0 0
Waterbury, CT Waterloo-Cedar Falls, IA
0 0
Wausau, WI Wichita, KS
0 0
Worcester, MA-CT Yakima, WA
0 0
York-Hanover, PA Youngstown-Warren-Boardman, OH
0 0
In Brazil?
sort(tapply(CPS$Country=="Brazil", CPS$MetroArea, sum, na.rm=T),decreasing = T)
Boston-Cambridge-Quincy, MA-NH Miami-Fort Lauderdale-Miami Beach, FL
18 16
Los Angeles-Long Beach-Santa Ana, CA Washington-Arlington-Alexandria, DC-VA-MD-WV
9 8
Bridgeport-Stamford-Norwalk, CT New York-Northern New Jersey-Long Island, NY-NJ-PA
7 7
San Francisco-Oakland-Fremont, CA Danbury, CT
6 5
Davenport-Moline-Rock Island, IA-IL Philadelphia-Camden-Wilmington, PA-NJ-DE
4 4
Canton-Massillon, OH Phoenix-Mesa-Scottsdale, AZ
3 3
Providence-Fall River-Warwick, MA-RI Salt Lake City, UT
3 3
Barnstable Town, MA Charlotte-Gastonia-Concord, NC-SC
2 2
Chicago-Naperville-Joliet, IN-IN-WI Columbia, SC
2 2
Dallas-Fort Worth-Arlington, TX Jacksonville, FL
2 2
Orlando, FL Sacramento-Arden-Arcade-Roseville, CA
2 2
Akron, OH Albuquerque, NM
1 1
Atlanta-Sandy Springs-Marietta, GA Bremerton-Silverdale, WA
1 1
Cape Coral-Fort Myers, FL Chico, CA
1 1
Cincinnati-Middletown, OH-KY-IN Denver-Aurora, CO
1 1
Hartford-West Hartford-East Hartford, CT Kansas City, MO-KS
1 1
Leominster-Fitchburg-Gardner, MA Louisville, KY-IN
1 1
Minneapolis-St Paul-Bloomington, MN-WI Monroe, LA
1 1
Montgomery, AL Oxnard-Thousand Oaks-Ventura, CA
1 1
Pensacola-Ferry Pass-Brent, FL Racine, WI
1 1
Rochester, NY Salem, OR
1 1
San Jose-Sunnyvale-Santa Clara, CA Seattle-Tacoma-Bellevue, WA
1 1
Tampa-St. Petersburg-Clearwater, FL Trenton-Ewing, NJ
1 1
Virginia Beach-Norfolk-Newport News, VA-NC Waterbury, CT
1 1
Wichita, KS Albany-Schenectady-Troy, NY
1 0
Albany, GA Allentown-Bethlehem-Easton, PA-NJ
0 0
Altoona, PA Amarillo, TX
0 0
Anderson, IN Anderson, SC
0 0
Ann Arbor, MI Anniston-Oxford, AL
0 0
Appleton,WI Asheville, NC
0 0
Athens-Clark County, GA Atlantic City, NJ
0 0
Augusta-Richmond County, GA-SC Austin-Round Rock, TX
0 0
Bakersfield, CA Baltimore-Towson, MD
0 0
Bangor, ME Baton Rouge, LA
0 0
Beaumont-Port Author, TX Bellingham, WA
0 0
Bend, OR Billings, MT
0 0
Binghamton, NY Birmingham-Hoover, AL
0 0
Bloomington-Normal IL Bloomington, IN
0 0
Boise City-Nampa, ID Boulder, CO
0 0
Bowling Green, KY Brownsville-Harlingen, TX
0 0
Buffalo-Niagara Falls, NY Burlington-South Burlington, VT
0 0
Cedar Rapids, IA Champaign-Urbana, IL
0 0
Charleston-North Charleston, SC Charleston, WV
0 0
Chattanooga, TN-GA Cleveland-Elyria-Mentor, OH
0 0
Coeur d'Alene, ID Colorado Springs, CO
0 0
Columbia, MO Columbus, GA-AL
0 0
Columbus, OH Corpus Christi, TX
0 0
Dayton, OH Decatur, Al
0 0
Decatur, IL Deltona-Daytona Beach-Ormond Beach, FL
0 0
Des Moines, IA Detroit-Warren-Livonia, MI
0 0
Dover, DE Duluth, MN-WI
0 0
Durham, NC Eau Claire, WI
0 0
El Centro, CA El Paso, TX
0 0
Erie, PA Eugene-Springfield, OR
0 0
Evansville, IN-KY Fargo, ND-MN
0 0
Farmington, NM Fayetteville-Springdale-Rogers, AR-MO
0 0
Fayetteville, NC Flint, MI
0 0
Florence, AL Fort Collins-Loveland, CO
0 0
Fort Smith, AR-OK Fort Walton Beach-Crestview-Destin, FL
0 0
Fort Wayne, IN Fresno, CA
0 0
Gainesville, FL Grand Rapids-Wyoming, MI
0 0
Greeley, CO Green Bay, WI
0 0
Greensboro-High Point, NC Greenville, SC
0 0
Gulfport-Biloxi, MS Hagerstown-Martinsburg, MD-WV
0 0
Harrisburg-Carlisle, PA Harrisonburg, VA
0 0
Hickory-Morgantown-Lenoir, NC Holland-Grand Haven, MI
0 0
Honolulu, HI Houston-Baytown-Sugar Land, TX
0 0
Huntington-Ashland, WV-KY-OH Huntsville, AL
0 0
Indianapolis, IN Iowa City, IA
0 0
Jackson, MI Jackson, MS
0 0
Jacksonville, NC Janesville, WI
0 0
Johnson City, TN Johnstown, PA
0 0
Joplin, MO Kalamazoo-Portage, MI
0 0
Kankakee-Bradley, IL Killeen-Temple-Fort Hood, TX
0 0
Kingsport-Bristol, TN-VA Kingston, NY
0 0
Knoxville, TN La Crosse, WI
0 0
Lafayette, LA Lake Charles, LA
0 0
Lakeland-Winter Haven, FL Lancaster, PA
0 0
Lansing-East Lansing, MI Laredo, TX
0 0
Las Cruses, NM Las Vegas-Paradise, NV
0 0
Lawrence, KS Lawton, OK
0 0
Lexington-Fayette, KY Little Rock-North Little Rock, AR
0 0
Longview, TX Lubbock, TX
0 0
Lynchburg, VA Macon, GA
0 0
Madera, CA Madison, WI
0 0
McAllen-Edinburg-Pharr, TX Medford, OR
0 0
Memphis, TN-MS-AR Merced, CA
0 0
Michigan City-La Porte, IN Midland, TX
0 0
Milwaukee-Waukesha-West Allis, WI Mobile, AL
0 0
Modesto, CA Monroe, MI
0 0
Muskegon-Norton Shores, MI Myrtle Beach-Conway-North Myrtle Beach, SC
0 0
Napa, CA Naples-Marco Island, FL
0 0
Nashville-Davidson-Murfreesboro, TN New Haven, CT
0 0
New Orleans-Metairie-Kenner, LA Niles-Benton Harbor, MI
0 0
Norwich-New London, CT-RI Ocala, FL
0 0
Ocean City, NJ Ogden-Clearfield, UT
0 0
Oklahoma City, OK Olympia, WA
0 0
Omaha-Council Bluffs, NE-IA Oshkosh-Neenah, WI
0 0
Palm Bay-Melbourne-Titusville, FL Panama City-Lynn Haven, FL
0 0
Peoria, IL Pittsburgh, PA
0 0
Port St. Lucie-Fort Pierce, FL Portland-South Portland, ME
0 0
Portland-Vancouver-Beaverton, OR-WA Poughkeepsie-Newburgh-Middletown, NY
0 0
Prescott, AZ Provo-Orem, UT
0 0
Pueblo, CO Punta Gorda, FL
0 0
Raleigh-Cary, NC Reading, PA
0 0
Reno-Sparks, NV Richmond, VA
0 0
Riverside-San Bernardino, CA Roanoke, VA
0 0
Rochester-Dover, NH-ME Rockford, IL
0 0
Saginaw-Saginaw Township North, MI Salinas, CA
0 0
Salisbury, MD San Antonio, TX
0 0
San Diego-Carlsbad-San Marcos, CA San Luis Obispo-Paso Robles, CA
0 0
Santa-Cruz-Watsonville, CA Santa Barbara-Santa Maria-Goleta, CA
0 0
Santa Fe, NM Santa Rosa-Petaluma, CA
0 0
Sarasota-Bradenton-Venice, FL Savannah, GA
0 0
Scranton-Wilkes Barre, PA Shreveport-Bossier City, LA
0 0
Sioux Falls, SD South Bend-Mishawaka, IN-MI
0 0
Spartanburg, SC Spokane, WA
0 0
Springfield, IL Springfield, MA-CT
0 0
Springfield, MO Springfield, OH
0 0
St. Cloud, MN St. Louis, MO-IL
0 0
Stockton, CA Syracuse, NY
0 0
Tallahassee, FL Toledo, OH
0 0
Topeka, KS Tucson, AZ
0 0
Tulsa, OK Tuscaloosa, AL
0 0
Utica-Rome, NY Valdosta, GA
0 0
Vallejo-Fairfield, CA Vero Beach, FL
0 0
Victoria, TX Vineland-Millville-Bridgeton, NJ
0 0
Visalia-Porterville, CA Waco, TX
0 0
Warner Robins, GA Waterloo-Cedar Falls, IA
0 0
Wausau, WI Winston-Salem, NC
0 0
Worcester, MA-CT Yakima, WA
0 0
York-Hanover, PA Youngstown-Warren-Boardman, OH
0 0
In Somalia?
sort(tapply(CPS$Country=="Somalia", CPS$MetroArea, sum, na.rm=T),decreasing = T)
Minneapolis-St Paul-Bloomington, MN-WI Phoenix-Mesa-Scottsdale, AZ
17 7
Seattle-Tacoma-Bellevue, WA St. Cloud, MN
7 7
Columbus, OH Fargo, ND-MN
5 5
Burlington-South Burlington, VT Portland-South Portland, ME
3 3
Portland-Vancouver-Beaverton, OR-WA Houston-Baytown-Sugar Land, TX
3 2
Sioux Falls, SD Dayton, OH
2 1
Richmond, VA Akron, OH
1 0
Albany-Schenectady-Troy, NY Albany, GA
0 0
Albuquerque, NM Allentown-Bethlehem-Easton, PA-NJ
0 0
Altoona, PA Amarillo, TX
0 0
Anderson, IN Anderson, SC
0 0
Ann Arbor, MI Anniston-Oxford, AL
0 0
Appleton,WI Asheville, NC
0 0
Athens-Clark County, GA Atlanta-Sandy Springs-Marietta, GA
0 0
Atlantic City, NJ Augusta-Richmond County, GA-SC
0 0
Austin-Round Rock, TX Bakersfield, CA
0 0
Baltimore-Towson, MD Bangor, ME
0 0
Barnstable Town, MA Baton Rouge, LA
0 0
Beaumont-Port Author, TX Bellingham, WA
0 0
Bend, OR Billings, MT
0 0
Binghamton, NY Birmingham-Hoover, AL
0 0
Bloomington-Normal IL Bloomington, IN
0 0
Boise City-Nampa, ID Boston-Cambridge-Quincy, MA-NH
0 0
Boulder, CO Bowling Green, KY
0 0
Bremerton-Silverdale, WA Bridgeport-Stamford-Norwalk, CT
0 0
Brownsville-Harlingen, TX Buffalo-Niagara Falls, NY
0 0
Canton-Massillon, OH Cape Coral-Fort Myers, FL
0 0
Cedar Rapids, IA Champaign-Urbana, IL
0 0
Charleston-North Charleston, SC Charleston, WV
0 0
Charlotte-Gastonia-Concord, NC-SC Chattanooga, TN-GA
0 0
Chicago-Naperville-Joliet, IN-IN-WI Chico, CA
0 0
Cincinnati-Middletown, OH-KY-IN Cleveland-Elyria-Mentor, OH
0 0
Coeur d'Alene, ID Colorado Springs, CO
0 0
Columbia, MO Columbia, SC
0 0
Columbus, GA-AL Corpus Christi, TX
0 0
Dallas-Fort Worth-Arlington, TX Danbury, CT
0 0
Davenport-Moline-Rock Island, IA-IL Decatur, Al
0 0
Decatur, IL Deltona-Daytona Beach-Ormond Beach, FL
0 0
Denver-Aurora, CO Des Moines, IA
0 0
Detroit-Warren-Livonia, MI Dover, DE
0 0
Duluth, MN-WI Durham, NC
0 0
Eau Claire, WI El Centro, CA
0 0
El Paso, TX Erie, PA
0 0
Eugene-Springfield, OR Evansville, IN-KY
0 0
Farmington, NM Fayetteville-Springdale-Rogers, AR-MO
0 0
Fayetteville, NC Flint, MI
0 0
Florence, AL Fort Collins-Loveland, CO
0 0
Fort Smith, AR-OK Fort Walton Beach-Crestview-Destin, FL
0 0
Fort Wayne, IN Fresno, CA
0 0
Gainesville, FL Grand Rapids-Wyoming, MI
0 0
Greeley, CO Green Bay, WI
0 0
Greensboro-High Point, NC Greenville, SC
0 0
Gulfport-Biloxi, MS Hagerstown-Martinsburg, MD-WV
0 0
Harrisburg-Carlisle, PA Harrisonburg, VA
0 0
Hartford-West Hartford-East Hartford, CT Hickory-Morgantown-Lenoir, NC
0 0
Holland-Grand Haven, MI Honolulu, HI
0 0
Huntington-Ashland, WV-KY-OH Huntsville, AL
0 0
Indianapolis, IN Iowa City, IA
0 0
Jackson, MI Jackson, MS
0 0
Jacksonville, FL Jacksonville, NC
0 0
Janesville, WI Johnson City, TN
0 0
Johnstown, PA Joplin, MO
0 0
Kalamazoo-Portage, MI Kankakee-Bradley, IL
0 0
Kansas City, MO-KS Killeen-Temple-Fort Hood, TX
0 0
Kingsport-Bristol, TN-VA Kingston, NY
0 0
Knoxville, TN La Crosse, WI
0 0
Lafayette, LA Lake Charles, LA
0 0
Lakeland-Winter Haven, FL Lancaster, PA
0 0
Lansing-East Lansing, MI Laredo, TX
0 0
Las Cruses, NM Las Vegas-Paradise, NV
0 0
Lawrence, KS Lawton, OK
0 0
Leominster-Fitchburg-Gardner, MA Lexington-Fayette, KY
0 0
Little Rock-North Little Rock, AR Longview, TX
0 0
Los Angeles-Long Beach-Santa Ana, CA Louisville, KY-IN
0 0
Lubbock, TX Lynchburg, VA
0 0
Macon, GA Madera, CA
0 0
Madison, WI McAllen-Edinburg-Pharr, TX
0 0
Medford, OR Memphis, TN-MS-AR
0 0
Merced, CA Miami-Fort Lauderdale-Miami Beach, FL
0 0
Michigan City-La Porte, IN Midland, TX
0 0
Milwaukee-Waukesha-West Allis, WI Mobile, AL
0 0
Modesto, CA Monroe, LA
0 0
Monroe, MI Montgomery, AL
0 0
Muskegon-Norton Shores, MI Myrtle Beach-Conway-North Myrtle Beach, SC
0 0
Napa, CA Naples-Marco Island, FL
0 0
Nashville-Davidson-Murfreesboro, TN New Haven, CT
0 0
New Orleans-Metairie-Kenner, LA New York-Northern New Jersey-Long Island, NY-NJ-PA
0 0
Niles-Benton Harbor, MI Norwich-New London, CT-RI
0 0
Ocala, FL Ocean City, NJ
0 0
Ogden-Clearfield, UT Oklahoma City, OK
0 0
Olympia, WA Omaha-Council Bluffs, NE-IA
0 0
Orlando, FL Oshkosh-Neenah, WI
0 0
Oxnard-Thousand Oaks-Ventura, CA Palm Bay-Melbourne-Titusville, FL
0 0
Panama City-Lynn Haven, FL Pensacola-Ferry Pass-Brent, FL
0 0
Peoria, IL Philadelphia-Camden-Wilmington, PA-NJ-DE
0 0
Pittsburgh, PA Port St. Lucie-Fort Pierce, FL
0 0
Poughkeepsie-Newburgh-Middletown, NY Prescott, AZ
0 0
Providence-Fall River-Warwick, MA-RI Provo-Orem, UT
0 0
Pueblo, CO Punta Gorda, FL
0 0
Racine, WI Raleigh-Cary, NC
0 0
Reading, PA Reno-Sparks, NV
0 0
Riverside-San Bernardino, CA Roanoke, VA
0 0
Rochester-Dover, NH-ME Rochester, NY
0 0
Rockford, IL Sacramento-Arden-Arcade-Roseville, CA
0 0
Saginaw-Saginaw Township North, MI Salem, OR
0 0
Salinas, CA Salisbury, MD
0 0
Salt Lake City, UT San Antonio, TX
0 0
San Diego-Carlsbad-San Marcos, CA San Francisco-Oakland-Fremont, CA
0 0
San Jose-Sunnyvale-Santa Clara, CA San Luis Obispo-Paso Robles, CA
0 0
Santa-Cruz-Watsonville, CA Santa Barbara-Santa Maria-Goleta, CA
0 0
Santa Fe, NM Santa Rosa-Petaluma, CA
0 0
Sarasota-Bradenton-Venice, FL Savannah, GA
0 0
Scranton-Wilkes Barre, PA Shreveport-Bossier City, LA
0 0
South Bend-Mishawaka, IN-MI Spartanburg, SC
0 0
Spokane, WA Springfield, IL
0 0
Springfield, MA-CT Springfield, MO
0 0
Springfield, OH St. Louis, MO-IL
0 0
Stockton, CA Syracuse, NY
0 0
Tallahassee, FL Tampa-St. Petersburg-Clearwater, FL
0 0
Toledo, OH Topeka, KS
0 0
Trenton-Ewing, NJ Tucson, AZ
0 0
Tulsa, OK Tuscaloosa, AL
0 0
Utica-Rome, NY Valdosta, GA
0 0
Vallejo-Fairfield, CA Vero Beach, FL
0 0
Victoria, TX Vineland-Millville-Bridgeton, NJ
0 0
Virginia Beach-Norfolk-Newport News, VA-NC Visalia-Porterville, CA
0 0
Waco, TX Warner Robins, GA
0 0
Washington-Arlington-Alexandria, DC-VA-MD-WV Waterbury, CT
0 0
Waterloo-Cedar Falls, IA Wausau, WI
0 0
Wichita, KS Winston-Salem, NC
0 0
Worcester, MA-CT Yakima, WA
0 0
York-Hanover, PA Youngstown-Warren-Boardman, OH
0 0