This data set pertains to the shipping expenses of a company and encompasses a wide array of data, including numerous empty columns and values.
shipment <- read.csv("https://raw.githubusercontent.com/yli1048/yli1048/refs/heads/607/Shipment.csv", header = TRUE, skip = 2)
head(shipment)
## Order.Date X X.1 X.2 X.3 X.4 X.5 X.6 X.7 X.8 X.9 X.10 X.11
## 1 2013/3/14 NA NA NA NA NA NA NA NA NA NA NA 91.056
## 2 2013/12/16 NA NA NA NA NA NA 129.44 NA NA NA NA NA
## 3 2013/6/2 NA NA NA NA NA NA NA NA NA 605.47 NA NA
## 4 2013/10/21 NA NA NA NA NA NA NA NA NA NA 788.86 NA
## 5 2013/8/27 NA NA NA NA NA NA 13.36 NA NA NA NA NA
## 6 2013/11/28 NA NA NA NA NA NA NA 542.34 NA NA NA NA
shipment_1 = shipment[c(1, 2, 3, 4)]
names(shipment_1)[1] <- "Order_Date"
names(shipment_1)[2] <- "Consumer"
names(shipment_1)[3] <- "Corporate"
names(shipment_1)[4] <- "Home_Office"
shipment_1 <- melt(shipment_1, id.vars = c("Order_Date"), variable.name = "Segment", value.name = "Amount")
first_class <- shipment_1[!is.na(shipment_1$Amount),]
print(first_class)
## Order_Date Segment Amount
## 12 2013/1/15 Consumer 149.950
## 21 2013/8/15 Consumer 243.600
## 38 2013/7/5 Corporate 242.546
## 41 2013/3/19 Corporate 590.762
## 43 2013/1/6 Corporate 12.780
## 54 2013/5/5 Corporate 47.320
shipment_3 = shipment[c(1, 8, 9, 10)]
names(shipment_3)[1] <- "Order_Date"
names(shipment_3)[2] <- "Consumer"
names(shipment_3)[3] <- "Corporate"
names(shipment_3)[4] <- "Home_Office"
shipment_3 <- melt(shipment_3, id.vars = c("Order_Date"), variable.name = "Segment", value.name = "Amount")
second_class <- shipment_3[!is.na(shipment_3$Amount),]
print(second_class)
## Order_Date Segment Amount
## 2 2013/12/16 Consumer 129.440
## 5 2013/8/27 Consumer 13.360
## 22 2013/1/13 Consumer 545.940
## 25 2013/1/21 Consumer 25.248
## 33 2013/11/28 Corporate 542.340
## 37 2013/4/5 Corporate 4251.920
## 51 2013/10/18 Corporate 2216.800
shipment_4 = shipment[c(1, 11, 12, 13)]
names(shipment_4)[1] <- "Order_Date"
names(shipment_4)[2] <- "Consumer"
names(shipment_4)[3] <- "Corporate"
names(shipment_4)[4] <- "HomeOffice"
shipment_4 <- melt(shipment_4, id.vars = c("Order_Date"), variable.name = "Segment", value.name = "Amount")
standard_class <- shipment_4[!is.na(shipment_4$Amount),]
print(standard_class)
## Order_Date Segment Amount
## 3 2013/6/2 Consumer 605.470
## 15 2013/6/27 Consumer 616.140
## 18 2013/12/12 Consumer 23.472
## 23 2013/4/25 Consumer 302.376
## 31 2013/10/21 Corporate 788.860
## 34 2013/3/31 Corporate 1.869
## 35 2013/11/21 Corporate 865.500
## 36 2013/11/1 Corporate 1044.440
## 40 2013/12/2 Corporate 21.190
## 44 2013/5/14 Corporate 310.880
## 46 2013/4/29 Corporate 661.504
## 53 2013/8/17 Corporate 484.790
## 55 2013/3/14 HomeOffice 91.056
## 74 2013/10/24 HomeOffice 10.368
The process of organizing the wide shipment data set involves segmenting it by different shipping methods, thereby enhancing its clarity. Furthermore, integrating the data sets and incorporating an additional column to delineate the segments can significantly improve the comprehensibility of the data. It is worth noting that numerous rows and columns contained empty values, particularly in relation to same-day delivery.
For the second data set, it was sourced from the DOHMH New York City Restaurant Inspection Results. Due to its size, only selected and organized segments of the data set will be presented.
inspection <- read.csv("https://raw.githubusercontent.com/yli1048/yli1048/refs/heads/607/NYC_Restaurant_Inspection_Results.csv", header = TRUE)
store_info <- inspection[c(1, 2, 7, 8)]
names(store_info)[2] <- "NAME"
head(store_info)
## CAMIS NAME PHONE CUISINE.DESCRIPTION
## 1 50139400 ANDIE'S EATS 9143648113 Bakery Products/Desserts
## 2 41243535 EMPIRE CORNER II 2124105756 Chinese
## 3 41470527 HEAVEN'S HOT BAGEL 2124207566 Bagels/Pretzels
## 4 40373462 VILLA BERULIA 2126891970 Italian
## 5 50128067 GEORGIAN CORNER 3472400940 Eastern European
## 6 50104255 WO HOP NEXT DOOR 9175510233 Chinese
location <- inspection[c(1, 4, 5, 6, 3, 19, 20)]
location_1 = unite(location, address, c(BUILDING, STREET, ZIPCODE, BORO))
location_new = unite(location_1, coordinates, c(Latitude, Longitude))
head(location_new)
## CAMIS address
## 1 50139400 185_BLEECKER STREET_10012_Manhattan
## 2 41243535 1415_5 AVENUE_10029_Manhattan
## 3 41470527 283_EAST HOUSTON STREET_10002_Manhattan
## 4 40373462 107_EAST 34 STREET_10016_Manhattan
## 5 50128067 626_SHEEPSHEAD BAY ROAD_11224_Brooklyn
## 6 50104255 15_MOTT STREET_10013_Manhattan
## coordinates
## 1 40.729078531416_-74.001006642818
## 2 40.800487432208_-73.946572413814
## 3 40.721573256914_-73.984205791876
## 4 40.746865746167_-73.980821742973
## 5 40.578972142427_-73.975261496585
## 6 40.714179932664_-73.998755505932
violations <- inspection[c(1, 9, 10, 11, 12, 13)]
violations = unite(violations, inspection, c(INSPECTION.DATE, ACTION))
violations = unite(violations, violations, c(CRITICAL.FLAG, VIOLATION.CODE, VIOLATION.DESCRIPTION))
head(violations)
## CAMIS
## 1 50139400
## 2 41243535
## 3 41470527
## 4 40373462
## 5 50128067
## 6 50104255
## inspection
## 1 01/11/2024_Violations were cited in the following area(s).
## 2 04/02/2024_Violations were cited in the following area(s).
## 3 01/21/2022_Violations were cited in the following area(s).
## 4 04/13/2022_Violations were cited in the following area(s).
## 5 08/14/2024_Violations were cited in the following area(s).
## 6 08/11/2021_Establishment Closed by DOHMH. Violations were cited in the following area(s) and those requiring immediate action were addressed.
## violations
## 1 Not Critical_10B_Anti-siphonage or back-flow prevention device not provided where required; equipment or floor not properly drained; sewage disposal system in disrepair or not functioning properly. Condensation or liquid waste improperly disposed of.
## 2 Not Critical_10F_Non-food contact surface or equipment made of unacceptable material, not kept clean, or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.
## 3 Not Critical_10F_Non-food contact surface improperly constructed. Unacceptable material used. Non-food contact surface or equipment improperly maintained and/or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.
## 4 Critical_06D_Food contact surface not properly washed, rinsed and sanitized after each use and following any activity when contamination may have occurred.
## 5 Not Critical_10F_Non-food contact surface or equipment made of unacceptable material, not kept clean, or not properly sealed, raised, spaced or movable to allow accessibility for cleaning on all sides, above and underneath the unit.
## 6 Critical_04M_Live roaches present in facility's food and/or non-food areas.
inspection_details <- inspection[c(1, 14, 15, 16, 17, 18)]
inspection_details = unite(inspection_details, score, c(RECORD.DATE, SCORE))
inspection_details = unite(inspection_details, grade, c(GRADE.DATE, GRADE))
inspection_details <- melt(inspection_details, id.vars = c("CAMIS"), variable.name = "record", value.name = "Date_and_value")
head(inspection_details)
## CAMIS record Date_and_value
## 1 50139400 score 10/09/2024_70
## 2 41243535 score 10/09/2024_5
## 3 41470527 score 10/09/2024_41
## 4 40373462 score 10/09/2024_11
## 5 50128067 score 10/09/2024_17
## 6 50104255 score 10/09/2024_44
community <- inspection[c(1, 21, 22, 23, 24, 25, 26)]
community <- melt(community, id.vars = c("CAMIS"), variable.name = "community", value.name = "info")
head(community)
## CAMIS community info
## 1 50139400 Community.Board 102
## 2 41243535 Community.Board 111
## 3 41470527 Community.Board 103
## 4 40373462 Community.Board 106
## 5 50128067 Community.Board 313
## 6 50104255 Community.Board 103
The extensive data set consists of 27 columns, which can be categorized into smaller datasets based on specific criteria. Additionally, CAMIS functions as a distinct identifier for each store, facilitating its use as a representation of the store in the data frames. Segmenting the wide data set enables easier data retrieval and enhances overall organization.
The data set pertains to marriages for both genders and encompasses a wide array of conditions. It can be segmented into distinct categories and data frames, enabling a more systematic and organized presentation of the information.
marriages <- read.csv("https://raw.githubusercontent.com/fivethirtyeight/data/refs/heads/master/marriage/both_sexes.csv", header = TRUE)
head(marriages)
## X year date all_2534 HS_2534 SC_2534 BAp_2534 BAo_2534 GD_2534
## 1 1 1960 1960-01-01 0.1233145 0.1095332 0.1522818 0.2389952 0.2389952 NA
## 2 2 1970 1970-01-01 0.1269715 0.1094000 0.1495096 0.2187031 0.2187031 NA
## 3 3 1980 1980-01-01 0.1991767 0.1617313 0.2236916 0.2881646 0.2881646 NA
## 4 4 1990 1990-01-01 0.2968306 0.2777491 0.2780912 0.3612968 0.3656655 0.3474505
## 5 5 2000 2000-01-01 0.3450087 0.3316545 0.3249205 0.3874906 0.3939579 0.3691740
## 6 6 2001 2001-01-01 0.3527767 0.3446069 0.3341101 0.3835686 0.3925148 0.3590304
## White_2534 Black_2534 Hisp_2534 NE_2534 MA_2534 Midwest_2534 South_2534
## 1 0.1164848 0.1621855 0.1393736 0.1504184 0.1628934 0.1121467 0.1090562
## 2 0.1179043 0.1855163 0.1298769 0.1517231 0.1640680 0.1153741 0.1126220
## 3 0.1824126 0.3137500 0.1885440 0.2414327 0.2505925 0.1828339 0.1688435
## 4 0.2639256 0.4838556 0.2962372 0.3500384 0.3623321 0.2755046 0.2639794
## 5 0.3127149 0.5144994 0.3180681 0.4091852 0.4175565 0.3308022 0.3099712
## 6 0.3183506 0.5437985 0.3321214 0.4200581 0.4294281 0.3344332 0.3182688
## Mountain_2534 Pacific_2534 poor_2534 mid_2534 rich_2534 all_3544
## 1 0.09152117 0.1198758 0.1371597 0.07514929 0.2066776 0.07058157
## 2 0.10293602 0.1374964 0.1717202 0.08159207 0.1724093 0.06732520
## 3 0.17434230 0.2334279 0.3100591 0.14825303 0.1851082 0.06883378
## 4 0.25264326 0.3319579 0.4199108 0.24320008 0.2783226 0.11191800
## 5 0.30621032 0.3753061 0.5033676 0.30202036 0.2717386 0.15605881
## 6 0.30980779 0.3844799 0.5178771 0.31716118 0.2532041 0.15642529
## HS_3544 SC_3544 BAp_3544 BAo_3544 GD_3544 White_3544 Black_3544
## 1 0.06860309 0.06663695 0.1326265 0.1326265 NA 0.06825586 0.08836728
## 2 0.06511964 0.06271724 0.1116899 0.1116899 NA 0.06250372 0.10290904
## 3 0.06429102 0.06531333 0.1056102 0.1056102 NA 0.05966739 0.13140081
## 4 0.11210043 0.09699372 0.1285172 0.1258567 0.1328018 0.09611312 0.22010298
## 5 0.16993703 0.13800404 0.1541238 0.1536299 0.1550970 0.13207032 0.30239381
## 6 0.16870156 0.13986044 0.1548151 0.1524923 0.1595169 0.13287455 0.30857796
## Hisp_3544 NE_3544 MA_3544 Midwest_3544 South_3544 Mountain_3544
## 1 0.07307651 0.09194322 0.09347468 0.06863360 0.06026353 0.04739747
## 2 0.07070500 0.08570110 0.09040725 0.06156272 0.05966057 0.04651163
## 3 0.08110790 0.07997323 0.09744428 0.06070641 0.05914089 0.04880077
## 4 0.12194206 0.12785915 0.14354989 0.10157576 0.09637035 0.09189904
## 5 0.15469520 0.17327422 0.18819256 0.14539201 0.14230600 0.13584194
## 6 0.14953050 0.16653497 0.18315109 0.14794407 0.14312592 0.13943820
## Pacific_3544 poor_3544 mid_3544 rich_3544 all_4554 HS_4554 SC_4554
## 1 0.05822486 0.1019749 0.04717272 0.08553870 0.07254649 0.06840792 0.07903755
## 2 0.06347796 0.1117548 0.04566838 0.06499159 0.05968794 0.05833439 0.05443478
## 3 0.07552538 0.1291426 0.05050321 0.04445951 0.05250871 0.05036563 0.04816180
## 4 0.13134638 0.2012208 0.09024739 0.06573916 0.05947824 0.05988244 0.04654087
## 5 0.17480047 0.2813137 0.12815751 0.08622046 0.08804394 0.09442809 0.07558786
## 6 0.17694864 0.2919112 0.13267625 0.06803283 0.08823342 0.09189007 0.07795481
## BAp_4554 BAo_4554 GD_4554 White_4554 Black_4554 Hisp_4554 NE_4554
## 1 0.15360889 0.15360889 NA 0.07246692 0.06913249 0.06636058 0.10236412
## 2 0.10466047 0.10466047 NA 0.05754799 0.07899168 0.05810740 0.08028082
## 3 0.08623774 0.08623774 NA 0.04765354 0.08624602 0.06522951 0.06930253
## 4 0.07301884 0.06416529 0.08394886 0.05092552 0.11617699 0.07613556 0.07047502
## 5 0.09208417 0.09097472 0.09362802 0.07578174 0.17587334 0.09418009 0.10232170
## 6 0.09333365 0.09313480 0.09362876 0.07516912 0.18154531 0.09409896 0.09868408
## MA_4554 Midwest_4554 South_4554 Mountain_4554 Pacific_4554 poor_4554
## 1 0.09264788 0.07285321 0.05977295 0.04754183 0.05996993 0.1030055
## 2 0.07860635 0.05791163 0.05174462 0.03970134 0.04826312 0.1016489
## 3 0.07508466 0.04807290 0.04485348 0.03374438 0.04958992 0.1003011
## 4 0.08373134 0.05398391 0.05043636 0.04459411 0.06461875 0.1148335
## 5 0.11269659 0.08302437 0.07631858 0.07637774 0.09896832 0.1718976
## 6 0.10953635 0.08207629 0.07886513 0.07405971 0.10119511 0.1759369
## mid_4554 rich_4554 nokids_all_2534 kids_all_2534 nokids_HS_2534
## 1 0.05364421 0.07908591 0.4640564 0.002820625 0.4430148
## 2 0.04221637 0.05142867 0.4309043 0.009868596 0.4246779
## 3 0.03830266 0.03311296 0.4464304 0.025285667 0.4319342
## 4 0.04562332 0.03136386 0.5425242 0.060277451 0.5464881
## 5 0.07055672 0.03897342 0.5714531 0.099472713 0.5711395
## 6 0.07407508 0.02857320 0.5852213 0.110178467 0.6045475
## nokids_SC_2534 nokids_BAp_2534 nokids_BAo_2534 nokids_GD_2534 kids_HS_2534
## 1 0.5000402 0.5619099 0.5619099 NA 0.003318886
## 2 0.4333479 0.4554766 0.4554766 NA 0.012465915
## 3 0.4505900 0.4719700 0.4719700 NA 0.031930752
## 4 0.5238446 0.5560765 0.5633301 0.5332628 0.078470444
## 5 0.5700042 0.5729677 0.5862213 0.5367160 0.127193577
## 6 0.5810912 0.5698644 0.5864967 0.5258800 0.141395652
## kids_SC_2534 kids_BAp_2534 kids_BAo_2534 kids_GD_2534 nokids_poor_2534
## 1 0.001150824 0.0005751073 0.0005751073 NA 0.4933061
## 2 0.003699982 0.0014683425 0.0014683425 NA 0.5097742
## 3 0.018135401 0.0062544364 0.0062544364 NA 0.5740402
## 4 0.052032702 0.0171241042 0.0181766027 0.01374234 0.6546908
## 5 0.097625310 0.0370024452 0.0401009875 0.02761467 0.7055451
## 6 0.110030662 0.0399801447 0.0445838012 0.02645041 0.7147334
## nokids_mid_2534 nokids_rich_2534 kids_poor_2534 kids_mid_2534 kids_rich_2534
## 1 0.4100080 0.4921184 0.008722711 0.0007532065 0.0008027331
## 2 0.3764538 0.4288948 0.029974945 0.0033771145 0.0030435661
## 3 0.3998250 0.3848089 0.077926214 0.0102368871 0.0068317224
## 4 0.5186604 0.4750156 0.170763774 0.0274655254 0.0182329127
## 5 0.5690228 0.4458023 0.256281918 0.0597845173 0.0295644698
## 6 0.5864741 0.4461111 0.280146488 0.0677954572 0.0336540502
#By age of 25-34
age2534 <- marriages[c(2, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21)]
#By education
edu_2534 <- age2534[c(1, 3, 4, 5, 6, 7)]
names(edu_2534)[2] <- "High_School"
names(edu_2534)[3] <- "Some_College"
names(edu_2534)[4] <- "Bachelor's"
names(edu_2534)[5] <- "No_Graduate"
names(edu_2534)[6] <- "Graduate"
edu_2534 <- melt(edu_2534, id.vars = c("year"), variable.name = "education", value.name = "rate")
edu_2534 <- edu_2534[!is.na(edu_2534$rate),]
head(edu_2534)
## year education rate
## 1 1960 High_School 0.1095332
## 2 1970 High_School 0.1094000
## 3 1980 High_School 0.1617313
## 4 1990 High_School 0.2777491
## 5 2000 High_School 0.3316545
## 6 2001 High_School 0.3446069
#By race
race_2534 <- age2534[c(1, 8, 9, 10)]
names(race_2534)[2] <- "White"
names(race_2534)[3] <- "Black"
names(race_2534)[4] <- "Hispanic"
race_2534 <- melt(race_2534, id.vars = c("year"), variable.name = "race", value.name = "rate")
head(race_2534)
## year race rate
## 1 1960 White 0.1164848
## 2 1970 White 0.1179043
## 3 1980 White 0.1824126
## 4 1990 White 0.2639256
## 5 2000 White 0.3127149
## 6 2001 White 0.3183506
#By region
region_2534 <- age2534[c(1, 11, 12, 13, 14, 15, 16)]
names(region_2534)[2] <- "New_England"
names(region_2534)[3] <- "Mid_Atlantic"
names(region_2534)[4] <- "Midwest"
names(region_2534)[5] <- "South"
names(region_2534)[6] <- "Mountain_West"
names(region_2534)[7] <- "Pacific"
region_2534 <- melt(region_2534, id.vars = c("year"), variable.name = "region", value.name = "rate")
head(region_2534)
## year region rate
## 1 1960 New_England 0.1504184
## 2 1970 New_England 0.1517231
## 3 1980 New_England 0.2414327
## 4 1990 New_England 0.3500384
## 5 2000 New_England 0.4091852
## 6 2001 New_England 0.4200581
#By family income
income_2534 <- age2534[c(1, 17, 18, 19)]
names(income_2534)[2] <- "Low_25%"
names(income_2534)[3] <- "Middle_50%"
names(income_2534)[4] <- "Top_25%"
income_2534 <- melt(income_2534, id.vars = c("year"), variable.name = "income", value.name = "rate")
head(income_2534)
## year income rate
## 1 1960 Low_25% 0.1371597
## 2 1970 Low_25% 0.1717202
## 3 1980 Low_25% 0.3100591
## 4 1990 Low_25% 0.4199108
## 5 2000 Low_25% 0.5033676
## 6 2001 Low_25% 0.5178771
#By age of 35-44
age3544 <- marriages[c(2, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 32, 33, 34, 35, 36, 37, 38, 39)]
#By education
edu_3544 <- age3544[c(1, 3, 4, 5, 6, 7)]
names(edu_3544)[2] <- "High_School"
names(edu_3544)[3] <- "Some_College"
names(edu_3544)[4] <- "Bachelor's"
names(edu_3544)[5] <- "No_Graduate"
names(edu_3544)[6] <- "Graduate"
edu_3544 <- melt(edu_3544, id.vars = c("year"), variable.name = "education", value.name = "rate")
edu_3544 <- edu_3544[!is.na(edu_3544$rate),]
head(edu_3544)
## year education rate
## 1 1960 High_School 0.06860309
## 2 1970 High_School 0.06511964
## 3 1980 High_School 0.06429102
## 4 1990 High_School 0.11210043
## 5 2000 High_School 0.16993703
## 6 2001 High_School 0.16870156
#By race
race_3544 <- age3544[c(1, 8, 9, 10)]
names(race_3544)[2] <- "White"
names(race_3544)[3] <- "Black"
names(race_3544)[4] <- "Hispanic"
race_3544 <- melt(race_3544, id.vars = c("year"), variable.name = "race", value.name = "rate")
head(race_3544)
## year race rate
## 1 1960 White 0.06825586
## 2 1970 White 0.06250372
## 3 1980 White 0.05966739
## 4 1990 White 0.09611312
## 5 2000 White 0.13207032
## 6 2001 White 0.13287455
#By region
region_3544 <- age3544[c(1, 11, 12, 13, 14, 15, 16)]
names(region_3544)[2] <- "New_England"
names(region_3544)[3] <- "Mid_Atlantic"
names(region_3544)[4] <- "Midwest"
names(region_3544)[5] <- "South"
names(region_3544)[6] <- "Mountain_West"
names(region_3544)[7] <- "Pacific"
region_3544 <- melt(region_3544, id.vars = c("year"), variable.name = "region", value.name = "rate")
head(region_3544)
## year region rate
## 1 1960 New_England 0.09194322
## 2 1970 New_England 0.08570110
## 3 1980 New_England 0.07997323
## 4 1990 New_England 0.12785915
## 5 2000 New_England 0.17327422
## 6 2001 New_England 0.16653497
#By family income
income_3544 <- age3544[c(1, 17, 18, 19)]
names(income_3544)[2] <- "Low_25%"
names(income_3544)[3] <- "Middle_50%"
names(income_3544)[4] <- "Top_25%"
income_3544 <- melt(income_3544, id.vars = c("year"), variable.name = "income", value.name = "rate")
head(income_3544)
## year income rate
## 1 1960 Low_25% 0.05822486
## 2 1970 Low_25% 0.06347796
## 3 1980 Low_25% 0.07552538
## 4 1990 Low_25% 0.13134638
## 5 2000 Low_25% 0.17480047
## 6 2001 Low_25% 0.17694864
#By age of 35-44
age4554 <- marriages[c(2, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)]
#By education
edu_4554 <- age4554[c(1, 3, 4, 5, 6, 7)]
names(edu_4554)[2] <- "High_School"
names(edu_4554)[3] <- "Some_College"
names(edu_4554)[4] <- "Bachelor's"
names(edu_4554)[5] <- "No_Graduate"
names(edu_4554)[6] <- "Graduate"
edu_4554 <- melt(edu_4554, id.vars = c("year"), variable.name = "education", value.name = "rate")
edu_4554 <- edu_4554[!is.na(edu_4554$rate),]
head(edu_4554)
## year education rate
## 1 1960 High_School 0.06840792
## 2 1970 High_School 0.05833439
## 3 1980 High_School 0.05036563
## 4 1990 High_School 0.05988244
## 5 2000 High_School 0.09442809
## 6 2001 High_School 0.09189007
#By race
race_4554 <- age4554[c(1, 8, 9, 10)]
names(race_4554)[2] <- "White"
names(race_4554)[3] <- "Black"
names(race_4554)[4] <- "Hispanic"
race_4554 <- melt(race_4554, id.vars = c("year"), variable.name = "race", value.name = "rate")
head(race_4554)
## year race rate
## 1 1960 White 0.07246692
## 2 1970 White 0.05754799
## 3 1980 White 0.04765354
## 4 1990 White 0.05092552
## 5 2000 White 0.07578174
## 6 2001 White 0.07516912
#By region
region_4554 <- age4554[c(1, 11, 12)]
names(region_4554)[2] <- "New_England"
names(region_4554)[3] <- "Mid_Atlantic"
region_4554 <- melt(region_4554, id.vars = c("year"), variable.name = "region", value.name = "rate")
head(region_4554)
## year region rate
## 1 1960 New_England 0.10236412
## 2 1970 New_England 0.08028082
## 3 1980 New_England 0.06930253
## 4 1990 New_England 0.07047502
## 5 2000 New_England 0.10232170
## 6 2001 New_England 0.09868408
The initial dataset comprises a diverse range of data spanning multiple decades and years, encompassing various conditions. For improved manageability and clarity, it is advisable to transform the dataset into a long-format dataframe categorized by relevant criteria.