library(dslabs)
## Warning: package 'dslabs' was built under R version 4.1.3
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5 v purrr 0.3.4
## v tibble 3.1.2 v dplyr 1.0.7
## v tidyr 1.1.3 v stringr 1.4.0
## v readr 1.4.0 v forcats 0.5.1
## Warning: package 'ggplot2' was built under R version 4.1.3
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
data("nyc_regents_scores")
nyc_regents_scores
## score integrated_algebra global_history living_environment english
## 1 0 56 55 66 165
## 2 1 NA 8 3 69
## 3 2 1 9 2 237
## 4 3 NA 3 1 190
## 5 4 3 15 1 109
## 6 5 2 11 10 122
## 7 6 4 29 3 151
## 8 7 1 37 2 175
## 9 8 24 53 6 197
## 10 9 3 49 3 175
## 11 10 NA 64 8 227
## 12 11 6 83 11 217
## 13 12 23 87 7 28
## 14 13 1 54 16 230
## 15 14 3 145 18 303
## 16 15 58 151 50 192
## 17 16 1 129 13 156
## 18 17 73 214 59 163
## 19 18 26 267 18 210
## 20 19 1 127 100 172
## 21 20 28 271 24 404
## 22 21 185 248 140 237
## 23 22 1 411 45 235
## 24 23 170 433 191 433
## 25 24 98 254 56 246
## 26 25 49 570 250 256
## 27 26 328 657 103 267
## 28 27 73 515 301 298
## 29 28 359 280 401 339
## 30 29 215 868 87 294
## 31 30 147 1063 499 344
## 32 31 670 804 80 402
## 33 32 147 579 562 695
## 34 33 643 1069 777 84
## 35 34 421 1423 10 733
## 36 35 312 222 921 99
## 37 36 1181 1572 174 784
## 38 37 1179 1474 881 2
## 39 38 784 1890 1116 546
## 40 39 548 341 15 586
## 41 40 1608 1999 1242 1010
## 42 41 394 1773 1406 715
## 43 42 2401 2180 21 630
## 44 43 229 356 1540 681
## 45 44 2545 2348 1706 619
## 46 45 518 2095 29 113
## 47 46 2993 1160 1735 1290
## 48 47 328 1714 1910 131
## 49 48 3229 2526 342 1717
## 50 49 4217 2162 1592 159
## 51 50 424 1043 2063 2016
## 52 51 3471 1526 2023 111
## 53 52 585 1779 300 1577
## 54 53 3557 1934 1607 133
## 55 54 2428 243 1534 1493
## 56 55 2240 3765 3589 249
## 57 56 4997 2901 1244 2908
## 58 57 2313 1361 2453 244
## 59 58 4018 2191 1610 3304
## 60 59 2200 2383 2357 2896
## 61 60 2464 1812 1797 114
## 62 61 3114 406 1717 1613
## 63 62 2540 635 1343 83
## 64 63 1389 709 823 1185
## 65 64 102 349 237 1
## 66 65 8451 6392 7978 6084
## 67 66 3959 564 3133 345
## 68 67 3546 5063 2400 6393
## 69 68 3020 4338 2218 182
## 70 69 4773 2611 2178 3918
## 71 70 2625 213 2473 2004
## 72 71 2429 2357 2133 2069
## 73 72 2296 2343 1840 1735
## 74 73 3772 2062 3111 1960
## 75 74 1725 1706 1946 1606
## 76 75 3653 404 1758 2634
## 77 76 2568 1936 2952 4818
## 78 77 2638 1845 1636 2221
## 79 78 3126 1593 1448 2235
## 80 79 2113 237 2583 3699
## 81 80 2641 1799 1438 1653
## 82 81 2475 1728 2481 1838
## 83 82 2226 1511 1118 1374
## 84 83 1439 1268 2139 1515
## 85 84 2313 143 916 1150
## 86 85 612 1610 1320 3145
## 87 86 1636 1390 2019 1521
## 88 87 928 1402 940 2515
## 89 88 884 1212 1841 1090
## 90 89 473 1093 789 1790
## 91 90 744 1116 1718 1069
## 92 91 412 1029 689 871
## 93 92 312 924 690 1331
## 94 93 308 890 1211 1400
## 95 94 248 714 462 827
## 96 95 242 1388 457 521
## 97 96 125 547 403 729
## 98 97 110 1229 446 1071
## 99 98 55 764 87 171
## 100 99 19 499 NA 638
## 101 100 NA NA NA NA
## 102 NA 148 65 95 86
## us_history
## 1 65
## 2 4
## 3 16
## 4 10
## 5 6
## 6 8
## 7 7
## 8 12
## 9 16
## 10 28
## 11 34
## 12 58
## 13 67
## 14 42
## 15 57
## 16 98
## 17 125
## 18 115
## 19 30
## 20 118
## 21 153
## 22 202
## 23 136
## 24 214
## 25 177
## 26 162
## 27 332
## 28 284
## 29 248
## 30 281
## 31 446
## 32 602
## 33 123
## 34 515
## 35 659
## 36 459
## 37 361
## 38 887
## 39 587
## 40 391
## 41 1082
## 42 730
## 43 614
## 44 1021
## 45 845
## 46 1234
## 47 775
## 48 937
## 49 1392
## 50 792
## 51 1030
## 52 1174
## 53 631
## 54 790
## 55 912
## 56 1311
## 57 1752
## 58 2363
## 59 791
## 60 1455
## 61 1289
## 62 335
## 63 544
## 64 566
## 65 101
## 66 4749
## 67 2992
## 68 3717
## 69 1201
## 70 1827
## 71 2295
## 72 479
## 73 2118
## 74 1706
## 75 2023
## 76 417
## 77 1983
## 78 1918
## 79 1554
## 80 1901
## 81 385
## 82 1761
## 83 1651
## 84 1690
## 85 1326
## 86 1713
## 87 1608
## 88 1545
## 89 231
## 90 1382
## 91 1356
## 92 1252
## 93 1251
## 94 1187
## 95 1059
## 96 1083
## 97 972
## 98 3039
## 99 2074
## 100 1710
## 101 NA
## 102 83
nyc_regents_scores %>%
filter(!is.na(score) &
!is.na(integrated_algebra) &
!is.na(global_history) &
!is.na(english) &
!is.na(us_history)) %>%
select(Score = score,
Algebra = integrated_algebra,
History = global_history,
English = english,
US = us_history) %>%
gather(Subject, Frequency, Algebra, History, English, US) %>%
ggplot(aes(Frequency, Score, col = Subject)) +
geom_line(size = 0.5) +
ylab("Score Frequency") +
xlab("Scores from 0 to 100") +
ggtitle("NYC Regents Exam Frequency Plot for Different Subjects") +
xlim(c(0, 100)) +
scale_color_manual(values = c("#335c67", "#fff3b0", "#e09f3e", "#9e2a2b")) +
facet_grid(. ~ Subject)+
theme_light()+
theme(strip.background = element_blank(),
strip.text.x = element_blank(),
strip.text.y = element_blank())
## Warning: Removed 333 row(s) containing missing values (geom_path).
For my assignment, I chose a dataset in DSLabs and the dataset is called, “nyc_regents_score”. It includes all the scores for NYC Regents scores in Algebra, Global History, Living Environment, English, and US History. I also also wanted to make a visualization and plot the score frequency for subjects, such as Algebra, Global History, English and US History. For this visualization, I decided to create a line graph. This line graph has scores with the x-axis and the y-axis has a frequency of scores.The NA values that are removed are the values that are removed from “score”, “integrated algebra”, “global history”. “english”, and “us history which has !is.na. Then, I chose the subjects that I am interested in and created a column called Score. Then I chose to use scale_color_manual and gave out a list of colors to specify the set of its mapping for the color aesthetic. Last, I split the lines in a matrix of panels which include using facet_rid and made changes to the ggplot theme and got rid of default labels in the panel plots.