This is the second part of my Project 2 assignment for DATA607 in the Fall 2023 Term at CUNY SPS. In this assignment I import a wide data set, tidy it, and then analyze it. This second data set contains the distribution of doctoral degrees from US universities every 5 years from 1992 to 2022.
In this code block, I load the necessary libraries and import the data from my github repository.
library(tidyr)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ forcats 1.0.0 ✔ readr 2.1.4
## ✔ ggplot2 3.4.4 ✔ stringr 1.5.0
## ✔ lubridate 1.9.2 ✔ tibble 3.2.1
## ✔ purrr 1.0.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(knitr)
raw_data <- read.csv("https://raw.githubusercontent.com/Marley-Myrianthopoulos/Data607Project2/main/Data607DoctorateFields.csv")
kable(raw_data, format = "pipe", caption = "Initial Doctorate Data", align = "lcccccccccccccccc")
Table.1.3 | X | X.1 | X.2 | X.3 | X.4 | X.5 | X.6 | X.7 | X.8 | X.9 | X.10 | X.11 | X.12 | X.13 | X.14 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Research doctorate recipients, by historical major field of doctorate: Selected years, 1992–2022 | NA | ||||||||||||||
(Number and percent) | NA | ||||||||||||||
Field of doctorate | 1992 | 1997 | 2002 | 2007 | 2012 | 2017 | 2022 | NA | |||||||
Number | Percent | Number | Percent | Number | Percent | Number | Percent | Number | Percent | Number | Percent | Number | Percent | NA | |
All fields | 38,886 | 100.0 | 42,539 | 100.0 | 40,031 | 100.0 | 48,132 | 100.0 | 50,943 | 100.0 | 54,552 | 100.0 | 57,596 | 100.0 | NA |
Life sciences | 7,172 | 18.4 | 8,421 | 19.8 | 8,478 | 21.2 | 10,702 | 22.2 | 11,964 | 23.5 | 12,554 | 23.0 | 13,211 | 22.9 | NA |
Agricultural sciences and natural resources | 1,261 | 3.2 | 1,212 | 2.8 | 1,129 | 2.8 | 1,321 | 2.7 | 1,255 | 2.5 | 1,493 | 2.7 | 1,434 | 2.5 | NA |
Biological and biomedical sciences | 4,799 | 12.3 | 5,788 | 13.6 | 5,695 | 14.2 | 7,238 | 15.0 | 8,322 | 16.3 | 8,566 | 15.7 | 9,218 | 16.0 | NA |
Health sciences | 1,112 | 2.9 | 1,421 | 3.3 | 1,654 | 4.1 | 2,143 | 4.5 | 2,387 | 4.7 | 2,495 | 4.6 | 2,559 | 4.4 | NA |
Physical sciences and earth sciences | 4,517 | 11.6 | 4,550 | 10.7 | 3,875 | 9.7 | 4,956 | 10.3 | 5,419 | 10.6 | 6,082 | 11.1 | 6,649 | 11.5 | NA |
Chemistry | 2,213 | 5.7 | 2,148 | 5.0 | 1,922 | 4.8 | 2,318 | 4.8 | 2,416 | 4.7 | 2,699 | 4.9 | 3,060 | 5.3 | NA |
Geosciences, atmospheric sciences, and ocean sciences | 767 | 2.0 | 803 | 1.9 | 689 | 1.7 | 875 | 1.8 | 941 | 1.8 | 1,169 | 2.1 | 1,181 | 2.1 | NA |
Physics and astronomy | 1,537 | 4.0 | 1,599 | 3.8 | 1,264 | 3.2 | 1,763 | 3.7 | 2,062 | 4.0 | 2,214 | 4.1 | 2,408 | 4.2 | NA |
Mathematics and computer sciences | 1,927 | 5.0 | 2,032 | 4.8 | 1,729 | 4.3 | 3,042 | 6.3 | 3,496 | 6.9 | 3,842 | 7.0 | 4,854 | 8.4 | NA |
Computer and information sciences | 869 | 2.2 | 909 | 2.1 | 809 | 2.0 | 1,654 | 3.4 | 1,793 | 3.5 | 1,998 | 3.7 | 2,606 | 4.5 | NA |
Mathematics and statistics | 1,058 | 2.7 | 1,123 | 2.6 | 920 | 2.3 | 1,388 | 2.9 | 1,703 | 3.3 | 1,844 | 3.4 | 2,248 | 3.9 | NA |
Psychology and social sciences | 6,562 | 16.9 | 7,369 | 17.3 | 6,925 | 17.3 | 7,309 | 15.2 | 8,498 | 16.7 | 9,034 | 16.6 | 9,235 | 16.0 | NA |
Psychology | 3,262 | 8.4 | 3,557 | 8.4 | 3,207 | 8.0 | 3,276 | 6.8 | 3,599 | 7.1 | 3,925 | 7.2 | 3,990 | 6.9 | NA |
Anthropology | 320 | 0.8 | 434 | 1.0 | 496 | 1.2 | 512 | 1.1 | 547 | 1.1 | 446 | 0.8 | 415 | 0.7 | NA |
Economics | 910 | 2.3 | 1,030 | 2.4 | 908 | 2.3 | 1,004 | 2.1 | 1,243 | 2.4 | 1,239 | 2.3 | 1,287 | 2.2 | NA |
Political science and government | 513 | 1.3 | 665 | 1.6 | 606 | 1.5 | 588 | 1.2 | 724 | 1.4 | 743 | 1.4 | 678 | 1.2 | NA |
Sociology | 495 | 1.3 | 577 | 1.4 | 547 | 1.4 | 576 | 1.2 | 633 | 1.2 | 683 | 1.3 | 611 | 1.1 | NA |
Other social sciences | 1,062 | 2.7 | 1,106 | 2.6 | 1,161 | 2.9 | 1,353 | 2.8 | 1,752 | 3.4 | 1,998 | 3.7 | 2,254 | 3.9 | NA |
Engineering | 5,438 | 14.0 | 6,114 | 14.4 | 5,081 | 12.7 | 7,749 | 16.1 | 8,469 | 16.6 | 9,776 | 17.9 | 11,530 | 20.0 | NA |
Aerospace, aeronautical, and astronautical engineering | 234 | 0.6 | 273 | 0.6 | 209 | 0.5 | 267 | 0.6 | 307 | 0.6 | 379 | 0.7 | 374 | 0.6 | NA |
Bioengineering and biomedical engineering | 147 | 0.4 | 211 | 0.5 | 246 | 0.6 | 637 | 1.3 | 943 | 1.9 | 1,032 | 1.9 | 1,228 | 2.1 | NA |
Chemical engineering | 607 | 1.6 | 662 | 1.6 | 607 | 1.5 | 817 | 1.7 | 840 | 1.6 | 931 | 1.7 | 1,142 | 2.0 | NA |
Civil engineering | 540 | 1.4 | 592 | 1.4 | 540 | 1.3 | 703 | 1.5 | 495 | 1.0 | 713 | 1.3 | 898 | 1.6 | NA |
Electrical, electronics, and communications engineering | 1,278 | 3.3 | 1,460 | 3.4 | 1,212 | 3.0 | 1,967 | 4.1 | 1,938 | 3.8 | 1,879 | 3.4 | 2,193 | 3.8 | NA |
Industrial and manufacturing engineering | 196 | 0.5 | 246 | 0.6 | 230 | 0.6 | 279 | 0.6 | 226 | 0.4 | 249 | 0.5 | 381 | 0.7 | NA |
Materials science engineering | 365 | 0.9 | 483 | 1.1 | 364 | 0.9 | 646 | 1.3 | 743 | 1.5 | 937 | 1.7 | 1,136 | 2.0 | NA |
Mechanical engineering | 855 | 2.2 | 929 | 2.2 | 771 | 1.9 | 1,071 | 2.2 | 1,220 | 2.4 | 1,398 | 2.6 | 1,676 | 2.9 | NA |
Other engineering | 1,216 | 3.1 | 1,258 | 3.0 | 902 | 2.3 | 1,362 | 2.8 | 1,757 | 3.4 | 2,258 | 4.1 | 2,502 | 4.3 | NA |
Education | 6,677 | 17.2 | 6,577 | 15.5 | 6,508 | 16.3 | 6,448 | 13.4 | 4,802 | 9.4 | 4,826 | 8.8 | 4,509 | 7.8 | NA |
Education administration | 1,984 | 5.1 | 2,050 | 4.8 | 2,351 | 5.9 | 2,161 | 4.5 | 1,057 | 2.1 | 922 | 1.7 | 734 | 1.3 | NA |
Education research | 2,503 | 6.4 | 2,695 | 6.3 | 2,776 | 6.9 | 2,671 | 5.5 | 2,516 | 4.9 | 2,373 | 4.3 | 2,289 | 4.0 | NA |
Teacher education | 407 | 1.0 | 291 | 0.7 | 262 | 0.7 | 297 | 0.6 | 156 | 0.3 | 114 | 0.2 | 110 | 0.2 | NA |
Teaching fields | 1,008 | 2.6 | 919 | 2.2 | 686 | 1.7 | 873 | 1.8 | 757 | 1.5 | 925 | 1.7 | 890 | 1.5 | NA |
Other education | 775 | 2.0 | 622 | 1.5 | 433 | 1.1 | 446 | 0.9 | 316 | 0.6 | 492 | 0.9 | 486 | 0.8 | NA |
Humanities and arts | 4,387 | 11.3 | 5,285 | 12.4 | 5,297 | 13.2 | 5,085 | 10.6 | 5,561 | 10.9 | 5,286 | 9.7 | 4,464 | 7.8 | NA |
Foreign languages and literature | 562 | 1.4 | 652 | 1.5 | 627 | 1.6 | 607 | 1.3 | 684 | 1.3 | 618 | 1.1 | 442 | 0.8 | NA |
History | 724 | 1.9 | 965 | 2.3 | 1,031 | 2.6 | 937 | 1.9 | 1,086 | 2.1 | 1,058 | 1.9 | 750 | 1.3 | NA |
Letters | 1,278 | 3.3 | 1,550 | 3.6 | 1,455 | 3.6 | 1,340 | 2.8 | 1,638 | 3.2 | 1,462 | 2.7 | 1,292 | 2.2 | NA |
Other humanities and arts | 1,823 | 4.7 | 2,118 | 5.0 | 2,184 | 5.5 | 2,201 | 4.6 | 2,153 | 4.2 | 2,148 | 3.9 | 1,980 | 3.4 | NA |
Other | 2,206 | 5.7 | 2,191 | 5.2 | 2,138 | 5.3 | 2,841 | 5.9 | 2,734 | 5.4 | 3,152 | 5.8 | 3,144 | 5.5 | NA |
Business management and administration | 1,248 | 3.2 | 1,245 | 2.9 | 1,113 | 2.8 | 1,506 | 3.1 | 1,404 | 2.8 | 1,565 | 2.9 | 1,450 | 2.5 | NA |
Communication | 330 | 0.8 | 331 | 0.8 | 397 | 1.0 | 560 | 1.2 | 595 | 1.2 | 622 | 1.1 | 580 | 1.0 | NA |
Non-science and engineering fields nec | 628 | 1.6 | 615 | 1.4 | 628 | 1.6 | 775 | 1.6 | 735 | 1.4 | 965 | 1.8 | 1,114 | 1.9 | NA |
In this code block, I prepare the data for tidying by renaming the columns and removing the first four rows (which do not contain any useful data).
colnames(raw_data) <- c("Field", "docs1992", "percent1992", "docs1997", "percent1997", "docs2002", "percent2002", "docs2007", "percent2007", "docs2012", "percent2012", "docs2017", "percent2017", "docs2022", "percent2022")
prep_data <- raw_data[-c(1:4),]
kable(prep_data, format = "pipe", caption = "Data Prepped for Tidying", align = "lcccccccccccccccc")
Field | docs1992 | percent1992 | docs1997 | percent1997 | docs2002 | percent2002 | docs2007 | percent2007 | docs2012 | percent2012 | docs2017 | percent2017 | docs2022 | percent2022 | NA | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
5 | All fields | 38,886 | 100.0 | 42,539 | 100.0 | 40,031 | 100.0 | 48,132 | 100.0 | 50,943 | 100.0 | 54,552 | 100.0 | 57,596 | 100.0 | NA |
6 | Life sciences | 7,172 | 18.4 | 8,421 | 19.8 | 8,478 | 21.2 | 10,702 | 22.2 | 11,964 | 23.5 | 12,554 | 23.0 | 13,211 | 22.9 | NA |
7 | Agricultural sciences and natural resources | 1,261 | 3.2 | 1,212 | 2.8 | 1,129 | 2.8 | 1,321 | 2.7 | 1,255 | 2.5 | 1,493 | 2.7 | 1,434 | 2.5 | NA |
8 | Biological and biomedical sciences | 4,799 | 12.3 | 5,788 | 13.6 | 5,695 | 14.2 | 7,238 | 15.0 | 8,322 | 16.3 | 8,566 | 15.7 | 9,218 | 16.0 | NA |
9 | Health sciences | 1,112 | 2.9 | 1,421 | 3.3 | 1,654 | 4.1 | 2,143 | 4.5 | 2,387 | 4.7 | 2,495 | 4.6 | 2,559 | 4.4 | NA |
10 | Physical sciences and earth sciences | 4,517 | 11.6 | 4,550 | 10.7 | 3,875 | 9.7 | 4,956 | 10.3 | 5,419 | 10.6 | 6,082 | 11.1 | 6,649 | 11.5 | NA |
11 | Chemistry | 2,213 | 5.7 | 2,148 | 5.0 | 1,922 | 4.8 | 2,318 | 4.8 | 2,416 | 4.7 | 2,699 | 4.9 | 3,060 | 5.3 | NA |
12 | Geosciences, atmospheric sciences, and ocean sciences | 767 | 2.0 | 803 | 1.9 | 689 | 1.7 | 875 | 1.8 | 941 | 1.8 | 1,169 | 2.1 | 1,181 | 2.1 | NA |
13 | Physics and astronomy | 1,537 | 4.0 | 1,599 | 3.8 | 1,264 | 3.2 | 1,763 | 3.7 | 2,062 | 4.0 | 2,214 | 4.1 | 2,408 | 4.2 | NA |
14 | Mathematics and computer sciences | 1,927 | 5.0 | 2,032 | 4.8 | 1,729 | 4.3 | 3,042 | 6.3 | 3,496 | 6.9 | 3,842 | 7.0 | 4,854 | 8.4 | NA |
15 | Computer and information sciences | 869 | 2.2 | 909 | 2.1 | 809 | 2.0 | 1,654 | 3.4 | 1,793 | 3.5 | 1,998 | 3.7 | 2,606 | 4.5 | NA |
16 | Mathematics and statistics | 1,058 | 2.7 | 1,123 | 2.6 | 920 | 2.3 | 1,388 | 2.9 | 1,703 | 3.3 | 1,844 | 3.4 | 2,248 | 3.9 | NA |
17 | Psychology and social sciences | 6,562 | 16.9 | 7,369 | 17.3 | 6,925 | 17.3 | 7,309 | 15.2 | 8,498 | 16.7 | 9,034 | 16.6 | 9,235 | 16.0 | NA |
18 | Psychology | 3,262 | 8.4 | 3,557 | 8.4 | 3,207 | 8.0 | 3,276 | 6.8 | 3,599 | 7.1 | 3,925 | 7.2 | 3,990 | 6.9 | NA |
19 | Anthropology | 320 | 0.8 | 434 | 1.0 | 496 | 1.2 | 512 | 1.1 | 547 | 1.1 | 446 | 0.8 | 415 | 0.7 | NA |
20 | Economics | 910 | 2.3 | 1,030 | 2.4 | 908 | 2.3 | 1,004 | 2.1 | 1,243 | 2.4 | 1,239 | 2.3 | 1,287 | 2.2 | NA |
21 | Political science and government | 513 | 1.3 | 665 | 1.6 | 606 | 1.5 | 588 | 1.2 | 724 | 1.4 | 743 | 1.4 | 678 | 1.2 | NA |
22 | Sociology | 495 | 1.3 | 577 | 1.4 | 547 | 1.4 | 576 | 1.2 | 633 | 1.2 | 683 | 1.3 | 611 | 1.1 | NA |
23 | Other social sciences | 1,062 | 2.7 | 1,106 | 2.6 | 1,161 | 2.9 | 1,353 | 2.8 | 1,752 | 3.4 | 1,998 | 3.7 | 2,254 | 3.9 | NA |
24 | Engineering | 5,438 | 14.0 | 6,114 | 14.4 | 5,081 | 12.7 | 7,749 | 16.1 | 8,469 | 16.6 | 9,776 | 17.9 | 11,530 | 20.0 | NA |
25 | Aerospace, aeronautical, and astronautical engineering | 234 | 0.6 | 273 | 0.6 | 209 | 0.5 | 267 | 0.6 | 307 | 0.6 | 379 | 0.7 | 374 | 0.6 | NA |
26 | Bioengineering and biomedical engineering | 147 | 0.4 | 211 | 0.5 | 246 | 0.6 | 637 | 1.3 | 943 | 1.9 | 1,032 | 1.9 | 1,228 | 2.1 | NA |
27 | Chemical engineering | 607 | 1.6 | 662 | 1.6 | 607 | 1.5 | 817 | 1.7 | 840 | 1.6 | 931 | 1.7 | 1,142 | 2.0 | NA |
28 | Civil engineering | 540 | 1.4 | 592 | 1.4 | 540 | 1.3 | 703 | 1.5 | 495 | 1.0 | 713 | 1.3 | 898 | 1.6 | NA |
29 | Electrical, electronics, and communications engineering | 1,278 | 3.3 | 1,460 | 3.4 | 1,212 | 3.0 | 1,967 | 4.1 | 1,938 | 3.8 | 1,879 | 3.4 | 2,193 | 3.8 | NA |
30 | Industrial and manufacturing engineering | 196 | 0.5 | 246 | 0.6 | 230 | 0.6 | 279 | 0.6 | 226 | 0.4 | 249 | 0.5 | 381 | 0.7 | NA |
31 | Materials science engineering | 365 | 0.9 | 483 | 1.1 | 364 | 0.9 | 646 | 1.3 | 743 | 1.5 | 937 | 1.7 | 1,136 | 2.0 | NA |
32 | Mechanical engineering | 855 | 2.2 | 929 | 2.2 | 771 | 1.9 | 1,071 | 2.2 | 1,220 | 2.4 | 1,398 | 2.6 | 1,676 | 2.9 | NA |
33 | Other engineering | 1,216 | 3.1 | 1,258 | 3.0 | 902 | 2.3 | 1,362 | 2.8 | 1,757 | 3.4 | 2,258 | 4.1 | 2,502 | 4.3 | NA |
34 | Education | 6,677 | 17.2 | 6,577 | 15.5 | 6,508 | 16.3 | 6,448 | 13.4 | 4,802 | 9.4 | 4,826 | 8.8 | 4,509 | 7.8 | NA |
35 | Education administration | 1,984 | 5.1 | 2,050 | 4.8 | 2,351 | 5.9 | 2,161 | 4.5 | 1,057 | 2.1 | 922 | 1.7 | 734 | 1.3 | NA |
36 | Education research | 2,503 | 6.4 | 2,695 | 6.3 | 2,776 | 6.9 | 2,671 | 5.5 | 2,516 | 4.9 | 2,373 | 4.3 | 2,289 | 4.0 | NA |
37 | Teacher education | 407 | 1.0 | 291 | 0.7 | 262 | 0.7 | 297 | 0.6 | 156 | 0.3 | 114 | 0.2 | 110 | 0.2 | NA |
38 | Teaching fields | 1,008 | 2.6 | 919 | 2.2 | 686 | 1.7 | 873 | 1.8 | 757 | 1.5 | 925 | 1.7 | 890 | 1.5 | NA |
39 | Other education | 775 | 2.0 | 622 | 1.5 | 433 | 1.1 | 446 | 0.9 | 316 | 0.6 | 492 | 0.9 | 486 | 0.8 | NA |
40 | Humanities and arts | 4,387 | 11.3 | 5,285 | 12.4 | 5,297 | 13.2 | 5,085 | 10.6 | 5,561 | 10.9 | 5,286 | 9.7 | 4,464 | 7.8 | NA |
41 | Foreign languages and literature | 562 | 1.4 | 652 | 1.5 | 627 | 1.6 | 607 | 1.3 | 684 | 1.3 | 618 | 1.1 | 442 | 0.8 | NA |
42 | History | 724 | 1.9 | 965 | 2.3 | 1,031 | 2.6 | 937 | 1.9 | 1,086 | 2.1 | 1,058 | 1.9 | 750 | 1.3 | NA |
43 | Letters | 1,278 | 3.3 | 1,550 | 3.6 | 1,455 | 3.6 | 1,340 | 2.8 | 1,638 | 3.2 | 1,462 | 2.7 | 1,292 | 2.2 | NA |
44 | Other humanities and arts | 1,823 | 4.7 | 2,118 | 5.0 | 2,184 | 5.5 | 2,201 | 4.6 | 2,153 | 4.2 | 2,148 | 3.9 | 1,980 | 3.4 | NA |
45 | Other | 2,206 | 5.7 | 2,191 | 5.2 | 2,138 | 5.3 | 2,841 | 5.9 | 2,734 | 5.4 | 3,152 | 5.8 | 3,144 | 5.5 | NA |
46 | Business management and administration | 1,248 | 3.2 | 1,245 | 2.9 | 1,113 | 2.8 | 1,506 | 3.1 | 1,404 | 2.8 | 1,565 | 2.9 | 1,450 | 2.5 | NA |
47 | Communication | 330 | 0.8 | 331 | 0.8 | 397 | 1.0 | 560 | 1.2 | 595 | 1.2 | 622 | 1.1 | 580 | 1.0 | NA |
48 | Non-science and engineering fields nec | 628 | 1.6 | 615 | 1.4 | 628 | 1.6 | 775 | 1.6 | 735 | 1.4 | 965 | 1.8 | 1,114 | 1.9 | NA |
In this code block, I continue to clean the data. The data is organized with broad and specific fields in the same column, so I create a new column for the broad fields and copy the data into that column. I then create a new data frame without the rows that contained only broad field data. As a final preparatory step, I remove the commas from the data points in the cells and convert the resulting strings into numbers so that I can perform calculations on them later. In retrospect, it would have been easier to do this after pivoting the data since I would’ve had to only do it to one column. A lesson for next time! The data is now cleaned and ready to be pivoted into a tidy format.
prep_data$broadfield <- ""
prep_data$broadfield[3] <- prep_data$Field[2]
prep_data$broadfield[7] <- prep_data$Field[6]
prep_data$broadfield[11] <- prep_data$Field[10]
prep_data$broadfield[14] <- prep_data$Field[13]
prep_data$broadfield[21] <- prep_data$Field[20]
prep_data$broadfield[31] <- prep_data$Field[30]
prep_data$broadfield[37] <- prep_data$Field[36]
prep_data$broadfield[42] <- prep_data$Field[41]
tidy_data <- prep_data[-c(1,2,6,10,13,20,30,36,41),c(17,1,2,4,6,8,10,12,14)]
tidy_data$docs1992 <- as.numeric(gsub(",","",tidy_data$docs1992))
tidy_data$docs1997 <- as.numeric(gsub(",","",tidy_data$docs1997))
tidy_data$docs2002 <- as.numeric(gsub(",","",tidy_data$docs2002))
tidy_data$docs2007 <- as.numeric(gsub(",","",tidy_data$docs2007))
tidy_data$docs2012 <- as.numeric(gsub(",","",tidy_data$docs2012))
tidy_data$docs2017 <- as.numeric(gsub(",","",tidy_data$docs2017))
tidy_data$docs2022 <- as.numeric(gsub(",","",tidy_data$docs2022))
kable(tidy_data, format = "pipe", caption = "Clean Data", align = "llccccccc")
broadfield | Field | docs1992 | docs1997 | docs2002 | docs2007 | docs2012 | docs2017 | docs2022 | |
---|---|---|---|---|---|---|---|---|---|
7 | Life sciences | Agricultural sciences and natural resources | 1261 | 1212 | 1129 | 1321 | 1255 | 1493 | 1434 |
8 | Biological and biomedical sciences | 4799 | 5788 | 5695 | 7238 | 8322 | 8566 | 9218 | |
9 | Health sciences | 1112 | 1421 | 1654 | 2143 | 2387 | 2495 | 2559 | |
11 | Physical sciences and earth sciences | Chemistry | 2213 | 2148 | 1922 | 2318 | 2416 | 2699 | 3060 |
12 | Geosciences, atmospheric sciences, and ocean sciences | 767 | 803 | 689 | 875 | 941 | 1169 | 1181 | |
13 | Physics and astronomy | 1537 | 1599 | 1264 | 1763 | 2062 | 2214 | 2408 | |
15 | Mathematics and computer sciences | Computer and information sciences | 869 | 909 | 809 | 1654 | 1793 | 1998 | 2606 |
16 | Mathematics and statistics | 1058 | 1123 | 920 | 1388 | 1703 | 1844 | 2248 | |
18 | Psychology and social sciences | Psychology | 3262 | 3557 | 3207 | 3276 | 3599 | 3925 | 3990 |
19 | Anthropology | 320 | 434 | 496 | 512 | 547 | 446 | 415 | |
20 | Economics | 910 | 1030 | 908 | 1004 | 1243 | 1239 | 1287 | |
21 | Political science and government | 513 | 665 | 606 | 588 | 724 | 743 | 678 | |
22 | Sociology | 495 | 577 | 547 | 576 | 633 | 683 | 611 | |
23 | Other social sciences | 1062 | 1106 | 1161 | 1353 | 1752 | 1998 | 2254 | |
25 | Engineering | Aerospace, aeronautical, and astronautical engineering | 234 | 273 | 209 | 267 | 307 | 379 | 374 |
26 | Bioengineering and biomedical engineering | 147 | 211 | 246 | 637 | 943 | 1032 | 1228 | |
27 | Chemical engineering | 607 | 662 | 607 | 817 | 840 | 931 | 1142 | |
28 | Civil engineering | 540 | 592 | 540 | 703 | 495 | 713 | 898 | |
29 | Electrical, electronics, and communications engineering | 1278 | 1460 | 1212 | 1967 | 1938 | 1879 | 2193 | |
30 | Industrial and manufacturing engineering | 196 | 246 | 230 | 279 | 226 | 249 | 381 | |
31 | Materials science engineering | 365 | 483 | 364 | 646 | 743 | 937 | 1136 | |
32 | Mechanical engineering | 855 | 929 | 771 | 1071 | 1220 | 1398 | 1676 | |
33 | Other engineering | 1216 | 1258 | 902 | 1362 | 1757 | 2258 | 2502 | |
35 | Education | Education administration | 1984 | 2050 | 2351 | 2161 | 1057 | 922 | 734 |
36 | Education research | 2503 | 2695 | 2776 | 2671 | 2516 | 2373 | 2289 | |
37 | Teacher education | 407 | 291 | 262 | 297 | 156 | 114 | 110 | |
38 | Teaching fields | 1008 | 919 | 686 | 873 | 757 | 925 | 890 | |
39 | Other education | 775 | 622 | 433 | 446 | 316 | 492 | 486 | |
41 | Humanities and arts | Foreign languages and literature | 562 | 652 | 627 | 607 | 684 | 618 | 442 |
42 | History | 724 | 965 | 1031 | 937 | 1086 | 1058 | 750 | |
43 | Letters | 1278 | 1550 | 1455 | 1340 | 1638 | 1462 | 1292 | |
44 | Other humanities and arts | 1823 | 2118 | 2184 | 2201 | 2153 | 2148 | 1980 | |
46 | Other | Business management and administration | 1248 | 1245 | 1113 | 1506 | 1404 | 1565 | 1450 |
47 | Communication | 330 | 331 | 397 | 560 | 595 | 622 | 580 | |
48 | Non-science and engineering fields nec | 628 | 615 | 628 | 775 | 735 | 965 | 1114 |
In this code block I start by filling in the empty cells in the broad field column. I used a for loop for this on the week 5 assignment, but picked up a better trick from looking at the solution that Molly Siebecker shared for that assignment that I wanted to try for this one. I first convert all of the empty cells to “NA” values and then use the “fill” function to finish out the data in the column. I then use pivot_longer to convert the data into a format that includes a variable for the year, rather than having each year be its own variable. I then use group_by and mutate to add an additional column calculating what percentage of the doctorates from that year each field represents. Finally, I use a regular expression to eliminate the first four characters of the “Year” column. Since the entire column is formatted as “docs[YYYY]” this results in a column that just displays the year. The data is now “tidy”.
library(tidyr)
library(dplyr)
library(tidyverse)
tidy_data <- tidy_data %>%
mutate(broadfield = na_if(broadfield, "")) %>%
fill(broadfield) %>%
pivot_longer(
cols = -c("broadfield", "Field"),
names_to = "Year",
values_to = "Doctorates"
) %>%
group_by(Year) %>%
mutate(Year_Percent = round(Doctorates / sum(Doctorates) * 100, 1))
tidy_data$Year <- as.integer(sub("^....","",tidy_data$Year))
kable(tidy_data, format = "pipe", caption = "Tidy Doctorate Data", align = "llccc")
broadfield | Field | Year | Doctorates | Year_Percent |
---|---|---|---|---|
Life sciences | Agricultural sciences and natural resources | 1992 | 1261 | 3.2 |
Life sciences | Agricultural sciences and natural resources | 1997 | 1212 | 2.8 |
Life sciences | Agricultural sciences and natural resources | 2002 | 1129 | 2.8 |
Life sciences | Agricultural sciences and natural resources | 2007 | 1321 | 2.7 |
Life sciences | Agricultural sciences and natural resources | 2012 | 1255 | 2.5 |
Life sciences | Agricultural sciences and natural resources | 2017 | 1493 | 2.7 |
Life sciences | Agricultural sciences and natural resources | 2022 | 1434 | 2.5 |
Life sciences | Biological and biomedical sciences | 1992 | 4799 | 12.3 |
Life sciences | Biological and biomedical sciences | 1997 | 5788 | 13.6 |
Life sciences | Biological and biomedical sciences | 2002 | 5695 | 14.2 |
Life sciences | Biological and biomedical sciences | 2007 | 7238 | 15.0 |
Life sciences | Biological and biomedical sciences | 2012 | 8322 | 16.3 |
Life sciences | Biological and biomedical sciences | 2017 | 8566 | 15.7 |
Life sciences | Biological and biomedical sciences | 2022 | 9218 | 16.0 |
Life sciences | Health sciences | 1992 | 1112 | 2.9 |
Life sciences | Health sciences | 1997 | 1421 | 3.3 |
Life sciences | Health sciences | 2002 | 1654 | 4.1 |
Life sciences | Health sciences | 2007 | 2143 | 4.5 |
Life sciences | Health sciences | 2012 | 2387 | 4.7 |
Life sciences | Health sciences | 2017 | 2495 | 4.6 |
Life sciences | Health sciences | 2022 | 2559 | 4.4 |
Physical sciences and earth sciences | Chemistry | 1992 | 2213 | 5.7 |
Physical sciences and earth sciences | Chemistry | 1997 | 2148 | 5.0 |
Physical sciences and earth sciences | Chemistry | 2002 | 1922 | 4.8 |
Physical sciences and earth sciences | Chemistry | 2007 | 2318 | 4.8 |
Physical sciences and earth sciences | Chemistry | 2012 | 2416 | 4.7 |
Physical sciences and earth sciences | Chemistry | 2017 | 2699 | 4.9 |
Physical sciences and earth sciences | Chemistry | 2022 | 3060 | 5.3 |
Physical sciences and earth sciences | Geosciences, atmospheric sciences, and ocean sciences | 1992 | 767 | 2.0 |
Physical sciences and earth sciences | Geosciences, atmospheric sciences, and ocean sciences | 1997 | 803 | 1.9 |
Physical sciences and earth sciences | Geosciences, atmospheric sciences, and ocean sciences | 2002 | 689 | 1.7 |
Physical sciences and earth sciences | Geosciences, atmospheric sciences, and ocean sciences | 2007 | 875 | 1.8 |
Physical sciences and earth sciences | Geosciences, atmospheric sciences, and ocean sciences | 2012 | 941 | 1.8 |
Physical sciences and earth sciences | Geosciences, atmospheric sciences, and ocean sciences | 2017 | 1169 | 2.1 |
Physical sciences and earth sciences | Geosciences, atmospheric sciences, and ocean sciences | 2022 | 1181 | 2.1 |
Physical sciences and earth sciences | Physics and astronomy | 1992 | 1537 | 4.0 |
Physical sciences and earth sciences | Physics and astronomy | 1997 | 1599 | 3.8 |
Physical sciences and earth sciences | Physics and astronomy | 2002 | 1264 | 3.2 |
Physical sciences and earth sciences | Physics and astronomy | 2007 | 1763 | 3.7 |
Physical sciences and earth sciences | Physics and astronomy | 2012 | 2062 | 4.0 |
Physical sciences and earth sciences | Physics and astronomy | 2017 | 2214 | 4.1 |
Physical sciences and earth sciences | Physics and astronomy | 2022 | 2408 | 4.2 |
Mathematics and computer sciences | Computer and information sciences | 1992 | 869 | 2.2 |
Mathematics and computer sciences | Computer and information sciences | 1997 | 909 | 2.1 |
Mathematics and computer sciences | Computer and information sciences | 2002 | 809 | 2.0 |
Mathematics and computer sciences | Computer and information sciences | 2007 | 1654 | 3.4 |
Mathematics and computer sciences | Computer and information sciences | 2012 | 1793 | 3.5 |
Mathematics and computer sciences | Computer and information sciences | 2017 | 1998 | 3.7 |
Mathematics and computer sciences | Computer and information sciences | 2022 | 2606 | 4.5 |
Mathematics and computer sciences | Mathematics and statistics | 1992 | 1058 | 2.7 |
Mathematics and computer sciences | Mathematics and statistics | 1997 | 1123 | 2.6 |
Mathematics and computer sciences | Mathematics and statistics | 2002 | 920 | 2.3 |
Mathematics and computer sciences | Mathematics and statistics | 2007 | 1388 | 2.9 |
Mathematics and computer sciences | Mathematics and statistics | 2012 | 1703 | 3.3 |
Mathematics and computer sciences | Mathematics and statistics | 2017 | 1844 | 3.4 |
Mathematics and computer sciences | Mathematics and statistics | 2022 | 2248 | 3.9 |
Psychology and social sciences | Psychology | 1992 | 3262 | 8.4 |
Psychology and social sciences | Psychology | 1997 | 3557 | 8.4 |
Psychology and social sciences | Psychology | 2002 | 3207 | 8.0 |
Psychology and social sciences | Psychology | 2007 | 3276 | 6.8 |
Psychology and social sciences | Psychology | 2012 | 3599 | 7.1 |
Psychology and social sciences | Psychology | 2017 | 3925 | 7.2 |
Psychology and social sciences | Psychology | 2022 | 3990 | 6.9 |
Psychology and social sciences | Anthropology | 1992 | 320 | 0.8 |
Psychology and social sciences | Anthropology | 1997 | 434 | 1.0 |
Psychology and social sciences | Anthropology | 2002 | 496 | 1.2 |
Psychology and social sciences | Anthropology | 2007 | 512 | 1.1 |
Psychology and social sciences | Anthropology | 2012 | 547 | 1.1 |
Psychology and social sciences | Anthropology | 2017 | 446 | 0.8 |
Psychology and social sciences | Anthropology | 2022 | 415 | 0.7 |
Psychology and social sciences | Economics | 1992 | 910 | 2.3 |
Psychology and social sciences | Economics | 1997 | 1030 | 2.4 |
Psychology and social sciences | Economics | 2002 | 908 | 2.3 |
Psychology and social sciences | Economics | 2007 | 1004 | 2.1 |
Psychology and social sciences | Economics | 2012 | 1243 | 2.4 |
Psychology and social sciences | Economics | 2017 | 1239 | 2.3 |
Psychology and social sciences | Economics | 2022 | 1287 | 2.2 |
Psychology and social sciences | Political science and government | 1992 | 513 | 1.3 |
Psychology and social sciences | Political science and government | 1997 | 665 | 1.6 |
Psychology and social sciences | Political science and government | 2002 | 606 | 1.5 |
Psychology and social sciences | Political science and government | 2007 | 588 | 1.2 |
Psychology and social sciences | Political science and government | 2012 | 724 | 1.4 |
Psychology and social sciences | Political science and government | 2017 | 743 | 1.4 |
Psychology and social sciences | Political science and government | 2022 | 678 | 1.2 |
Psychology and social sciences | Sociology | 1992 | 495 | 1.3 |
Psychology and social sciences | Sociology | 1997 | 577 | 1.4 |
Psychology and social sciences | Sociology | 2002 | 547 | 1.4 |
Psychology and social sciences | Sociology | 2007 | 576 | 1.2 |
Psychology and social sciences | Sociology | 2012 | 633 | 1.2 |
Psychology and social sciences | Sociology | 2017 | 683 | 1.3 |
Psychology and social sciences | Sociology | 2022 | 611 | 1.1 |
Psychology and social sciences | Other social sciences | 1992 | 1062 | 2.7 |
Psychology and social sciences | Other social sciences | 1997 | 1106 | 2.6 |
Psychology and social sciences | Other social sciences | 2002 | 1161 | 2.9 |
Psychology and social sciences | Other social sciences | 2007 | 1353 | 2.8 |
Psychology and social sciences | Other social sciences | 2012 | 1752 | 3.4 |
Psychology and social sciences | Other social sciences | 2017 | 1998 | 3.7 |
Psychology and social sciences | Other social sciences | 2022 | 2254 | 3.9 |
Engineering | Aerospace, aeronautical, and astronautical engineering | 1992 | 234 | 0.6 |
Engineering | Aerospace, aeronautical, and astronautical engineering | 1997 | 273 | 0.6 |
Engineering | Aerospace, aeronautical, and astronautical engineering | 2002 | 209 | 0.5 |
Engineering | Aerospace, aeronautical, and astronautical engineering | 2007 | 267 | 0.6 |
Engineering | Aerospace, aeronautical, and astronautical engineering | 2012 | 307 | 0.6 |
Engineering | Aerospace, aeronautical, and astronautical engineering | 2017 | 379 | 0.7 |
Engineering | Aerospace, aeronautical, and astronautical engineering | 2022 | 374 | 0.6 |
Engineering | Bioengineering and biomedical engineering | 1992 | 147 | 0.4 |
Engineering | Bioengineering and biomedical engineering | 1997 | 211 | 0.5 |
Engineering | Bioengineering and biomedical engineering | 2002 | 246 | 0.6 |
Engineering | Bioengineering and biomedical engineering | 2007 | 637 | 1.3 |
Engineering | Bioengineering and biomedical engineering | 2012 | 943 | 1.9 |
Engineering | Bioengineering and biomedical engineering | 2017 | 1032 | 1.9 |
Engineering | Bioengineering and biomedical engineering | 2022 | 1228 | 2.1 |
Engineering | Chemical engineering | 1992 | 607 | 1.6 |
Engineering | Chemical engineering | 1997 | 662 | 1.6 |
Engineering | Chemical engineering | 2002 | 607 | 1.5 |
Engineering | Chemical engineering | 2007 | 817 | 1.7 |
Engineering | Chemical engineering | 2012 | 840 | 1.6 |
Engineering | Chemical engineering | 2017 | 931 | 1.7 |
Engineering | Chemical engineering | 2022 | 1142 | 2.0 |
Engineering | Civil engineering | 1992 | 540 | 1.4 |
Engineering | Civil engineering | 1997 | 592 | 1.4 |
Engineering | Civil engineering | 2002 | 540 | 1.3 |
Engineering | Civil engineering | 2007 | 703 | 1.5 |
Engineering | Civil engineering | 2012 | 495 | 1.0 |
Engineering | Civil engineering | 2017 | 713 | 1.3 |
Engineering | Civil engineering | 2022 | 898 | 1.6 |
Engineering | Electrical, electronics, and communications engineering | 1992 | 1278 | 3.3 |
Engineering | Electrical, electronics, and communications engineering | 1997 | 1460 | 3.4 |
Engineering | Electrical, electronics, and communications engineering | 2002 | 1212 | 3.0 |
Engineering | Electrical, electronics, and communications engineering | 2007 | 1967 | 4.1 |
Engineering | Electrical, electronics, and communications engineering | 2012 | 1938 | 3.8 |
Engineering | Electrical, electronics, and communications engineering | 2017 | 1879 | 3.4 |
Engineering | Electrical, electronics, and communications engineering | 2022 | 2193 | 3.8 |
Engineering | Industrial and manufacturing engineering | 1992 | 196 | 0.5 |
Engineering | Industrial and manufacturing engineering | 1997 | 246 | 0.6 |
Engineering | Industrial and manufacturing engineering | 2002 | 230 | 0.6 |
Engineering | Industrial and manufacturing engineering | 2007 | 279 | 0.6 |
Engineering | Industrial and manufacturing engineering | 2012 | 226 | 0.4 |
Engineering | Industrial and manufacturing engineering | 2017 | 249 | 0.5 |
Engineering | Industrial and manufacturing engineering | 2022 | 381 | 0.7 |
Engineering | Materials science engineering | 1992 | 365 | 0.9 |
Engineering | Materials science engineering | 1997 | 483 | 1.1 |
Engineering | Materials science engineering | 2002 | 364 | 0.9 |
Engineering | Materials science engineering | 2007 | 646 | 1.3 |
Engineering | Materials science engineering | 2012 | 743 | 1.5 |
Engineering | Materials science engineering | 2017 | 937 | 1.7 |
Engineering | Materials science engineering | 2022 | 1136 | 2.0 |
Engineering | Mechanical engineering | 1992 | 855 | 2.2 |
Engineering | Mechanical engineering | 1997 | 929 | 2.2 |
Engineering | Mechanical engineering | 2002 | 771 | 1.9 |
Engineering | Mechanical engineering | 2007 | 1071 | 2.2 |
Engineering | Mechanical engineering | 2012 | 1220 | 2.4 |
Engineering | Mechanical engineering | 2017 | 1398 | 2.6 |
Engineering | Mechanical engineering | 2022 | 1676 | 2.9 |
Engineering | Other engineering | 1992 | 1216 | 3.1 |
Engineering | Other engineering | 1997 | 1258 | 3.0 |
Engineering | Other engineering | 2002 | 902 | 2.3 |
Engineering | Other engineering | 2007 | 1362 | 2.8 |
Engineering | Other engineering | 2012 | 1757 | 3.4 |
Engineering | Other engineering | 2017 | 2258 | 4.1 |
Engineering | Other engineering | 2022 | 2502 | 4.3 |
Education | Education administration | 1992 | 1984 | 5.1 |
Education | Education administration | 1997 | 2050 | 4.8 |
Education | Education administration | 2002 | 2351 | 5.9 |
Education | Education administration | 2007 | 2161 | 4.5 |
Education | Education administration | 2012 | 1057 | 2.1 |
Education | Education administration | 2017 | 922 | 1.7 |
Education | Education administration | 2022 | 734 | 1.3 |
Education | Education research | 1992 | 2503 | 6.4 |
Education | Education research | 1997 | 2695 | 6.3 |
Education | Education research | 2002 | 2776 | 6.9 |
Education | Education research | 2007 | 2671 | 5.5 |
Education | Education research | 2012 | 2516 | 4.9 |
Education | Education research | 2017 | 2373 | 4.3 |
Education | Education research | 2022 | 2289 | 4.0 |
Education | Teacher education | 1992 | 407 | 1.0 |
Education | Teacher education | 1997 | 291 | 0.7 |
Education | Teacher education | 2002 | 262 | 0.7 |
Education | Teacher education | 2007 | 297 | 0.6 |
Education | Teacher education | 2012 | 156 | 0.3 |
Education | Teacher education | 2017 | 114 | 0.2 |
Education | Teacher education | 2022 | 110 | 0.2 |
Education | Teaching fields | 1992 | 1008 | 2.6 |
Education | Teaching fields | 1997 | 919 | 2.2 |
Education | Teaching fields | 2002 | 686 | 1.7 |
Education | Teaching fields | 2007 | 873 | 1.8 |
Education | Teaching fields | 2012 | 757 | 1.5 |
Education | Teaching fields | 2017 | 925 | 1.7 |
Education | Teaching fields | 2022 | 890 | 1.5 |
Education | Other education | 1992 | 775 | 2.0 |
Education | Other education | 1997 | 622 | 1.5 |
Education | Other education | 2002 | 433 | 1.1 |
Education | Other education | 2007 | 446 | 0.9 |
Education | Other education | 2012 | 316 | 0.6 |
Education | Other education | 2017 | 492 | 0.9 |
Education | Other education | 2022 | 486 | 0.8 |
Humanities and arts | Foreign languages and literature | 1992 | 562 | 1.4 |
Humanities and arts | Foreign languages and literature | 1997 | 652 | 1.5 |
Humanities and arts | Foreign languages and literature | 2002 | 627 | 1.6 |
Humanities and arts | Foreign languages and literature | 2007 | 607 | 1.3 |
Humanities and arts | Foreign languages and literature | 2012 | 684 | 1.3 |
Humanities and arts | Foreign languages and literature | 2017 | 618 | 1.1 |
Humanities and arts | Foreign languages and literature | 2022 | 442 | 0.8 |
Humanities and arts | History | 1992 | 724 | 1.9 |
Humanities and arts | History | 1997 | 965 | 2.3 |
Humanities and arts | History | 2002 | 1031 | 2.6 |
Humanities and arts | History | 2007 | 937 | 1.9 |
Humanities and arts | History | 2012 | 1086 | 2.1 |
Humanities and arts | History | 2017 | 1058 | 1.9 |
Humanities and arts | History | 2022 | 750 | 1.3 |
Humanities and arts | Letters | 1992 | 1278 | 3.3 |
Humanities and arts | Letters | 1997 | 1550 | 3.6 |
Humanities and arts | Letters | 2002 | 1455 | 3.6 |
Humanities and arts | Letters | 2007 | 1340 | 2.8 |
Humanities and arts | Letters | 2012 | 1638 | 3.2 |
Humanities and arts | Letters | 2017 | 1462 | 2.7 |
Humanities and arts | Letters | 2022 | 1292 | 2.2 |
Humanities and arts | Other humanities and arts | 1992 | 1823 | 4.7 |
Humanities and arts | Other humanities and arts | 1997 | 2118 | 5.0 |
Humanities and arts | Other humanities and arts | 2002 | 2184 | 5.5 |
Humanities and arts | Other humanities and arts | 2007 | 2201 | 4.6 |
Humanities and arts | Other humanities and arts | 2012 | 2153 | 4.2 |
Humanities and arts | Other humanities and arts | 2017 | 2148 | 3.9 |
Humanities and arts | Other humanities and arts | 2022 | 1980 | 3.4 |
Other | Business management and administration | 1992 | 1248 | 3.2 |
Other | Business management and administration | 1997 | 1245 | 2.9 |
Other | Business management and administration | 2002 | 1113 | 2.8 |
Other | Business management and administration | 2007 | 1506 | 3.1 |
Other | Business management and administration | 2012 | 1404 | 2.8 |
Other | Business management and administration | 2017 | 1565 | 2.9 |
Other | Business management and administration | 2022 | 1450 | 2.5 |
Other | Communication | 1992 | 330 | 0.8 |
Other | Communication | 1997 | 331 | 0.8 |
Other | Communication | 2002 | 397 | 1.0 |
Other | Communication | 2007 | 560 | 1.2 |
Other | Communication | 2012 | 595 | 1.2 |
Other | Communication | 2017 | 622 | 1.1 |
Other | Communication | 2022 | 580 | 1.0 |
Other | Non-science and engineering fields nec | 1992 | 628 | 1.6 |
Other | Non-science and engineering fields nec | 1997 | 615 | 1.4 |
Other | Non-science and engineering fields nec | 2002 | 628 | 1.6 |
Other | Non-science and engineering fields nec | 2007 | 775 | 1.6 |
Other | Non-science and engineering fields nec | 2012 | 735 | 1.4 |
Other | Non-science and engineering fields nec | 2017 | 965 | 1.8 |
Other | Non-science and engineering fields nec | 2022 | 1114 | 1.9 |
Jonathan shared this data set and suggested looking at “how the make up of doctorate degrees has changed throughout the years.” Following his advice, I will analyze the data to determine which broad fields have experienced the greatest increase and greatest decrease in their share of doctorate degrees since 1992.
In this code block, create a new data frame consisting of the list of broad fields and what percent each of those fields represented of total doctorates from that year.
broadfield_annual <- tidy_data %>%
group_by(broadfield, Year) %>%
summarise(Percent = sum(Year_Percent))
## `summarise()` has grouped output by 'broadfield'. You can override using the
## `.groups` argument.
kable(broadfield_annual, format = "pipe", caption = "Broad Field % of Doctorates by Year", align = "lcc")
broadfield | Year | Percent |
---|---|---|
Education | 1992 | 17.1 |
Education | 1997 | 15.5 |
Education | 2002 | 16.3 |
Education | 2007 | 13.3 |
Education | 2012 | 9.4 |
Education | 2017 | 8.8 |
Education | 2022 | 7.8 |
Engineering | 1992 | 14.0 |
Engineering | 1997 | 14.4 |
Engineering | 2002 | 12.6 |
Engineering | 2007 | 16.1 |
Engineering | 2012 | 16.6 |
Engineering | 2017 | 17.9 |
Engineering | 2022 | 20.0 |
Humanities and arts | 1992 | 11.3 |
Humanities and arts | 1997 | 12.4 |
Humanities and arts | 2002 | 13.3 |
Humanities and arts | 2007 | 10.6 |
Humanities and arts | 2012 | 10.8 |
Humanities and arts | 2017 | 9.6 |
Humanities and arts | 2022 | 7.7 |
Life sciences | 1992 | 18.4 |
Life sciences | 1997 | 19.7 |
Life sciences | 2002 | 21.1 |
Life sciences | 2007 | 22.2 |
Life sciences | 2012 | 23.5 |
Life sciences | 2017 | 23.0 |
Life sciences | 2022 | 22.9 |
Mathematics and computer sciences | 1992 | 4.9 |
Mathematics and computer sciences | 1997 | 4.7 |
Mathematics and computer sciences | 2002 | 4.3 |
Mathematics and computer sciences | 2007 | 6.3 |
Mathematics and computer sciences | 2012 | 6.8 |
Mathematics and computer sciences | 2017 | 7.1 |
Mathematics and computer sciences | 2022 | 8.4 |
Other | 1992 | 5.6 |
Other | 1997 | 5.1 |
Other | 2002 | 5.4 |
Other | 2007 | 5.9 |
Other | 2012 | 5.4 |
Other | 2017 | 5.8 |
Other | 2022 | 5.4 |
Physical sciences and earth sciences | 1992 | 11.7 |
Physical sciences and earth sciences | 1997 | 10.7 |
Physical sciences and earth sciences | 2002 | 9.7 |
Physical sciences and earth sciences | 2007 | 10.3 |
Physical sciences and earth sciences | 2012 | 10.5 |
Physical sciences and earth sciences | 2017 | 11.1 |
Physical sciences and earth sciences | 2022 | 11.6 |
Psychology and social sciences | 1992 | 16.8 |
Psychology and social sciences | 1997 | 17.4 |
Psychology and social sciences | 2002 | 17.3 |
Psychology and social sciences | 2007 | 15.2 |
Psychology and social sciences | 2012 | 16.6 |
Psychology and social sciences | 2017 | 16.7 |
Psychology and social sciences | 2022 | 16.0 |
In this code block, I use ggplot to graph the data for each broad field as a line graph over time. Visually, it appears that since 1992 Engineering has grown the most as a share of doctorates and Education has declined the most.
library(ggplot2)
ggplot(broadfield_annual, aes(x = Year, y = Percent, color = broadfield)) +
geom_line() +
labs(title = "Major Field Share of Doctorates", y = "Percent of Doctorates", color = "Major Field") +
scale_x_continuous(breaks = seq(1992, 2022, by = 5))
In this code block, I confirm the findings I had identified visually from the graph. I use pivot_wider to get the data for each broad field by year in a different column, then mutate to get a new column representing the change in the percent of doctorates that the broad field had from 1992 to 2022, then remove all columns except the broad field and the change, then sort the data by change in descending order. The output confirms my visual observations: Engineering increased its share of doctorates by 6% from 1992 to 2022 (the largest increase), while Education decreased its share by 9.3% (the largest decrease).
broadfield_changes <- broadfield_annual %>%
filter(Year %in% c(1992, 2022)) %>%
pivot_wider(
names_from = "Year",
values_from = "Percent"
) %>%
mutate(Change = `2022` - `1992`) %>%
select(broadfield, Change) %>%
arrange(desc(Change))
kable(broadfield_changes, format = "pipe", caption = "Change in Broad Field % of Doctorates, 1992-2022", align = "lc")
broadfield | Change |
---|---|
Engineering | 6.0 |
Life sciences | 4.5 |
Mathematics and computer sciences | 3.5 |
Physical sciences and earth sciences | -0.1 |
Other | -0.2 |
Psychology and social sciences | -0.8 |
Humanities and arts | -3.6 |
Education | -9.3 |
This is the second consecutive assignment in which I have used pivot_longer to tidy wide data and then during data analysis used pivot_wider to return it to a wide format. I think I have benefited from the experience of having to convert data in both directions, gaining a deeper understanding of the benefits of both formats (as well as the code used to convert between them). There is a connection to my job as a high school math teacher here, the high school math standards in New York emphasize the benefits to students of understanding and converting between “multiple equivalent representations” of functions (such as a graph, equation, and table of input-output pairs) and I think understanding and converting between multiple representations of data has been very helpful to me. Going forward, I’ll continue to look for opportunities to work in “two directions” during data analysis.