Choosing a college major is something most students think a lot about because it can affect future jobs and income. Many people assume that picking a major with a high starting salary will automatically lead to higher earnings later in life, but that is not always guaranteed. Different majors can have very different career paths, and some may grow more over time than others. Because of this, it is interesting to look at how salaries change from early career to mid-career across different fields. This project focuses on whether higher starting salaries are related to higher long-term earnings.
The data used in this project comes from salary information collected by PayScale and published by the Wall Street Journal. It includes salary data for different undergraduate majors at both early career and mid-career stages. Each row in the dataset represents one major, and the dataset contains 50 majors in total. Key variables include starting median salary and mid-career median salary, which are measured in US dollars. This data comes from real-world salary reports, so it reflects actual earnings rather than results from an experiment.
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.2.0 ✔ readr 2.2.0
## ✔ forcats 1.0.1 ✔ stringr 1.6.0
## ✔ ggplot2 4.0.2 ✔ tibble 3.3.1
## ✔ lubridate 1.9.5 ✔ tidyr 1.3.2
## ✔ purrr 1.2.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
salary <- read_csv("degrees-that-pay-back.csv")
## Rows: 50 Columns: 8
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (7): Undergraduate Major, Starting Median Salary, Mid-Career Median Sala...
## dbl (1): Percent change from Starting to Mid-Career Salary
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
salary <- salary |>
mutate(
start_salary = parse_number(`Starting Median Salary`),
mid_salary = parse_number(`Mid-Career Median Salary`)
)
Relationship Between Starting and Mid-Career Salary
ggplot(salary, aes(x = start_salary, y = mid_salary)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
labs(
title = "Starting Salary vs Mid-Career Salary",
x = "Starting Salary",
y = "Mid-Career Salary"
)
## `geom_smooth()` using formula = 'y ~ x'
Caption
This scatterplot shows the relationship between starting salaries and mid-career salaries across different undergraduate majors. Each point represents one major, with the x-axis showing starting salary and the y-axis showing mid-career salary. The line added represents the general trend in the data.
Explanation
The graph shows a strong positive relationship between starting salary and mid-career salary. This means that majors with higher starting salaries also tend to have higher salaries later in a person’s career. Most of the points follow an upward pattern, which suggests that early salary differences usually continue over time. However, there are still some majors that do not follow the exact pattern, showing that other factors may also affect long-term earnings.
Alt text
This is a Scatterplot showing a clear positive relationship between starting salary and mid-career salary across different majors, with most points following an upward trend.
Top Majors by Mid-Career Salary
salary |>
slice_max(mid_salary, n = 10) |>
ggplot(aes(x = reorder(`Undergraduate Major`, mid_salary), y = mid_salary)) +
geom_col() +
coord_flip() +
labs(
title = "Top 10 Majors by Mid-Career Salary",
x = "Major",
y = "Mid-Career Salary"
)
Caption
This bar chart displays the top 10 undergraduate majors with the highest mid-career salaries. The majors are ordered from lowest to highest within the top 10, and the values represent median salaries measured in US dollars.
Explanation
The chart shows that certain majors have much higher mid-career salaries compared to others. Many of the highest-paying majors are in technical or specialized fields, which suggests that these areas may offer stronger long-term earning potential. The difference between majors is also quite large, showing that the choice of major can have a significant impact on future income. This supports the idea that field of study plays an important role in long-term earnings.
Alt text
This Bar chart showing the top 10 majors ranked by mid-career salary, with noticeable differences between majors.
Salary Growth by Major
salary <- salary |>
mutate(growth = mid_salary - start_salary)
salary |>
slice_max(growth, n = 10) |>
ggplot(aes(x = reorder(`Undergraduate Major`, growth), y = growth)) +
geom_col() +
coord_flip() +
labs(
title = "Majors with the Highest Salary Growth",
x = "Major",
y = "Salary Growth"
)
Caption
This bar chart shows the top 10 majors with the largest increase in salary from early career to mid-career. The values represent the difference between mid-career salary and starting salary.
Explanation
This graph shows that some majors experience larger salary growth over time compared to others. Even if a major does not start with the highest salary, it may still have strong long-term growth. This suggests that focusing only on starting salary may not give a complete picture of earning potential. Some fields may offer more opportunities for salary increases as experience grows.
Alt Text
Bar chart showing the majors with the largest increase in salary from starting to mid-career, highlighting differences in salary growth.