I chose to use a data set that I found on kaggle.com. The data set “is a cleaned data set of US state and federal minimum wages from 1968 to 2020 (including 2020 equivalency values). The data was scraped from the United States Department of Labor’s table of minimum wage by state.” From this data I chose to focus on one state for each visualization. I begin to clean the data and make it easier to work with by changing all the columns to lowercase and changing the periods to underscores. I then select only the columns that I’ll be using for my visualizations. After that I filter out by state and create new variables for both MD and CA. Even though the data set is listed as “cleaned” I still use is.na to be sure there are no “NA” values present.
Load in the data
I set my working directory after loading the needed libraries, then load in my data.
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.2 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.2 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.1
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Rows: 2862 Columns: 15
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (3): State, Department.Of.Labor.Uncleaned.Data, Footnote
dbl (12): Year, State.Minimum.Wage, State.Minimum.Wage.2020.Dollars, Federal...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
plot1 <-ggplot(cleaned_wage_data, aes(x = year)) +geom_bar(aes(y = state_minimum_wage_2020_dollars, fill ="State Minimum Wage"), stat ="identity") +geom_bar(aes(y = federal_minimum_wage_2020_dollars, fill ="Federal Minimum Wage"), stat ="identity", alpha =0.5) +scale_fill_manual(values =c("State Minimum Wage"="darkred", "Federal Minimum Wage"="yellow"), name ="Legend") +scale_y_continuous(breaks =seq(0, max(cleaned_wage_data$year), by =2)) +labs(title ="Maryland vs. Federal \nMinimum Wage Comparison",x ="Year",y ="Minimum Wage (in 2020 dollars)") +theme_minimal()plot1
Bar Graph for California
plot2 <-ggplot(cleaned_wage_data2, aes(x = year)) +geom_bar(aes(y = state_minimum_wage_2020_dollars, fill ="State Minimum Wage"), stat ="identity") +geom_bar(aes(y = federal_minimum_wage_2020_dollars, fill ="Federal Minimum Wage"), stat ="identity", alpha =0.5) +scale_fill_manual(values =c("State Minimum Wage"="darkred", "Federal Minimum Wage"="yellow"), name ="Legend") +scale_y_continuous(breaks =seq(0, max(cleaned_wage_data$year), by =2)) +labs(title ="California vs. Federal \nMinimum Wage Comparison",x ="Year",y ="Minimum Wage (in 2020 dollars)") +theme_minimal()plot2
What my visualization represents
My visualizations show the relationship between state and federal minimum wages. I used the year for the x axis and used the minimum wage in 2020 dollars for the y axis. I chose to use dark red (State) and yellow (Federal) for when they blend they make orange. As you can see most of the graph is orange with the only time you see either red or yellow indicating a clear difference. As you can see for a state such as Maryland, the state and federal minimum wages mainly stayed the same until around 2015 when the State decided to raise its minimum. Compared to a state such as California where the state minimum started higher than the federal, then dipped below the federal before effectively matching the federal. Then around 2000 the state minimum rose steadily for the next 20 years. One interesting thing to note is that since 2010 the federal minimum has been steadily decreasing while both states I used saw increases. I am curious if I were to repeat graphs for all states, how many states would align with the federal minimum versus raising over the last ~10 years. The benefit of this code being written now is that it can easily be modified to make a graph for any state.
Anything I could not get to work
Originally I planned to make a heat map including all states, and using the “heat” or color to represent how they compare to the federal minimum. After numerous attempts at the heat maps I could not get a visualization that I felt was effective and easy to read. Including all states made the visualization clustered for starters. I also felt that I couldn’t effectively show the exact value for the states and the federal minimum wage whereas with the bar graphs I could.