Data Science Salaries Visualization

Author: Irtiza Mohammad Karib

In this analysis, we delve into the evolution of salaries in the data science industry. We take a comprehensive look at the data from 2023, covering various experience levels from entry to senior positions.

The dataset employed in this project can be found here: Data Science Salaries 2023.

Visualization 1: Boxplot of Data Science Salaries by Experience Level

ggplot(ds_salaries, aes(x = experience_level, y = salary_in_usd, fill = experience_level)) +
  geom_boxplot() +
  scale_y_continuous(labels = scales::comma) +
  scale_fill_brewer(palette = "Set3") +
  labs(title = "Boxplot of Data Science Salaries by Experience Level",
       x = "Experience Level",
       y = "Salary in USD") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1, color = "blue"),
        axis.text.y = element_text(color = "green"),
        axis.title.x = element_text(color = "red", size = 12),
        axis.title.y = element_text(color = "purple", size = 12),
        plot.title = element_text(color = "darkorange", size = 16, hjust = 0.5),
        legend.position = "none")

Graph Analysis

The boxplot paints a clear picture: as data scientists rack up experience, their paychecks tend to fatten up. Fresh faces in entry-level roles are on the tighter end of the earning scale, while the seasoned pros at the senior level hit the salary jackpot, with some exceptional standouts pulling in upwards of $400,000. It’s a simple reminder that in data science, experience really does pay.

Visualization 2: Interactive Scatter Plot of Salaries Over Time

Graph Analysis

This scatter plot gives us a quick snapshot of data science salaries growing with experience over four years. In 2023, mid-level salaries seem to jump, suggesting they’re in hot demand. Entry-level and junior positions cluster at the lower end, while senior spots scatter widely up top, showing a world of pay that grows with the depth of one’s skills. Simply put, this chart shows that in data science, time and expertise can lead to a rewarding financial path.

Visualization 3: Animated Bar Chart of Job Counts

# Count number of positions by year and experience level
job_count_by_year <- ds_salaries %>%
  group_by(work_year, experience_level) %>%
  summarize(job_count = n(), .groups = 'drop')

# Create an enhanced animated bar chart
p <- ggplot(job_count_by_year, aes(x = reorder(experience_level, job_count), y = job_count, fill = experience_level)) +
  geom_bar(stat = "identity", show.legend = FALSE) +
  transition_states(work_year, transition_length = 1, state_length = 1) +
  labs(title = 'Data Science Job Growth: {closest_state}',
       x = 'Experience Level', y = 'Number of Positions') +
  scale_fill_brewer(palette = "Dark2") +
  theme_minimal() +
  theme(plot.title = element_text(size = 16, face = "bold", color = "darkblue"),
        axis.title.x = element_text(size = 12, color = "darkgreen"),
        axis.title.y = element_text(size = 12, color = "darkgreen"),
        axis.text.x = element_text(size = 10, color = "black"),
        axis.text.y = element_text(size = 10, color = "black"))

# Animate and save the plot as a GIF
anim <- animate(p, nframes = 100, fps = 10, width = 800, height = 600, renderer = gifski_renderer("data_science_salaries_animation.gif"))
anim

Graph Analysis

The third visualization illustrates the fluctuation in the number of data science positions over four years, categorized by experience level. It’s evident that junior roles maintain a consistent demand, mid-level positions experience fluctuations, and senior roles demonstrate steady growth. This chart provides insights into the evolving job market for data scientists.