I have introduced the term “Data Practitioner” as a generic job descriptor because we have so many different job role titles for individuals whose work activities overlap including Data Scientist, Data Engineer, Data Analyst, Business Analyst, Data Architect, etc. For this story we will answer the question, “How much do we get paid?” Your analysis and data visualizations must address the variation in average salary based on role descriptor and state.

url <- "https://raw.githubusercontent.com/crystaliquezada/data608_story4/main/Data%20Practitioner%20Salaries%20-%20Sheet1.csv"
data <- read.csv(url)
head(data)
##              Job State Salary
## 1 Data Scientist    AL  99040
## 2 Data Scientist    AK  91710
## 3 Data Scientist    AZ 112470
## 4 Data Scientist    AR 117250
## 5 Data Scientist    CA 140490
## 6 Data Scientist    CO 120320

First, let’s define the data practitioner roles:

So, what do we get paid?

role_avg <- data %>%
  group_by(Job) %>%
  summarise(avg_salary = mean(Salary)) %>%
  arrange(avg_salary)

highlight_role <- "Data Architect"  

ggplot(role_avg, aes(x = reorder(Job, avg_salary), y = avg_salary, fill = Job == highlight_role)) +
  geom_col(show.legend = FALSE) +
  geom_text(aes(label = dollar(avg_salary)),
      hjust = -0.05, size = 3.5) +
  coord_flip() +
  scale_fill_manual(values = c("TRUE" = "#2C7BE5", "FALSE" = "gray80")) +
  labs(
    title = "What Data Practitioner Role Gets Paid the Most?",
    subtitle = "Data Architects have the highest average salary",
    x = "",
    y = ""
  ) +
  scale_y_continuous(expand = expansion(mult = c(0, 0.15))
  ) +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold", size = 14),
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.text.y = element_text(size = 11)
  )

The average salary of data practitioners range from 65,000 to almost 140,000 dollars. Of the five roles, data architects have the highest average salary, reflecting the importance of data organization within businesses.

Next, we examine how salary varies across the states.

state_avg <- data %>%
  group_by(State) %>%
  summarise(avg_salary = mean(Salary))

state_lookup <- data.frame(
  State = state.abb,
  region = tolower(state.name)
)

state_avg <- state_avg %>%
  left_join(state_lookup, by = "State")
us_map <- map_data("state")

map_data_final <- us_map %>%
  left_join(state_avg, by = "region")

ggplot(map_data_final, aes(x = long, y = lat, group = group, fill = avg_salary)) +
  geom_polygon(color = "white") +
  scale_fill_gradient(
    low = "gray",
    high = "#2C7BE5",
    labels = dollar
  ) +
  labs(
    title = "Average Data Practitioner Salary by State",
    fill = "Salary"
  ) +
  theme_void() +
  theme(plot.title = element_text(face = "bold", size = 14)
  )

Major technology hubs like New York, California, and Washington have a higher average salary across all data practitioner roles (over 110K). While central states still have an average salary of over 90,000, geography is a clear driver of pay.

Finally, we examine how average salary and geography work together.

ggplot(data, aes(x = Salary, y = reorder(Job, Salary, FUN = mean))) +
  geom_jitter(alpha = 0.4, color = "gray60", height = 0.2) +
  stat_summary(fun = mean, geom = "point", size = 4, color = "#2C7BE5") +
  labs(
    title = "Salary Variation Across States",
    subtitle = "point = state, blue dot = average salary",
    x = "",
    y = ""
  ) +
  scale_x_continuous(labels = scales::dollar) +
  theme_minimal() +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.text.y = element_text(size = 11),
    plot.title = element_text(face = "bold")
  )

While data architects and data scientists have two of the highest average salaries among data practitioner roles, they both have greater salary variability across states. This suggests that higher paying data roles are driven by both geography and position.

Conclusion

Data practitioner salaries vary by both technical skill and geographic location, with more technical roles and coastal states commanding higher pay. Ultimately, data scientists hold the highest average salary of the five data practitioner roles mentioned here.

Sources

https://www.bls.gov/oes/2023/may/oes152051.htm#

https://www.zippia.com/salaries/data-engineer/

https://www.zippia.com/advice/data-analyst-salary-by-state/

https://www.ziprecruiter.com/Salaries/What-Is-the-Average-Business-Analyst-Salary-by-State

https://www.ziprecruiter.com/Salaries/What-Is-the-Average-Lead-DATA-Architect-Salary-by-State