Final project

Author

Karen Pesca

The Cost of Learning: Visualizing STEM Tuition Around the World

1. Image

Source: https://www.happyschools.com/benefits-studying-expensive-university/

Source: https://www.happyschools.com/benefits-studying-expensive-university/

2. Introduction Essay

The topic of this project is the global cost of higher education in STEM fields, with a focus on tuition and additional expenses such as rent, visa fees, and insurance. This analysis is based on a dataset that compiles international education costs across multiple countries and degree levels. The dataset includes a mix of categorical and numerical variables that help capture both academic and financial aspects of international education. The main variables are:

  • Country: The name of the country where the university is located
  • City: The city where the university is based
  • University: The name of the higher education institution
  • Program: The academic program or major
  • Level: The degree level (e.g., Bachelor or Master)
  • Duration_Years: The number of years the program lasts
  • Tuition_USD: The total tuition cost in U.S. dollars
  • Living_Cost_Index: An indicator of general living expenses in the city
  • Rent_USD: Average monthly rent for a student in U.S. dollars
  • Visa_Fee_USD: One-time visa fee for international students
  • Insurance_USD: Yearly cost of student insurance
  • Exchange_Rate: Used to convert local currencies into U.S. dollars
  • Total_cost: A new variable I created to estimate overall program cost by summing tuition, rent, visa fees, and insurance over the full duration of the program.

These variables allowed me to go beyond just tuition and explore broader questions about affordability for international students. By including factors like rent and visa costs, the dataset gives a more complete picture of what students can expect to pay while earning a STEM degree abroad.

I found the dataset on Kaggle, which is a data repository platform. However, the dataset does not provide an exact source or official citation. There is no README file or documentation explaining the methodology. Based on the structure and type of information included, it appears the data was collected through web scraping from official sources such as UNESCO, OECD, and university websites. Because no clear methodology is provided, the findings from this dataset are best used for exploratory research rather than for drawing definitive conclusions. It serves as a useful starting point to identify trends, compare costs, and generate questions, but it should not be treated as an exact or official source.

I cleaned the dataset by selecting key variables such as tuition, country, city, and degree level. I also filtered the data to include only programs under $20,000 USD and identified the five most expensive countries by tuition. These steps allowed me to highlight both affordable and high-cost options, making the analysis more relevant for international students. This cleaning process was essential to ensure the dataset was reliable and ready for visual exploration.This dataset is meaningful because it brings together key cost components that international students must consider when planning to study abroad. It does not only show tuition fees but also includes estimates for rent, visa fees, and insurance, which are often overlooked in basic tuition comparisons. Having this kind of detailed cost breakdown helps create a more realistic picture of what it takes to afford a STEM degree in different parts of the world.

Background Information

The United States is known as a global leader in higher education, offering some of the most prestigious institutions in the world like Harvard, MIT, and Stanford. However, this reputation comes with a high price. According to Niall McCarthy, the U.S. has the highest average tuition fees for public universities at the bachelor’s level, averaging around $8,200 per year. Even though many students receive financial support such as scholarships and loans, the cost at private universities is nearly two and a half times more expensive. This shows that while the U.S. offers academic excellence and global recognition, affordability remains a big concern for many students and families (McCarthy).

The global distribution of STEM graduates is shifting, creating new dynamics in education and innovation worldwide.While the United States stands out for its top-ranked institutions, its high tuition rates can limit access, especially for students pursuing competitive fields like STEM. According to Brendan Oliss et al., China led the world in 2020 with 3.57 million STEM graduates, followed by India with 2.55 million and the United States with 820,000. New players like Brazil and Mexico have also gained ground, surpassing countries like Iran and Japan in the number of STEM graduates. This shift reflects how countries in Latin America and Asia are expanding their investment in science and technology education (Oliss et al.).

Another key insight is that China also leads in the percentage of total graduates in STEM, with over 40 percent earning degrees in these fields. Russia, Germany, Iran, and India each had more than 30 percent, while the U.S. lagged behind at just 20 percent. Interestingly, both Mexico and France outpaced the U.S. in this metric. This suggests that affordability and access may play a role in helping nations grow their technical workforce and stay competitive globally (Oliss et al.). This is why I wanted to explore these global trends and find a simple, visual way to present this information—so that students around the world can easily see where more accessible and high-impact STEM education opportunities are available

As an international student, one of the first things I researched before studying abroad was the cost of education. From my experience, this is something most international students are always aware of, because tuition fees can vary a lot depending on the country and your nationality. In the report presented by the OECD in 2024, it is clear that many countries have different policies for foreign students. For example, students from the EU or EEA pay the same fees as national students when studying in countries like Finland and Romania, but those from outside the region often pay much more. In Finland, for instance, non-EU/EEA students pay around $14,000 per year for a master’s program, while in France the difference is about $5,200. On the other hand, countries like Norway, Spain, and Italy charge similar or no fees at all for both local and international students. Even in countries with higher tuition, like Australia and New Zealand, the number of international students remains high, showing that things like education quality, language, and future job opportunities are just as important as cost when deciding where to study (OECD).

Based on my own experience and the results in my analysis, the United States continues to be one of the most expensive countries for STEM programs. For example, I currently pay around $12,000 per year to study data science at the associate level in a community college. Depending on the university, the cost can go even higher. That’s why affordability is such an important factor to consider, especially for international students. Through this project, I wanted to explore ways to show this information clearly and make it easier for students around the world to compare program options and make informed decisions.

3. Load the neccesaries libraries and upload the dataset

# Load Libraries

library(tidyverse)
library(geosphere)
library(tidygeocoder)
library(leaflet)
library(GGally)
library(ggfortify)
library(purrr)
library(highcharter)

# Upload dataset

setwd("/Users/karenlizethpp/Library/Mobile Documents/com~apple~CloudDocs/Data 110")
eduinternational <- read_csv("International_Education_Costs.csv")
head(eduinternational)
# A tibble: 6 × 12
  Country   City      University        Program Level Duration_Years Tuition_USD
  <chr>     <chr>     <chr>             <chr>   <chr>          <dbl>       <dbl>
1 USA       Cambridge Harvard Universi… Comput… Mast…              2       55400
2 UK        London    Imperial College… Data S… Mast…              1       41200
3 Canada    Toronto   University of To… Busine… Mast…              2       38500
4 Australia Melbourne University of Me… Engine… Mast…              2       42000
5 Germany   Munich    Technical Univer… Mechan… Mast…              2         500
6 Japan     Tokyo     University of To… Inform… Mast…              2        8900
# ℹ 5 more variables: Living_Cost_Index <dbl>, Rent_USD <dbl>,
#   Visa_Fee_USD <dbl>, Insurance_USD <dbl>, Exchange_Rate <dbl>

4. Clean process

In general my dataset looks pretty clean, but I would like to remove some variables that i will not use like exchange rate and Living_Cost_Index, I wil create a new variable using Mutate function, this variable count the total cost of the program based on tuition, rent, insurance, visa cost per program.

#Sum to check if there are any NA values in the entire dataset
sum(is.na(eduinternational))
[1] 0
# Remove Exchange rate variable
eduinternational2 <- eduinternational %>%
  select(-Exchange_Rate, -Living_Cost_Index)

# Create Total Cost of Study by institution and program
edutotal <- eduinternational2 %>%
  mutate(Total_cost = (Tuition_USD + Rent_USD + Insurance_USD + Visa_Fee_USD)*Duration_Years)

5. Regresionn linear model

ggpairs(edutotal, columns = c("Tuition_USD", "Rent_USD", "Insurance_USD", "Visa_Fee_USD", "Total_cost"))

Before building a final model, I tested a few regressions where I used tuition, rent, insurance, and visa fees to predict the total cost of education. These variables were all statistically significant, and the adjusted R² was very high (around 0.79), which shows that the model was very good at predicting total cost. However, I realized this result was expected because Total_cost is calculated directly from those same variables. I still ran these models to confirm the relationships and see how each variable contributed, but I understood that this type of regression would not give me new insights, since it was based on a formula I had already defined.

In this model, I used rent, insurance, and visa fees to predict tuition. All three variables are statistically significant and have a positive relationship with tuition, meaning that in places where rent, insurance, or visa costs are higher, tuition also tends to be more expensive. The strongest impact comes from rent, which makes sense because living in expensive cities is often linked to higher education costs. The adjusted R² is 0.61, which means the model explains a good portion of the variation in tuition, but there are still other factors that might affect the price, like university reputation or country policies. Overall, this model shows that living-related expenses can be good indicators of how much students might expect to pay for tuition.

model1 <- lm(Tuition_USD ~ Rent_USD + Insurance_USD + Visa_Fee_USD, data = edutotal)

summary(model1)

Call:
lm(formula = Tuition_USD ~ Rent_USD + Insurance_USD + Visa_Fee_USD, 
    data = edutotal)

Residuals:
   Min     1Q Median     3Q    Max 
-34497  -5509   1363   6072  25852 

Coefficients:
                Estimate Std. Error t value Pr(>|t|)    
(Intercept)   -12194.465    929.833 -13.115  < 2e-16 ***
Rent_USD          14.778      1.176  12.566  < 2e-16 ***
Insurance_USD      9.424      1.776   5.308  1.4e-07 ***
Visa_Fee_USD      37.743      2.693  14.014  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 10370 on 903 degrees of freedom
Multiple R-squared:  0.6105,    Adjusted R-squared:  0.6092 
F-statistic: 471.8 on 3 and 903 DF,  p-value: < 2.2e-16
autoplot(model1, 1:4, nrow=2, ncol=2)

Model Equation

Model Equation:

Total_cost=−12194.47+14.78Rent_USD+ 9.42Insurance_USD+37.74+ 37.74*Visa_Fee_USD

This equation means that for each additional dollar in monthly rent, the total education cost is expected to increase by approximately $14.78, on average. Similarly, for every additional dollar in annual insurance cost, the total cost rises by $9.42. The strongest effect comes from visa fees: each additional dollar in visa cost is associated with a $37.74 increase in total cost. The intercept of -12,194.47 is the expected total cost when all three variables are zero. While this scenario isn’t realistic in practice, the intercept is part of the model’s calculation.

P-Values and Coefficients

All three predictors Rent_USD, Insurance_USD, and Visa_Fee_USD, have very small p-values (< 0.005), indicating that they are statistically significant. This confirms that each variable plays an important role in predicting the total cost of international education. Among them, visa fees have the strongest effect, followed by rent and insurance.

Adjusted R-squared Value

The model has an Adjusted R-squared of 0.6092, which means that approximately 60.9% of the variation in total education cost can be explained by the combination of rent, insurance, and visa fees. For a cost-related model with only three predictors, this represents a moderate to strong level of explanatory power. It shows that these living-related expenses help estimate how expensive a program might be for international students.

Diagnostic Plots

The Residuals vs Fitted plot shows that the residuals are mostly centered around zero, but there is a slight curve, suggesting a possible nonlinear pattern or other variables not included in the model. This may indicate that some parts of the total cost are influenced by factors outside of rent, insurance, and visa fees.The Normal Q-Q plot indicates that residuals are mostly normally distributed, with a few deviations at the tails. This suggests that the model assumptions are generally met, though there may be a few outliers. The Scale-Location plot shows a slight upward trend, meaning that variance increases with fitted values, a common issue called heteroscedasticity. It’s not severe but could affect predictions for the most expensive programs.

Conclusion

This analysis shows that rent, insurance, and visa fees are meaningful predictors of the total cost of international education. All three variables were statistically significant and contributed to explaining cost differences between programs. The model explains about 61% of the variation in total cost, making it a reliable way to estimate how living-related costs impact a student’s financial burden. Although there are a few outliers and mild variance concerns, the model is still strong for identifying general trends. Schools or countries with higher rent, insurance, and visa fees tend to have more expensive education programs overall.

6. Vizualizations

1. Alluvial Chart:

I decided to use an alluvial graph because I saw one in Professor Saidi’s class, where she explained it clearly and used it to group information in a visual way. That caught my attention and inspired me to try it. I created a dynamic version using the highcharter package in R, which generates interactive alluvial-style charts—technically known as Sankey diagrams in Highcharts. At first, I worked only with data from the United States, but after analyzing the graph, I realized it wasn’t very meaningful. Since the cost of education in the U.S. is generally much higher, it didn’t allow for a fair comparison. For that reason, I decided to expand the chart to include universities from around the world, organized by degree level and total cost—classified into high, medium, and low. This visualization helps present a global perspective, showing key details such as country, city, and total estimated cost, including rent, tuition, insurance, and visa. It offers a way to make more informed and thoughtful decisions when considering international education options.

Prepare the Data for the Alluvial Chart

In this step, I got the data ready to create an alluvial chart, which helps to show the connection between the degree level and the total cost group (low, medium, or high). First, I created a new variable called cost_group by grouping programs based on their total cost: programs under $40,000 were labeled as “Low”, those between $40,000 and $80,000 as “Medium”, and anything over $80,000 as “High”. I also renamed each degree level to include a number prefix (like “1_Master”) to control the layout of the flow in the chart.

In this step, I got the data ready to create an alluvial chart, which helps to show the connection between the degree level and the total cost group (low, medium, or high). First, I created a new variable called cost_group by grouping programs based on their total cost: programs under $40,000 were labeled as “Low”, those between $40,000 and $80,000 as “Medium”, and anything over $80,000 as “High”. I also renamed each degree level to include a number prefix (like “1_Master”) to control the layout of the flow in the chart.

# Add prefixes to control layout
flow_data <- edutotal %>%
  mutate(cost_group = case_when(
    Total_cost <= 40000 ~ "2_Low",
    Total_cost <= 80000 ~ "1_Medium",
    TRUE ~ "0_High"
  ),
  from = case_when(
    Level == "Bachelor" ~ "2_Bachelor",
    Level == "Master" ~ "1_Master",
    Level == "PhD" ~ "0_PhD",
    TRUE ~ Level
  ),
  to = cost_group,
  tooltip = paste0(
    "<b>University:</b> ", University, "<br>",
    "<b>City:</b> ", City, "<br>",
    "<b>Country:</b> ", Country, "<br>",
    "<b>Total Cost:</b> $", formatC(Total_cost, format = "f", big.mark = ",", digits = 0) #ChatGPT help me with this because it was  challenging to try the orden the axis from level and tution on this alluvial in Highcharts.
  )) %>%
  count(from, to, tooltip, name = "weight") %>%
  mutate(custom = purrr::map(tooltip, ~ list(tooltip = .))) %>%
  select(from, to, weight, custom)

Create Custom Labels

In this step, I created custom labels to make the alluvial chart easier to read. Earlier, I added numeric prefixes to the degree levels and cost groups (like “1_Master” or “2_Bachelor”) to help control the chart layout. Now, I removed those prefixes so only the clean label (like “Master”) would appear on the chart.

To do this, I first created a list of all the unique values in the from and to columns of my flow data. Then I used the gsub() function to remove the numbers and underscores at the beginning of each label. Finally, I stored the cleaned-up labels in a new data frame called node_order, which I use to display the final chart with simple, clear names for each category.

This step helped make the final visualization more “user-friendly” by showing just the level and cost group without any extra formatting symbols.

# Create custom labels
node_ids <- unique(c(flow_data$from, flow_data$to))
node_labels <- gsub("^\\d_", "", node_ids)  # Remove the numeric prefix that I add before.

node_order <- data.frame(
  id = node_ids,
  name = node_labels,
  stringsAsFactors = FALSE
)

HighCharts Sankey Diagram

I created a Sankey diagram to group and show the flow between university degree levels and cost categories in a dynamic way. It helps visualize how different degrees like Bachelor, Master and PhD are distributed across low, medium, and high tuition costs, including their locations and tuition details, making it easier to compare options across the world.

highchart() %>%
  hc_chart(type = "sankey") %>%
  hc_add_series(
    data = list_parse(flow_data),
    nodes = list_parse(node_order),
    type = "sankey",
    name = "Global Degree Level to Cost Group",
    dataLabels = list(enabled = TRUE),
    colorByPoint = TRUE,
  colors = c("#3247e3", "#F1948A", "#82E0AA","#e66eeb", "#60f0f0" , "#edd147" )
  ) %>%
   hc_tooltip(
    useHTML = TRUE,
    formatter = JS("function() {
      return this.point.custom.tooltip;
    }") 
    )%>%
  hc_title(text = "Global Flow of University Degree Levels to Cost Categories with Location and Tuition Information",
           align = "center") %>%
  hc_credits(
    enabled = TRUE,
    text = "Source:Cost of International Education, Comparative Financial Dataset for Global Study",
    href = "https://www.kaggle.com/datasets/adilshamim8/cost-of-international-education",
    style = list(fontSize = "10px", color = "black", fontStyle = "italic")
  ) %>%
  hc_add_theme(hc_theme_economist())

2. Bar Chart:

In this Bar chart, I calculated the average tuition for each country and identified the top five with the highest overall tuition costs. Then, I grouped the data by degree level (Bachelor, Master, and PhD) to compare how tuition varies within each of those countries. I used a grouped bar chart in Highcharter to present the results clearly, making it easier to observe the cost differences across academic levels.

# Obtain average tuition for each country
top_countries <- edutotal %>%
  group_by(Country) %>%
  summarise(Avg_Tuition = mean(Tuition_USD, na.rm = TRUE)) %>%
  arrange(desc(Avg_Tuition)) %>%
  slice(1:5) %>%
  pull(Country)

# Filter edutotal for those top countries and obtain average per level
chart_data <- edutotal %>%
  filter(Country %in% top_countries) %>%
  group_by(Country, Level) %>%
  summarise(Tuition = mean(Tuition_USD, na.rm = TRUE), .groups = "drop") %>%
  pivot_wider(names_from = Level, values_from = Tuition) %>%
  pivot_longer(cols = c("Bachelor", "Master", "PhD"), names_to = "Level", values_to = "Tuition")

# Plot with highcharter
highchart() %>%
  hc_chart(type = "column") %>%
  hc_title(text = "Top 5 Countries with Highest Average Tuition (Grouped by Level)") %>%
  hc_xAxis(categories = unique(chart_data$Country)) %>%
  hc_yAxis(title = list(text = "Average Tuition (USD)")) %>%
  hc_plotOptions(column = list(grouping = TRUE)) %>%
  hc_add_series(name = "Bachelor",
                data = chart_data %>% filter(Level == "Bachelor") %>% pull(Tuition),
                color = "#1f77b4") %>%
  hc_add_series(name = "Master",
                data = chart_data %>% filter(Level == "Master") %>% pull(Tuition),
                color = "#ff7f0e") %>%
  hc_add_series(name = "PhD",
                data = chart_data %>% filter(Level == "PhD") %>% pull(Tuition),
                color = "#2ca02c") %>%
  hc_tooltip(shared = TRUE, valuePrefix = "$", valueDecimals = 0) %>%
  hc_caption(text = "Source:Cost of International Education, Comparative Financial Dataset for Global Study")%>%
hc_add_theme(hc_theme_flat())

3. Maps

Heatmap-Style Map (Blue Scale by Tuition)

In this map, I used a blue heatmap-style color palette to show tuition costs across different countries. The darker the blue, the cheaper the program. I used the YlGnBu color scale in reverse, so $0 tuition programs appear in dark blue and the most expensive ones closer to $20,000 show in light blue. All the bubbles are the same size, so the color is what tells the story. This helped me highlight the countries where international students might find more affordable STEM options.

# Obtain the 100 cheapest programs worldwide (across all levels)
cheapest300 <- edutotal %>%
  arrange(Tuition_USD) %>%
  slice(1:300)
# Geocode: create "location" and get lat/lon
geo_cheapest <- cheapest300 %>%
  mutate(location = paste(City, Country, sep = ", ")) %>%
  geocode(location, method = "osm", lat = latitude, long = longitude)
Passing 240 addresses to the Nominatim single address geocoder
Query completed in: 243.9 seconds
# Define color palette for Tuition ($0 to $15K), reversed so cheaper = darker
tuition_pal <- colorNumeric("YlGnBu", domain = c(0, 15000), reverse = TRUE)

# Create the Leaflet map
tuition_map <- leaflet(geo_cheapest) %>%
  addProviderTiles("Stadia.OSMBright") %>%
  addCircleMarkers(
    lng = ~longitude,
    lat = ~latitude,
    radius = 9,  # Fixed size so all are visible
    color = ~tuition_pal(Tuition_USD),
    stroke = TRUE,
    weight = 1,
    fillOpacity = 0.85,
    popup = ~paste0(
      "<b>University:</b> ", University, "<br>",
      "<b>Program:</b> ", Program, "<br>",
      "<b>Level:</b> ", Level, "<br>",
      "<b>City:</b> ", City, "<br>",
      "<b>Country:</b> ", Country, "<br>",
      "<b>Tuition:</b> $", Tuition_USD
    )
  )

# Create custom HTML legend for $0–$15,000 tuition
legend_html <- "<div style='padding:6px; background:white; font-size:12px;'>
  <b>Tuition Range (USD)</b><br>
  <div style='width:250px; height:20px; background: linear-gradient(to right, #08306b, #2171b5, #6baed6, #c6dbef, #f7fbff); border: 1px solid gray;'></div>
  <div style='display: flex; justify-content: space-between;'>
    <span>$0</span><span>$3.75K</span><span>$7.5K</span><span>$11.25K</span><span>$15K</span>
  </div>
</div>"

# Add the legend to the map
tuition_map <- tuition_map %>%
  addControl(html = legend_html, position = "bottomleft")

# Display the map
tuition_map

Color-Coded Map by Degree Level

In this second map, I used different colors to represent each degree level. Blue is for Bachelor’s, orange for Master’s, and green for PhD. All the bubbles are the same size, and this time the color tells you what type of program you’re looking at. This view makes it easier to compare which countries offer more programs at each level, and it matches the same color scheme I used in the bar chart to be consist

# Assign custom color by degree level
geo_cheapest <- geo_cheapest %>%
  mutate(Level_Color = case_when(
    Level == "Bachelor" ~ "#1f77b4",  # Blue
    Level == "Master"   ~ "#ff7f0e",  # Orange
    Level == "PhD"      ~ "#2ca02c",  # Green
    TRUE ~ "#999999"
  ))

# Create the level-coded map
tuition_map_level <- leaflet(geo_cheapest) %>%
  addProviderTiles("Stadia.OSMBright") %>%
  addCircleMarkers(
    lng = ~longitude,
    lat = ~latitude,
    radius = 9,  # Fixed size
    color = ~Level_Color,
    stroke = TRUE,
    weight = 1,
    fillOpacity = 0.85,
    popup = ~paste0(
      "<b>University:</b> ", University, "<br>",
      "<b>Program:</b> ", Program, "<br>",
      "<b>Level:</b> ", Level, "<br>",
      "<b>City:</b> ", City, "<br>",
      "<b>Country:</b> ", Country, "<br>",
      "<b>Tuition:</b> $", Tuition_USD
    )
  ) %>%
  addControl(html = "<div style='padding:6px; background:white; font-size:12px;'>
    <b>Program Level</b><br>
    <div><span style='color:#1f77b4;'>&#9679;</span> Bachelor's</div>
    <div><span style='color:#ff7f0e;'>&#9679;</span> Master's</div>
    <div><span style='color:#2ca02c;'>&#9679;</span> PhD</div>
  </div>", position = "bottomleft")

tuition_map_level

7. Final Essay

This project allowed me to visualize and compare the cost of STEM education programs across different countries, using tuition and other important expenses like rent, visa fees, and insurance. Through the maps and alluvial chart, I was able to identify which countries offer more affordable programs and how degree levels relate to total cost. One interesting pattern was seeing how some emerging economies, like Mexico and Brazil, are becoming more competitive by offering low-cost STEM programs.

As shown in the background research, the United States has the highest average tuition for institutions, but it also has some of the most prestigious universities in the world, like Harvard, MIT, and Stanford. In general, the U.S. continues to offer high-quality education, even though affordability is a major concern. On the maps, I noticed that most of the free or low-cost programs are concentrated in South America, Europe, and Southeast Asia. These regions appear to offer strong opportunities for international students looking for quality education at a lower price.

One pattern I noticed in the bar chart is the contrast between degree levels. In many cases, bachelor’s programs appear more expensive than master’s or PhDs. This could be because bachelor’s degrees typically last four years, while master’s and PhD programs are often shorter. However, in some cases, especially among the highest tuition programs, we also see that master’s degrees are more expensive, even when they only last one or two years. This shows that cost is not always related to the program’s duration, and that tuition differences may depend more on the country, institution, or other variables.

While the project was successful, there were a few things I wish I could have added. I would have liked to include more program level details, like university ranking to go deeper in the analysis. Also, I wanted to highlight specific case studies, such as India, which shows strong growth in the STEM field and could be interesting to compare with developed countries like the United States or Germany. In the future, I would like to focus more on individual countries or specific types of STEM programs, especially in comparing trends between emerging and developed economies.

Works Cited

  • McCarthy, Niall. “The U.S. Leads the World in Tuition Fees [Infographic].” Forbes, 12 Sept. 2017, https://www.forbes.com/sites/niallmccarthy/2017/09/12/the-u-s-leads-the-world-in-tuition-fees-infographic/.

  • Oliss, Brendan, et al. “The Global Distribution of STEM Graduates: Which Countries Lead the Way?” Center for Security and Emerging Technology (CSET), 27 Nov. 2023, https://cset.georgetown.edu/article/the-global-distribution-of-stem-graduates-which-countries-lead-the-way/.

  • OECD (2024), Education at a Glance 2024: OECD Indicators, OECD Publishing, Paris, https://doi.org/10.1787/c00cad36-en.

  • “Data Visualisation: Alluvial Diagram vs. Sankey Diagram.” Analytics Vidhya, 22 June 2022, https://www.analyticsvidhya.com/blog/2022/06/data-visualisation-alluvial-diagram-vs-sankey-diagram/.

  • BlackLabel. “Highcharts Demo.” JSFiddle, https://jsfiddle.net/BlackLabel/romtnqx5/.

  • Techanswers88. “Easy Sankey Diagram in Highcharter Using R.” RPubs, https://rpubs.com/techanswers88/sankey

  • OpenAI. ChatGPT: How format tooltips and align the axis in my Sankey diagram code.

  • Bringley, Joe. “Highcharts Sankey Diagram in R.” Stack Overflow, 8 May 2018, https://stackoverflow.com/questions/50242318/highcharts-sankey-diagram-in-r

  • Bajak, Aleszu. “How to Geocode a CSV of Addresses in R.” Storybench, 13 June 2020, https://www.storybench.org/geocode-csv-addresses-r/.

  • Pesca Polanco, Karen. “Geographic Distribution of CGMJCI in Silver Spring, MD.” https://rpubs.com/kpescapo/1300606.