Introduction
I’ve always had a deep appreciation for education, instilled by my
father who consistently reminded me to be grateful for my education and
the importance of educating the youth to create a better society for
each generation to come. My interest in sustainability has grown over
the years as the environmental situation has become more and more dire.
I wanted to find a topic that combined both of these interests and
attempted to uncover relationships between education and environmental
variables across the world.
My dataset features over 3200 observations and 54 variables both
categorical and quantitative. The data was obtained from Kaggle and was
created using data exclusively from the WorldBank and the UN. The data
spans the years of 2000 to 2018 and tracks 173 countries against
sustainability metrics. My analysis focuses on 7 educational variables
and 7 environmental variables.
Educational variables:
- Out of School (% Primary Age). Percentage of primary-aged children
not enrolled in school. Higher values indicate barriers to basic
education—linked to lower future awareness of health, environment, and
civic issues.
- Compulsory Education (Years). Legal minimum number of years children
must attend school.
- Primary Completion Rate (% of Age Group). Percentage of students
completing the final grade of primary school. Indicates sustained
engagement in school; a proxy for basic literacy.
- Pre-primary Enrollment (% Gross).Percentage of children enrolled in
early childhood education.
- Primary Enrollment (% Gross). Percentage of children enrolled in
primary school (regardless of age). Measures access to the most basic
level of education.
- Secondary Enrollment (% Gross).Percentage of children enrolled in
secondary education (regardless of age) Reflects more advanced
education, linked to critical thinking, civic engagement, and
environmental literacy.
- Pupil-Teacher Ratio (Primary). Number of students per primary school
teacher. Lower ratios imply higher teaching quality.
Environmental variables:
- Adjusted Net Savings (Excl. Particulate, % of GNI). Net savings
after accounting for physical and natural capital losses, excluding
particulate pollution. Positive values suggest sustainable investment;
negative values signal environmental degradation or unsustainable
economics.
- CO2 Damage (% of GNI). Estimated economic damage from carbon
emissions. Reflects the burden of climate-related costs; high values
imply weak environmental policies or high fossil fuel reliance.
- Resource Depletion (% of GNI). Cost of depleting natural resources
like oil, minerals, or forests. Indicates whether economic growth is
coming at the expense of natural capital; lower is better.
- Particulate Emission Damage (% of GNI). Cost of health and
productivity losses from air pollution.
- Net Forest Depletion (% of GNI). Economic cost of unsustainable
forest harvesting. Reflects deforestation beyond natural regrowth; tied
to logging, agriculture, and regulatory capacity.
- Renewable Energy Consumption (% of Total Final Energy). Share of
renewables in overall energy use. Higher values signal cleaner national
energy portfolios.
- Renewable Electricity Output (% of Total Electricity).Percentage of
electricity generated from renewable sources. Measures green energy
development.
Now let’s dive into some analysis!
1. Secondary School Enrollment v. Rewnewable Energy Consumption:
Bubble Scatter Plot
Do More Educated Populations Adopt More Renewable Energy?
This bubble scatter plot shows the relationship between secondary
school enrollment and renewable energy consumption with the bubble size
correlating with population size. There is an inverse relationship
between secondary school enrollment and rewnewable energy consumption as
we see from the downward sloping trendline. This suggests that countries
with higher education levels are associated with a lower use of
renewable energy. Countries with the highest renewable energy
consumption tend to be less populous, while low renewable energy
consumers span all sizes of population. So, do more educated populations
adopt more renewable energy? No, this data suggests that more educated
populations don’t show a strong tendency to consume more renewable
energy. This could potentially be because more industrialized (and
educated) countries still rely heavily on fossil fuels.
2. Envrionmental & Edcuational Variable Correlation Matrix
Heatmap
What environmental and educational variables are highly
correlated?

This is a Lower Triangle Correlation Heatmap that displays the
correlation coefficients between educational and environmental variables
in 2015. Color intensity and numeric labels represent the strength and
direction of relationships. Red signifies a strong positive correlation
while blue signifies a strong negative correlation.
Three key positive relationships are evident. The 0.84 correlation
between particulate damage & pupil-teacher ratio suggests that
countries with more crowded classrooms (lower teaching quality)
experience greater economic damage from air pollution. The 0.71
correlation between particulate damage & out-of-school rate implies
that places where more children are out of school, air pollution tends
to be worse economically. The 0.55 forest depletion & pupil-teacher
ratio suggests that poorer education quality may be associated with the
cost of unsustainable forest harvesting.
Five key negative relationships are evident. The -0.84 correlation
between particulate damage & secondary enrollment implies that
countries with higher secondary school enrollment suffer less from
particulate pollution. The -0.81 correlation between particulate damage
& primary completion as well as the -0.70 correlation between
particulate damage & pre-primary enrollment show a similar trend,
when more students complete primary school (or are enrolled in
pre-primary school), the cost due to air pollution damage is lower. The
-0.52 correlation between forest depletion & secondary enrollment
alongside the -0.52 correlation between forest depletion & primary
completion indicate that higher school participation is associated with
less unsustainable forest use that negatively impacts savings.
This suggests lack of access to education may reduce environmental
awareness or capacity. Thus, education may lead to cleaner energy use
and better policies.
3. Pupil-Teacher Ratio vs Environmental Investment: Scatterplot
Is quality of education (smaller class sizes) related to
sustainability awareness or investment?
## `geom_smooth()` using formula = 'y ~ x'

This is a scatterplot that analyzes the relationship between
pupil-teacher ratio and environmental investment score (negative values,
higher the better as it signifies less damage). The environmental
investment score is made up of 3 environmental variables: economic
damage from natural resource depletion, CO2 emissions, and particulate
emissions. The downward-sloping trend line shows that higher
pupil-teacher ratios (larger class sizes) are associated with lower
environmental investment. This implies that countries with overcrowded
classrooms tend to underinvest in environmental sustainability.While the
relationship isn’t extremely strong (there’s some spread), the negative
slope and confidence band suggest a statistically meaningful pattern.
So, is quality of education (smaller class sizes) related to
sustainability awareness or investment? Yes, to some extent. This data
supports the idea that better quality education (smaller class sizes) is
positively associated with higher environmental investment.
4. Compulsory Enrollment and Renewable Energy Consumption by Iccome
Level: Box Plot
How do compulsory education levels and renewable energy consumption
vary across income groups?

These are side-by-side boxplots that show the relationship of
compulsory education and renewable energy consumption by income level.
We see a clear increase in the value of compulsory education as income
level increases with a plateau between upper-middle and high income. We
also can see a clear decrease in renewable energy consumption as income
level increases with low-income countries having a far larger median
compared to lower-middle, upper-middle, and high income countries.
Despite greater wealth and resources, high-income countries rely less on
renewables, likely due to legacy fossil fuel infrastructure. So, how do
compulsory education levels and renewable energy consumption vary across
income groups? Compulsory education tends to increase with income level,
reflecting stronger educational institutions and access. Renewable
energy consumption decreases with income. Low-income nations use more
renewable energy (out of necessity and limited access to fossil
fuels/modern grids), while wealthier countries still depend on
non-renewables despite having the capacity to invest in clean tech.
Thus, wealth brings better education, but not necessarily cleaner energy
use.
5. Enrollment Level vs. Renewable Electricity Over Time: Line Chart
(Time Series)
Are more educated countries expanding rewnewable power?


These line plots compare renewable electricity output over time by
enrollment group (high vs low secondary school enrollment globally). On
a global level, high enrollment countries (≥80% secondary enrollment)
show a steady increase in renewable electricity output over time, rising
from ~25% in 2000 to 28% in 2015. Low enrollment countries consistently
produce a higher percentage of renewable electricity, averaging around
43–45%. While less-educated countries may currently use more renewable
energy, more-educated countries are either steady in renewable
electricity output or slightly increasing. In Africa, South America, and
North America, low enrollment countries dominate renewable electricity
output, likely due to natural resource availability (hydroelectric in
Africa). It’s important to note that Europe doesn’t have a “low
enrollment” trend because of all its secondary enrollment rates are
greater than 80%. So, are more educated countries expanding rewnewable
power? Yes, there is a slight upward shift on a global level, however
renewable eletricity output has remained relatively the same across
continents in from 2000-2015.
6. CO2 Damage v. Women in Parliment by Secondary Enrollment: Stacked
Bar Chart
Does gender equality in education/policy leadership improve
environmental sustainability?

This visualization is a stacked bar chart that shows average CO2
economic damage broken down by proportion of women in parliament, with
stacks representing girls’ secondary school enrollment levels. As the
proportion of women in parliament increases, the total CO2 damage
decreases. Across all categories, countries with the lowest female
secondary enrollment (<50%) and 50–80% contribute the highest
portions to CO2 damage. The 20-30% women in parliament group has the
highest economic CO2 damage. These are likely developed and industrial
countries who aren’t developed enough to have environmental policy to
reduce CO2 economic damage. The 30–50% women in parliament group not
only has the lowest total CO2 damage, but also a more balanced share
across education levels. Countries where girls’ secondary enrollment is
100%+ consistently contribute smaller portions to CO₂ damage in each
group.This implies that when female education and political leadership
are both strong, environmental outcomes improve. So, does gender
equality in education/policy leadership improve environmental
sustainability? Yes. This supports the narrative that investing in
girls’ education and empowering women in governance leads to more
sustainable environmental policies and outcomes.
7. Primary Children Out of School vs. Particulate Damage: Stacked
Area Chart
Do higher out-of-school rates correlate with higher environmental
harm?

This is a stacked area chart that shows the change in average
particulate damage over time, segmented by percent of children out of
school.There is a clear inverse relationship, countries with higher
percentages of out-of-school children (especially 30–50% and 50%+
groups) have substantially higher levels of particulate emission damage.
Over time, as education improves (fewer children out of school), overall
particulate damage declines sharply, especially post-2007. Countries
with fewer than 10% of primary-age children out of school consistently
show the lowest particulate damage, with little fluctuation over time,
these are likely countries with strong infrastructure and effective
environmental and educational systems. Countries with better educational
access for children tend to experience lower environmental harm, as
reflected by lower particulate emission damage. Thus, higher
out-of-school rates correlate with higher environmental degradation.
- Shiny App
- add each country and timeseries or plotly animation
cat('<iframe src="https://kennycodez.shinyapps.io/pshiny1/" width="100%" height="600px"></iframe>')
## <iframe src="https://kennycodez.shinyapps.io/pshiny1/" width="100%" height="600px"></iframe>
tags$iframe(
src = "https://kennycodez.shinyapps.io/pshiny1/",
width = "100%",
height = "800px",
frameborder = "0"
)
- Shiny App
- world map chloropleth (animation?)
cat('<iframe src="https://yourusername.shinyapps.io/yourappname/" width="100%" height="600px"></iframe>')
## <iframe src="https://yourusername.shinyapps.io/yourappname/" width="100%" height="600px"></iframe>
Conclusion
We have gained key insights on the relationship between environmental
and educational variables and their implications globally. From this
analysis we have found that more educated populations don’t show a
strong tendency to consume more renewable energy and what educational
and environmental variables are highly correlated. We see that better
quality education (smaller class sizes) is positively associated with
higher environmental investment and that wealth brings better education,
but not necessarily cleaner energy use. There is a global upward shift
towards renewable power, but on a continent level rates have stayed
approximately the same. We found that investing in girls’ education and
empowering women in governance leads to more lower economic CO2 damage.
We also see that higher out-of-school rates correlate with higher
environmental degradation.
In creating this visualizations, I kept accessibility in mind and
used color palettes and combinations that are made for color-blind
individuals. I kept Schwabish’s 5 principles of a good data
visualization in mind. 1) Show the data in the clearest/most purposeful
way: I chose visualizations that correctly fit the data I was trying to
interpret. 2) Reduce the clutter: I filtered the data and ensured graphs
were clean. 3) Integrate the graphics and texts: I made sure to analyze
each visualzation and make a key takeway from the data. 4) Use a
small-multiples approach: I have visualizations that are side by side
comparisons and more to easily examine differences in the graphs. 5)
Start everything with gray: I also kept the viewer in mind and
prioritizes objective data interpretation.
In general, we can see that education in many instances can
positively impact the environment and mitigate problems stemming from
lower levels of education. However, education is solely one of the
multitude of variables that impact the environment. Sometimes there is a
correlation, sometimes there isn’t. It’s important to remain critical
and aware of the growing environmental crisis and acknowledge the
components we can control to help save the earth we know and love.
- Forest depletion vs. pupil teacher ratio
# Prepare and clean data
hex_data <- sustainability %>%
filter(
!is.na(`Pupil.teacher.ratio..primary...SE.PRM.ENRL.TC.ZS`),
!is.na(`Adjusted.savings..net.forest.depletion....of.GNI....NY.ADJ.DFOR.GN.ZS`)
) %>%
mutate(
PupilTeacher = `Pupil.teacher.ratio..primary...SE.PRM.ENRL.TC.ZS`,
Forest_Depletion = `Adjusted.savings..net.forest.depletion....of.GNI....NY.ADJ.DFOR.GN.ZS`
)
# Create hexbin plot
ggplot(hex_data, aes(x = PupilTeacher, y = Forest_Depletion)) +
geom_hex(bins = 30) +
scale_fill_viridis_c(option = "C") +
labs(
title = "Pupil-Teacher Ratio vs Forest Depletion",
subtitle = "Each hex shows density of countries",
x = "Pupil-Teacher Ratio (Primary)",
y = "Forest Depletion (% of GNI)",
fill = "Country Count"
) +
theme_minimal()

Project Accurately addresses accesibility concerns and makes some
efforts towards accesibility even if the work is not perfectly
accessible across CVD, screen readers etc