Electric Energy in African Countries: Production, Consumption, Imports, and Inflation.
Author
Merveille Kuendzong
Published
April 14, 2024
Introduction
Countries across the world vary greatly in numerous aspects, including socioeconomic status, type of government, culture, currency, population diversity, and management of natural energy resources. This work essentially focuses on the analysis of electric energy in African countries, specifically examining energy production, consumption, imports, and their correlation with inflation. This topic is meaningful to me as I am originally from Africa and am aware of the energy challenges faced by populations in many African countries.
Data
The dataset I am working with was obtained from web scraping, and contains information on all the countries in the world. However, I have focused my work on the analysis of African countries. My original dataset contains 64 variables, but I have used 15 of them for this analysis: “country”: The name of the country. “region”: The region to which the country belongs. “latitude”: The latitude of the country. “longitude”: The longitude of the country. “inflation”: The inflation rate in the country. “internet_pct”: The percentage of the population in the country that has access to the internet. “electricity_access_pct”: The percentage of the population in the country that has access to electricity. “alternative_nuclear_energy_pct”: The percentage of total electricity consumption in the country that comes from alternative energy sources. “electricity_production_coal_pct”: The percentage of total electricity production in the country that comes from coal power. “electricity_production_hydroelectric_pct”: The percentage of total electricity production in the country that comes from hydroelectric power. “electricity_production_gas_pct”: The percentage of total electricity production in the country that comes from natural gas power. “electricity_production_nuclear_pct”: The percentage of total electricity production in the country that comes from nuclear power. “electricity_production_oil_pct”: The percentage of total electricity production in the country that comes from oil power. “electricity_production_renewable_pct”: The percentage of total electricity production in the country that comes from renewable energy sources. “energy_imports_pct”: The percentage of the country’s total energy that is imported from other countries. A negative percentage might indicate that the country is a net exporter of energy.
Load Libraries
library(tidyverse)
Warning: package 'dplyr' was built under R version 4.3.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.0
✔ ggplot2 3.4.3 ✔ tibble 3.2.1
✔ lubridate 1.9.2 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning: package 'leaflet' was built under R version 4.3.3
library(viridis)
Loading required package: viridisLite
library(plotly)
Warning: package 'plotly' was built under R version 4.3.2
Attaching package: 'plotly'
The following object is masked from 'package:ggplot2':
last_plot
The following object is masked from 'package:stats':
filter
The following object is masked from 'package:graphics':
layout
Load Data
# set working directorysetwd("C:/Users/kmerv_6exilcx/Dropbox/SPRING 2024/Data 110/week11/project2")countries <-read_csv('AllCountries.csv')# display the first six rowshead(countries)
# A tibble: 6 × 64
country country_long currency capital_city region continent demonym latitude
<chr> <chr> <chr> <chr> <chr> <chr> <chr> <dbl>
1 Afghanis… Islamic Sta… Afghan … Kabul South… Asia Afghan 33
2 Albania Republic of… Albania… Tirana South… Europe Albani… 41
3 Algeria People's De… Algeria… Algiers North… Africa Algeri… 28
4 Andorra Principalit… Euro Andorra la … South… Europe Andorr… 42.5
5 Angola People's Re… Angolan… Luanda Middl… Africa Angolan -12.5
6 Antigua … Antigua and… East Ca… Saint John's Carib… Americas Antigu… 17.0
# ℹ 56 more variables: longitude <dbl>, agricultural_land <dbl>,
# forest_area <dbl>, land_area <dbl>, rural_land <dbl>, urban_land <dbl>,
# central_government_debt_pct_gdp <dbl>, expense_pct_gdp <dbl>, gdp <dbl>,
# inflation <dbl>, self_employed_pct <dbl>, tax_revenue_pct_gdp <dbl>,
# unemployment_pct <dbl>, vulnerable_employment_pct <dbl>,
# electricity_access_pct <dbl>, alternative_nuclear_energy_pct <dbl>,
# electricty_production_coal_pct <dbl>, …
Filter data
# filter african countriesaf_countries = countries |>filter(continent =="Africa") # select only the columns I will work withdata <- af_countries[, c(1, 5, 8, 9, 18, 50, 23, 24, 25, 26, 27, 28, 29, 30, 31)]head(data)
# A tibble: 6 × 15
country region latitude longitude inflation internet_pct
<chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Algeria Northern Africa 28 3 9.27 70.8
2 Angola Middle Africa -12.5 18.5 25.8 32.6
3 Benin Western Africa 9.5 2.25 1.35 34.0
4 Botswana Southern Africa -22 24 11.7 73.5
5 Burkina Faso Western Africa 13 -2 14.3 21.6
6 Burundi Eastern Africa -3.5 30 18.8 5.80
# ℹ 9 more variables: electricity_access_pct <dbl>,
# alternative_nuclear_energy_pct <dbl>, electricty_production_coal_pct <dbl>,
# electricty_production_hydroelectric_pct <dbl>,
# electricty_production_gas_pct <dbl>,
# electricty_production_nuclear_pct <dbl>,
# electricty_production_oil_pct <dbl>,
# electricty_production_renewable_pct <dbl>, energy_imports_pct <dbl>
Relationship between Electricity Access and Internet Access in African Countries
Scatterplot of Electricity Access to Internet Access
m_plot <-ggplot(data, aes(x = electricity_access_pct, y = internet_pct, color=region, text =paste("country:", country))) +theme_minimal(base_size =12, base_family ="serif") +geom_point(size =3, alpha =0.5) +geom_smooth(method=lm, se=FALSE, lty =5, linewidth =0.2) +scale_color_brewer(palette ="Set1") +labs(x="Percentage of populations that have access to electricity", y="Percentage of populations that have access to Internet",title ="Scatterplot of Electricity Access to Internet Access",caption ="Source: Web Scraping")m_plot <-ggplotly(m_plot)
fit =lm(internet_pct ~ electricity_access_pct, data = data)summary(fit)
Call:
lm(formula = internet_pct ~ electricity_access_pct, data = data)
Residuals:
Min 1Q Median 3Q Max
-33.776 -4.176 1.334 9.571 25.161
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.98903 4.50272 0.220 0.827
electricity_access_pct 0.68367 0.07178 9.524 5.44e-13 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 14.02 on 52 degrees of freedom
Multiple R-squared: 0.6356, Adjusted R-squared: 0.6286
F-statistic: 90.71 on 1 and 52 DF, p-value: 5.445e-13
Correlation is equal to 0.7972618, meaning that there is a strong positive linear relationship between the percentage of populations that have access to electricity and the percentage of populations that have access to the Internet. However, correlation does not imply causation, so while the two variables are related, this does not necessarily mean that changes in one directly cause changes in the other.
The model has the equation: internet_pct = 0.68367(electricity_access_pct) + 0.98903
The slope may be interpreted as: For each additional percent of electricity access (electricity_access_pct), there is a predicted increase of 0.68367 percent of Internet access.
The p-value on the right of electricity_access_pct is very low (5.44e-13) and has 3 asterisks which suggests it is a meaningful variable to explain the linear increase in internet_pct. The Adjusted R-Squared value states that about 62.86% of the variation in the observations may be explained by the model. It is quite high, which indicates that the model’s explanatory power is relatively consistent.
Diagnostic plots
autoplot(fit, 1:4, nrow=2, ncol=2)
The non horizontal pattern or trend of the Residual vs Fitted plot may suggest violations of the assumption of constant variance. Both Residual and Normal Q-Q plots show observations 28 and 44 have an effect on those plot as well as having high scale-location values. Those observations correspond to the countries ‘Libya’ and ‘Somalia’ that have a internet_pct very low than electricity_acces_pct.
Regional Distribution of Energy Access and Energy Imports
# mean electricity_access_pct for each regionenergy_access <- data |>group_by(region) |>summarise(mean_electricity_access =mean(electricity_access_pct, na.rm =TRUE))ggplot(data = energy_access, aes(x = region, y = mean_electricity_access, fill = region)) +geom_bar(stat="identity")+labs(x ="Region", y ="Average Electricity Access (%)", fill ="Region",title ="Average Electricity Access by Region", caption ="Source: Web Scraping") +theme_dark() +scale_fill_brewer(palette ="Set1")
This graph shows that in Middle Africa, less than 50% of the population has access to electricity (the lowest percentage among the regions), and there are no regions where 100% of the population has access to electricity.
# mean energy_imports_pct for each regionenergy_imports <- data |>group_by(region) |>summarise(mean_energy_import =mean(energy_imports_pct, na.rm =TRUE))ggplot(data = energy_imports, aes(x = region, y = mean_energy_import, fill = region)) +geom_bar(stat="identity")+labs(x ="Region", y ="Average Energy Imports (%)", fill ="Region",title ="Average Energy Imports by Region", caption ="Source: Web Scraping") +theme_dark()+scale_fill_brewer(palette ="Set1")
This graph shows that the average energy imports in Middle Africa are unusually high and negative (nearly -400), suggesting that the region exports four times the energy it consumes. This is surprising given the low percentage of the population with access to electricity.
Energy Access, Production, and Imports Stats in African Countries
# select columns needed and rows containing non na values, and order the data based on countries namescountr <- data[,c(1, 7, 8, 9, 10, 11, 12, 13, 14, 15)]countr <-na.omit(countr)countr <- countr[order(countr$country),]countr <-as.data.frame(countr)
# rename columns to give shorter namescountr <- countr |>rename(elAcc = electricity_access_pct, altNucl = alternative_nuclear_energy_pct,pcoal = electricty_production_coal_pct, phydro = electricty_production_hydroelectric_pct,pgas = electricty_production_gas_pct,pnucl = electricty_production_nuclear_pct,poil = electricty_production_oil_pct,prenew = electricty_production_renewable_pct,enImp = energy_imports_pct )row.names(countr) <- countr$countrycountr <- countr[,c(2:10)]# matrix of data without nascountr_matrix <-data.matrix(countr)
# Heatmap of energy statsheatmap(countr_matrix, Rowv=NA, Colv=NA, col =viridis(30), scale="column", margins=c(5,10),xlab ="Energy access and production Stats",ylab ="African countries",main ="Energy Stats in African Countries")
We observe that South Sudan has the lowest access to electricity and the lowest energy import percentage (indicating the highest export), which is surprising. Namibia has the highest percentage of electricity consumption coming from nuclear power. South Africa has a significant portion of its electricity production coming from nuclear power, a contrast to other countries where it is nonexistent. Kenya leads in the production of electricity from renewable energy sources.
Data Map: Inflation Rate in each African Country
# latitude and longitude of a a country located in Middle Africa: 'Cameroon'long <-12lat <-6
# mapmy_map <-leaflet() |>setView(lng = long, lat = lat, zoom =4) |>addProviderTiles("Esri.NatGeoWorldMap") palette <-colorFactor(palette ="Set1", domain = data$region)my_map <- my_map|>addCircles(data = data,radius = data$inflation*3500, # radius is based on percentage of inflationcolor =~palette(region), fillColor =~palette(region),#Add popuppopup = pop )
Assuming "longitude" and "latitude" are longitude and latitude, respectively
# addlegend for colorsmy_map <-addLegend(my_map, position ="bottomleft", pal = palette, values = data$region, title ="Region")my_map
We can observe that some countries, such as Angola in the Middle Africa, experience high inflation rates (25.7%) despite having a high percentage of energy exports (-541% of energy imports) and a low percentage of electricity access (48%). The same is true for Sudan in the Northern Africa, which experiences an inflation rate of 138.8%.
Conclusion
This dataset was suitable for analyzing the management of natural energy resources in specific regions of the world. The scatterplot and correlation coefficient showed a strong positive relationship between electricity access and internet access in African countries. The linear model further confirmed that the percentage of the population with access to electricity is a reliable predictor of the percentage of the population with access to the internet. Bar graphs revealed that Middle Africa, on average, has the lowest electricity access, yet surprisingly, it is also the region that exports the most energy. The heatmap reveals patterns and relationships among statistical variables, providing insights into energy access, production, and imports across African countries. Additionally, the map depicted the variation in inflation rates among these countries. Cleaning the dataset before working with it was unnecessary, as all variable names were already lowercase and clearly named. However, renaming them for better readability in the heatmap was required. I filtered the data to focus only on African countries and removed NA values before conducting computations and generating the heatmap. Working with this dataset provided an excellent opportunity for practicing visualization and gave me valuable insights.