The aim of this report is to compare the literacy rates of youth and adults, also comparing any differences between youth and adult literacy of males and females. The total literacy rate is then compared against national GDP per capita to investigate any association.
The main discoveries are the literacy rates of all the six countries are decreased, Nepal’s change is most obvious. It is difficult to determine surely the association between the total literacy rate and national GDP due to the trend of exploring the association between higher literacy rates and GDP per capita. We just can determine a possibility of their association.
The Education data comes from UNICEF and the GDP per capita data comes from The World Bank. The data is valid as it is from credible, globally renowned organizations that are established producers of data in various areas.
UNICEF took their data from enrolment data based on administrative records and household surveys and state that more than half the countries had education data taken from more than one source. However, this does not tell us if our randomly selected countries are included in this. Moreover, the UNICEF data does not specify sample size, although it states the source of data. Therefore, as the values presented as percentages, we cannot sure how representative this percentage is of the population.
A disparity within our dataset lies in the reference years. Countries were randomly selected after a filter had been applied to the data to only show references from the year 2011. Data was found for all countries from 2011 for youth literacy, but the DRC only had data from 2012 for adult literacy. Although minor, the difference in reference year could slightly alter results. To be consistent, GDP data from 2012 was provided for the adult literacy data for DRC specifically.
library(readxl)
youthdata= read_excel("educationdata.xlsx")
adultdata= read_excel("educationdata1.xlsx")
# Quick look at top 5 rows of data
head(youthdata)
## # A tibble: 6 x 6
## Country `Ref Year` Total Male Female GDP
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Afghanistan 2011 47 62 32 591.
## 2 Bolivia 2011 99 99 99 2346.
## 3 Chile 2011 99 99 100 14637.
## 4 India 2011 86 90 82 1458.
## 5 Indonesia 2011 99 99 99 3643.
## 6 Nepal 2011 85 90 80 699.
## Size of data
dim(youthdata)
## [1] 6 6
## R's classification of data
class(youthdata)
## [1] "tbl_df" "tbl" "data.frame"
## R's classification of variables
str(youthdata)
## Classes 'tbl_df', 'tbl' and 'data.frame': 6 obs. of 6 variables:
## $ Country : chr "Afghanistan" "Bolivia" "Chile" "India" ...
## $ Ref Year: num 2011 2011 2011 2011 2011 ...
## $ Total : num 47 99 99 86 99 85
## $ Male : num 62 99 99 90 99 90
## $ Female : num 32 99 100 82 99 80
## $ GDP : num 591 2346 14637 1458 3643 ...
sapply(youthdata, class)
## Country Ref Year Total Male Female GDP
## "character" "numeric" "numeric" "numeric" "numeric" "numeric"
head(adultdata)
## # A tibble: 6 x 6
## Country `Ref Year` Total Male Female GDP
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 Afghanistan 2011 31.7 45.4 17.6 591.
## 2 Bolivia 2011 92.2 96.6 88.1 2346.
## 3 Chile 2011 97 97 97 14637.
## 4 India 2011 69.3 78.9 59.3 1458.
## 5 Indonesia 2011 92.8 95.6 90.1 3643.
## 6 Nepal 2011 60 72 49 699.
str(adultdata)
## Classes 'tbl_df', 'tbl' and 'data.frame': 6 obs. of 6 variables:
## $ Country : chr "Afghanistan" "Bolivia" "Chile" "India" ...
## $ Ref Year: num 2011 2011 2011 2011 2011 ...
## $ Total : num 31.7 92.2 97 69.3 92.8 ...
## $ Male : num 45.4 96.6 97 78.9 95.6 ...
## $ Female : num 17.6 88.1 97 59.3 90.1 ...
## $ GDP : num 591 2346 14637 1458 3643 ...
dim(adultdata)
## [1] 6 6
class(adultdata)
## [1] "tbl_df" "tbl" "data.frame"
summary(adultdata$Total)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 31.74 62.33 80.76 73.85 92.67 97.00
summary(youthdata$Total)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 47.00 85.25 92.50 85.83 99.00 99.00
Is there a difference in literacy rates between females and males amoung youth and adult populations (per country?)
In Afghanistan, there are both a little decreasing in literacy rates between females and males among youth and adult populations; the change of females is smaller than the change of males. In Bolivia, there is a slight difference in literacy rates between youth males and adult males, it goes to down; there is a little decreasing in literacy rates for Bolivia’s females among youth and adult populations. Thus, the change of females is a little bigger than the change of males. For both Chile’s males and Chile’s females, there is a slight falling in literacy rates between youth and adult populations. There is an obvious decline in literacy rates among youth and adult populations for India’s females; for India’s males, the decrease in literacy rates among youth and adult population groups is slightly. Therefore, the change in literacy rates of females is bigger than it of males. In Indonesia, there are both a reducing in literacy rates between females and males among youths and adults. The literacy rates of Nepal’s males decrease a little among youth and adult populations; the literacy rates of Nepal’s females reduce obviously among youths and adults. Nepal’s change is most obvious; Chile’s change is slightest.
Country <- as.factor(youthdata$Country)
Info1 <- cbind(c(youthdata$Male), c(youthdata$Female))
barplot(Info1~Country, beside = T, ylab = "Literacy Rate", xlab = "Country",las = 1, ylim=c(0,100), main= "Youth Literacy Rates", col=c("mediumorchid4", "skyblue2"))
legend(0.5, 100, pch=c(15,15), box.col=("black"), cex=1, c("Male", "Female"), col=c("mediumorchid4", "skyblue2"))
Country <- as.factor(adultdata$Country)
Info2 <- cbind(c(adultdata$Male), c(adultdata$Female))
barplot(Info2~Country, beside = T, ylab = "Literacy Rate", xlab = "Country", las = 1, ylim = c(0,100), main = "Adult Literacy Rates", col=c("mediumorchid4", "skyblue2"))
legend(0.5,100, pch=c(15,15), box.col=("black"), cex=1, c("Male", "Female"), col=c("mediumorchid4", "skyblue2"))
Summary:
Is there an association between higher literacy rates and GDP per capita?
Summary:
From the scatter plot generated is is evident that there is no strong relationship between literacy rates and GDP per capita. Many countries with lower GDP/capita have the same literacy rate as more affluent countries. This is a reflection of the modern day social and economic structures; where literacy and basic education is generally accessible and there is a general global education standard. However, exceptions do exist to this, and although there is no correlation, there is an association for some countries e.g. Afghanistan and Nepal.
youthdata=read_excel("~/Desktop/scatterplotdata.xlsx")
adultdata=read_excel("~/Desktop/scatterplotdata.xlsx")
L = lm(adultdata$GDP~adultdata$Total)
plot(adultdata$Total, L$residuals, xlab = "Literacy Rate", ylab = "GDP per Capita", main = "Literacy Rate and GDP per Capita")
abline (h=0)
Conclusion
From the set of data that we chose, we had chosen to go with the data from the year 2011 for a more uniform set of data. In accordance with the plots made, namely the barplot and scatterplot, a lower amount of data was taken from the set of datas to show a clear difference on the barplot. This helps us to show the difference between the literacy rate and the country better and clearer. On the other hand, from the set of datas that we had, all countries involved in 2011 was taken for the scatter plot as a lesser ammount of data does not give a definite scatter for the difference between the GDP per capita and literacy rate.
Literacy Rates -UNICEF DATA. (2018). Retrieved from https://data.unicef.org/topic/education/literacy/
GDP per capita (current US$)| Data. (2019). Retrieved from https://data.worldbank.org/indicator/NY.GDP.PCAP.CD Style: APA