library(tidyverse)
## -- Attaching packages ---------------------------------------------------------------------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.1 v purrr 0.3.2
## v tibble 2.1.3 v dplyr 0.8.3
## v tidyr 0.8.3 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.4.0
## -- Conflicts ------------------------------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
library(dplyr)
library(psych)
##
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
library(plotly)
##
## Attaching package: 'plotly'
## The following object is masked from 'package:ggplot2':
##
## last_plot
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:graphics':
##
## layout
Populatio Growth Data set
#Original Population Growth dataset (pg)
# Reading the original .csv file that downloaded from http://data.un.org/
pg_original_file <- read.csv("https://raw.githubusercontent.com/gpadmaperuma/DATA606/master/SYB62_T03_201907_Population%20Growth%2C%20Fertility%20and%20Mortality%20Indicators.csv", header = TRUE, skip = 1)
head(pg_original_file) %>% kable()
Region.Country.Area | X | Year | Series | Value | Footnotes | Source |
---|---|---|---|---|---|---|
1 | Total, all countries or areas | 2005 | Population annual rate of increase (percent) | 1.2570 | Data refers to a 5-year period preceding the reference year. | United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019. |
1 | Total, all countries or areas | 2005 | Total fertility rate (children per women) | 2.6513 | Data refers to a 5-year period preceding the reference year. | United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019. |
1 | Total, all countries or areas | 2005 | Infant mortality for both sexes (per 1,000 live births) | 49.2161 | Data refers to a 5-year period preceding the reference year. | United Nations Statistics Division, New York, “Demographic Yearbook 2015” and the demographic statistics database, last accessed June 2017. |
1 | Total, all countries or areas | 2005 | Maternal mortality ratio (deaths per 100,000 population) | 288.0000 | World Health Organization (WHO), the United Nations Children’s Fund (UNICEF), the United Nations Population Fund (UNFPA), the World Bank and the United Nations Population Division, “Trends in Maternal Mortality 1990 - 2015.” | |
1 | Total, all countries or areas | 2005 | Life expectancy at birth for both sexes (years) | 67.0455 | Data refers to a 5-year period preceding the reference year. | United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019. |
1 | Total, all countries or areas | 2005 | Life expectancy at birth for males (years) | 64.8082 | Data refers to a 5-year period preceding the reference year. | United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019. |
Population Data set
#Original population dataset(p)
p_original <- read.csv("https://raw.githubusercontent.com/gpadmaperuma/DATA606/master/Population%2C%20Surface%20Area%20and%20Density.csv", header = TRUE, skip = 1)
head(p_original) %>% kable()
Region.Country.Area | X | Year | Series | Value | Footnotes | Source |
---|---|---|---|---|---|---|
1 | Total, all countries or areas | 2005 | Population mid-year estimates (millions) | 6541.9070 | United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019. | |
1 | Total, all countries or areas | 2005 | Population mid-year estimates for males (millions) | 3296.4853 | United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019. | |
1 | Total, all countries or areas | 2005 | Population mid-year estimates for females (millions) | 3245.4217 | United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019. | |
1 | Total, all countries or areas | 2005 | Sex ratio (males per 100 females) | 101.5734 | United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019. | |
1 | Total, all countries or areas | 2005 | Population aged 0 to 14 years old (percentage) | 28.1425 | United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019. | |
1 | Total, all countries or areas | 2005 | Population aged 60+ years old (percentage) | 10.2516 | United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019. |
dim(pg_original_file)
## [1] 4984 7
dim(p_original)
## [1] 7351 7
While analyzing these data I will try to find solutions to my two questions:
(1) Which Region has the highest Life Expectancy? (2) Is there a relationship in Population, fertility rate and infant mortality rate in these Regions?
There are 4984 Cases in the population growth dataset. Each case represent population growth and indicators of fertility and mortality of the world There are 7351 Cases in the population dataset. Each case represent populatin of males, females, seniors etc.
These data was obtained from the United Nations Database called UNdata:A world of information.
UNdata is a web-based data service for the global user community. These data are maintained by the Statistical Division of the Department of Economics and Social Affairs (UN DESA) of UN Secretariat. Most of the data sourced by UN partner organization such as UNICEF, UNDP, UNHCR, WHO etc.
These data are obtained as a part of UN research efforts in order to solve world economic, health and other problems.These are observational data collected in UN researches of those countries or regions.
UNdata: A world of information
The United Nations, Population Growth Fertility Mortality Indicators (2019). Retrieved from (http://data.un.org/)
The responce variable for this dataset is value which is a quantitative variable.It holds all the population, fertility and mortality rates.
Two Qualitative independent variables are the Region/Country/Area and Series and one quantitative independent variable is year that data was collected.
summary statistics for each the variables and appropriate visualizations
#summary of original population growth file
summary(pg_original_file)
## Region.Country.Area X Year
## Min. : 1.0 Afghanistan: 21 Min. :2000
## 1st Qu.:152.0 Albania : 21 1st Qu.:2005
## Median :388.0 Algeria : 21 Median :2010
## Mean :393.4 Angola : 21 Mean :2010
## 3rd Qu.:624.0 Argentina : 21 3rd Qu.:2015
## Max. :894.0 Armenia : 21 Max. :2018
## (Other) :4858
## Series
## Infant mortality for both sexes (per 1,000 live births) :702
## Life expectancy at birth for both sexes (years) :705
## Life expectancy at birth for females (years) :735
## Life expectancy at birth for males (years) :735
## Maternal mortality ratio (deaths per 100,000 population):573
## Population annual rate of increase (percent) :799
## Total fertility rate (children per women) :735
## Value
## Min. : -4.978
## 1st Qu.: 3.074
## Median : 52.536
## Mean : 57.959
## 3rd Qu.: 73.586
## Max. :1986.136
##
## Footnotes
## Data refers to a 5-year period preceding the reference year. :3835
## : 659
## Data refers to a 5-year period preceding the reference year.;For statistical purposes, the data for China do not include those for the Hong Kong Special Administrative Region (Hong Kong SAR), Macao Special Administrative Region (Macao SAR) and Taiwan Province of China.: 18
## Data refers to a 5-year period preceding the reference year.;Including Abkhazia and South Ossetia. : 18
## Data refers to a 5-year period preceding the reference year.;Including Agalega, Rodrigues and Saint Brandon. : 18
## Data refers to a 5-year period preceding the reference year.;Including Åland Islands. : 18
## (Other) : 418
## Source
## United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019. : 799
## United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019.:2910
## United Nations Statistics Division, New York, "Demographic Yearbook 2015" and the demographic statistics database, last accessed June 2017. : 702
## World Health Organization (WHO), the United Nations Children's Fund (UNICEF), the United Nations Population Fund (UNFPA), the World Bank and the United Nations Population Division, "Trends in Maternal Mortality 1990 - 2015." : 573
##
##
##
#summary of original population file
summary(p_original)
## Region.Country.Area X Year
## Min. : 1.0 Montserrat : 30 Min. :2000
## 1st Qu.:151.0 Afghanistan: 29 1st Qu.:2006
## Median :384.0 Africa : 29 Median :2016
## Mean :391.3 Albania : 29 Mean :2013
## 3rd Qu.:624.0 Algeria : 29 3rd Qu.:2017
## Max. :894.0 Americas : 29 Max. :2019
## (Other) :7176
## Series
## Population density :1115
## Population mid-year estimates (millions) :1115
## Sex ratio (males per 100 females) :1018
## Population aged 0 to 14 years old (percentage) : 993
## Population aged 60+ years old (percentage) : 993
## Population mid-year estimates for females (millions): 925
## (Other) :1192
## Value Footnotes
## Min. : 0.00 :6373
## 1st Qu.: 5.11 Projected estimate (medium fertility variant).: 88
## Median : 21.73 De jure population. : 55
## Mean : 201.69 Calculated by the UN Statistics Division. : 32
## 3rd Qu.: 94.51 Including Åland Islands. : 29
## Max. :136162.00 Including Svalbard and Jan Mayen Islands. : 29
## (Other) : 745
## Source
## United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019. :4080
## United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019.:3004
## United Nations Statistics Division, New York, "Demographic Yearbook 2015" and the demographic statistics database, last accessed June 2017. : 267
##
##
##
##
#Population Growth
describe(pg_original_file)
## vars n mean sd median trimmed mad
## Region.Country.Area 1 4984 393.39 264.59 388.00 385.88 349.89
## X* 2 4984 133.17 76.52 132.00 133.08 99.33
## Year 3 4984 2009.95 4.12 2010.00 2009.97 7.41
## Series* 4 4984 4.03 2.02 4.00 4.03 2.97
## Value 5 4984 57.96 108.95 52.54 41.12 43.83
## Footnotes* 6 4984 10.95 6.20 11.00 10.55 0.00
## Source* 7 4984 2.21 0.85 2.00 2.14 0.00
## min max range skew kurtosis se
## Region.Country.Area 1.00 894.00 893.00 0.16 -1.23 3.75
## X* 1.00 265.00 264.00 0.01 -1.20 1.08
## Year 2000.00 2018.00 18.00 -0.03 -1.42 0.06
## Series* 1.00 7.00 6.00 0.00 -1.28 0.03
## Value -4.98 1986.14 1991.11 6.45 58.89 1.54
## Footnotes* 1.00 42.00 41.00 1.70 6.19 0.09
## Source* 1.00 4.00 3.00 0.72 0.07 0.01
#Population
describe(p_original)
## vars n mean sd median trimmed mad min
## Region.Country.Area 1 7351 391.27 265.72 384.00 383.42 349.89 1
## X* 2 7351 133.16 76.79 132.00 133.03 99.33 1
## Year 3 7351 2012.69 5.62 2016.00 2012.91 4.45 2000
## Series* 4 7351 4.11 2.09 4.00 4.09 2.97 1
## Value 5 7351 201.69 2094.52 21.73 38.65 29.74 0
## Footnotes* 6 7351 6.18 14.86 1.00 1.54 0.00 1
## Source* 7 7351 1.48 0.57 1.00 1.43 0.00 1
## max range skew kurtosis se
## Region.Country.Area 894 893 0.16 -1.23 3.10
## X* 266 265 0.02 -1.20 0.90
## Year 2019 19 -0.29 -1.49 0.07
## Series* 8 7 0.10 -1.14 0.02
## Value 136162 136162 41.93 2475.20 24.43
## Footnotes* 74 73 2.84 6.86 0.17
## Source* 3 2 0.67 -0.57 0.01
#Population Growth
head(pg_original_file)
## Region.Country.Area X Year
## 1 1 Total, all countries or areas 2005
## 2 1 Total, all countries or areas 2005
## 3 1 Total, all countries or areas 2005
## 4 1 Total, all countries or areas 2005
## 5 1 Total, all countries or areas 2005
## 6 1 Total, all countries or areas 2005
## Series Value
## 1 Population annual rate of increase (percent) 1.2570
## 2 Total fertility rate (children per women) 2.6513
## 3 Infant mortality for both sexes (per 1,000 live births) 49.2161
## 4 Maternal mortality ratio (deaths per 100,000 population) 288.0000
## 5 Life expectancy at birth for both sexes (years) 67.0455
## 6 Life expectancy at birth for males (years) 64.8082
## Footnotes
## 1 Data refers to a 5-year period preceding the reference year.
## 2 Data refers to a 5-year period preceding the reference year.
## 3 Data refers to a 5-year period preceding the reference year.
## 4
## 5 Data refers to a 5-year period preceding the reference year.
## 6 Data refers to a 5-year period preceding the reference year.
## Source
## 1 United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019.
## 2 United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019.
## 3 United Nations Statistics Division, New York, "Demographic Yearbook 2015" and the demographic statistics database, last accessed June 2017.
## 4 World Health Organization (WHO), the United Nations Children's Fund (UNICEF), the United Nations Population Fund (UNFPA), the World Bank and the United Nations Population Division, "Trends in Maternal Mortality 1990 - 2015."
## 5 United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019.
## 6 United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019.
#Population
head(p_original)
## Region.Country.Area X Year
## 1 1 Total, all countries or areas 2005
## 2 1 Total, all countries or areas 2005
## 3 1 Total, all countries or areas 2005
## 4 1 Total, all countries or areas 2005
## 5 1 Total, all countries or areas 2005
## 6 1 Total, all countries or areas 2005
## Series Value Footnotes
## 1 Population mid-year estimates (millions) 6541.9070
## 2 Population mid-year estimates for males (millions) 3296.4853
## 3 Population mid-year estimates for females (millions) 3245.4217
## 4 Sex ratio (males per 100 females) 101.5734
## 5 Population aged 0 to 14 years old (percentage) 28.1425
## 6 Population aged 60+ years old (percentage) 10.2516
## Source
## 1 United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019.
## 2 United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019.
## 3 United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019.
## 4 United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019.
## 5 United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019.
## 6 United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019.
Original data includes values for both regions and countries. I will create two subsets one for regions and one for Countries. In this way it will be easy for us to visualize data in a more organized manner.
# Deleting unwanted columns from the original file and save as a new data frame.
UN_PopulationGrowth <-
select(pg_original_file, -c("Region.Country.Area","Footnotes", "Source"))
head(UN_PopulationGrowth)
## X Year
## 1 Total, all countries or areas 2005
## 2 Total, all countries or areas 2005
## 3 Total, all countries or areas 2005
## 4 Total, all countries or areas 2005
## 5 Total, all countries or areas 2005
## 6 Total, all countries or areas 2005
## Series Value
## 1 Population annual rate of increase (percent) 1.2570
## 2 Total fertility rate (children per women) 2.6513
## 3 Infant mortality for both sexes (per 1,000 live births) 49.2161
## 4 Maternal mortality ratio (deaths per 100,000 population) 288.0000
## 5 Life expectancy at birth for both sexes (years) 67.0455
## 6 Life expectancy at birth for males (years) 64.8082
UN_Population <-
select(p_original, -c("Region.Country.Area","Footnotes", "Source"))
head(UN_Population)
## X Year
## 1 Total, all countries or areas 2005
## 2 Total, all countries or areas 2005
## 3 Total, all countries or areas 2005
## 4 Total, all countries or areas 2005
## 5 Total, all countries or areas 2005
## 6 Total, all countries or areas 2005
## Series Value
## 1 Population mid-year estimates (millions) 6541.9070
## 2 Population mid-year estimates for males (millions) 3296.4853
## 3 Population mid-year estimates for females (millions) 3245.4217
## 4 Sex ratio (males per 100 females) 101.5734
## 5 Population aged 0 to 14 years old (percentage) 28.1425
## 6 Population aged 60+ years old (percentage) 10.2516
#Population Growth by region
PG_Region <- UN_PopulationGrowth %>%
slice(22:564)
names(PG_Region)[names(PG_Region) == "X"] <- "Region"
head(PG_Region)
## Region Year Series
## 1 Africa 2005 Population annual rate of increase (percent)
## 2 Africa 2005 Total fertility rate (children per women)
## 3 Africa 2005 Infant mortality for both sexes (per 1,000 live births)
## 4 Africa 2005 Life expectancy at birth for both sexes (years)
## 5 Africa 2005 Life expectancy at birth for males (years)
## 6 Africa 2005 Life expectancy at birth for females (years)
## Value
## 1 2.4390
## 2 5.0771
## 3 81.0492
## 4 53.5269
## 5 51.9582
## 6 55.1343
#Population by Region
Total_Population_Region <- UN_Population %>%
slice(30:870)
names(Total_Population_Region)[names(Total_Population_Region) == "X"] <- "Region"
head(Total_Population_Region)
## Region Year Series
## 1 Africa 2005 Population mid-year estimates (millions)
## 2 Africa 2005 Population mid-year estimates for males (millions)
## 3 Africa 2005 Population mid-year estimates for females (millions)
## 4 Africa 2005 Sex ratio (males per 100 females)
## 5 Africa 2005 Population aged 0 to 14 years old (percentage)
## 6 Africa 2005 Population aged 60+ years old (percentage)
## Value
## 1 916.1543
## 2 456.6481
## 3 459.5062
## 4 99.3780
## 5 41.8707
## 6 5.1112
PG_by_Region <- PG_Region %>%
spread(key = Series, value = Value)
names(PG_by_Region)[names(PG_by_Region) == "Infant mortality for both sexes (per 1,000 live births)"] <- "Infant_Mortality"
names(PG_by_Region)[names(PG_by_Region) == "Life expectancy at birth for both sexes (years)"] <- "Life_Expectancy"
names(PG_by_Region)[names(PG_by_Region) == "Maternal mortality ratio (deaths per 100,000 population)"] <- "Maternal_mortality_ratio"
names(PG_by_Region)[names(PG_by_Region) == "Life expectancy at birth for males (years)"] <- "LifeExpectancy_males"
names(PG_by_Region)[names(PG_by_Region) == "Life expectancy at birth for females (years)"] <- "LifeExpectancy_females"
names(PG_by_Region)[names(PG_by_Region) == "Population annual rate of increase (percent)"] <- "Population_increase_rate"
names(PG_by_Region)[names(PG_by_Region) == "Total fertility rate (children per women)"] <- "Total_fertility_rate"
names(PG_by_Region)[names(PG_by_Region) == "X"] <- "Region"
head(PG_by_Region)
## Region Year Infant_Mortality Life_Expectancy LifeExpectancy_females
## 1 Africa 2005 81.0492 53.5269 55.1343
## 2 Africa 2010 67.7143 56.7825 58.3389
## 3 Africa 2015 55.9325 60.2471 61.9302
## 4 Asia 2005 45.8017 68.3315 70.1224
## 5 Asia 2010 37.1114 70.0293 71.9654
## 6 Asia 2015 29.5012 71.8300 74.0127
## LifeExpectancy_males Maternal_mortality_ratio Population_increase_rate
## 1 51.9582 NA 2.439
## 2 55.2459 NA 2.522
## 3 58.5824 NA 2.581
## 4 66.6483 NA 1.227
## 5 68.2175 NA 1.132
## 6 69.8000 NA 1.036
## Total_fertility_rate
## 1 5.0771
## 2 4.9000
## 3 4.7301
## 4 2.4467
## 5 2.3281
## 6 2.2098
P_by_Region <- Total_Population_Region %>%
spread(key = Series, value = Value)
names(P_by_Region)[names(P_by_Region) == "Population mid-year estimates (millions)"] <- "Pop.est.total"
names(P_by_Region)[names(P_by_Region) == "Population mid-year estimates for males (millions)"] <- "Pop.est.males"
names(P_by_Region)[names(P_by_Region) == "Population mid-year estimates for females (millions)"] <- "Pop.est.females"
names(P_by_Region)[names(P_by_Region) == "Sex ratio (males per 100 females)"] <- "m.to.f.ratio"
P_by_Region <-
select(P_by_Region, -c("Population aged 0 to 14 years old (percentage)","Population aged 60+ years old (percentage)", "Surface area (thousand km2)"))
head(P_by_Region)
## Region Year Population density Pop.est.total Pop.est.females
## 1 Africa 2005 30.9005 916.1543 459.5062
## 2 Africa 2010 35.0542 1039.3040 521.0514
## 3 Africa 2017 41.9658 1244.2223 622.8357
## 4 Africa 2019 44.1191 1308.0642 654.5505
## 5 Americas 2005 20.9061 884.7882 448.2641
## 6 Americas 2010 22.0840 934.6398 473.6061
## Pop.est.males m.to.f.ratio
## 1 456.6481 99.3780
## 2 518.2526 99.4629
## 3 621.3865 99.7673
## 4 653.5137 99.8416
## 5 436.5240 97.3810
## 6 461.0337 97.3454
Using some data such as Infant mortality rate, Life expectancy for both sexes, males,and females, I am creating some interacting scatter-plots for better understand of these populations around the world. All the data are for the years for major regions: 2005, 2005, 2015.
# Infant mortality rate by region
g <- ggplot(PG_by_Region, aes(x = Infant_Mortality, y = Region, text = Year))+
geom_point(aes(color=Region))
ggplotly(g)
# Life expectancy for both sexes by region (Main Regions)
g<-ggplot(subset(PG_by_Region, Region %in% c("Africa", "Asia", "Australia and New Zealand", "Europe", "Caribbean", "South America", "Northern America")),
aes(x = Life_Expectancy, y = Region, text = Year))+
geom_point(aes(color=Region))
ggplotly(g)
# Life expectancy for Males by region
g<-ggplot(PG_by_Region, aes(x = LifeExpectancy_males, y = Region, fill = Region, text = Year))+
geom_point(aes(color=Region))
ggplotly(g)
# Life expectancy for Females by region
g<-ggplot(PG_by_Region, aes(x = LifeExpectancy_females, y = Region, text = Year))+
geom_point(aes(color=Region))
knitr::opts_chunk$set(fig.width=12, fig.height=8)
ggplotly(g)
g <- ggplot(PG_by_Region, aes(x = Region, y = Life_Expectancy, fill = as.character(Year))) +
geom_bar(stat = "Identity", position = "dodge") +
geom_text(aes(label = paste0(round(PG_by_Region$Life_Expectancy,0))), hjust=-0.5, color="black", position = position_dodge(1), size = 2)+
scale_fill_brewer(palette = "Paired") +
theme(axis.text.x=element_text(angle = 0, vjust = 1)) +
theme(plot.title = element_text(hjust = 0.5), legend.position = "bottom") +
ggtitle("Life Expectancy by Region") +
xlab("Regions") + ylab ("Age in Years") +
coord_flip()
ggplotly(g)
# Life expectancy for both sexes by region (Main Regions)
g<-ggplot(subset(PG_by_Region, Region %in% c("Africa", "Asia", "Australia and New Zealand", "Europe", "Caribbean", "South America", "Northern America")),
aes(x = Life_Expectancy, y = Region, text = Year))+
geom_point(aes(color=Region))
ggplotly(g)
Comparing Population Growth and Population data to see the relationship between Fertility Rate, Infant Mortality and Population growth rate. Fertility rate, which shows the children per woman, has clearly declined within the time frame 2005 to 2010. This has impacted on slow increase in population increase rate. Also Infant Mortality rates are also have declined. This reflects in decreased fertility rates as well. In an environment with high child mortality women will give birth to more children than they want to ensure against the loss of children (source: https://ourworldindata.org/fertility-rate)
#Merged population data for 2005 and 2010
merged_population_data <- PG_by_Region %>%
inner_join(P_by_Region, by = c("Region" = "Region", "Year" = "Year"))
## Warning: Column `Region` joining factors with different levels, coercing to
## character vector
head(merged_population_data) %>%
kable() %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) %>%
scroll_box(width="100%",height="400px")
Region | Year | Infant_Mortality | Life_Expectancy | LifeExpectancy_females | LifeExpectancy_males | Maternal_mortality_ratio | Population_increase_rate | Total_fertility_rate | Population density | Pop.est.total | Pop.est.females | Pop.est.males | m.to.f.ratio |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Africa | 2005 | 81.0492 | 53.5269 | 55.1343 | 51.9582 | NA | 2.439 | 5.0771 | 30.9005 | 916.1543 | 459.5062 | 456.6481 | 99.3780 |
Africa | 2010 | 67.7143 | 56.7825 | 58.3389 | 55.2459 | NA | 2.522 | 4.9000 | 35.0542 | 1039.3040 | 521.0514 | 518.2526 | 99.4629 |
Asia | 2005 | 45.8017 | 68.3315 | 70.1224 | 66.6483 | NA | 1.227 | 2.4467 | 128.1851 | 3977.9865 | 1942.7627 | 2035.2237 | 104.7593 |
Asia | 2010 | 37.1114 | 70.0293 | 71.9654 | 68.2175 | NA | 1.132 | 2.3281 | 135.6484 | 4209.5937 | 2054.2814 | 2155.3123 | 104.9181 |
Australia and New Zealand | 2005 | 5.0247 | 80.1206 | 82.5737 | 77.6371 | NA | 1.242 | 1.8043 | 3.0600 | 24.3139 | 12.2181 | 12.0958 | 98.9993 |
Australia and New Zealand | 2010 | 4.5033 | 81.2801 | 83.5156 | 79.0327 | NA | 1.741 | 1.9850 | 3.3383 | 26.5247 | 13.2996 | 13.2251 | 99.4400 |
summary(merged_population_data)
## Region Year Infant_Mortality Life_Expectancy
## Length:56 Min. :2005 Min. : 3.762 Min. :49.54
## Class :character 1st Qu.:2005 1st Qu.: 16.112 1st Qu.:63.64
## Mode :character Median :2008 Median : 31.162 Median :69.75
## Mean :2008 Mean : 36.569 Mean :68.00
## 3rd Qu.:2010 3rd Qu.: 51.484 3rd Qu.:74.52
## Max. :2010 Max. :103.615 Max. :81.28
##
## LifeExpectancy_females LifeExpectancy_males Maternal_mortality_ratio
## Min. :50.51 Min. :48.58 Min. : 36.0
## 1st Qu.:64.72 1st Qu.:61.54 1st Qu.: 83.5
## Median :72.87 Median :67.58 Median :103.0
## Mean :70.48 Mean :65.58 Mean :199.1
## 3rd Qu.:77.26 3rd Qu.:71.41 3rd Qu.:207.2
## Max. :83.52 Max. :79.03 Max. :717.0
## NA's :42
## Population_increase_rate Total_fertility_rate Population density
## Min. :-0.4270 Min. :1.260 Min. : 3.06
## 1st Qu.: 0.8057 1st Qu.:1.990 1st Qu.: 20.53
## Median : 1.3230 Median :2.534 Median : 40.50
## Mean : 1.3953 Mean :2.975 Mean : 71.56
## 3rd Qu.: 1.8373 3rd Qu.:3.216 3rd Qu.:130.78
## Max. : 3.2270 Max. :6.381 Max. :267.58
##
## Pop.est.total Pop.est.females Pop.est.males
## Min. : 0.498 Min. : 0.2467 Min. : 0.2516
## 1st Qu.: 87.932 1st Qu.: 44.8318 1st Qu.: 43.0999
## Median : 250.161 Median : 122.5598 Median : 127.6007
## Mean : 559.181 Mean : 276.6138 Mean : 282.5668
## 3rd Qu.: 630.032 3rd Qu.: 316.6489 3rd Qu.: 311.5643
## Max. :4209.594 Max. :2054.2814 Max. :2155.3123
##
## m.to.f.ratio
## Min. : 88.71
## 1st Qu.: 97.10
## Median : 98.98
## Mean : 99.36
## 3rd Qu.:101.30
## Max. :108.33
##
population_subset <- merged_population_data[c(1:3,8:9)]
head(population_subset)
## Region Year Infant_Mortality Population_increase_rate
## 1 Africa 2005 81.0492 2.439
## 2 Africa 2010 67.7143 2.522
## 3 Asia 2005 45.8017 1.227
## 4 Asia 2010 37.1114 1.132
## 5 Australia and New Zealand 2005 5.0247 1.242
## 6 Australia and New Zealand 2010 4.5033 1.741
## Total_fertility_rate
## 1 5.0771
## 2 4.9000
## 3 2.4467
## 4 2.3281
## 5 1.8043
## 6 1.9850
library(ggpubr)
## Loading required package: magrittr
##
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
##
## set_names
## The following object is masked from 'package:tidyr':
##
## extract
Population_Increase <- ggplot(population_subset, aes(y=Region, x=Population_increase_rate, fill = Year))+
geom_point(color = "blue")
Fertility_Rate <- ggplot(population_subset, aes(y=Region, x=Total_fertility_rate, fill = Year))+
geom_point(color = "Red")
Infant_Mortality <- ggplot(population_subset, aes(y=Region, x=Infant_Mortality, fill = Year))+
geom_point(color = "Green")
figure <- ggarrange(Population_Increase, Fertility_Rate, Infant_Mortality,
labels = c("A", "B", "c"),
ncol = 3, nrow = 3, scales = "Free")
## Warning in as_grob.default(plot): Cannot convert object of class character
## into a grob.
figure
findCorrelation <- function() {
x = population_subset$Infant_Mortality
y = population_subset$Total_fertility_rate
corr = round(cor(x, y),4)
print (paste0("Correlation = ",corr))
return (corr)
}
c = findCorrelation()
## [1] "Correlation = 0.9209"
findStatsFunction <- function() {
m = lm (Infant_Mortality ~ Total_fertility_rate, data = population_subset)
s = summary(m)
print(s)
slp = round(m$coefficients[2], 4)
int = round(m$coefficients[1], 4)
return (m)
}
m = findStatsFunction()
##
## Call:
## lm(formula = Infant_Mortality ~ Total_fertility_rate, data = population_subset)
##
## Residuals:
## Min 1Q Median 3Q Max
## -16.212 -7.672 -2.871 7.250 23.868
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -15.682 3.328 -4.712 1.76e-05 ***
## Total_fertility_rate 17.564 1.012 17.356 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 10.62 on 54 degrees of freedom
## Multiple R-squared: 0.848, Adjusted R-squared: 0.8452
## F-statistic: 301.2 on 1 and 54 DF, p-value: < 2.2e-16
plot = ggplot(population_subset, aes(Infant_Mortality, Total_fertility_rate)) + geom_point(colour="blue") +
xlab("Infant mortality") + ylab("Fertility Rate") + labs(title = "Infant Mortality vs. Total Fertility Rate")
ggplotly(plot)
\(H_0\) : Null Hypothesis - There is no relationship between Infant Mortality and Fertility Rate \(H_A\) : Alternative Hypothesis - There is a relationship between Infant Mortality and Fertility Rate
Here the multiple R value is 0.9209 which shows that there is significant correlation between Infant Mortality and Fertility Rate. Also the value of R square is 0.9209 which shows the extent to which the Infant Mortality affect the Fertility Rate. Therefore, we reject the null hypothesis (H_0) and accept the Alternative hypothesis (H_1).
Graph for Life expectancy by region clearly shows that Australia and New Zealand region with the highest life expactancy of age 82. This region is on the top for the last 15 years with highest Life expectancy over age 80.
Years Infant.Mortalit Pop.increase Fertility Australia and New Zealand 2005 5.0247 1.242 1.8043 6 Australia and New Zealand 2010 4.5033 1.741 1.9850
Infant Mortality rate decreased while population and fertility rate incrased. This is a good sign and probably a good indicators for their high life expectancy in that region. Overall all the regions have decrease in infant mortality and incrase in population increase and fertility, just in a lower speed than it used to be.