library(tidyverse)
## -- Attaching packages -------------------------------------------------------------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.2.1 v purrr 0.3.2
## v tibble 2.1.3 v dplyr 0.8.3
## v tidyr 0.8.3 v stringr 1.4.0
## v readr 1.3.1 v forcats 0.4.0
## -- Conflicts ----------------------------------------------------------------------------------------------------------------------------------- tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
library(dplyr)
library(psych)
##
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
##
## %+%, alpha
# Reading the original .csv file that downloaded from http://data.un.org/
original_file <- read.csv("https://raw.githubusercontent.com/gpadmaperuma/DATA606/master/SYB62_T03_201907_Population%20Growth%2C%20Fertility%20and%20Mortality%20Indicators.csv", header = TRUE, skip = 1)
head(original_file) %>% kable() %>% kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))%>%scroll_box(width = "100%", height = "400px")
Region.Country.Area | X | Year | Series | Value | Footnotes | Source |
---|---|---|---|---|---|---|
1 | Total, all countries or areas | 2005 | Population annual rate of increase (percent) | 1.2570 | Data refers to a 5-year period preceding the reference year. | United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019. |
1 | Total, all countries or areas | 2005 | Total fertility rate (children per women) | 2.6513 | Data refers to a 5-year period preceding the reference year. | United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019. |
1 | Total, all countries or areas | 2005 | Infant mortality for both sexes (per 1,000 live births) | 49.2161 | Data refers to a 5-year period preceding the reference year. | United Nations Statistics Division, New York, “Demographic Yearbook 2015” and the demographic statistics database, last accessed June 2017. |
1 | Total, all countries or areas | 2005 | Maternal mortality ratio (deaths per 100,000 population) | 288.0000 | World Health Organization (WHO), the United Nations Children’s Fund (UNICEF), the United Nations Population Fund (UNFPA), the World Bank and the United Nations Population Division, “Trends in Maternal Mortality 1990 - 2015.” | |
1 | Total, all countries or areas | 2005 | Life expectancy at birth for both sexes (years) | 67.0455 | Data refers to a 5-year period preceding the reference year. | United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019. |
1 | Total, all countries or areas | 2005 | Life expectancy at birth for males (years) | 64.8082 | Data refers to a 5-year period preceding the reference year. | United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019. |
dim(original_file)
## [1] 4984 7
While analysing these data I will try to find solutions to my two questions:
(1) Which country/region/area has the most fertility rate?
(2) Which country/region/area has the highest morality rate?
There are 4984 Cases in this dataset. Each case represent population growth and indicators of fertility and mortality of the world
These data was obtained from the United Nations Database called UNdata:A world of information.
UNdata is a web-based data service for the global user community. These data are maintained by the Statistical Division of the Department of Economics and Social Affairs (UN DESA) of UN Secretariat. Most of the data sourced by UN partner organization such as UNICEF, UNDP, UNHCR, WHO etc.
These data are obtained as a part of UN research efforts inorder to solve world economic, health and other problems.These are observational data collected in UN researches of those countries or regions.
UNdata: A world of information
The United Nations, Population Growth Fertility Mortality Indicators (2019). Retrieved from (http://data.un.org/)
The responce variable for this dataset is value which is a quantitative variable.It holds all the population, fertility and mortality rates.
Two Qualitative independent variables are the Region/Country/Area and Series and one quantitative independent variable is year that data was collected.
summary statistics for each the variables and appropriate visualizations
summary(original_file)
## Region.Country.Area X Year
## Min. : 1.0 Afghanistan: 21 Min. :2000
## 1st Qu.:152.0 Albania : 21 1st Qu.:2005
## Median :388.0 Algeria : 21 Median :2010
## Mean :393.4 Angola : 21 Mean :2010
## 3rd Qu.:624.0 Argentina : 21 3rd Qu.:2015
## Max. :894.0 Armenia : 21 Max. :2018
## (Other) :4858
## Series
## Infant mortality for both sexes (per 1,000 live births) :702
## Life expectancy at birth for both sexes (years) :705
## Life expectancy at birth for females (years) :735
## Life expectancy at birth for males (years) :735
## Maternal mortality ratio (deaths per 100,000 population):573
## Population annual rate of increase (percent) :799
## Total fertility rate (children per women) :735
## Value
## Min. : -4.978
## 1st Qu.: 3.074
## Median : 52.536
## Mean : 57.959
## 3rd Qu.: 73.586
## Max. :1986.136
##
## Footnotes
## Data refers to a 5-year period preceding the reference year. :3835
## : 659
## Data refers to a 5-year period preceding the reference year.;For statistical purposes, the data for China do not include those for the Hong Kong Special Administrative Region (Hong Kong SAR), Macao Special Administrative Region (Macao SAR) and Taiwan Province of China.: 18
## Data refers to a 5-year period preceding the reference year.;Including Abkhazia and South Ossetia. : 18
## Data refers to a 5-year period preceding the reference year.;Including Agalega, Rodrigues and Saint Brandon. : 18
## Data refers to a 5-year period preceding the reference year.;Including Åland Islands. : 18
## (Other) : 418
## Source
## United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2019. : 799
## United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2015 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2019.:2910
## United Nations Statistics Division, New York, "Demographic Yearbook 2015" and the demographic statistics database, last accessed June 2017. : 702
## World Health Organization (WHO), the United Nations Children's Fund (UNICEF), the United Nations Population Fund (UNFPA), the World Bank and the United Nations Population Division, "Trends in Maternal Mortality 1990 - 2015." : 573
##
##
##
describe(original_file)
## vars n mean sd median trimmed mad
## Region.Country.Area 1 4984 393.39 264.59 388.00 385.88 349.89
## X* 2 4984 133.17 76.52 132.00 133.08 99.33
## Year 3 4984 2009.95 4.12 2010.00 2009.97 7.41
## Series* 4 4984 4.03 2.02 4.00 4.03 2.97
## Value 5 4984 57.96 108.95 52.54 41.12 43.83
## Footnotes* 6 4984 10.95 6.20 11.00 10.55 0.00
## Source* 7 4984 2.21 0.85 2.00 2.14 0.00
## min max range skew kurtosis se
## Region.Country.Area 1.00 894.00 893.00 0.16 -1.23 3.75
## X* 1.00 265.00 264.00 0.01 -1.20 1.08
## Year 2000.00 2018.00 18.00 -0.03 -1.42 0.06
## Series* 1.00 7.00 6.00 0.00 -1.28 0.03
## Value -4.98 1986.14 1991.11 6.45 58.89 1.54
## Footnotes* 1.00 42.00 41.00 1.70 6.19 0.09
## Source* 1.00 4.00 3.00 0.72 0.07 0.01
hist(original_file$Value)
ggplot(data = original_file, mapping = aes(x = original_file$Year, y = original_file$Value)) +
geom_line()