Required packages

Provide the packages required to reproduce the report. Make sure you fulfilled the minimum requirement #10.

library(mlr)
## Loading required package: ParamHelpers
## Warning: replacing previous import 'BBmisc::isFALSE' by
## 'backports::isFALSE' when loading 'mlr'
library(tidyr)
library(Hmisc)
## Loading required package: lattice
## Loading required package: survival
## Loading required package: Formula
## Loading required package: ggplot2
## 
## Attaching package: 'Hmisc'
## The following object is masked from 'package:mlr':
## 
##     impute
## The following objects are masked from 'package:base':
## 
##     format.pval, units
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:Hmisc':
## 
##     src, summarize
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(gdata)
## gdata: read.xls support for 'XLS' (Excel 97-2004) files ENABLED.
## 
## gdata: read.xls support for 'XLSX' (Excel 2007+) files ENABLED.
## 
## Attaching package: 'gdata'
## The following objects are masked from 'package:dplyr':
## 
##     combine, first, last
## The following object is masked from 'package:mlr':
## 
##     resample
## The following object is masked from 'package:stats':
## 
##     nobs
## The following object is masked from 'package:utils':
## 
##     object.size
## The following object is masked from 'package:base':
## 
##     startsWith
library(editrules)
## Loading required package: igraph
## 
## Attaching package: 'igraph'
## The following objects are masked from 'package:dplyr':
## 
##     as_data_frame, groups, union
## The following object is masked from 'package:tidyr':
## 
##     crossing
## The following objects are masked from 'package:stats':
## 
##     decompose, spectrum
## The following object is masked from 'package:base':
## 
##     union
## 
## Attaching package: 'editrules'
## The following objects are masked from 'package:igraph':
## 
##     blocks, normalize
## The following object is masked from 'package:dplyr':
## 
##     contains
## The following object is masked from 'package:tidyr':
## 
##     separate
## The following object is masked from 'package:ParamHelpers':
## 
##     isFeasible
library(stringr)
library(MVN)
## sROC 0.1-2 loaded
library(forecast)
library(lubridate)
## 
## Attaching package: 'lubridate'
## The following object is masked from 'package:igraph':
## 
##     %--%
## The following object is masked from 'package:base':
## 
##     date

Executive Summary

Now, the data processing process has been finished.
From the merged dataset, a dataset “north_america” which contains population information of North America countries—United States and Canada, has been obtained through merging and filtering 3 datasets containing the world male, female population and population’s regional information repectively.

The north_america dataset contains 10 variables with 2 character variables “Country.Code” and “Indicator.Code”, 4 factor variables “Country.Name”, “Indicator.Name”, “Region” and “IncomeGroup”, 3 numeric variables, “male_female_population”, “Population.percentage” and “transformation” and 1 date type variable, “year”. All the 10 variables has been converted into proper date types. The factor variable “IncomeGroup” has been order according to its levels , “Low income”, “Lower middle income”, “Upper middle income” and “High income”. And the merged datasets has been converted from untidy format into tidy format. All variables missing values and inconsistencies has been scaned and proporly dealed with.

The outliers of the numeric variable “male_female_population” which contains the size of population for male and female of each country has been scaned and coped by optimal boxcox transformation.

And the transformed variable from the “male_female_population” variable has been created by naming “transformed.population”. The transformed variable can also decrease the skewness to make the variable more normally distributed, which be used for further population analysis and forcast.

Data

This analysis has retrieved three datasets of the total male and female population across different countries around the world from 1960 to 2016, and the regional information on the areas of these countries.

The datasets for total male and female population accross different countries containing the following variables. The “Country Name” and “Country Code” record the name of different countries and their abbreviations respectively. The “Indicator Name” and “Indicator Code” record the gender of the observations of the population data and their abreviation code respectively. The population size of each country from year 1960 to 2016 has been recorded in individual columns with each column occupied by each year.

The dataset called “Metadata_Country_API_SP.POP.TOTL.FE.IN_DS2_en_csv_v2_9952571” containing the information on regional and income levels also has been used. There are five variables in there, the “Country Code” record the name of different countries , the “Region” denoting which continents the countries belongs to, the “IncomeGroup” conveying the income level of differenting countries, the “SpecialNotes” recording different specific notes and the “TableName” containing different countries name.

All the three datasets are open sourced and can be find at the following websies: https://data.worldbank.org/indicator/SP.POP.TOTL.MA.IN https://data.worldbank.org/indicator/SP.POP.TOTL.FE.IN

A clear description of data sets, their sources, and variable descriptions should be provided. In this section, you must also provide the R codes with outputs (head of data sets) that you used to import/read/scrape the data set. You need to fulfil the minimum requirement #1 and merge at least two data sets to create the one you are going to work on. In addition to the R codes and outputs, you need to explain the steps that you have taken.

#Read the female population dataset.
fe<- read.csv("~/Desktop/population_female.csv", header =TRUE,skip=4)
#Read the male population dataset.
ma<- read.csv("~/Desktop/population_male.csv", header =TRUE,skip=4)
#Read the regional and income level information contained dataset.
region<- read.csv("~/Desktop/pupulation region.csv",header = TRUE)
#Before conducting the gathering method to the dataset, the Column, X2017, in both "ma" and "fe" datasets, has to be delete due to no information contained in there. 

fe <- select(fe, -X2017)
ma <- select(ma, -X2017)

head(fe)
##   Country.Name Country.Code     Indicator.Name    Indicator.Code    X1960
## 1        Aruba          ABW Population, female SP.POP.TOTL.FE.IN    27637
## 2  Afghanistan          AFG Population, female SP.POP.TOTL.FE.IN  4346990
## 3       Angola          AGO Population, female SP.POP.TOTL.FE.IN  2900597
## 4      Albania          ALB Population, female SP.POP.TOTL.FE.IN   780595
## 5      Andorra          AND Population, female SP.POP.TOTL.FE.IN       NA
## 6   Arab World          ARB Population, female SP.POP.TOTL.FE.IN 45909546
##      X1961    X1962    X1963    X1964    X1965    X1966    X1967    X1968
## 1    28254    28655    28907    29094    29268    29458    29631    29804
## 2  4437679  4532368  4631212  4734371  4842015  4952722  5066254  5185164
## 3  2956543  3014082  3072260  3129657  3185518  3239546  3292896  3348020
## 4   805305   830317   855238   880256   904823   928927   953600   981013
## 5       NA       NA       NA       NA       NA       NA       NA       NA
## 6 47174491 48482890 49837622 51242480 52699695 54215933 55790717 57411074
##      X1969    X1970    X1971    X1972    X1973    X1974    X1975    X1976
## 1    29988    30178    30401    30634    30889    31065    31150    31111
## 2  5313009  5451441  5599707  5753953  5908686  6056668  6191328  6315722
## 3  3408200  3475917  3552328  3637187  3729918  3829313  3934592  4045039
## 4  1009808  1035954  1061521  1088570  1114865  1141014  1167738  1193877
## 5       NA       NA       NA       NA       NA       NA       NA       NA
## 6 59058941 60724221 62398727 64094688 65847194 67704063 69698805 71842855
##      X1977    X1978    X1979    X1980    X1981    X1982    X1983    X1984
## 1    30989    30834    30753    30805    31047    31457    31912    32227
## 2  6427910  6510970  6543069  6511413  6411284  6255675  6072080  5898422
## 3  4161240  4285197  4419637  4566089  4726261  4898495  5076921  5253511
## 4  1220603  1246195  1271298  1297791  1324521  1353597  1383430  1413336
## 5       NA       NA       NA       NA       NA       NA       NA       NA
## 6 74123776 76521599 79005258 81550074 84149697 86805054 89506982 92246331
##      X1985    X1986     X1987     X1988     X1989     X1990     X1991
## 1    32303    32055     31571     31104     30993     31491     32693
## 2  5764972  5674753   5627943   5647569   5760223   5981147   6327356
## 3  5422852  5581981   5733487   5884261   6044247   6220388   6416266
## 4  1442682  1469816   1497724   1525329   1568794   1603298   1604803
## 5       NA       NA        NA        NA        NA        NA        NA
## 6 95015336 97806638 100615528 103440246 106280987 110113892 113029317
##       X1992     X1993     X1994     X1995     X1996     X1997     X1998
## 1     34482     36633     38785     40708     42294     43615     44757
## 2   6785762   7304033   7809329   8250185   8603315   8885666   9134961
## 3   6629273   6852909   7077584   7296962   7508648   7716678   7928906
## 4   1610619   1617772   1621629   1618647   1607089   1588481   1566199
## 5        NA        NA        NA        NA        NA        NA        NA
## 6 115137501 118134381 121111744 124723010 127619185 130509484 133371678
##       X1999     X2000     X2001     X2002     X2003     X2004     X2005
## 1     45856     47012     48252     49506     50707     51712     52456
## 2   9408613   9746770  10162500  10637846  11144964  11642547  12101551
## 3   8156453   8407379   8684468   8985544   9307609   9645607   9995796
## 4   1544994   1528046   1511707   1509000   1507666   1505935   1501423
## 5        NA        NA        NA        NA        NA        NA        NA
## 6 136236749 139113662 141998877 144904104 147871961 150957572 154198970
##       X2006     X2007     X2008     X2009     X2010     X2011     X2012
## 1     52896     53083     53104     53112     53202     53404     53701
## 2  12511524  12884170  13239157  13606016  14005473  14444001  14912657
## 3  10357453  10731777  11119158  11520427  11936016  12366067  12809782
## 4   1493529   1482744   1470945   1460006   1451422   1445814   1441287
## 5        NA        NA        NA        NA        NA        NA        NA
## 6 157610509 161177796 164879183 168667331 172508052 176400834 180343747
##       X2013     X2014     X2015     X2016
## 1     54060     54417     54743     55023
## 2  15398276  15881092  16346869  16791609
## 3  13265577  13731363  14205741  14688058
## 4   1436444   1431651   1426369   1423809
## 5        NA        NA        NA        NA
## 6 184306583 188253495 192160815 196007048
head(ma)
##   Country.Name Country.Code   Indicator.Name    Indicator.Code    X1960
## 1        Aruba          ABW Population, male SP.POP.TOTL.MA.IN    26574
## 2  Afghanistan          AFG Population, male SP.POP.TOTL.MA.IN  4649361
## 3       Angola          AGO Population, male SP.POP.TOTL.MA.IN  2742585
## 4      Albania          ALB Population, male SP.POP.TOTL.MA.IN   828205
## 5      Andorra          AND Population, male SP.POP.TOTL.MA.IN       NA
## 6   Arab World          ARB Population, male SP.POP.TOTL.MA.IN 46581386
##      X1961    X1962    X1963    X1964    X1965    X1966    X1967    X1968
## 1    27184    27570    27788    27938    28092    28257    28424    28582
## 2  4729085  4813500  4902742  4996990  5096399  5199609  5306376  5419182
## 3  2796481  2851979  2908157  2963664  3017781  3070224  3122099  3175771
## 4   854495   881002   907383   933879   959968   985646  1011998  1041259
## 5       NA       NA       NA       NA       NA       NA       NA       NA
## 6 47870006 49199404 50573454 51997422 53475293 55014660 56616215 58269091
##      X1969    X1970    X1971    X1972    X1973    X1974    X1975    X1976
## 1    28738    28885    29039    29206    29354    29463    29507    29475
## 2  5541419  5674682  5818118  5967987  6119136  6264873  6398958  6524577
## 3  3234432  3300464  3374941  3457647  3548042  3645025  3747887  3855958
## 4  1071887  1099525  1126332  1154556  1181887  1209110  1237093  1264649
## 5       NA       NA       NA       NA       NA       NA       NA       NA
## 6 59957601 61674153 63408692 65174687 67016222 68992698 71144493 73489523
##      X1977    X1978    X1979    X1980    X1981    X1982    X1983    X1984
## 1    29377    29269    29227    29291    29520    29888    30289    30609
## 2  6639628  6726764  6763626  6736957  6642670  6493970  6317189  6148693
## 3  3969748  4090950  4221884  4363811  4518246  4683661  4854641  5023810
## 4  1292943  1320071  1346534  1374206  1401535  1430681  1460530  1491093
## 5       NA       NA       NA       NA       NA       NA       NA       NA
## 6 76009278 78662125 81387230 84139416 86902253 89685030 92498845 95364425
##      X1985     X1986     X1987     X1988     X1989     X1990     X1991
## 1    30723     30589     30262     29975     30039     30658     31929
## 2  6018078   5926288   5874818   5893319   6017386   6267967   6666301
## 3  5186190   5339056   5484781   5629707   5782990   5951053   6137180
## 4  1522080   1552819   1585881   1617007   1659149   1683244   1661987
## 5       NA        NA        NA        NA        NA        NA        NA
## 6 98294965 101287129 104327021 107404525 110506415 114621554 117800551
##       X1992     X1993     X1994     X1995     X1996     X1997     X1998
## 1     33753     35871     37915     39616     40906     41836     42520
## 2   7195469   7791066   8363390   8849356   9219569   9495939   9729038
## 3   6339072   6550825   6763717   6972032   7173636   7372303   7575412
## 4   1636420   1609515   1585907   1569137   1560944   1559800   1562331
## 5        NA        NA        NA        NA        NA        NA        NA
## 6 119899678 123151710 126324186 130306661 133224277 136065591 138863468
##       X1999     X2000     X2001     X2002     X2003     X2004     X2005
## 1     43149     43841     44646     45486     46310     47025     47575
## 2   9995063  10346986  10803963  11342077  11919887  12476432  12969247
## 3   7793313   8033545   8298798   8587105   8895760   9220109   9556746
## 4   1563784   1560981   1548466   1542010   1531950   1521004   1510064
## 5        NA        NA        NA        NA        NA        NA        NA
## 6 141726120 144718354 147851480 151122471 154562558 158204457 162065758
##       X2006     X2007     X2008     X2009     X2010     X2011     X2012
## 1     47936     48137     48249     48341     48467     48649     48876
## 2  13381926  13732622  14054874  14398315  14797694  15264598  15784301
## 3   9904946  10265910  10640262  11029120  11433115  11852498  12286368
## 4   1499018   1487273   1476369   1467513   1461599   1459381   1459114
## 5        NA        NA        NA        NA        NA        NA        NA
## 6 166162755 170476001 174946300 179477763 184000856 188495044 192963246
##       X2013     X2014     X2015     X2016
## 1     49127     49378     49598     49799
## 2  16333412  16876928  17389625  17864423
## 3  12732763  13189103  13653564  14125405
## 4   1458648   1457453   1454334   1452292
## 5        NA        NA        NA        NA
## 6 197395503 201789533 206144145 210445642
head(region)
##   Country.Code                    Region         IncomeGroup
## 1          ABW Latin America & Caribbean         High income
## 2          AFG                South Asia          Low income
## 3          AGO        Sub-Saharan Africa Lower middle income
## 4          ALB     Europe & Central Asia Upper middle income
## 5          AND     Europe & Central Asia         High income
## 6          ARB                                              
##                                                                                                                                                                                                                                                                                                             SpecialNotes
## 1                                                                                                                                                                          SNA data for 2000-2011 are updated from official government statistics; 1994-1999 from UN databases. Base year has changed from 1995 to 2000.
## 2 Fiscal year end: March 20; reporting period for national accounts data is calendar year, estimated to insure consistency between national accounts and fiscal data. National accounts data are sourced from the IMF and differ from the Central Statistics Organization numbers due to exclusion of the opium economy.
## 3                                                                                                                                                                                                                                                                                                                       
## 4                                                                                                                                                                                                                                                                                                                       
## 5                                                                                                                                                                                                                                                              WB-3 code changed from ADO to AND to align with ISO code.
## 6                                                                                                                                                                                                                                  Arab World aggregate. Arab World is composed of members of the League of Arab States.
##     TableName
## 1       Aruba
## 2 Afghanistan
## 3      Angola
## 4     Albania
## 5     Andorra
## 6  Arab World
#It can be seen that the two datasets are not tidy due to the fact that the year of population as a variable should have occupied just one culumn but actually occupied 41 column with each year occupying one column. Accordingly, the amount of population each year which should have occupying just one column has occupying 57 column as the year variable does. Therefore, it is necessary to fisrt change the non-tidy data into the tidy one. 

#As the "ma" and "fe" datasets are in untidy format which has been discussed before, it is necessary to change the untidy into the tidy one. 

#Get tidy format datasets, "matidy" and "fetidy".
fetidy <-gather(fe,year,female_population,X1960:X2016) 
head(fetidy)
##   Country.Name Country.Code     Indicator.Name    Indicator.Code  year
## 1        Aruba          ABW Population, female SP.POP.TOTL.FE.IN X1960
## 2  Afghanistan          AFG Population, female SP.POP.TOTL.FE.IN X1960
## 3       Angola          AGO Population, female SP.POP.TOTL.FE.IN X1960
## 4      Albania          ALB Population, female SP.POP.TOTL.FE.IN X1960
## 5      Andorra          AND Population, female SP.POP.TOTL.FE.IN X1960
## 6   Arab World          ARB Population, female SP.POP.TOTL.FE.IN X1960
##   female_population
## 1             27637
## 2           4346990
## 3           2900597
## 4            780595
## 5                NA
## 6          45909546
matidy <-gather(ma,year,male_population,X1960:X2016) 
head(matidy)
##   Country.Name Country.Code   Indicator.Name    Indicator.Code  year
## 1        Aruba          ABW Population, male SP.POP.TOTL.MA.IN X1960
## 2  Afghanistan          AFG Population, male SP.POP.TOTL.MA.IN X1960
## 3       Angola          AGO Population, male SP.POP.TOTL.MA.IN X1960
## 4      Albania          ALB Population, male SP.POP.TOTL.MA.IN X1960
## 5      Andorra          AND Population, male SP.POP.TOTL.MA.IN X1960
## 6   Arab World          ARB Population, male SP.POP.TOTL.MA.IN X1960
##   male_population
## 1           26574
## 2         4649361
## 3         2742585
## 4          828205
## 5              NA
## 6        46581386
#Change the variable name "male_population" and "female_population" in both "matidy" and "fetidy" datasets into the same variable name "male_female_population" 

colnames(fetidy)[6] <- "male_female_population"

colnames(matidy)[6] <- "male_female_population"

#Merge the two datasets "matidy" and "fetidy" into dataset "mafamale".

mafemale<-bind_rows(fetidy,matidy)
## Warning in bind_rows_(x, .id): Unequal factor levels: coercing to character
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector

## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
## Warning in bind_rows_(x, .id): Unequal factor levels: coercing to character
## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector

## Warning in bind_rows_(x, .id): binding character and factor vector,
## coercing into character vector
#Join another dataset "region" into "mafamale". 
region<-select(region,Country.Code,Region,IncomeGroup)               
head(region)
##   Country.Code                    Region         IncomeGroup
## 1          ABW Latin America & Caribbean         High income
## 2          AFG                South Asia          Low income
## 3          AGO        Sub-Saharan Africa Lower middle income
## 4          ALB     Europe & Central Asia Upper middle income
## 5          AND     Europe & Central Asia         High income
## 6          ARB
#Left join the region dataset into the mafemale dataset.
population<-mafemale%>%left_join(region,by="Country.Code")
## Warning: Column `Country.Code` joining factors with different levels,
## coercing to character vector
head(population)
##   Country.Name Country.Code     Indicator.Name    Indicator.Code  year
## 1        Aruba          ABW Population, female SP.POP.TOTL.FE.IN X1960
## 2  Afghanistan          AFG Population, female SP.POP.TOTL.FE.IN X1960
## 3       Angola          AGO Population, female SP.POP.TOTL.FE.IN X1960
## 4      Albania          ALB Population, female SP.POP.TOTL.FE.IN X1960
## 5      Andorra          AND Population, female SP.POP.TOTL.FE.IN X1960
## 6   Arab World          ARB Population, female SP.POP.TOTL.FE.IN X1960
##   male_female_population                    Region         IncomeGroup
## 1                  27637 Latin America & Caribbean         High income
## 2                4346990                South Asia          Low income
## 3                2900597        Sub-Saharan Africa Lower middle income
## 4                 780595     Europe & Central Asia Upper middle income
## 5                     NA     Europe & Central Asia         High income
## 6               45909546
#It can be check that the variable "region" groups the data into "East Asia & Pacific", "Europe & Central Asia","Latin America & Caribbean",  "Middle East & North Africa" "North America", "South Asia", "Sub-Saharan Africa" and blank group without being classified into any of region of the world.    

levels(population$Region)
## [1] ""                           "East Asia & Pacific"       
## [3] "Europe & Central Asia"      "Latin America & Caribbean" 
## [5] "Middle East & North Africa" "North America"             
## [7] "South Asia"                 "Sub-Saharan Africa"
# Inspect its structure, the data structure is shown below.
str(population)
## 'data.frame':    30096 obs. of  8 variables:
##  $ Country.Name          : Factor w/ 264 levels "Afghanistan",..: 11 1 6 2 5 8 250 9 10 4 ...
##  $ Country.Code          : chr  "ABW" "AFG" "AGO" "ALB" ...
##  $ Indicator.Name        : chr  "Population, female" "Population, female" "Population, female" "Population, female" ...
##  $ Indicator.Code        : chr  "SP.POP.TOTL.FE.IN" "SP.POP.TOTL.FE.IN" "SP.POP.TOTL.FE.IN" "SP.POP.TOTL.FE.IN" ...
##  $ year                  : chr  "X1960" "X1960" "X1960" "X1960" ...
##  $ male_female_population: num  27637 4346990 2900597 780595 NA ...
##  $ Region                : Factor w/ 8 levels "","East Asia & Pacific",..: 4 7 8 3 3 1 5 4 3 2 ...
##  $ IncomeGroup           : Factor w/ 5 levels "","High income",..: 2 3 4 5 2 1 2 5 4 5 ...
#The dataset can be filtered into different subsets according to different regions if the infomation of particular region needs to be investigated. In this analysis, the population information on North America will be analysed. 

#Filter the dataset "population" and create the new subset only including the Northe America region, named "north_america". 
north_america<-population%>%filter(Region=="North America")
head(north_america)
##    Country.Name Country.Code     Indicator.Name    Indicator.Code  year
## 1       Bermuda          BMU Population, female SP.POP.TOTL.FE.IN X1960
## 2        Canada          CAN Population, female SP.POP.TOTL.FE.IN X1960
## 3 United States          USA Population, female SP.POP.TOTL.FE.IN X1960
## 4       Bermuda          BMU Population, female SP.POP.TOTL.FE.IN X1961
## 5        Canada          CAN Population, female SP.POP.TOTL.FE.IN X1961
## 6 United States          USA Population, female SP.POP.TOTL.FE.IN X1961
##   male_female_population        Region IncomeGroup
## 1                     NA North America High income
## 2                8851741 North America High income
## 3               91167688 North America High income
## 4                     NA North America High income
## 5                9040305 North America High income
## 6               92724776 North America High income

Understand

Summarise the types of variables and data structures, check the attributes in the data.

#Check the class of each variable in ???population???
sapply(population, class)
##           Country.Name           Country.Code         Indicator.Name 
##               "factor"            "character"            "character" 
##         Indicator.Code                   year male_female_population 
##            "character"            "character"              "numeric" 
##                 Region            IncomeGroup 
##               "factor"               "factor"
#Check the levels of factor variable "IncomeGroup" and order the levels of "IncomeGroup" according to the levels of the factor variable.
levels(population$IncomeGroup)
## [1] ""                    "High income"         "Low income"         
## [4] "Lower middle income" "Upper middle income"
population$IncomeGroup<-factor(population$IncomeGroup,levels = c("Low income", "Lower middle income", "Upper middle income","High income"), ordered=TRUE)
#The "year" variable is chacater variable, which should be changed into the date type. 
#The year vairable is charactor due to the reason that there is a captial letter "X" at the first place of the string, which needs to be delete and extract all numeric numbers in year variable.
population$year <- str_extract(population$year , "[0-9]+")
#Then convert the character vairable "year" into date type.   
population$year<-as.Date(as.character(population$year), format ="%Y")
class(population$year)
## [1] "Date"
#now, the variable year is date type. 
head(population$year)
## [1] "1960-06-23" "1960-06-23" "1960-06-23" "1960-06-23" "1960-06-23"
## [6] "1960-06-23"
#The date type year "variable" is in dmy format. 
#However, it is only desirable to keep the year part from the dmy format "year" variable. Therefere, the year extraction from date variable needs to be conducted. 
population$year <-format(as.Date(population$year, format="%d/%m/%Y"),"%Y")
head(population$year) 
## [1] "1960" "1960" "1960" "1960" "1960" "1960"
class(population$year)
## [1] "character"
#Now only the years of each observation has been shown.
typeof(population$male_female_population)
## [1] "double"
#The numeric "male_female_population" variable is in a double type variable.
#Check the class of each variable in ???population??? again.
sapply(population, class)
## $Country.Name
## [1] "factor"
## 
## $Country.Code
## [1] "character"
## 
## $Indicator.Name
## [1] "character"
## 
## $Indicator.Code
## [1] "character"
## 
## $year
## [1] "character"
## 
## $male_female_population
## [1] "numeric"
## 
## $Region
## [1] "factor"
## 
## $IncomeGroup
## [1] "ordered" "factor"
#It can be seen that all variable has been converted into the approporiate types they should be. 

#The indicator.name variable denotes if the observation is the male or femalepopulation, which should be the factor variable rather than character. Therefore, we change the character type into factor type variable and name the levels of the variable with male and female. 
population$Indicator.Name<-as.factor(population$Indicator.Name)

levels(population$Indicator.Name)[levels(population$Indicator.Name)=="Population, female"] <- "female"

levels(population$Indicator.Name)[levels(population$Indicator.Name)=="Population, male"] <- "male"

class(population$Indicator.Name)
## [1] "factor"
levels(population$Indicator.Name) 
## [1] "female" "male"
#Finally see the structure of "population" dataframe.
str(population)
## 'data.frame':    30096 obs. of  8 variables:
##  $ Country.Name          : Factor w/ 264 levels "Afghanistan",..: 11 1 6 2 5 8 250 9 10 4 ...
##  $ Country.Code          : chr  "ABW" "AFG" "AGO" "ALB" ...
##  $ Indicator.Name        : Factor w/ 2 levels "female","male": 1 1 1 1 1 1 1 1 1 1 ...
##  $ Indicator.Code        : chr  "SP.POP.TOTL.FE.IN" "SP.POP.TOTL.FE.IN" "SP.POP.TOTL.FE.IN" "SP.POP.TOTL.FE.IN" ...
##  $ year                  : chr  "1960" "1960" "1960" "1960" ...
##  $ male_female_population: num  27637 4346990 2900597 780595 NA ...
##  $ Region                : Factor w/ 8 levels "","East Asia & Pacific",..: 4 7 8 3 3 1 5 4 3 2 ...
##  $ IncomeGroup           : Ord.factor w/ 4 levels "Low income"<"Lower middle income"<..: 4 1 2 3 4 NA 4 3 2 3 ...

Tidy & Manipulate Data I

Check if the data conforms the tidy data principles.

#The target dataset is the region in North America. The tidy format issue of the dataset has ready been coped with. The dataset of population is in tidy format so does in its sub dataset, "north_america".
north_america<-population%>%filter(Region=="North America")
head(north_america)
##    Country.Name Country.Code Indicator.Name    Indicator.Code year
## 1       Bermuda          BMU         female SP.POP.TOTL.FE.IN 1960
## 2        Canada          CAN         female SP.POP.TOTL.FE.IN 1960
## 3 United States          USA         female SP.POP.TOTL.FE.IN 1960
## 4       Bermuda          BMU         female SP.POP.TOTL.FE.IN 1961
## 5        Canada          CAN         female SP.POP.TOTL.FE.IN 1961
## 6 United States          USA         female SP.POP.TOTL.FE.IN 1961
##   male_female_population        Region IncomeGroup
## 1                     NA North America High income
## 2                8851741 North America High income
## 3               91167688 North America High income
## 4                     NA North America High income
## 5                9040305 North America High income
## 6               92724776 North America High income
sapply(north_america, class)
## $Country.Name
## [1] "factor"
## 
## $Country.Code
## [1] "character"
## 
## $Indicator.Name
## [1] "factor"
## 
## $Indicator.Code
## [1] "character"
## 
## $year
## [1] "character"
## 
## $male_female_population
## [1] "numeric"
## 
## $Region
## [1] "factor"
## 
## $IncomeGroup
## [1] "ordered" "factor"
#check the levels of countries included in the subset "north_america". 

levels(droplevels(north_america$Country.Name)) 
## [1] "Bermuda"       "Canada"        "United States"
#it can be seen that there are 3 coutries included in the north america, "Bermuda", "Canada" and "United States". 

#Then let's scan and deal with the missing value and inconsistencies issue as well as the outlier issue of the datasets. 

colSums(is.na(north_america))
##           Country.Name           Country.Code         Indicator.Name 
##                      0                      0                      0 
##         Indicator.Code                   year male_female_population 
##                      0                      0                    114 
##                 Region            IncomeGroup 
##                      0                      0
#it can be seen that there are no missing value in all variables in the dataset except for the male_female_population variable with 114 missing data. 

#scan the missing value of the population in each of the leveled countries in North America. 

ber<-north_america%>% filter(Country.Name=="Bermuda")

canada<-north_america%>% filter(Country.Name=="Canada")

us<-north_america%>% filter(Country.Name=="United States")

sum(is.na(ber$male_female_population))
## [1] 114
sum(is.na(canada$male_female_population))
## [1] 0
sum(is.na(us$male_female_population))
## [1] 0
#it can be seen that there are no missing values in both United States and Canada, however, the population information in the Bermuda region has been totally missed. 

#In this case, the original dataset contains no informarion on the population of the Bermuda, therefore, there is no way to recode these missing values but delate them. 

#The information on the Bermuda has nothing to do with the population information, which needs to be delated. 
#Drop the bermuda information 
north_america<-na.omit(north_america)
#check the levels now in the North America. 
levels(droplevels(north_america$Country.Name)) 
## [1] "Canada"        "United States"
north_america
##      Country.Name Country.Code Indicator.Name    Indicator.Code year
## 2          Canada          CAN         female SP.POP.TOTL.FE.IN 1960
## 3   United States          USA         female SP.POP.TOTL.FE.IN 1960
## 5          Canada          CAN         female SP.POP.TOTL.FE.IN 1961
## 6   United States          USA         female SP.POP.TOTL.FE.IN 1961
## 8          Canada          CAN         female SP.POP.TOTL.FE.IN 1962
## 9   United States          USA         female SP.POP.TOTL.FE.IN 1962
## 11         Canada          CAN         female SP.POP.TOTL.FE.IN 1963
## 12  United States          USA         female SP.POP.TOTL.FE.IN 1963
## 14         Canada          CAN         female SP.POP.TOTL.FE.IN 1964
## 15  United States          USA         female SP.POP.TOTL.FE.IN 1964
## 17         Canada          CAN         female SP.POP.TOTL.FE.IN 1965
## 18  United States          USA         female SP.POP.TOTL.FE.IN 1965
## 20         Canada          CAN         female SP.POP.TOTL.FE.IN 1966
## 21  United States          USA         female SP.POP.TOTL.FE.IN 1966
## 23         Canada          CAN         female SP.POP.TOTL.FE.IN 1967
## 24  United States          USA         female SP.POP.TOTL.FE.IN 1967
## 26         Canada          CAN         female SP.POP.TOTL.FE.IN 1968
## 27  United States          USA         female SP.POP.TOTL.FE.IN 1968
## 29         Canada          CAN         female SP.POP.TOTL.FE.IN 1969
## 30  United States          USA         female SP.POP.TOTL.FE.IN 1969
## 32         Canada          CAN         female SP.POP.TOTL.FE.IN 1970
## 33  United States          USA         female SP.POP.TOTL.FE.IN 1970
## 35         Canada          CAN         female SP.POP.TOTL.FE.IN 1971
## 36  United States          USA         female SP.POP.TOTL.FE.IN 1971
## 38         Canada          CAN         female SP.POP.TOTL.FE.IN 1972
## 39  United States          USA         female SP.POP.TOTL.FE.IN 1972
## 41         Canada          CAN         female SP.POP.TOTL.FE.IN 1973
## 42  United States          USA         female SP.POP.TOTL.FE.IN 1973
## 44         Canada          CAN         female SP.POP.TOTL.FE.IN 1974
## 45  United States          USA         female SP.POP.TOTL.FE.IN 1974
## 47         Canada          CAN         female SP.POP.TOTL.FE.IN 1975
## 48  United States          USA         female SP.POP.TOTL.FE.IN 1975
## 50         Canada          CAN         female SP.POP.TOTL.FE.IN 1976
## 51  United States          USA         female SP.POP.TOTL.FE.IN 1976
## 53         Canada          CAN         female SP.POP.TOTL.FE.IN 1977
## 54  United States          USA         female SP.POP.TOTL.FE.IN 1977
## 56         Canada          CAN         female SP.POP.TOTL.FE.IN 1978
## 57  United States          USA         female SP.POP.TOTL.FE.IN 1978
## 59         Canada          CAN         female SP.POP.TOTL.FE.IN 1979
## 60  United States          USA         female SP.POP.TOTL.FE.IN 1979
## 62         Canada          CAN         female SP.POP.TOTL.FE.IN 1980
## 63  United States          USA         female SP.POP.TOTL.FE.IN 1980
## 65         Canada          CAN         female SP.POP.TOTL.FE.IN 1981
## 66  United States          USA         female SP.POP.TOTL.FE.IN 1981
## 68         Canada          CAN         female SP.POP.TOTL.FE.IN 1982
## 69  United States          USA         female SP.POP.TOTL.FE.IN 1982
## 71         Canada          CAN         female SP.POP.TOTL.FE.IN 1983
## 72  United States          USA         female SP.POP.TOTL.FE.IN 1983
## 74         Canada          CAN         female SP.POP.TOTL.FE.IN 1984
## 75  United States          USA         female SP.POP.TOTL.FE.IN 1984
## 77         Canada          CAN         female SP.POP.TOTL.FE.IN 1985
## 78  United States          USA         female SP.POP.TOTL.FE.IN 1985
## 80         Canada          CAN         female SP.POP.TOTL.FE.IN 1986
## 81  United States          USA         female SP.POP.TOTL.FE.IN 1986
## 83         Canada          CAN         female SP.POP.TOTL.FE.IN 1987
## 84  United States          USA         female SP.POP.TOTL.FE.IN 1987
## 86         Canada          CAN         female SP.POP.TOTL.FE.IN 1988
## 87  United States          USA         female SP.POP.TOTL.FE.IN 1988
## 89         Canada          CAN         female SP.POP.TOTL.FE.IN 1989
## 90  United States          USA         female SP.POP.TOTL.FE.IN 1989
## 92         Canada          CAN         female SP.POP.TOTL.FE.IN 1990
## 93  United States          USA         female SP.POP.TOTL.FE.IN 1990
## 95         Canada          CAN         female SP.POP.TOTL.FE.IN 1991
## 96  United States          USA         female SP.POP.TOTL.FE.IN 1991
## 98         Canada          CAN         female SP.POP.TOTL.FE.IN 1992
## 99  United States          USA         female SP.POP.TOTL.FE.IN 1992
## 101        Canada          CAN         female SP.POP.TOTL.FE.IN 1993
## 102 United States          USA         female SP.POP.TOTL.FE.IN 1993
## 104        Canada          CAN         female SP.POP.TOTL.FE.IN 1994
## 105 United States          USA         female SP.POP.TOTL.FE.IN 1994
## 107        Canada          CAN         female SP.POP.TOTL.FE.IN 1995
## 108 United States          USA         female SP.POP.TOTL.FE.IN 1995
## 110        Canada          CAN         female SP.POP.TOTL.FE.IN 1996
## 111 United States          USA         female SP.POP.TOTL.FE.IN 1996
## 113        Canada          CAN         female SP.POP.TOTL.FE.IN 1997
## 114 United States          USA         female SP.POP.TOTL.FE.IN 1997
## 116        Canada          CAN         female SP.POP.TOTL.FE.IN 1998
## 117 United States          USA         female SP.POP.TOTL.FE.IN 1998
## 119        Canada          CAN         female SP.POP.TOTL.FE.IN 1999
## 120 United States          USA         female SP.POP.TOTL.FE.IN 1999
## 122        Canada          CAN         female SP.POP.TOTL.FE.IN 2000
## 123 United States          USA         female SP.POP.TOTL.FE.IN 2000
## 125        Canada          CAN         female SP.POP.TOTL.FE.IN 2001
## 126 United States          USA         female SP.POP.TOTL.FE.IN 2001
## 128        Canada          CAN         female SP.POP.TOTL.FE.IN 2002
## 129 United States          USA         female SP.POP.TOTL.FE.IN 2002
## 131        Canada          CAN         female SP.POP.TOTL.FE.IN 2003
## 132 United States          USA         female SP.POP.TOTL.FE.IN 2003
## 134        Canada          CAN         female SP.POP.TOTL.FE.IN 2004
## 135 United States          USA         female SP.POP.TOTL.FE.IN 2004
## 137        Canada          CAN         female SP.POP.TOTL.FE.IN 2005
## 138 United States          USA         female SP.POP.TOTL.FE.IN 2005
## 140        Canada          CAN         female SP.POP.TOTL.FE.IN 2006
## 141 United States          USA         female SP.POP.TOTL.FE.IN 2006
## 143        Canada          CAN         female SP.POP.TOTL.FE.IN 2007
## 144 United States          USA         female SP.POP.TOTL.FE.IN 2007
## 146        Canada          CAN         female SP.POP.TOTL.FE.IN 2008
## 147 United States          USA         female SP.POP.TOTL.FE.IN 2008
## 149        Canada          CAN         female SP.POP.TOTL.FE.IN 2009
## 150 United States          USA         female SP.POP.TOTL.FE.IN 2009
## 152        Canada          CAN         female SP.POP.TOTL.FE.IN 2010
## 153 United States          USA         female SP.POP.TOTL.FE.IN 2010
## 155        Canada          CAN         female SP.POP.TOTL.FE.IN 2011
## 156 United States          USA         female SP.POP.TOTL.FE.IN 2011
## 158        Canada          CAN         female SP.POP.TOTL.FE.IN 2012
## 159 United States          USA         female SP.POP.TOTL.FE.IN 2012
## 161        Canada          CAN         female SP.POP.TOTL.FE.IN 2013
## 162 United States          USA         female SP.POP.TOTL.FE.IN 2013
## 164        Canada          CAN         female SP.POP.TOTL.FE.IN 2014
## 165 United States          USA         female SP.POP.TOTL.FE.IN 2014
## 167        Canada          CAN         female SP.POP.TOTL.FE.IN 2015
## 168 United States          USA         female SP.POP.TOTL.FE.IN 2015
## 170        Canada          CAN         female SP.POP.TOTL.FE.IN 2016
## 171 United States          USA         female SP.POP.TOTL.FE.IN 2016
## 173        Canada          CAN           male SP.POP.TOTL.MA.IN 1960
## 174 United States          USA           male SP.POP.TOTL.MA.IN 1960
## 176        Canada          CAN           male SP.POP.TOTL.MA.IN 1961
## 177 United States          USA           male SP.POP.TOTL.MA.IN 1961
## 179        Canada          CAN           male SP.POP.TOTL.MA.IN 1962
## 180 United States          USA           male SP.POP.TOTL.MA.IN 1962
## 182        Canada          CAN           male SP.POP.TOTL.MA.IN 1963
## 183 United States          USA           male SP.POP.TOTL.MA.IN 1963
## 185        Canada          CAN           male SP.POP.TOTL.MA.IN 1964
## 186 United States          USA           male SP.POP.TOTL.MA.IN 1964
## 188        Canada          CAN           male SP.POP.TOTL.MA.IN 1965
## 189 United States          USA           male SP.POP.TOTL.MA.IN 1965
## 191        Canada          CAN           male SP.POP.TOTL.MA.IN 1966
## 192 United States          USA           male SP.POP.TOTL.MA.IN 1966
## 194        Canada          CAN           male SP.POP.TOTL.MA.IN 1967
## 195 United States          USA           male SP.POP.TOTL.MA.IN 1967
## 197        Canada          CAN           male SP.POP.TOTL.MA.IN 1968
## 198 United States          USA           male SP.POP.TOTL.MA.IN 1968
## 200        Canada          CAN           male SP.POP.TOTL.MA.IN 1969
## 201 United States          USA           male SP.POP.TOTL.MA.IN 1969
## 203        Canada          CAN           male SP.POP.TOTL.MA.IN 1970
## 204 United States          USA           male SP.POP.TOTL.MA.IN 1970
## 206        Canada          CAN           male SP.POP.TOTL.MA.IN 1971
## 207 United States          USA           male SP.POP.TOTL.MA.IN 1971
## 209        Canada          CAN           male SP.POP.TOTL.MA.IN 1972
## 210 United States          USA           male SP.POP.TOTL.MA.IN 1972
## 212        Canada          CAN           male SP.POP.TOTL.MA.IN 1973
## 213 United States          USA           male SP.POP.TOTL.MA.IN 1973
## 215        Canada          CAN           male SP.POP.TOTL.MA.IN 1974
## 216 United States          USA           male SP.POP.TOTL.MA.IN 1974
## 218        Canada          CAN           male SP.POP.TOTL.MA.IN 1975
## 219 United States          USA           male SP.POP.TOTL.MA.IN 1975
## 221        Canada          CAN           male SP.POP.TOTL.MA.IN 1976
## 222 United States          USA           male SP.POP.TOTL.MA.IN 1976
## 224        Canada          CAN           male SP.POP.TOTL.MA.IN 1977
## 225 United States          USA           male SP.POP.TOTL.MA.IN 1977
## 227        Canada          CAN           male SP.POP.TOTL.MA.IN 1978
## 228 United States          USA           male SP.POP.TOTL.MA.IN 1978
## 230        Canada          CAN           male SP.POP.TOTL.MA.IN 1979
## 231 United States          USA           male SP.POP.TOTL.MA.IN 1979
## 233        Canada          CAN           male SP.POP.TOTL.MA.IN 1980
## 234 United States          USA           male SP.POP.TOTL.MA.IN 1980
## 236        Canada          CAN           male SP.POP.TOTL.MA.IN 1981
## 237 United States          USA           male SP.POP.TOTL.MA.IN 1981
## 239        Canada          CAN           male SP.POP.TOTL.MA.IN 1982
## 240 United States          USA           male SP.POP.TOTL.MA.IN 1982
## 242        Canada          CAN           male SP.POP.TOTL.MA.IN 1983
## 243 United States          USA           male SP.POP.TOTL.MA.IN 1983
## 245        Canada          CAN           male SP.POP.TOTL.MA.IN 1984
## 246 United States          USA           male SP.POP.TOTL.MA.IN 1984
## 248        Canada          CAN           male SP.POP.TOTL.MA.IN 1985
## 249 United States          USA           male SP.POP.TOTL.MA.IN 1985
## 251        Canada          CAN           male SP.POP.TOTL.MA.IN 1986
## 252 United States          USA           male SP.POP.TOTL.MA.IN 1986
## 254        Canada          CAN           male SP.POP.TOTL.MA.IN 1987
## 255 United States          USA           male SP.POP.TOTL.MA.IN 1987
## 257        Canada          CAN           male SP.POP.TOTL.MA.IN 1988
## 258 United States          USA           male SP.POP.TOTL.MA.IN 1988
## 260        Canada          CAN           male SP.POP.TOTL.MA.IN 1989
## 261 United States          USA           male SP.POP.TOTL.MA.IN 1989
## 263        Canada          CAN           male SP.POP.TOTL.MA.IN 1990
## 264 United States          USA           male SP.POP.TOTL.MA.IN 1990
## 266        Canada          CAN           male SP.POP.TOTL.MA.IN 1991
## 267 United States          USA           male SP.POP.TOTL.MA.IN 1991
## 269        Canada          CAN           male SP.POP.TOTL.MA.IN 1992
## 270 United States          USA           male SP.POP.TOTL.MA.IN 1992
## 272        Canada          CAN           male SP.POP.TOTL.MA.IN 1993
## 273 United States          USA           male SP.POP.TOTL.MA.IN 1993
## 275        Canada          CAN           male SP.POP.TOTL.MA.IN 1994
## 276 United States          USA           male SP.POP.TOTL.MA.IN 1994
## 278        Canada          CAN           male SP.POP.TOTL.MA.IN 1995
## 279 United States          USA           male SP.POP.TOTL.MA.IN 1995
## 281        Canada          CAN           male SP.POP.TOTL.MA.IN 1996
## 282 United States          USA           male SP.POP.TOTL.MA.IN 1996
## 284        Canada          CAN           male SP.POP.TOTL.MA.IN 1997
## 285 United States          USA           male SP.POP.TOTL.MA.IN 1997
## 287        Canada          CAN           male SP.POP.TOTL.MA.IN 1998
## 288 United States          USA           male SP.POP.TOTL.MA.IN 1998
## 290        Canada          CAN           male SP.POP.TOTL.MA.IN 1999
## 291 United States          USA           male SP.POP.TOTL.MA.IN 1999
## 293        Canada          CAN           male SP.POP.TOTL.MA.IN 2000
## 294 United States          USA           male SP.POP.TOTL.MA.IN 2000
## 296        Canada          CAN           male SP.POP.TOTL.MA.IN 2001
## 297 United States          USA           male SP.POP.TOTL.MA.IN 2001
## 299        Canada          CAN           male SP.POP.TOTL.MA.IN 2002
## 300 United States          USA           male SP.POP.TOTL.MA.IN 2002
## 302        Canada          CAN           male SP.POP.TOTL.MA.IN 2003
## 303 United States          USA           male SP.POP.TOTL.MA.IN 2003
## 305        Canada          CAN           male SP.POP.TOTL.MA.IN 2004
## 306 United States          USA           male SP.POP.TOTL.MA.IN 2004
## 308        Canada          CAN           male SP.POP.TOTL.MA.IN 2005
## 309 United States          USA           male SP.POP.TOTL.MA.IN 2005
## 311        Canada          CAN           male SP.POP.TOTL.MA.IN 2006
## 312 United States          USA           male SP.POP.TOTL.MA.IN 2006
## 314        Canada          CAN           male SP.POP.TOTL.MA.IN 2007
## 315 United States          USA           male SP.POP.TOTL.MA.IN 2007
## 317        Canada          CAN           male SP.POP.TOTL.MA.IN 2008
## 318 United States          USA           male SP.POP.TOTL.MA.IN 2008
## 320        Canada          CAN           male SP.POP.TOTL.MA.IN 2009
## 321 United States          USA           male SP.POP.TOTL.MA.IN 2009
## 323        Canada          CAN           male SP.POP.TOTL.MA.IN 2010
## 324 United States          USA           male SP.POP.TOTL.MA.IN 2010
## 326        Canada          CAN           male SP.POP.TOTL.MA.IN 2011
## 327 United States          USA           male SP.POP.TOTL.MA.IN 2011
## 329        Canada          CAN           male SP.POP.TOTL.MA.IN 2012
## 330 United States          USA           male SP.POP.TOTL.MA.IN 2012
## 332        Canada          CAN           male SP.POP.TOTL.MA.IN 2013
## 333 United States          USA           male SP.POP.TOTL.MA.IN 2013
## 335        Canada          CAN           male SP.POP.TOTL.MA.IN 2014
## 336 United States          USA           male SP.POP.TOTL.MA.IN 2014
## 338        Canada          CAN           male SP.POP.TOTL.MA.IN 2015
## 339 United States          USA           male SP.POP.TOTL.MA.IN 2015
## 341        Canada          CAN           male SP.POP.TOTL.MA.IN 2016
## 342 United States          USA           male SP.POP.TOTL.MA.IN 2016
##     male_female_population        Region IncomeGroup
## 2                  8851741 North America High income
## 3                 91167688 North America High income
## 5                  9040305 North America High income
## 6                 92724776 North America High income
## 8                  9221831 North America High income
## 9                 94190774 North America High income
## 11                 9408033 North America High income
## 12                95586520 North America High income
## 14                 9599279 North America High income
## 15                96963098 North America High income
## 17                 9784951 North America High income
## 18                98236551 North America High income
## 20                 9977130 North America High income
## 21                99448848 North America High income
## 23                10164729 North America High income
## 24               100623425 North America High income
## 26                10335090 North America High income
## 27               101724070 North America High income
## 29                10480782 North America High income
## 30               102805810 North America High income
## 32                10632184 North America High income
## 33               104076122 North America High income
## 35                10795924 North America High income
## 36               105443973 North America High income
## 38                10972537 North America High income
## 39               106604253 North America High income
## 41                11163418 North America High income
## 42               107643874 North America High income
## 44                11370241 North America High income
## 45               108655509 North America High income
## 47                11594419 North America High income
## 48               109771934 North America High income
## 50                11758334 North America High income
## 51               110880498 North America High income
## 53                11908985 North America High income
## 54               112077291 North America High income
## 56                12041878 North America High income
## 57               113350788 North America High income
## 59                12175105 North America High income
## 60               114675212 North America High income
## 62                12344789 North America High income
## 63               115822943 North America High income
## 65                12508542 North America High income
## 66               116977168 North America High income
## 68                12668611 North America High income
## 69               118084774 North America High income
## 71                12803432 North America High income
## 72               119144238 North America High income
## 74                12933351 North America High income
## 75               120160657 North America High income
## 77                13059560 North America High income
## 78               121228156 North America High income
## 80                13195995 North America High income
## 81               122374325 North America High income
## 83                13373841 North America High income
## 84               123510043 North America High income
## 86                13550793 North America High income
## 87               124677136 North America High income
## 89                13798061 North America High income
## 90               125885861 North America High income
## 92                14009770 North America High income
## 93               127314146 North America High income
## 95                14206688 North America High income
## 96               128993474 North America High income
## 98                14387814 North America High income
## 99               130735361 North America High income
## 101               14551739 North America High income
## 102              132392823 North America High income
## 104               14696788 North America High income
## 105              133942111 North America High income
## 107               14821658 North America High income
## 108              135465255 North America High income
## 110               14982770 North America High income
## 111              136974478 North America High income
## 113               15140827 North America High income
## 114              138561200 North America High income
## 116               15270066 North America High income
## 117              140117226 North America High income
## 119               15394068 North America High income
## 120              141669463 North America High income
## 122               15527833 North America High income
## 123              143190463 North America High income
## 125               15682901 North America High income
## 126              144552310 North America High income
## 128               15821913 North America High income
## 129              145840225 North America High income
## 131               15978061 North America High income
## 132              147043924 North America High income
## 134               16136529 North America High income
## 135              148361957 North America High income
## 137               16293676 North America High income
## 138              149693614 North America High income
## 140               16420913 North America High income
## 141              151110004 North America High income
## 143               16577614 North America High income
## 144              152526919 North America High income
## 146               16754869 North America High income
## 147              153952230 North America High income
## 149               16945444 North America High income
## 150              155280807 North America High income
## 152               17134047 North America High income
## 153              156551423 North America High income
## 155               17304245 North America High income
## 156              157681176 North America High income
## 158               17510939 North America High income
## 159              158813852 North America High income
## 161               17715017 North America High income
## 162              159877330 North America High income
## 164               17908981 North America High income
## 165              161017913 North America High income
## 167               18058362 North America High income
## 168              162149165 North America High income
## 170               18274152 North America High income
## 171              163233094 North America High income
## 173                9057268 North America High income
## 174               89503312 North America High income
## 176                9230695 North America High income
## 177               90966224 North America High income
## 179                9392169 North America High income
## 180               92347226 North America High income
## 182                9555967 North America High income
## 183               93655480 North America High income
## 185                9725721 North America High income
## 186               94925902 North America High income
## 188                9893049 North America High income
## 189               96066449 North America High income
## 191               10070870 North America High income
## 192               97111152 North America High income
## 194               10247271 North America High income
## 195               98088575 North America High income
## 197               10408910 North America High income
## 198               98981930 North America High income
## 200               10547218 North America High income
## 201               99871190 North America High income
## 203               10691816 North America High income
## 204              100975878 North America High income
## 206               10849611 North America High income
## 207              102217027 North America High income
## 209               11021094 North America High income
## 210              103291747 North America High income
## 212               11205990 North America High income
## 213              104265126 North America High income
## 215               11403846 North America High income
## 216              105198491 North America High income
## 218               11614581 North America High income
## 219              106201066 North America High income
## 221               11759666 North America High income
## 222              107154502 North America High income
## 224               11887015 North America High income
## 225              108161709 North America High income
## 227               11994122 North America High income
## 228              109234212 North America High income
## 230               12101895 North America High income
## 231              110379788 North America High income
## 233               12248211 North America High income
## 234              111402057 North America High income
## 236               12391458 North America High income
## 237              112488832 North America High income
## 239               12533389 North America High income
## 240              113579226 North America High income
## 242               12652568 North America High income
## 243              114647762 North America High income
## 245               12768649 North America High income
## 246              115664343 North America High income
## 248               12882440 North America High income
## 249              116695844 North America High income
## 251               13008005 North America High income
## 252              117758675 North America High income
## 254               13176159 North America High income
## 255              118778957 North America High income
## 257               13344207 North America High income
## 258              119821864 North America High income
## 260               13580939 North America High income
## 261              120933139 North America High income
## 263               13781230 North America High income
## 264              122308854 North America High income
## 266               13964994 North America High income
## 267              123987526 North America High income
## 269               14131783 North America High income
## 270              125778639 North America High income
## 272               14281671 North America High income
## 273              127526177 North America High income
## 275               14415118 North America High income
## 276              129183889 North America High income
## 278               14532342 North America High income
## 279              130812745 North America High income
## 281               14689130 North America High income
## 282              132419522 North America High income
## 284               14846373 North America High income
## 285              134095800 North America High income
## 287               14977834 North America High income
## 288              135736774 North America High income
## 290               15105132 North America High income
## 291              137370537 North America High income
## 293               15241867 North America High income
## 294              138971948 North America High income
## 296               15398999 North America High income
## 297              140416645 North America High income
## 299               15540087 North America High income
## 300              141784968 North America High income
## 302               15697939 North America High income
## 303              143064009 North America High income
## 305               15858471 North America High income
## 306              144443341 North America High income
## 308               16018324 North America High income
## 309              145822985 North America High income
## 311               16149592 North America High income
## 312              147269908 North America High income
## 314               16310314 North America High income
## 315              148704288 North America High income
## 317               16490904 North America High income
## 318              150141736 North America High income
## 320               16683127 North America High income
## 321              151490722 North America High income
## 323               16871227 North America High income
## 324              152796770 North America High income
## 326               17038535 North America High income
## 327              153982182 North America High income
## 329               17239606 North America High income
## 330              155184527 North America High income
## 332               17437353 North America High income
## 333              156327578 North America High income
## 335               17626367 North America High income
## 336              157545543 North America High income
## 338               17774151 North America High income
## 339              158747453 North America High income
## 341               17990452 North America High income
## 342              159894419 North America High income
#Now, there are only 2 countries contained in North America dataset, the "Canada"" and "United States"" with no missing value in any varibale of any observations.
#As the same as the tidy format population dataset,  north_america dataset is also in tidy format. Each variable must have its own column.Each observation must have its own row.Each value must have its own cell.

Tidy & Manipulate Data II

Create/mutate at two variables from the existing variables to demonstrate the total population and percentatge of male and female population each year for each country.

#More information of this dataset, like total population and percentage of male and female population each year for both Canada and united states, can be demonstrated by creating and transfroming new variable based on the existing male and female population information. 
#Create a new dataset called "k1" to calculate the total the population each year for both Canada and united states,
k1<-aggregate(male_female_population~year+Country.Name,north_america,sum)
#rename the variable "male_female_population" into "total.population"
colnames(k1)[3] <- "total.population"
head(k1)
##   year Country.Name total.population
## 1 1960       Canada         17909009
## 2 1961       Canada         18271000
## 3 1962       Canada         18614000
## 4 1963       Canada         18964000
## 5 1964       Canada         19325000
## 6 1965       Canada         19678000
#Now the total population for of male and female population each year for both Canada and united states are get. 
#The left join the dataset "k1" to the dataset "north_america". 
north_america<-north_america%>%left_join(k1)
## Joining, by = c("Country.Name", "year")
#The new variable "total.population" which represent the total population for each observation has been created. 
head(north_america)
##    Country.Name Country.Code Indicator.Name    Indicator.Code year
## 1        Canada          CAN         female SP.POP.TOTL.FE.IN 1960
## 2 United States          USA         female SP.POP.TOTL.FE.IN 1960
## 3        Canada          CAN         female SP.POP.TOTL.FE.IN 1961
## 4 United States          USA         female SP.POP.TOTL.FE.IN 1961
## 5        Canada          CAN         female SP.POP.TOTL.FE.IN 1962
## 6 United States          USA         female SP.POP.TOTL.FE.IN 1962
##   male_female_population        Region IncomeGroup total.population
## 1                8851741 North America High income         17909009
## 2               91167688 North America High income        180671000
## 3                9040305 North America High income         18271000
## 4               92724776 North America High income        183691000
## 5                9221831 North America High income         18614000
## 6               94190774 North America High income        186538000
#Create a new variable "population.percentage", which represent the percentage of  male or female population consituting the total population of each country each year. 
north_america<- north_america%>% mutate(population.percentage=male_female_population/total.population)
north_america$population.percentage <- paste(round((north_america$population.percentage)*100,digits=2),"%",sep="")
head(north_america)
##    Country.Name Country.Code Indicator.Name    Indicator.Code year
## 1        Canada          CAN         female SP.POP.TOTL.FE.IN 1960
## 2 United States          USA         female SP.POP.TOTL.FE.IN 1960
## 3        Canada          CAN         female SP.POP.TOTL.FE.IN 1961
## 4 United States          USA         female SP.POP.TOTL.FE.IN 1961
## 5        Canada          CAN         female SP.POP.TOTL.FE.IN 1962
## 6 United States          USA         female SP.POP.TOTL.FE.IN 1962
##   male_female_population        Region IncomeGroup total.population
## 1                8851741 North America High income         17909009
## 2               91167688 North America High income        180671000
## 3                9040305 North America High income         18271000
## 4               92724776 North America High income        183691000
## 5                9221831 North America High income         18614000
## 6               94190774 North America High income        186538000
##   population.percentage
## 1                49.43%
## 2                50.46%
## 3                49.48%
## 4                50.48%
## 5                49.54%
## 6                50.49%
#Now, new variables,"total.population" and Population.percentage, have been created representing total population and percentage of male and female population each year for both Canada and united states.

Scan I

Scan and deal with the data for missing values, inconsistencies and obvious errors.

#Next, the consistencies of the information in each valuable needs to be checked. 
#list the type of each variable in the dataset. 

sapply(north_america, class)
## $Country.Name
## [1] "factor"
## 
## $Country.Code
## [1] "character"
## 
## $Indicator.Name
## [1] "factor"
## 
## $Indicator.Code
## [1] "character"
## 
## $year
## [1] "character"
## 
## $male_female_population
## [1] "numeric"
## 
## $Region
## [1] "factor"
## 
## $IncomeGroup
## [1] "ordered" "factor" 
## 
## $total.population
## [1] "numeric"
## 
## $population.percentage
## [1] "character"
#first, the "year" variable is chacater variable, which should be changed into the numeric. 
#The year vairable is charactor due to the reason that there is a captial letter "X" at the first place of the string, which needs to be delete and extract all numeric numbers in year variable.

north_america$year <- str_extract(north_america$year , "[0-9]+")

#Then convert the character vairable "year" into date type.   

north_america$year<-as.Date(as.character(north_america$year), format ="%Y")

class(north_america$year)
## [1] "Date"
#now, the variable year is date type. 

#Convert character variable "population.percentage" into numeric type.

north_america$population.percentage<-as.numeric(north_america$population.percentage)
## Warning: NAs introduced by coercion
#Convert charater variable, "Country.Name", into factor variable and check its consistency. 

north_america$Country.Name<-as.factor(north_america$Country.Name)

levels(droplevels(north_america$Country.Name))
## [1] "Canada"        "United States"
#Now the charater variable, "Country.Name", has been converted into factor variable with two levels "Canada" and "United States", which indicates all observations in these variable are consistent with no other other type of values. 

#set rules to check if the year is restricted between 1960 and 2016.

Rule1 <- editset(c("year >= 1960", "year <= 2016"))

summary(violatedEdits(Rule1, north_america))
## Edit violations, 228 observations, 0 completely missing (0%):
## 
##  editname freq   rel
##      num2  164 71.9%
##      num1   60 26.3%
## 
## Edit violations per record:
## 
##  errors freq   rel
##       0    4  1.8%
##       1  224 98.2%
#There is no violation to the rule 1.

#Check the consistency of factor variable "Country.Name". 

levels(droplevels(north_america$Country.Name))
## [1] "Canada"        "United States"
#the consistency of the country.name variable can be check jsut through checking the levels of the variable. All the obsetvations has assigned either Canada nor United States without any other levels or inconsistency values, like missing values. 

class(north_america$Indicator.Name)
## [1] "factor"
#The indicator.name variable denotes if the observation is male or female population,which should be the factor variable. 

class(north_america$Indicator.Name)
## [1] "factor"
levels(north_america$Indicator.Name) 
## [1] "female" "male"
#Now the factor variable "indicator.name""  has two differen levels, male and female, which indicates no missing value and all observation in this variable has assigned a value.
#check consistency of IncomeGroup variable.
levels(droplevels(north_america$IncomeGroup))
## [1] "High income"
#All observation in north_america dataset has been classified into high income without any type of inconsistency. 
colSums(is.na(north_america))
##           Country.Name           Country.Code         Indicator.Name 
##                      0                      0                      0 
##         Indicator.Code                   year male_female_population 
##                      0                      0                      0 
##                 Region            IncomeGroup       total.population 
##                      0                      0                      0 
##  population.percentage 
##                    228
#check consistency of character variables, Country.Code and Indicator.Code to see if all observations in these variables are consistency. 
all(north_america$Country.Code %in% c("CAN", "USA"))
## [1] TRUE
#The result is true which means Country.Code variable is consistency and all observations in this variable are in either "CAN" or "USA" value. 
all(north_america$Indicator.Code %in% c("SP.POP.TOTL.FE.IN", "SP.POP.TOTL.MA.IN"))
## [1] TRUE
#The result is true which means Indicator.Code variable is consistency and all observations in this variable are in either "SP.POP.TOTL.FE.IN" or "SP.POP.TOTL.MA.IN" value. 
#After scan and deal with all the variables and observations, the north_america dataset now has no missing value and all variables are consistency.

Scan II check outliers for numeric variables.

#Check the outliers of the numeric variable "male_female_population". Due to the reason that there is only one numeric variable in original datset "male_female_population" needs to be conducting outliers analysis (the other 2 numeric variables "total.population" and "population.percentage" are all derived from "male_female_population" variable, so just univariate outlier detection), therefore, univariate outlier detection methods should be applied. 

#The outliers of population size variable "male_female_population" in both Canada and United States has been checked.
canada<-north_america%>% filter(Country.Name=="Canada")

boxplot(male_female_population~Indicator.Name,data=canada,main="Box Plot of Population for Male and Female of Canada", ylab="Population", col = "grey")

us<-north_america%>% filter(Country.Name=="United States")

boxplot(male_female_population~Indicator.Name,data=us,main="Box Plot of Population for Male and Female of United States", ylab="Population", col = "grey")

#From the boxplot, there is no outliers in either United states and Canada male and female population dataset and all the observation performs normally. 

#When the two variables year and population in that year are considered, it is shown that there are 3 outliers shown in Chi-Square Q-Q plot. 
#However, due to the reason that the data in this analysis is the census of the population of different country,which is relatively accurate data but not baised sample size, therefore, the outliers should not be handled. 
#The reason that contributes these outliers might probably by some socal factors, such as policy reform on immigration and child birth social walfare.  Handling these outliers probably will bring more inaccuracy from the real data. 

Transform

#Split the data into subsets that contains Canada-female, Canada-male, US-female, and US-male population information. 
north_america %>%group_by(Indicator.Name, Country.Name) %>% hist(north_america$male_female_population)
## Warning in if (length(unique(w)) >= n.unique) {: the condition has length >
## 1 and only the first element will be used

## Warning in if (length(unique(w)) >= n.unique) {: the condition has length >
## 1 and only the first element will be used

## Warning in if (length(unique(w)) >= n.unique) {: the condition has length >
## 1 and only the first element will be used

## Warning in if (length(unique(w)) >= n.unique) {: the condition has length >
## 1 and only the first element will be used

b<-split(north_america, list(north_america$Country.Name, north_america$Indicator.Name),drop=TRUE)

Canada.female<-b$Canada.female

Canada.male<-b$Canada.male

US.male<-b$`United States.male`

US.female<-b$`United States.female`

#Demonstrate the histogram of these subset of the population. 
hist(Canada.female$male_female_population)

hist(Canada.male$male_female_population)

hist(US.male$male_female_population)

hist(US.female$male_female_population)

#It can be seen that population in either male and female in both Canada and United States are obviously not normally distributed. 
#Due to the fact that population has high correlation with the past, therefore, it is possible to make forecast of the future population, which requires nomally distributed dataset. 
#Hence, a boxcox transformation variable are required for the forecasting to make population normally distributed. 

#Filter the Canada female data subset and create a new variable named "transformed.population" by boxcox transformation.
Canada.female<-north_america%>%filter(Indicator.Name=="female",Country.Name=="Canada")
lambda=BoxCox.lambda(Canada.female$male_female_population)
Canada.female<-Canada.female%>%mutate(transformed.population=BoxCox(male_female_population,lambda = lambda))

#Filter the Canada female data subset and create a new variable named "transformed.population" by boxcox transformation.
Canada.male<-north_america%>%filter(Indicator.Name=="male",Country.Name=="Canada")
lambda=BoxCox.lambda(Canada.male$male_female_population)
Canada.male<-Canada.male%>%mutate(transformed.population=BoxCox(male_female_population,lambda = lambda))

#Filter the United States male data subset and create a new variable named "transformed.population" by boxcox transformation.
US.male<-north_america%>%filter(Indicator.Name=="male",Country.Name=="United States")
lambda=BoxCox.lambda(US.male$male_female_population)
US.male<-US.male%>%mutate(transformed.population=BoxCox(male_female_population,lambda = lambda))

#Filter the United States female data subset and create a new variable named "transformed.population" by boxcox transformation.
US.female<-north_america%>%filter(Indicator.Name=="female",Country.Name=="United States")
lambda=BoxCox.lambda(US.female$male_female_population)
US.female<-US.female%>%mutate(transformed.population=BoxCox(male_female_population,lambda = lambda))

#Combine these four data subsets together by merging the rows of the four dataset.
north_america<-bind_rows(Canada.male,Canada.female,US.male,US.female)

#sort and order the dataset by ascending order of the year.
north_america<-north_america[order(north_america$year,north_america$Country.Name),]

#Finally the post-processed tidy dataset "north_america" has been obtained. 
north_america
##      Country.Name Country.Code Indicator.Name    Indicator.Code       year
## 1          Canada          CAN           male SP.POP.TOTL.MA.IN 1960-06-23
## 58         Canada          CAN         female SP.POP.TOTL.FE.IN 1960-06-23
## 115 United States          USA           male SP.POP.TOTL.MA.IN 1960-06-23
## 172 United States          USA         female SP.POP.TOTL.FE.IN 1960-06-23
## 2          Canada          CAN           male SP.POP.TOTL.MA.IN 1961-06-23
## 59         Canada          CAN         female SP.POP.TOTL.FE.IN 1961-06-23
## 116 United States          USA           male SP.POP.TOTL.MA.IN 1961-06-23
## 173 United States          USA         female SP.POP.TOTL.FE.IN 1961-06-23
## 3          Canada          CAN           male SP.POP.TOTL.MA.IN 1962-06-23
## 60         Canada          CAN         female SP.POP.TOTL.FE.IN 1962-06-23
## 117 United States          USA           male SP.POP.TOTL.MA.IN 1962-06-23
## 174 United States          USA         female SP.POP.TOTL.FE.IN 1962-06-23
## 4          Canada          CAN           male SP.POP.TOTL.MA.IN 1963-06-23
## 61         Canada          CAN         female SP.POP.TOTL.FE.IN 1963-06-23
## 118 United States          USA           male SP.POP.TOTL.MA.IN 1963-06-23
## 175 United States          USA         female SP.POP.TOTL.FE.IN 1963-06-23
## 5          Canada          CAN           male SP.POP.TOTL.MA.IN 1964-06-23
## 62         Canada          CAN         female SP.POP.TOTL.FE.IN 1964-06-23
## 119 United States          USA           male SP.POP.TOTL.MA.IN 1964-06-23
## 176 United States          USA         female SP.POP.TOTL.FE.IN 1964-06-23
## 6          Canada          CAN           male SP.POP.TOTL.MA.IN 1965-06-23
## 63         Canada          CAN         female SP.POP.TOTL.FE.IN 1965-06-23
## 120 United States          USA           male SP.POP.TOTL.MA.IN 1965-06-23
## 177 United States          USA         female SP.POP.TOTL.FE.IN 1965-06-23
## 7          Canada          CAN           male SP.POP.TOTL.MA.IN 1966-06-23
## 64         Canada          CAN         female SP.POP.TOTL.FE.IN 1966-06-23
## 121 United States          USA           male SP.POP.TOTL.MA.IN 1966-06-23
## 178 United States          USA         female SP.POP.TOTL.FE.IN 1966-06-23
## 8          Canada          CAN           male SP.POP.TOTL.MA.IN 1967-06-23
## 65         Canada          CAN         female SP.POP.TOTL.FE.IN 1967-06-23
## 122 United States          USA           male SP.POP.TOTL.MA.IN 1967-06-23
## 179 United States          USA         female SP.POP.TOTL.FE.IN 1967-06-23
## 9          Canada          CAN           male SP.POP.TOTL.MA.IN 1968-06-23
## 66         Canada          CAN         female SP.POP.TOTL.FE.IN 1968-06-23
## 123 United States          USA           male SP.POP.TOTL.MA.IN 1968-06-23
## 180 United States          USA         female SP.POP.TOTL.FE.IN 1968-06-23
## 10         Canada          CAN           male SP.POP.TOTL.MA.IN 1969-06-23
## 67         Canada          CAN         female SP.POP.TOTL.FE.IN 1969-06-23
## 124 United States          USA           male SP.POP.TOTL.MA.IN 1969-06-23
## 181 United States          USA         female SP.POP.TOTL.FE.IN 1969-06-23
## 11         Canada          CAN           male SP.POP.TOTL.MA.IN 1970-06-23
## 68         Canada          CAN         female SP.POP.TOTL.FE.IN 1970-06-23
## 125 United States          USA           male SP.POP.TOTL.MA.IN 1970-06-23
## 182 United States          USA         female SP.POP.TOTL.FE.IN 1970-06-23
## 12         Canada          CAN           male SP.POP.TOTL.MA.IN 1971-06-23
## 69         Canada          CAN         female SP.POP.TOTL.FE.IN 1971-06-23
## 126 United States          USA           male SP.POP.TOTL.MA.IN 1971-06-23
## 183 United States          USA         female SP.POP.TOTL.FE.IN 1971-06-23
## 13         Canada          CAN           male SP.POP.TOTL.MA.IN 1972-06-23
## 70         Canada          CAN         female SP.POP.TOTL.FE.IN 1972-06-23
## 127 United States          USA           male SP.POP.TOTL.MA.IN 1972-06-23
## 184 United States          USA         female SP.POP.TOTL.FE.IN 1972-06-23
## 14         Canada          CAN           male SP.POP.TOTL.MA.IN 1973-06-23
## 71         Canada          CAN         female SP.POP.TOTL.FE.IN 1973-06-23
## 128 United States          USA           male SP.POP.TOTL.MA.IN 1973-06-23
## 185 United States          USA         female SP.POP.TOTL.FE.IN 1973-06-23
## 15         Canada          CAN           male SP.POP.TOTL.MA.IN 1974-06-23
## 72         Canada          CAN         female SP.POP.TOTL.FE.IN 1974-06-23
## 129 United States          USA           male SP.POP.TOTL.MA.IN 1974-06-23
## 186 United States          USA         female SP.POP.TOTL.FE.IN 1974-06-23
## 16         Canada          CAN           male SP.POP.TOTL.MA.IN 1975-06-23
## 73         Canada          CAN         female SP.POP.TOTL.FE.IN 1975-06-23
## 130 United States          USA           male SP.POP.TOTL.MA.IN 1975-06-23
## 187 United States          USA         female SP.POP.TOTL.FE.IN 1975-06-23
## 17         Canada          CAN           male SP.POP.TOTL.MA.IN 1976-06-23
## 74         Canada          CAN         female SP.POP.TOTL.FE.IN 1976-06-23
## 131 United States          USA           male SP.POP.TOTL.MA.IN 1976-06-23
## 188 United States          USA         female SP.POP.TOTL.FE.IN 1976-06-23
## 18         Canada          CAN           male SP.POP.TOTL.MA.IN 1977-06-23
## 75         Canada          CAN         female SP.POP.TOTL.FE.IN 1977-06-23
## 132 United States          USA           male SP.POP.TOTL.MA.IN 1977-06-23
## 189 United States          USA         female SP.POP.TOTL.FE.IN 1977-06-23
## 19         Canada          CAN           male SP.POP.TOTL.MA.IN 1978-06-23
## 76         Canada          CAN         female SP.POP.TOTL.FE.IN 1978-06-23
## 133 United States          USA           male SP.POP.TOTL.MA.IN 1978-06-23
## 190 United States          USA         female SP.POP.TOTL.FE.IN 1978-06-23
## 20         Canada          CAN           male SP.POP.TOTL.MA.IN 1979-06-23
## 77         Canada          CAN         female SP.POP.TOTL.FE.IN 1979-06-23
## 134 United States          USA           male SP.POP.TOTL.MA.IN 1979-06-23
## 191 United States          USA         female SP.POP.TOTL.FE.IN 1979-06-23
## 21         Canada          CAN           male SP.POP.TOTL.MA.IN 1980-06-23
## 78         Canada          CAN         female SP.POP.TOTL.FE.IN 1980-06-23
## 135 United States          USA           male SP.POP.TOTL.MA.IN 1980-06-23
## 192 United States          USA         female SP.POP.TOTL.FE.IN 1980-06-23
## 22         Canada          CAN           male SP.POP.TOTL.MA.IN 1981-06-23
## 79         Canada          CAN         female SP.POP.TOTL.FE.IN 1981-06-23
## 136 United States          USA           male SP.POP.TOTL.MA.IN 1981-06-23
## 193 United States          USA         female SP.POP.TOTL.FE.IN 1981-06-23
## 23         Canada          CAN           male SP.POP.TOTL.MA.IN 1982-06-23
## 80         Canada          CAN         female SP.POP.TOTL.FE.IN 1982-06-23
## 137 United States          USA           male SP.POP.TOTL.MA.IN 1982-06-23
## 194 United States          USA         female SP.POP.TOTL.FE.IN 1982-06-23
## 24         Canada          CAN           male SP.POP.TOTL.MA.IN 1983-06-23
## 81         Canada          CAN         female SP.POP.TOTL.FE.IN 1983-06-23
## 138 United States          USA           male SP.POP.TOTL.MA.IN 1983-06-23
## 195 United States          USA         female SP.POP.TOTL.FE.IN 1983-06-23
## 25         Canada          CAN           male SP.POP.TOTL.MA.IN 1984-06-23
## 82         Canada          CAN         female SP.POP.TOTL.FE.IN 1984-06-23
## 139 United States          USA           male SP.POP.TOTL.MA.IN 1984-06-23
## 196 United States          USA         female SP.POP.TOTL.FE.IN 1984-06-23
## 26         Canada          CAN           male SP.POP.TOTL.MA.IN 1985-06-23
## 83         Canada          CAN         female SP.POP.TOTL.FE.IN 1985-06-23
## 140 United States          USA           male SP.POP.TOTL.MA.IN 1985-06-23
## 197 United States          USA         female SP.POP.TOTL.FE.IN 1985-06-23
## 27         Canada          CAN           male SP.POP.TOTL.MA.IN 1986-06-23
## 84         Canada          CAN         female SP.POP.TOTL.FE.IN 1986-06-23
## 141 United States          USA           male SP.POP.TOTL.MA.IN 1986-06-23
## 198 United States          USA         female SP.POP.TOTL.FE.IN 1986-06-23
## 28         Canada          CAN           male SP.POP.TOTL.MA.IN 1987-06-23
## 85         Canada          CAN         female SP.POP.TOTL.FE.IN 1987-06-23
## 142 United States          USA           male SP.POP.TOTL.MA.IN 1987-06-23
## 199 United States          USA         female SP.POP.TOTL.FE.IN 1987-06-23
## 29         Canada          CAN           male SP.POP.TOTL.MA.IN 1988-06-23
## 86         Canada          CAN         female SP.POP.TOTL.FE.IN 1988-06-23
## 143 United States          USA           male SP.POP.TOTL.MA.IN 1988-06-23
## 200 United States          USA         female SP.POP.TOTL.FE.IN 1988-06-23
## 30         Canada          CAN           male SP.POP.TOTL.MA.IN 1989-06-23
## 87         Canada          CAN         female SP.POP.TOTL.FE.IN 1989-06-23
## 144 United States          USA           male SP.POP.TOTL.MA.IN 1989-06-23
## 201 United States          USA         female SP.POP.TOTL.FE.IN 1989-06-23
## 31         Canada          CAN           male SP.POP.TOTL.MA.IN 1990-06-23
## 88         Canada          CAN         female SP.POP.TOTL.FE.IN 1990-06-23
## 145 United States          USA           male SP.POP.TOTL.MA.IN 1990-06-23
## 202 United States          USA         female SP.POP.TOTL.FE.IN 1990-06-23
## 32         Canada          CAN           male SP.POP.TOTL.MA.IN 1991-06-23
## 89         Canada          CAN         female SP.POP.TOTL.FE.IN 1991-06-23
## 146 United States          USA           male SP.POP.TOTL.MA.IN 1991-06-23
## 203 United States          USA         female SP.POP.TOTL.FE.IN 1991-06-23
## 33         Canada          CAN           male SP.POP.TOTL.MA.IN 1992-06-23
## 90         Canada          CAN         female SP.POP.TOTL.FE.IN 1992-06-23
## 147 United States          USA           male SP.POP.TOTL.MA.IN 1992-06-23
## 204 United States          USA         female SP.POP.TOTL.FE.IN 1992-06-23
## 34         Canada          CAN           male SP.POP.TOTL.MA.IN 1993-06-23
## 91         Canada          CAN         female SP.POP.TOTL.FE.IN 1993-06-23
## 148 United States          USA           male SP.POP.TOTL.MA.IN 1993-06-23
## 205 United States          USA         female SP.POP.TOTL.FE.IN 1993-06-23
## 35         Canada          CAN           male SP.POP.TOTL.MA.IN 1994-06-23
## 92         Canada          CAN         female SP.POP.TOTL.FE.IN 1994-06-23
## 149 United States          USA           male SP.POP.TOTL.MA.IN 1994-06-23
## 206 United States          USA         female SP.POP.TOTL.FE.IN 1994-06-23
## 36         Canada          CAN           male SP.POP.TOTL.MA.IN 1995-06-23
## 93         Canada          CAN         female SP.POP.TOTL.FE.IN 1995-06-23
## 150 United States          USA           male SP.POP.TOTL.MA.IN 1995-06-23
## 207 United States          USA         female SP.POP.TOTL.FE.IN 1995-06-23
## 37         Canada          CAN           male SP.POP.TOTL.MA.IN 1996-06-23
## 94         Canada          CAN         female SP.POP.TOTL.FE.IN 1996-06-23
## 151 United States          USA           male SP.POP.TOTL.MA.IN 1996-06-23
## 208 United States          USA         female SP.POP.TOTL.FE.IN 1996-06-23
## 38         Canada          CAN           male SP.POP.TOTL.MA.IN 1997-06-23
## 95         Canada          CAN         female SP.POP.TOTL.FE.IN 1997-06-23
## 152 United States          USA           male SP.POP.TOTL.MA.IN 1997-06-23
## 209 United States          USA         female SP.POP.TOTL.FE.IN 1997-06-23
## 39         Canada          CAN           male SP.POP.TOTL.MA.IN 1998-06-23
## 96         Canada          CAN         female SP.POP.TOTL.FE.IN 1998-06-23
## 153 United States          USA           male SP.POP.TOTL.MA.IN 1998-06-23
## 210 United States          USA         female SP.POP.TOTL.FE.IN 1998-06-23
## 40         Canada          CAN           male SP.POP.TOTL.MA.IN 1999-06-23
## 97         Canada          CAN         female SP.POP.TOTL.FE.IN 1999-06-23
## 154 United States          USA           male SP.POP.TOTL.MA.IN 1999-06-23
## 211 United States          USA         female SP.POP.TOTL.FE.IN 1999-06-23
## 41         Canada          CAN           male SP.POP.TOTL.MA.IN 2000-06-23
## 98         Canada          CAN         female SP.POP.TOTL.FE.IN 2000-06-23
## 155 United States          USA           male SP.POP.TOTL.MA.IN 2000-06-23
## 212 United States          USA         female SP.POP.TOTL.FE.IN 2000-06-23
## 42         Canada          CAN           male SP.POP.TOTL.MA.IN 2001-06-23
## 99         Canada          CAN         female SP.POP.TOTL.FE.IN 2001-06-23
## 156 United States          USA           male SP.POP.TOTL.MA.IN 2001-06-23
## 213 United States          USA         female SP.POP.TOTL.FE.IN 2001-06-23
## 43         Canada          CAN           male SP.POP.TOTL.MA.IN 2002-06-23
## 100        Canada          CAN         female SP.POP.TOTL.FE.IN 2002-06-23
## 157 United States          USA           male SP.POP.TOTL.MA.IN 2002-06-23
## 214 United States          USA         female SP.POP.TOTL.FE.IN 2002-06-23
## 44         Canada          CAN           male SP.POP.TOTL.MA.IN 2003-06-23
## 101        Canada          CAN         female SP.POP.TOTL.FE.IN 2003-06-23
## 158 United States          USA           male SP.POP.TOTL.MA.IN 2003-06-23
## 215 United States          USA         female SP.POP.TOTL.FE.IN 2003-06-23
## 45         Canada          CAN           male SP.POP.TOTL.MA.IN 2004-06-23
## 102        Canada          CAN         female SP.POP.TOTL.FE.IN 2004-06-23
## 159 United States          USA           male SP.POP.TOTL.MA.IN 2004-06-23
## 216 United States          USA         female SP.POP.TOTL.FE.IN 2004-06-23
## 46         Canada          CAN           male SP.POP.TOTL.MA.IN 2005-06-23
## 103        Canada          CAN         female SP.POP.TOTL.FE.IN 2005-06-23
## 160 United States          USA           male SP.POP.TOTL.MA.IN 2005-06-23
## 217 United States          USA         female SP.POP.TOTL.FE.IN 2005-06-23
## 47         Canada          CAN           male SP.POP.TOTL.MA.IN 2006-06-23
## 104        Canada          CAN         female SP.POP.TOTL.FE.IN 2006-06-23
## 161 United States          USA           male SP.POP.TOTL.MA.IN 2006-06-23
## 218 United States          USA         female SP.POP.TOTL.FE.IN 2006-06-23
## 48         Canada          CAN           male SP.POP.TOTL.MA.IN 2007-06-23
## 105        Canada          CAN         female SP.POP.TOTL.FE.IN 2007-06-23
## 162 United States          USA           male SP.POP.TOTL.MA.IN 2007-06-23
## 219 United States          USA         female SP.POP.TOTL.FE.IN 2007-06-23
## 49         Canada          CAN           male SP.POP.TOTL.MA.IN 2008-06-23
## 106        Canada          CAN         female SP.POP.TOTL.FE.IN 2008-06-23
## 163 United States          USA           male SP.POP.TOTL.MA.IN 2008-06-23
## 220 United States          USA         female SP.POP.TOTL.FE.IN 2008-06-23
## 50         Canada          CAN           male SP.POP.TOTL.MA.IN 2009-06-23
## 107        Canada          CAN         female SP.POP.TOTL.FE.IN 2009-06-23
## 164 United States          USA           male SP.POP.TOTL.MA.IN 2009-06-23
## 221 United States          USA         female SP.POP.TOTL.FE.IN 2009-06-23
## 51         Canada          CAN           male SP.POP.TOTL.MA.IN 2010-06-23
## 108        Canada          CAN         female SP.POP.TOTL.FE.IN 2010-06-23
## 165 United States          USA           male SP.POP.TOTL.MA.IN 2010-06-23
## 222 United States          USA         female SP.POP.TOTL.FE.IN 2010-06-23
## 52         Canada          CAN           male SP.POP.TOTL.MA.IN 2011-06-23
## 109        Canada          CAN         female SP.POP.TOTL.FE.IN 2011-06-23
## 166 United States          USA           male SP.POP.TOTL.MA.IN 2011-06-23
## 223 United States          USA         female SP.POP.TOTL.FE.IN 2011-06-23
## 53         Canada          CAN           male SP.POP.TOTL.MA.IN 2012-06-23
## 110        Canada          CAN         female SP.POP.TOTL.FE.IN 2012-06-23
## 167 United States          USA           male SP.POP.TOTL.MA.IN 2012-06-23
## 224 United States          USA         female SP.POP.TOTL.FE.IN 2012-06-23
## 54         Canada          CAN           male SP.POP.TOTL.MA.IN 2013-06-23
## 111        Canada          CAN         female SP.POP.TOTL.FE.IN 2013-06-23
## 168 United States          USA           male SP.POP.TOTL.MA.IN 2013-06-23
## 225 United States          USA         female SP.POP.TOTL.FE.IN 2013-06-23
## 55         Canada          CAN           male SP.POP.TOTL.MA.IN 2014-06-23
## 112        Canada          CAN         female SP.POP.TOTL.FE.IN 2014-06-23
## 169 United States          USA           male SP.POP.TOTL.MA.IN 2014-06-23
## 226 United States          USA         female SP.POP.TOTL.FE.IN 2014-06-23
## 56         Canada          CAN           male SP.POP.TOTL.MA.IN 2015-06-23
## 113        Canada          CAN         female SP.POP.TOTL.FE.IN 2015-06-23
## 170 United States          USA           male SP.POP.TOTL.MA.IN 2015-06-23
## 227 United States          USA         female SP.POP.TOTL.FE.IN 2015-06-23
## 57         Canada          CAN           male SP.POP.TOTL.MA.IN 2016-06-23
## 114        Canada          CAN         female SP.POP.TOTL.FE.IN 2016-06-23
## 171 United States          USA           male SP.POP.TOTL.MA.IN 2016-06-23
## 228 United States          USA         female SP.POP.TOTL.FE.IN 2016-06-23
##     male_female_population        Region IncomeGroup total.population
## 1                  9057268 North America High income         17909009
## 58                 8851741 North America High income         17909009
## 115               89503312 North America High income        180671000
## 172               91167688 North America High income        180671000
## 2                  9230695 North America High income         18271000
## 59                 9040305 North America High income         18271000
## 116               90966224 North America High income        183691000
## 173               92724776 North America High income        183691000
## 3                  9392169 North America High income         18614000
## 60                 9221831 North America High income         18614000
## 117               92347226 North America High income        186538000
## 174               94190774 North America High income        186538000
## 4                  9555967 North America High income         18964000
## 61                 9408033 North America High income         18964000
## 118               93655480 North America High income        189242000
## 175               95586520 North America High income        189242000
## 5                  9725721 North America High income         19325000
## 62                 9599279 North America High income         19325000
## 119               94925902 North America High income        191889000
## 176               96963098 North America High income        191889000
## 6                  9893049 North America High income         19678000
## 63                 9784951 North America High income         19678000
## 120               96066449 North America High income        194303000
## 177               98236551 North America High income        194303000
## 7                 10070870 North America High income         20048000
## 64                 9977130 North America High income         20048000
## 121               97111152 North America High income        196560000
## 178               99448848 North America High income        196560000
## 8                 10247271 North America High income         20412000
## 65                10164729 North America High income         20412000
## 122               98088575 North America High income        198712000
## 179              100623425 North America High income        198712000
## 9                 10408910 North America High income         20744000
## 66                10335090 North America High income         20744000
## 123               98981930 North America High income        200706000
## 180              101724070 North America High income        200706000
## 10                10547218 North America High income         21028000
## 67                10480782 North America High income         21028000
## 124               99871190 North America High income        202677000
## 181              102805810 North America High income        202677000
## 11                10691816 North America High income         21324000
## 68                10632184 North America High income         21324000
## 125              100975878 North America High income        205052000
## 182              104076122 North America High income        205052000
## 12                10849611 North America High income         21645535
## 69                10795924 North America High income         21645535
## 126              102217027 North America High income        207661000
## 183              105443973 North America High income        207661000
## 13                11021094 North America High income         21993631
## 70                10972537 North America High income         21993631
## 127              103291747 North America High income        209896000
## 184              106604253 North America High income        209896000
## 14                11205990 North America High income         22369408
## 71                11163418 North America High income         22369408
## 128              104265126 North America High income        211909000
## 185              107643874 North America High income        211909000
## 15                11403846 North America High income         22774087
## 72                11370241 North America High income         22774087
## 129              105198491 North America High income        213854000
## 186              108655509 North America High income        213854000
## 16                11614581 North America High income         23209000
## 73                11594419 North America High income         23209000
## 130              106201066 North America High income        215973000
## 187              109771934 North America High income        215973000
## 17                11759666 North America High income         23518000
## 74                11758334 North America High income         23518000
## 131              107154502 North America High income        218035000
## 188              110880498 North America High income        218035000
## 18                11887015 North America High income         23796000
## 75                11908985 North America High income         23796000
## 132              108161709 North America High income        220239000
## 189              112077291 North America High income        220239000
## 19                11994122 North America High income         24036000
## 76                12041878 North America High income         24036000
## 133              109234212 North America High income        222585000
## 190              113350788 North America High income        222585000
## 20                12101895 North America High income         24277000
## 77                12175105 North America High income         24277000
## 134              110379788 North America High income        225055000
## 191              114675212 North America High income        225055000
## 21                12248211 North America High income         24593000
## 78                12344789 North America High income         24593000
## 135              111402057 North America High income        227225000
## 192              115822943 North America High income        227225000
## 22                12391458 North America High income         24900000
## 79                12508542 North America High income         24900000
## 136              112488832 North America High income        229466000
## 193              116977168 North America High income        229466000
## 23                12533389 North America High income         25202000
## 80                12668611 North America High income         25202000
## 137              113579226 North America High income        231664000
## 194              118084774 North America High income        231664000
## 24                12652568 North America High income         25456000
## 81                12803432 North America High income         25456000
## 138              114647762 North America High income        233792000
## 195              119144238 North America High income        233792000
## 25                12768649 North America High income         25702000
## 82                12933351 North America High income         25702000
## 139              115664343 North America High income        235825000
## 196              120160657 North America High income        235825000
## 26                12882440 North America High income         25942000
## 83                13059560 North America High income         25942000
## 140              116695844 North America High income        237924000
## 197              121228156 North America High income        237924000
## 27                13008005 North America High income         26204000
## 84                13195995 North America High income         26204000
## 141              117758675 North America High income        240133000
## 198              122374325 North America High income        240133000
## 28                13176159 North America High income         26550000
## 85                13373841 North America High income         26550000
## 142              118778957 North America High income        242289000
## 199              123510043 North America High income        242289000
## 29                13344207 North America High income         26895000
## 86                13550793 North America High income         26895000
## 143              119821864 North America High income        244499000
## 200              124677136 North America High income        244499000
## 30                13580939 North America High income         27379000
## 87                13798061 North America High income         27379000
## 144              120933139 North America High income        246819000
## 201              125885861 North America High income        246819000
## 31                13781230 North America High income         27791000
## 88                14009770 North America High income         27791000
## 145              122308854 North America High income        249623000
## 202              127314146 North America High income        249623000
## 32                13964994 North America High income         28171682
## 89                14206688 North America High income         28171682
## 146              123987526 North America High income        252981000
## 203              128993474 North America High income        252981000
## 33                14131783 North America High income         28519597
## 90                14387814 North America High income         28519597
## 147              125778639 North America High income        256514000
## 204              130735361 North America High income        256514000
## 34                14281671 North America High income         28833410
## 91                14551739 North America High income         28833410
## 148              127526177 North America High income        259919000
## 205              132392823 North America High income        259919000
## 35                14415118 North America High income         29111906
## 92                14696788 North America High income         29111906
## 149              129183889 North America High income        263126000
## 206              133942111 North America High income        263126000
## 36                14532342 North America High income         29354000
## 93                14821658 North America High income         29354000
## 150              130812745 North America High income        266278000
## 207              135465255 North America High income        266278000
## 37                14689130 North America High income         29671900
## 94                14982770 North America High income         29671900
## 151              132419522 North America High income        269394000
## 208              136974478 North America High income        269394000
## 38                14846373 North America High income         29987200
## 95                15140827 North America High income         29987200
## 152              134095800 North America High income        272657000
## 209              138561200 North America High income        272657000
## 39                14977834 North America High income         30247900
## 96                15270066 North America High income         30247900
## 153              135736774 North America High income        275854000
## 210              140117226 North America High income        275854000
## 40                15105132 North America High income         30499200
## 97                15394068 North America High income         30499200
## 154              137370537 North America High income        279040000
## 211              141669463 North America High income        279040000
## 41                15241867 North America High income         30769700
## 98                15527833 North America High income         30769700
## 155              138971948 North America High income        282162411
## 212              143190463 North America High income        282162411
## 42                15398999 North America High income         31081900
## 99                15682901 North America High income         31081900
## 156              140416645 North America High income        284968955
## 213              144552310 North America High income        284968955
## 43                15540087 North America High income         31362000
## 100               15821913 North America High income         31362000
## 157              141784968 North America High income        287625193
## 214              145840225 North America High income        287625193
## 44                15697939 North America High income         31676000
## 101               15978061 North America High income         31676000
## 158              143064009 North America High income        290107933
## 215              147043924 North America High income        290107933
## 45                15858471 North America High income         31995000
## 102               16136529 North America High income         31995000
## 159              144443341 North America High income        292805298
## 216              148361957 North America High income        292805298
## 46                16018324 North America High income         32312000
## 103               16293676 North America High income         32312000
## 160              145822985 North America High income        295516599
## 217              149693614 North America High income        295516599
## 47                16149592 North America High income         32570505
## 104               16420913 North America High income         32570505
## 161              147269908 North America High income        298379912
## 218              151110004 North America High income        298379912
## 48                16310314 North America High income         32887928
## 105               16577614 North America High income         32887928
## 162              148704288 North America High income        301231207
## 219              152526919 North America High income        301231207
## 49                16490904 North America High income         33245773
## 106               16754869 North America High income         33245773
## 163              150141736 North America High income        304093966
## 220              153952230 North America High income        304093966
## 50                16683127 North America High income         33628571
## 107               16945444 North America High income         33628571
## 164              151490722 North America High income        306771529
## 221              155280807 North America High income        306771529
## 51                16871227 North America High income         34005274
## 108               17134047 North America High income         34005274
## 165              152796770 North America High income        309348193
## 222              156551423 North America High income        309348193
## 52                17038535 North America High income         34342780
## 109               17304245 North America High income         34342780
## 166              153982182 North America High income        311663358
## 223              157681176 North America High income        311663358
## 53                17239606 North America High income         34750545
## 110               17510939 North America High income         34750545
## 167              155184527 North America High income        313998379
## 224              158813852 North America High income        313998379
## 54                17437353 North America High income         35152370
## 111               17715017 North America High income         35152370
## 168              156327578 North America High income        316204908
## 225              159877330 North America High income        316204908
## 55                17626367 North America High income         35535348
## 112               17908981 North America High income         35535348
## 169              157545543 North America High income        318563456
## 226              161017913 North America High income        318563456
## 56                17774151 North America High income         35832513
## 113               18058362 North America High income         35832513
## 170              158747453 North America High income        320896618
## 227              162149165 North America High income        320896618
## 57                17990452 North America High income         36264604
## 114               18274152 North America High income         36264604
## 171              159894419 North America High income        323127513
## 228              163233094 North America High income        323127513
##     population.percentage transformed.population
## 1                      NA              564047.43
## 58                     NA             6899526.15
## 115                    NA               10246.06
## 172                    NA             5669412.79
## 2                      NA              572821.56
## 59                     NA             7044034.35
## 116                    NA               10323.15
## 173                    NA             5750529.11
## 3                      NA              580963.40
## 60                     NA             7183101.55
## 117                    NA               10395.31
## 174                    NA             5826699.69
## 4                      NA              589195.83
## 61                     NA             7325703.78
## 118                    NA               10463.13
## 175                    NA             5899042.75
## 5                      NA              597699.95
## 62                     NA             7472120.10
## 119                    NA               10528.51
## 176                    NA             5970225.76
## 6                      NA              606055.52
## 63                     NA             7614222.62
## 120                    NA               10586.81
## 177                    NA             6035931.26
## 7                      NA              614906.29
## 64                     NA             7761258.02
## 121                    NA               10639.88
## 178                    NA             6098353.93
## 8                      NA              623657.67
## 65                     NA             7904743.86
## 122                    NA               10689.25
## 179                    NA             6158717.48
## 9                      NA              631652.11
## 66                     NA             8035006.99
## 123                    NA               10734.15
## 180                    NA             6215178.57
## 10                     NA              638474.28
## 67                     NA             8146379.13
## 124                    NA               10778.63
## 181                    NA             6270574.02
## 11                     NA              645588.94
## 68                     NA             8262088.93
## 125                    NA               10833.59
## 182                    NA             6335506.43
## 12                     NA              653332.52
## 69                     NA             8387197.30
## 126                    NA               10894.95
## 183                    NA             6405281.98
## 13                     NA              661724.09
## 70                     NA             8522106.16
## 127                    NA               10947.75
## 184                    NA             6464354.81
## 14                     NA              670744.85
## 71                     NA             8667873.30
## 128                    NA               10995.33
## 185                    NA             6517196.62
## 15                     NA              680367.28
## 72                     NA             8825767.88
## 129                    NA               11040.72
## 186                    NA             6568537.06
## 16                     NA              690581.94
## 73                     NA             8996857.88
## 130                    NA               11089.25
## 187                    NA             6625106.29
## 17                     NA              697594.38
## 74                     NA             9121921.00
## 131                    NA               11135.16
## 188                    NA             6681185.54
## 18                     NA              703736.31
## 75                     NA             9236838.45
## 132                    NA               11183.43
## 189                    NA             6741626.76
## 19                     NA              708892.50
## 76                     NA             9338189.92
## 133                    NA               11234.56
## 190                    NA             6805827.64
## 20                     NA              714072.12
## 77                     NA             9439777.45
## 134                    NA               11288.87
## 191                    NA             6872472.75
## 21                     NA              721090.41
## 78                     NA             9569137.26
## 135                    NA               11337.09
## 192                    NA             6930126.38
## 22                     NA              727946.38
## 79                     NA             9693947.49
## 136                    NA               11388.09
## 193                    NA             6988013.45
## 23                     NA              734724.84
## 80                     NA             9815923.58
## 137                    NA               11438.99
## 194                    NA             7043475.98
## 24                     NA              740405.65
## 81                     NA             9918640.24
## 138                    NA               11488.61
## 195                    NA             7096449.45
## 25                     NA              745929.22
## 82                     NA            10017605.19
## 139                    NA               11535.60
## 196                    NA             7147199.36
## 26                     NA              751334.76
## 83                     NA            10113728.24
## 140                    NA               11583.04
## 197                    NA             7200425.28
## 27                     NA              757289.30
## 84                     NA            10217622.23
## 141                    NA               11631.69
## 198                    NA             7257489.73
## 28                     NA              765246.78
## 85                     NA            10353023.54
## 142                    NA               11678.18
## 199                    NA             7313948.96
## 29                     NA              773180.38
## 86                     NA            10487714.50
## 143                    NA               11725.47
## 200                    NA             7371880.83
## 30                     NA              784325.12
## 87                     NA            10675879.17
## 144                    NA               11775.62
## 201                    NA             7431787.19
## 31                     NA              793726.08
## 88                     NA            10836939.76
## 145                    NA               11837.36
## 202                    NA             7502456.00
## 32                     NA              802328.96
## 89                     NA            10986711.56
## 146                    NA               11912.19
## 203                    NA             7585382.80
## 33                     NA              810118.93
## 90                     NA            11124441.81
## 147                    NA               11991.44
## 204                    NA             7671215.25
## 34                     NA              817104.93
## 91                     NA            11249067.37
## 148                    NA               12068.18
## 205                    NA             7752716.64
## 35                     NA              823313.17
## 92                     NA            11359322.80
## 149                    NA               12140.45
## 206                    NA             7828750.29
## 36                     NA              828757.86
## 93                     NA            11454225.16
## 150                    NA               12210.98
## 207                    NA             7903362.85
## 37                     NA              836027.42
## 94                     NA            11576652.16
## 151                    NA               12280.09
## 208                    NA             7977160.26
## 38                     NA              843303.57
## 95                     NA            11696736.43
## 152                    NA               12351.71
## 209                    NA             8054606.05
## 39                     NA              849375.70
## 96                     NA            11794910.65
## 153                    NA               12421.36
## 210                    NA             8130414.94
## 40                     NA              855246.10
## 97                     NA            11889093.68
## 154                    NA               12490.26
## 211                    NA             8205904.18
## 41                     NA              861541.43
## 98                     NA            11990677.86
## 155                    NA               12557.36
## 212                    NA             8279745.11
## 42                     NA              868762.89
## 99                     NA            12108421.81
## 156                    NA               12617.54
## 213                    NA             8345752.34
## 43                     NA              875235.32
## 100                    NA            12213957.90
## 157                    NA               12674.24
## 214                    NA             8408084.03
## 44                     NA              882463.84
## 101                    NA            12332485.05
## 158                    NA               12726.96
## 215                    NA             8466259.74
## 45                     NA              889801.23
## 102                    NA            12452753.55
## 159                    NA               12783.54
## 216                    NA             8529873.36
## 46                     NA              897093.85
## 103                    NA            12572000.09
## 160                    NA               12839.85
## 217                    NA             8594052.12
## 47                     NA              903072.27
## 104                    NA            12668536.27
## 161                    NA               12898.59
## 218                    NA             8662213.69
## 48                     NA              910379.84
## 105                    NA            12787410.04
## 162                    NA               12956.52
## 219                    NA             8730297.60
## 49                     NA              918574.78
## 106                    NA            12921853.63
## 163                    NA               13014.27
## 220                    NA             8798682.19
## 50                     NA              927279.27
## 107                    NA            13066373.74
## 164                    NA               13068.19
## 221                    NA             8862333.79
## 51                     NA              935779.00
## 108                    NA            13209371.80
## 165                    NA               13120.16
## 222                    NA             8923126.46
## 52                     NA              943324.38
## 109                    NA            13338392.80
## 166                    NA               13167.12
## 223                    NA             8977112.77
## 53                     NA              952374.21
## 110                    NA            13495051.77
## 167                    NA               13214.55
## 224                    NA             9031176.23
## 54                     NA              961255.29
## 111                    NA            13649697.83
## 168                    NA               13259.45
## 225                    NA             9081880.28
## 55                     NA              969726.65
## 112                    NA            13796652.26
## 169                    NA               13307.11
## 226                    NA             9136200.16
## 56                     NA              976338.35
## 113                    NA            13909810.89
## 170                    NA               13353.95
## 227                    NA             9190014.44
## 57                     NA              985997.02
## 114                    NA            14073248.01
## 171                    NA               13398.47
## 228                    NA             9241520.81