gapminder Dataset EDA

Exploratory Data Analysis in R of Global Data from GapMinder a [ 1704 6]

GapMinder_Unfiltered is the dataset which has gdpPerCapita (Gross Domestic Product per Capita) across the countries in the globe collected over years dating 1950 to 2007.

## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6     ✔ purrr   0.3.4
## ✔ tibble  3.1.8     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## Loading required package: sp
## 
## ### Welcome to rworldmap ###
## 
## For a short introduction type :   vignette('rworldmap')
## 
## Loading required package: lattice
## 
## Loading required package: survival
## 
## Loading required package: Formula
## 
## 
## Attaching package: 'Hmisc'
## 
## 
## The following objects are masked from 'package:dplyr':
## 
##     src, summarize
## 
## 
## The following objects are masked from 'package:base':
## 
##     format.pval, units
## 
## 
## Registered S3 method overwritten by 'printr':
##   method                from     
##   knit_print.data.frame rmarkdown
country continent year lifeExp pop gdpPercap
Afghanistan Asia 1952 28.801 8425333 779.4453
Afghanistan Asia 1957 30.332 9240934 820.8530
Afghanistan Asia 1962 31.997 10267083 853.1007
Afghanistan Asia 1967 34.020 11537966 836.1971
Afghanistan Asia 1972 36.088 13079460 739.9811
Afghanistan Asia 1977 38.438 14880372 786.1134
## [1] "country"   "continent" "year"      "lifeExp"   "pop"       "gdpPercap"
country continent year lifeExp pop gdpPercap
Afghanistan: 12 Africa :624 Min. :1952 Min. :23.60 Min. :6.001e+04 Min. : 241.2
Albania : 12 Americas:300 1st Qu.:1966 1st Qu.:48.20 1st Qu.:2.794e+06 1st Qu.: 1202.1
Algeria : 12 Asia :396 Median :1980 Median :60.71 Median :7.024e+06 Median : 3531.8
Angola : 12 Europe :360 Mean :1980 Mean :59.47 Mean :2.960e+07 Mean : 7215.3
Argentina : 12 Oceania : 24 3rd Qu.:1993 3rd Qu.:70.85 3rd Qu.:1.959e+07 3rd Qu.: 9325.5
Australia : 12 NA Max. :2007 Max. :82.60 Max. :1.319e+09 Max. :113523.1
(Other) :1632 NA NA NA NA NA
## [1] 1704    6
## [1] 1704
## gapminder 
## 
##  6  Variables      1704  Observations
## --------------------------------------------------------------------------------
## country 
##        n  missing distinct 
##     1704        0      142 
## 
## lowest : Afghanistan        Albania            Algeria            Angola             Argentina         
## highest: Vietnam            West Bank and Gaza Yemen, Rep.        Zambia             Zimbabwe          
## --------------------------------------------------------------------------------
## continent 
##        n  missing distinct 
##     1704        0        5 
## 
## lowest : Africa   Americas Asia     Europe   Oceania 
## highest: Africa   Americas Asia     Europe   Oceania 
##                                                        
## Value        Africa Americas     Asia   Europe  Oceania
## Frequency       624      300      396      360       24
## Proportion    0.366    0.176    0.232    0.211    0.014
## --------------------------------------------------------------------------------
## year 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     1704        0       12    0.993     1980    19.87     1952     1957 
##      .25      .50      .75      .90      .95 
##     1966     1980     1993     2002     2007 
## 
## lowest : 1952 1957 1962 1967 1972, highest: 1987 1992 1997 2002 2007
##                                                                             
## Value       1952  1957  1962  1967  1972  1977  1982  1987  1992  1997  2002
## Frequency    142   142   142   142   142   142   142   142   142   142   142
## Proportion 0.083 0.083 0.083 0.083 0.083 0.083 0.083 0.083 0.083 0.083 0.083
##                 
## Value       2007
## Frequency    142
## Proportion 0.083
## --------------------------------------------------------------------------------
## lifeExp 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     1704        0     1626        1    59.47    14.82    38.49    41.51 
##      .25      .50      .75      .90      .95 
##    48.20    60.71    70.85    75.10    77.44 
## 
## lowest : 23.599 28.801 30.000 30.015 30.331, highest: 81.701 81.757 82.000 82.208 82.603
## --------------------------------------------------------------------------------
## pop 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     1704        0     1704        1 29601212 46384459   475459   946367 
##      .25      .50      .75      .90      .95 
##  2793664  7023596 19585222 54801370 89822054 
## 
## lowest :      60011      61325      63149      65345      70787
## highest: 1110396331 1164970000 1230075000 1280400000 1318683096
## --------------------------------------------------------------------------------
## gdpPercap 
##        n  missing distinct     Info     Mean      Gmd      .05      .10 
##     1704        0     1704        1     7215     8573    548.0    687.7 
##      .25      .50      .75      .90      .95 
##   1202.1   3531.8   9325.5  19449.1  26608.3 
## 
## lowest :    241.1659    277.5519    298.8462    299.8503    312.1884
## highest:  80894.8833  95458.1118 108382.3529 109347.8670 113523.1329
## --------------------------------------------------------------------------------

For the year 2007: What is the distribution of GDP per capita across all countries?

For the year 2007, how do the distributions differ across the different continents?

For the year 2007: What are the top 10 countries with the largest GDP per capita?

country gdpPercap
Norway 49357.19
Kuwait 47306.99
Singapore 47143.18
United States 42951.65
Ireland 40676.00
Hong Kong, China 39724.98
Switzerland 37506.42
Netherlands 36797.93
Canada 36319.24
Iceland 36180.79

Plot the GDP per capita for your country of origin for all years available.

What was the percent growth (or decline) in GDP per capita in 2007?

country continent year lifeExp pop gdpPercap PercentGrowth
Venezuela Americas 2007 73.747 26084662 11415.81 32.66406

Where are the outliers?

What has been the historical growth (or decline) in GDP per capita for your country?

## Coordinate system already present. Adding new coordinate system, which will replace the existing one.
## Warning: Ignoring unknown aesthetics: x, y
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## 
## To cite package 'gapminder' in publications use:
## 
##   Bryan J (2017). _gapminder: Data from Gapminder_. R package version
##   0.3.0, <https://CRAN.R-project.org/package=gapminder>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {gapminder: Data from Gapminder},
##     author = {Jennifer Bryan},
##     year = {2017},
##     note = {R package version 0.3.0},
##     url = {https://CRAN.R-project.org/package=gapminder},
##   }
## R version 4.2.1 (2022-06-23 ucrt)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 22000)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United States.utf8 
## [2] LC_CTYPE=English_United States.utf8   
## [3] LC_MONETARY=English_United States.utf8
## [4] LC_NUMERIC=C                          
## [5] LC_TIME=English_United States.utf8    
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] RColorBrewer_1.1-3 printr_0.2         Hmisc_4.7-1        Formula_1.2-4     
##  [5] survival_3.3-1     lattice_0.20-45    countrycode_1.4.0  rworldmap_1.3-6   
##  [9] sp_1.5-0           ggthemes_4.2.4     gapminder_0.3.0    forcats_0.5.1     
## [13] stringr_1.4.0      dplyr_1.0.9        purrr_0.3.4        readr_2.1.2       
## [17] tidyr_1.2.0        tibble_3.1.8       ggplot2_3.3.6      tidyverse_1.3.2   
## 
## loaded via a namespace (and not attached):
##  [1] fs_1.5.2            lubridate_1.8.0     httr_1.4.3         
##  [4] tools_4.2.1         backports_1.4.1     bslib_0.4.0        
##  [7] utf8_1.2.2          R6_2.5.1            rpart_4.1.16       
## [10] DBI_1.1.3           colorspace_2.0-3    nnet_7.3-17        
## [13] withr_2.5.0         tidyselect_1.1.2    gridExtra_2.3      
## [16] compiler_4.2.1      cli_3.3.0           rvest_1.0.2        
## [19] htmlTable_2.4.1     xml2_1.3.3          labeling_0.4.2     
## [22] bookdown_0.28       sass_0.4.2          checkmate_2.1.0    
## [25] scales_1.2.0        digest_0.6.29       foreign_0.8-82     
## [28] rmarkdown_2.14      base64enc_0.1-3     jpeg_0.1-9         
## [31] pkgconfig_2.0.3     htmltools_0.5.3     highr_0.9          
## [34] dbplyr_2.2.1        fastmap_1.1.0       maps_3.4.0         
## [37] htmlwidgets_1.5.4   rlang_1.0.4         readxl_1.4.0       
## [40] rstudioapi_0.13     farver_2.1.1        jquerylib_0.1.4    
## [43] generics_0.1.3      jsonlite_1.8.0      googlesheets4_1.0.0
## [46] magrittr_2.0.3      dotCall64_1.0-1     interp_1.1-3       
## [49] Matrix_1.4-1        Rcpp_1.0.9          munsell_0.5.0      
## [52] fansi_1.0.3         viridis_0.6.2       lifecycle_1.0.1    
## [55] stringi_1.7.8       yaml_2.3.5          grid_4.2.1         
## [58] maptools_1.1-4      crayon_1.5.1        deldir_1.0-6       
## [61] haven_2.5.0         splines_4.2.1       mapproj_1.2.8      
## [64] hms_1.1.1           knitr_1.39          pillar_1.8.0       
## [67] codetools_0.2-18    reprex_2.0.1        glue_1.6.2         
## [70] evaluate_0.15       latticeExtra_0.6-30 data.table_1.14.2  
## [73] modelr_0.1.8        vctrs_0.4.1         png_0.1-7          
## [76] rmdformats_1.0.4    spam_2.9-0          tzdb_0.3.0         
## [79] cellranger_1.1.0    gtable_0.3.0        assertthat_0.2.1   
## [82] cachem_1.0.6        xfun_0.31           broom_1.0.0        
## [85] googledrive_2.0.0   viridisLite_0.4.0   gargle_1.2.0       
## [88] cluster_2.1.3       fields_14.0         ellipsis_0.3.2