Global Country Information Dataset 2023

Polstat STIS

2025-03-11

1 Pegenalan Dataset

1.1 Atribut

  • Country: Name of the country.
  • Density (P/Km2): Population density measured in persons per square kilometer.
  • Abbreviation: Abbreviation or code representing the country.
  • Agricultural Land (%): Percentage of land area used for agricultural purposes.
  • Land Area (Km2): Total land area of the country in square kilometers.
  • Armed Forces Size: Size of the armed forces in the country.
  • Birth Rate: Number of births per 1,000 population per year.
  • Calling Code: International calling code for the country.
  • Capital/Major City: Name of the capital or major city.
  • CO2 Emissions: Carbon dioxide emissions in tons.
  • CPI: Consumer Price Index, a measure of inflation and purchasing power.
  • CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.
  • Currency_Code: Currency code used in the country.
  • Fertility Rate: Average number of children born to a woman during her lifetime.
  • Forested Area (%): Percentage of land area covered by forests.
  • Gasoline_Price: Price of gasoline per liter in local currency.
  • GDP: Gross Domestic Product, the total value of goods and services produced in the country.
  • Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.
  • Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.
  • Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.
  • Largest City: Name of the country’s largest city.
  • Life Expectancy: Average number of years a newborn is expected to live.
  • Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.
  • Minimum Wage: Minimum wage level in local currency.
  • Official Language: Official language(s) spoken in the country.
  • Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.
  • Physicians per Thousand: Number of physicians per thousand people.
  • Population: Total population of the country.
  • Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.
  • Tax Revenue (%): Tax revenue as a percentage of GDP.
  • Total Tax Rate: Overall tax burden as a percentage of commercial profits.
  • Unemployment Rate: Percentage of the labor force that is unemployed.
  • Urban Population: Percentage of the population living in urban areas.
  • Latitude: Latitude coordinate of the country’s location.
  • Longitude: Longitude coordinate of the country’s location.

2 Import Library

2.0.1 Cek Package yang Sudah Terinstal

  • installed.packages() mengambil daftar package yang sudah ada.
  • !(my_essential_packages %in% installed.packages()[, "Package"]) mencari package yang belum terinstal.

2.0.2 Install Package yang Belum Ada

  • install.packages(packages_to_install, dependencies = TRUE) memastikan semua package beserta dependensinya diinstal.

2.0.3 Load Semua Package yang Diperlukan

  • lapply(..., library, character.only = TRUE, quietly = TRUE) digunakan untuk memuat package tanpa menampilkan output di console.

3 Data Cleaning

3.1 Membaca Dataset

Menggunakan fread() dari data.table untuk membaca file CSV dengan lebih cepat dibandingkan read.csv().

path_data = "C:/Users/Hamdani Umar/Downloads/Dataset2023/world-data-2023.csv"
world_data <- fread(path_data)
head(world_data)
##                Country Density\n(P/Km2) Abbreviation Agricultural Land( %)
##                 <char>           <char>       <char>                <char>
## 1:         Afghanistan               60           AF                58.10%
## 2:             Albania              105           AL                43.10%
## 3:             Algeria               18           DZ                17.40%
## 4:             Andorra              164           AD                40.00%
## 5:              Angola               26           AO                47.50%
## 6: Antigua and Barbuda              223           AG                20.50%
##    Land Area(Km2) Armed Forces size Birth Rate Calling Code
##            <char>            <char>      <num>        <int>
## 1:        652,230           323,000      32.49           93
## 2:         28,748             9,000      11.78          355
## 3:      2,381,741           317,000      24.28          213
## 4:            468                         7.20          376
## 5:      1,246,700           117,000      40.73          244
## 6:            443                 0      15.33            1
##        Capital/Major City Co2-Emissions    CPI CPI Change (%) Currency-Code
##                    <char>        <char> <char>         <char>        <char>
## 1:                  Kabul         8,672  149.9          2.30%           AFN
## 2:                 Tirana         4,536 119.05          1.40%           ALL
## 3:                Algiers       150,006 151.36          2.00%           DZD
## 4:       Andorra la Vella           469                                 EUR
## 5:                 Luanda        34,693 261.73         17.10%           AOA
## 6: St. John's, Saint John           557 113.81          1.20%           XCD
##    Fertility Rate Forested Area (%) Gasoline Price               GDP
##             <num>            <char>         <char>            <char>
## 1:           4.47             2.10%          $0.70  $19,101,353,833 
## 2:           1.62            28.10%          $1.36  $15,278,077,447 
## 3:           3.02             0.80%          $0.28 $169,988,236,398 
## 4:           1.27            34.00%          $1.51   $3,154,057,987 
## 5:           5.52            46.30%          $0.97  $94,635,415,870 
## 6:           1.99            22.30%          $0.99   $1,727,759,259 
##    Gross primary education enrollment (%)
##                                    <char>
## 1:                                104.00%
## 2:                                107.00%
## 3:                                109.90%
## 4:                                106.40%
## 5:                                113.50%
## 6:                                105.00%
##    Gross tertiary education enrollment (%) Infant mortality
##                                     <char>            <num>
## 1:                                   9.70%             47.9
## 2:                                  55.00%              7.8
## 3:                                  51.40%             20.1
## 4:                                                      2.7
## 5:                                   9.30%             51.6
## 6:                                  24.80%              5.0
##              Largest city Life expectancy Maternal mortality ratio Minimum wage
##                    <char>           <num>                    <int>       <char>
## 1:                  Kabul            64.5                      638        $0.43
## 2:                 Tirana            78.5                       15        $1.12
## 3:                Algiers            76.7                      112        $0.95
## 4:       Andorra la Vella              NA                       NA        $6.63
## 5:                 Luanda            60.8                      241        $0.71
## 6: St. John's, Saint John            76.9                       42        $3.04
##    Official language Out of pocket health expenditure Physicians per thousand
##               <char>                           <char>                   <num>
## 1:            Pashto                           78.40%                    0.28
## 2:          Albanian                           56.90%                    1.20
## 3:            Arabic                           28.10%                    1.72
## 4:           Catalan                           36.40%                    3.33
## 5:        Portuguese                           33.40%                    0.21
## 6:           English                           24.30%                    2.76
##    Population Population: Labor force participation (%) Tax revenue (%)
##        <char>                                    <char>          <char>
## 1: 38,041,754                                    48.90%           9.30%
## 2:  2,854,191                                    55.70%          18.60%
## 3: 43,053,054                                    41.20%          37.20%
## 4:     77,142                                                          
## 5: 31,825,295                                    77.50%           9.20%
## 6:     97,118                                                    16.50%
##    Total tax rate Unemployment rate Urban_population  Latitude  Longitude
##            <char>            <char>           <char>     <num>      <num>
## 1:         71.40%            11.12%        9,797,273  33.93911  67.709953
## 2:         36.60%            12.33%        1,747,593  41.15333  20.168331
## 3:         66.10%            11.70%       31,510,100  28.03389   1.659626
## 4:                                            67,873  42.50628   1.521801
## 5:         49.10%             6.89%       21,061,025 -11.20269  17.873887
## 6:         43.00%                             23,800  17.06082 -61.796428

3.2 Membersihkan nama kolom

  • Mengganti spasi dengan underscore (_) agar lebih mudah digunakan dalam R.
  • Menghapus teks dalam tanda kurung () dan %.
  • Menghapus underscore yang tersisa di akhir nama kolom.
  • Mengubah huruf kapital menjadi huruf kecil agar lebih konsisten.
colnames(world_data) <- gsub(" ", "_", colnames(world_data))
colnames(world_data) <- gsub("\\s*\\(.*?\\)%*", "", colnames(world_data))
colnames(world_data) <- gsub("_$", "", colnames(world_data))
colnames(world_data) <- tolower(colnames(world_data))

colnames(world_data)
##  [1] "country"                              
##  [2] "density"                              
##  [3] "abbreviation"                         
##  [4] "agricultural_land"                    
##  [5] "land_area"                            
##  [6] "armed_forces_size"                    
##  [7] "birth_rate"                           
##  [8] "calling_code"                         
##  [9] "capital/major_city"                   
## [10] "co2-emissions"                        
## [11] "cpi"                                  
## [12] "cpi_change"                           
## [13] "currency-code"                        
## [14] "fertility_rate"                       
## [15] "forested_area"                        
## [16] "gasoline_price"                       
## [17] "gdp"                                  
## [18] "gross_primary_education_enrollment"   
## [19] "gross_tertiary_education_enrollment"  
## [20] "infant_mortality"                     
## [21] "largest_city"                         
## [22] "life_expectancy"                      
## [23] "maternal_mortality_ratio"             
## [24] "minimum_wage"                         
## [25] "official_language"                    
## [26] "out_of_pocket_health_expenditure"     
## [27] "physicians_per_thousand"              
## [28] "population"                           
## [29] "population:_labor_force_participation"
## [30] "tax_revenue"                          
## [31] "total_tax_rate"                       
## [32] "unemployment_rate"                    
## [33] "urban_population"                     
## [34] "latitude"                             
## [35] "longitude"

3.3 membersihkan nilai dalam kolom numerik

  • Format numerik seperti angka dengan koma (,), persentase (%), dan mata uang ($) diubah agar dapat dikonversi ke tipe data numerik.

  • Penggunaan str_replace_all() dari stringr untuk menghapus karakter tidak diinginkan.

  • Pengecekan nilai dengan regex (1+$) untuk memastikan hanya angka yang dikonversi ke numeric, sementara nilai tidak valid diubah menjadi NA.

  • Jika ada nilai NA pada GDP, diubah menjadi 0 untuk mencegah error dalam perhitungan statistik.

## Clean Density column
world_data$density <- str_replace_all(world_data$density, ',', '')
world_data$density <- ifelse(str_detect(world_data$density, '^[0-9\\.]+$'), as.numeric(world_data$density, NA))

## Clean Agricultural Land' column
world_data$agricultural_land <- str_replace_all(world_data$agricultural_land, '%', '')
world_data$agricultural_land <- ifelse(str_detect(world_data$agricultural_land, '^[0-9\\.]+$'), as.numeric(world_data$agricultural_land), NA)
## Clean Armed Forces size column
world_data$armed_forces_size <- str_replace_all(world_data$armed_forces_size, ',', '')
world_data$armed_forces_size <- ifelse(str_detect(world_data$armed_forces_size, '^[0-9\\.]+$'), as.numeric(world_data$armed_forces_size), NA)

## Clean Land Area column
world_data$land_area <- str_replace_all(world_data$land_area, ',', '')
world_data$land_area <- ifelse(str_detect(world_data$land_area, '^[0-9\\.]+$'), as.numeric(world_data$land_area), NA)

## Clean Co2-Emissions column
world_data$`co2-emissions` <- str_replace_all(world_data$`co2-emissions`, ',', '')
world_data$`co2-emissions` <- ifelse(str_detect(world_data$`co2-emissions`, '^[0-9\\.]+$'), as.numeric(world_data$`co2-emissions`), NA)

## Clean Forested Area column
world_data$forested_area <- str_replace_all(world_data$forested_area, '%', '')
world_data$forested_area <- ifelse(str_detect(world_data$forested_area, '^[0-9\\.]+$'), as.numeric(world_data$forested_area), NA)

## Clean GDP column
world_data$gdp <- str_replace_all(world_data$gdp, '\\$', '')
world_data$gdp <- as.numeric(str_replace_all(world_data$gdp, ',', ''))
world_data$gdp[is.na(world_data$gdp)] <- 0

## Clean Population column
world_data$population <- str_replace_all(world_data$population, ',', '')
world_data$population <- ifelse(str_detect(world_data$population, '^[0-9\\.]+$'), as.numeric(world_data$population), NA)

## Clean Unemployment rate column
world_data$unemployment_rate <- str_replace_all(world_data$unemployment_rate, '%', '')
world_data$unemployment_rate <- ifelse(str_detect(world_data$unemployment_rate, '^[0-9\\.]+$'), as.numeric(world_data$unemployment_rate), NA)

## Clean Urban_population column
world_data$urban_population <- str_replace_all(world_data$urban_population, ',', '')
world_data$urban_population <- ifelse(str_detect(world_data$urban_population, '^[0-9\\.]+$'), as.numeric(world_data$urban_population), NA)

3.4 Mengganti nama negara yang tidak lengkap atau salah ejaan

## Rename Country column to name
colnames(world_data)[1]<-"name"
colnames(world_data)
##  [1] "name"                                 
##  [2] "density"                              
##  [3] "abbreviation"                         
##  [4] "agricultural_land"                    
##  [5] "land_area"                            
##  [6] "armed_forces_size"                    
##  [7] "birth_rate"                           
##  [8] "calling_code"                         
##  [9] "capital/major_city"                   
## [10] "co2-emissions"                        
## [11] "cpi"                                  
## [12] "cpi_change"                           
## [13] "currency-code"                        
## [14] "fertility_rate"                       
## [15] "forested_area"                        
## [16] "gasoline_price"                       
## [17] "gdp"                                  
## [18] "gross_primary_education_enrollment"   
## [19] "gross_tertiary_education_enrollment"  
## [20] "infant_mortality"                     
## [21] "largest_city"                         
## [22] "life_expectancy"                      
## [23] "maternal_mortality_ratio"             
## [24] "minimum_wage"                         
## [25] "official_language"                    
## [26] "out_of_pocket_health_expenditure"     
## [27] "physicians_per_thousand"              
## [28] "population"                           
## [29] "population:_labor_force_participation"
## [30] "tax_revenue"                          
## [31] "total_tax_rate"                       
## [32] "unemployment_rate"                    
## [33] "urban_population"                     
## [34] "latitude"                             
## [35] "longitude"
## Rename mis-spelt or incomplete names
world_data$name[world_data$name=="United States"]<-"United States of America"
world_data$name[world_data$name=="Guinea-Bissau"]<-"Guinea Bissau"
world_data$name[world_data$name=="Tanzania"]<-"United Republic of Tanzania"
world_data$name[world_data$name=="Republic of Ireland"]<-"Ireland"
world_data$name[world_data$name=="Republic of the Congo"]<-"Republic of Congo"
world_data$name[world_data$name=="S�����������"]<-"São Tomé and Príncipe"

3.5 Menambahkan Kolom Benua

##Add continent data
world_data$continent <- countrycode(world_data$name, origin = 'country.name', destination = 'continent')

3.6 Melihat struktur data

Menampilkan struktur dataset dengan 3 baris pertama untuk memastikan semua tipe data sudah benar.

str(world_data, 3)
## Classes 'data.table' and 'data.frame':   195 obs. of  36 variables:
##  $ name                                 : chr  "Afghanistan" "Albania" "Algeria" "Andorra" ...
##  $ density                              : num  60 105 18 164 26 223 17 104 3 109 ...
##  $ abbreviation                         : chr  "AF" "AL" "DZ" "AD" ...
##  $ agricultural_land                    : num  58.1 43.1 17.4 40 47.5 20.5 54.3 58.9 48.2 32.4 ...
##  $ land_area                            : num  652230 28748 2381741 468 1246700 ...
##  $ armed_forces_size                    : num  323000 9000 317000 NA 117000 0 105000 49000 58000 21000 ...
##  $ birth_rate                           : num  32.5 11.8 24.3 7.2 40.7 ...
##  $ calling_code                         : int  93 355 213 376 244 1 54 374 61 43 ...
##  $ capital/major_city                   : chr  "Kabul" "Tirana" "Algiers" "Andorra la Vella" ...
##  $ co2-emissions                        : num  8672 4536 150006 469 34693 ...
##  $ cpi                                  : chr  "149.9" "119.05" "151.36" "" ...
##  $ cpi_change                           : chr  "2.30%" "1.40%" "2.00%" "" ...
##  $ currency-code                        : chr  "AFN" "ALL" "DZD" "EUR" ...
##  $ fertility_rate                       : num  4.47 1.62 3.02 1.27 5.52 1.99 2.26 1.76 1.74 1.47 ...
##  $ forested_area                        : num  2.1 28.1 0.8 34 46.3 22.3 9.8 11.7 16.3 46.9 ...
##  $ gasoline_price                       : chr  "$0.70" "$1.36" "$0.28" "$1.51" ...
##  $ gdp                                  : num  1.91e+10 1.53e+10 1.70e+11 3.15e+09 9.46e+10 ...
##  $ gross_primary_education_enrollment   : chr  "104.00%" "107.00%" "109.90%" "106.40%" ...
##  $ gross_tertiary_education_enrollment  : chr  "9.70%" "55.00%" "51.40%" "" ...
##  $ infant_mortality                     : num  47.9 7.8 20.1 2.7 51.6 5 8.8 11 3.1 2.9 ...
##  $ largest_city                         : chr  "Kabul" "Tirana" "Algiers" "Andorra la Vella" ...
##  $ life_expectancy                      : num  64.5 78.5 76.7 NA 60.8 76.9 76.5 74.9 82.7 81.6 ...
##  $ maternal_mortality_ratio             : int  638 15 112 NA 241 42 39 26 6 5 ...
##  $ minimum_wage                         : chr  "$0.43" "$1.12" "$0.95" "$6.63" ...
##  $ official_language                    : chr  "Pashto" "Albanian" "Arabic" "Catalan" ...
##  $ out_of_pocket_health_expenditure     : chr  "78.40%" "56.90%" "28.10%" "36.40%" ...
##  $ physicians_per_thousand              : num  0.28 1.2 1.72 3.33 0.21 2.76 3.96 4.4 3.68 5.17 ...
##  $ population                           : num  38041754 2854191 43053054 77142 31825295 ...
##  $ population:_labor_force_participation: chr  "48.90%" "55.70%" "41.20%" "" ...
##  $ tax_revenue                          : chr  "9.30%" "18.60%" "37.20%" "" ...
##  $ total_tax_rate                       : chr  "71.40%" "36.60%" "66.10%" "" ...
##  $ unemployment_rate                    : num  11.12 12.33 11.7 NA 6.89 ...
##  $ urban_population                     : num  9797273 1747593 31510100 67873 21061025 ...
##  $ latitude                             : num  33.9 41.2 28 42.5 -11.2 ...
##  $ longitude                            : num  67.71 20.17 1.66 1.52 17.87 ...
##  $ continent                            : chr  "Asia" "Europe" "Africa" "Europe" ...
##  - attr(*, ".internal.selfref")=<externalptr>
summary(world_data)
##      name              density        abbreviation       agricultural_land
##  Length:195         Min.   :    2.0   Length:195         Min.   : 0.60    
##  Class :character   1st Qu.:   35.5   Class :character   1st Qu.:21.70    
##  Mode  :character   Median :   89.0   Mode  :character   Median :39.60    
##                     Mean   :  356.8                      Mean   :39.12    
##                     3rd Qu.:  216.5                      3rd Qu.:55.38    
##                     Max.   :26337.0                      Max.   :82.60    
##                                                          NA's   :7        
##    land_area        armed_forces_size   birth_rate     calling_code   
##  Min.   :       0   Min.   :      0   Min.   : 5.90   Min.   :   1.0  
##  1st Qu.:   23828   1st Qu.:  11000   1st Qu.:11.30   1st Qu.:  82.5  
##  Median :  119511   Median :  31000   Median :17.95   Median : 255.5  
##  Mean   :  689624   Mean   : 159275   Mean   :20.21   Mean   : 360.5  
##  3rd Qu.:  524256   3rd Qu.: 142000   3rd Qu.:28.75   3rd Qu.: 506.8  
##  Max.   :17098240   Max.   :3031000   Max.   :46.08   Max.   :1876.0  
##  NA's   :1          NA's   :24        NA's   :6       NA's   :1       
##  capital/major_city co2-emissions         cpi             cpi_change       
##  Length:195         Min.   :     11   Length:195         Length:195        
##  Class :character   1st Qu.:   2304   Class :character   Class :character  
##  Mode  :character   Median :  12303   Mode  :character   Mode  :character  
##                     Mean   : 177799                                        
##                     3rd Qu.:  63884                                        
##                     Max.   :9893038                                        
##                     NA's   :7                                              
##  currency-code      fertility_rate  forested_area   gasoline_price    
##  Length:195         Min.   :0.980   Min.   : 0.00   Length:195        
##  Class :character   1st Qu.:1.705   1st Qu.:11.00   Class :character  
##  Mode  :character   Median :2.245   Median :32.00   Mode  :character  
##                     Mean   :2.698   Mean   :32.02                     
##                     3rd Qu.:3.598   3rd Qu.:48.17                     
##                     Max.   :6.910   Max.   :98.30                     
##                     NA's   :7       NA's   :7                         
##       gdp            gross_primary_education_enrollment
##  Min.   :0.000e+00   Length:195                        
##  1st Qu.:7.892e+09   Class :character                  
##  Median :3.412e+10   Mode  :character                  
##  Mean   :4.724e+11                                     
##  3rd Qu.:2.305e+11                                     
##  Max.   :2.143e+13                                     
##                                                        
##  gross_tertiary_education_enrollment infant_mortality largest_city      
##  Length:195                          Min.   : 1.40    Length:195        
##  Class :character                    1st Qu.: 6.00    Class :character  
##  Mode  :character                    Median :14.00    Mode  :character  
##                                      Mean   :21.33                      
##                                      3rd Qu.:32.70                      
##                                      Max.   :84.50                      
##                                      NA's   :6                          
##  life_expectancy maternal_mortality_ratio minimum_wage       official_language 
##  Min.   :52.80   Min.   :   2.0           Length:195         Length:195        
##  1st Qu.:67.00   1st Qu.:  13.0           Class :character   Class :character  
##  Median :73.20   Median :  53.0           Mode  :character   Mode  :character  
##  Mean   :72.28   Mean   : 160.4                                                
##  3rd Qu.:77.50   3rd Qu.: 186.0                                                
##  Max.   :85.40   Max.   :1150.0                                                
##  NA's   :8       NA's   :14                                                    
##  out_of_pocket_health_expenditure physicians_per_thousand   population       
##  Length:195                       Min.   :0.0100          Min.   :8.360e+02  
##  Class :character                 1st Qu.:0.3325          1st Qu.:1.963e+06  
##  Mode  :character                 Median :1.4600          Median :8.827e+06  
##                                   Mean   :1.8398          Mean   :3.938e+07  
##                                   3rd Qu.:2.9350          3rd Qu.:2.859e+07  
##                                   Max.   :8.4200          Max.   :1.398e+09  
##                                   NA's   :7               NA's   :1          
##  population:_labor_force_participation tax_revenue        total_tax_rate    
##  Length:195                            Length:195         Length:195        
##  Class :character                      Class :character   Class :character  
##  Mode  :character                      Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  unemployment_rate urban_population       latitude         longitude       
##  Min.   : 0.090    Min.   :     5464   Min.   :-40.901   Min.   :-175.198  
##  1st Qu.: 3.395    1st Qu.:  1152961   1st Qu.:  4.544   1st Qu.:  -7.941  
##  Median : 5.360    Median :  4678104   Median : 17.274   Median :  20.973  
##  Mean   : 6.886    Mean   : 22304543   Mean   : 19.092   Mean   :  20.232  
##  3rd Qu.: 9.490    3rd Qu.: 14903239   3rd Qu.: 40.125   3rd Qu.:  48.282  
##  Max.   :28.180    Max.   :842933962   Max.   : 64.963   Max.   : 178.065  
##  NA's   :19        NA's   :5           NA's   :1         NA's   :1         
##   continent        
##  Length:195        
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 

4 Analisis Data Eksploratif

4.1 Total Pupulation

4.1.1 Total Pupulation - Wolrd Map

population<-world_data %>% 
  select(name, population)


highchart() %>% 
  hc_add_series_map(worldgeojson, df=population, value="population", joinBy = "name") %>% 
  hc_colorAxis(stops=color_stops()) %>% 
  hc_title(text="World Population") %>% 
  hc_tooltip(useHTML = TRUE,
             formatter = JS(
               "function(){",
               "  return '<b><u>'+this.point.name+'</u></b><br>'",
               "         +'<b>Population:</b> '+parseInt(this.point.value);",
               "}"
             )
  ) %>% 
  hc_legend(
    enabled = TRUE,
    title = list(text = "Population"),
    layout = "vertical",
    align = "right",
    verticalAlign = "middle"
  )

4.1.2 Top 10 Countries with Highest Population

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Highest Population") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "Total Population")) %>%
  hc_add_series(
    data = population %>% 
      arrange(desc(population)) %>% 
      head(10) %>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = population, name = name),
    name = "Population",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.1.3 Top 10 Countries with Lowest Population

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Lowest Population") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "Total Population")) %>%
  hc_add_series(
    data = population %>% 
      arrange(population) %>% 
      head(10) %>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = population, name = name),
    name = "Population",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.1.4 Poppulation Vs Land

population_by_continent <-
  world_data %>%
  group_by(continent) %>%
  summarise(
    Population = sum(population)) 

land_by_continent <-
  world_data %>%
  group_by(continent) %>%
  summarise(
    sum_land = sum(land_area, na.rm = TRUE),
    sum_agri = sum(agricultural_land,  na.rm = TRUE),
    forested_land = sum(forested_area,  na.rm = TRUE)
    ) 

#Pie group land_by_continent vs population_by_continent 
fig <- plot_ly()
fig <- fig %>% add_pie(data = land_by_continent, labels = ~continent, values = ~sum_land,
                       name = "Land", domain = list(row = 1, column = 0),
                       texttemplate="%{label}<br>(%{percent})",
                       textposition="inside",
                       title = "Land distribution")
fig <- fig %>% add_pie(data = population_by_continent, labels = ~continent, values = ~population,
                       name = "Population", domain = list(row = 1, column = 1),
                       texttemplate="%{label}<br>(%{percent})",
                       textposition="inside",
                       title = "Population distribution")
fig <- fig %>% layout(title = "Land distribution vs Population distribution", showlegend = T,
                      grid=list(rows=1, columns=2),
                      xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
                      yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))
fig

4.2 Population Density

4.2.1 Population Density - World Map

pop_density<-world_data %>% 
  select(name, density)

highchart() %>% 
  hc_add_series_map(worldgeojson, df=pop_density, joinBy = "name", value="density") %>% 
  hc_colorAxis(type = "logarithmic", stops=color_stops(), min=1) %>%
  hc_title(text = "Population Density - World Map") %>% 
  hc_tooltip(useHTML = TRUE,
             formatter = JS(
               "function(){",
               "  return '<b><u>'+this.point.name+'</u></b><br>'",
               "         +'<b>Population Density per Sq.KM:</b> '+parseInt(this.point.value);",
               "}"
             )
  ) %>% 
   hc_legend(
    enabled = TRUE,
    title = list(text = "Population Density"),
    layout = "vertical",
    align = "right",
    verticalAlign = "middle"
  )

4.2.2 Top 10 Countries with Highest Population Density

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Highest Population Density") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "Population Density")) %>%
  hc_add_series(
    data = pop_density %>% 
      arrange(desc(density)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = density, name = name),
    name = "Density",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.2.3 Top 10 Countries with Lowest Population Density

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Lowest Population Density") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "Population Density")) %>%
  hc_add_series(
    data = pop_density %>% 
      arrange(density) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = density, name = name),
    name = "Density",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.3 Urban Population

4.3.1 Urban Population - Wolrd Map

# Urban Population
urban_total<-world_data %>% 
  select(name,land_area,  `capital/major_city`, population, urban_population, latitude, longitude )

highchart() %>%
  hc_add_series_map(worldgeojson, df=urban_total, value="urban_population", joinBy = "name") %>%
  hc_colorAxis(stops=color_stops()) %>%
  hc_title(text="Urban Population") %>%
  hc_tooltip(useHTML = TRUE,
             formatter = JS(
               "function(){",
               "  return '<b><u>'+this.point.name+'</u></b><br>'",
               "         +'<b>Urban Population:</b> '+parseInt(this.point.value);",
               "}"
             )
  ) %>%
  hc_legend(
    enabled = TRUE,
    title = list(text = "Urban Population"),
    layout = "vertical",
    align = "right",
    verticalAlign = "middle"
  )

4.3.2 Top 10 Countries with Highest Urban Population

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Highest Urban Population") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "Urban Population")) %>%
  hc_add_series(
    data = urban_total %>% 
      arrange(desc(urban_population)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = urban_population, name = name),
    name = "Urban Population",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.3.3 Top 10 Countries with Lowest Urban Population

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Lowest Urban Population") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "Urban Population")) %>%
  hc_add_series(
    data = urban_total %>% 
      arrange(urban_population) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = urban_population, name = name),
    name = "Urban Population",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.4 Armed Forces

4.4.1 Armed Forces - World Map

armed_forces<-world_data %>% 
  select(name, armed_forces_size)

highchart() %>% 
  hc_add_series_map(worldgeojson, df=armed_forces, joinBy = "name", value="armed_forces_size") %>% 
  hc_colorAxis(stops=color_stops()) %>% 
  hc_title(text = "Armed Forces - World Map") %>% 
  hc_tooltip(useHTML = TRUE,
             formatter = JS(
               "function(){",
               "  return '<b><u>'+this.point.name+'</u></b><br>'",
               "         +'<b>Armed Forces:</b> '+parseInt(this.point.value);",
               "}"
             )
  ) %>% 
   hc_legend(
    enabled = TRUE,
    title = list(text = "Total Armed Forces"),
    layout = "vertical",
    align = "right",
    verticalAlign = "middle"
  )

4.4.2 Top 10 Countries with Highest Armed Forces

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Highest Armed Forces") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "Armed Forces")) %>%
  hc_add_series(
    data = armed_forces %>% 
      arrange(desc(armed_forces_size)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = armed_forces_size, name = name),
    name = "Armed Forces",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.4.3 Top 10 Countries with Lowest Armed Forces

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Lowest Armed Forces") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "Armed Forces")) %>%
  hc_add_series(
    data = armed_forces %>% 
      arrange((armed_forces_size)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = armed_forces_size, name = name),
    name = "Armed Forces",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.5 Agricultural Land

4.5.1 Agricultural Land - Wolrd Map

agriculture_land_percentage<-world_data %>% 
  select(name, agricultural_land)

highchart() %>% 
  hc_add_series_map(worldgeojson, df=agriculture_land_percentage, value="agricultural_land", joinBy = "name") %>% 
  hc_colorAxis(mincolor="lightblue", maxcolor="blue")%>% 
  hc_title(text="Agricultural Land Percentage") %>% 
  hc_tooltip(useHTML = TRUE,
             formatter = JS(
               "function(){",
               "  return '<b><u>'+this.point.name+'</u></b><br>'",
               "         +'<b>Agricultural Land:</b> '+parseInt(this.point.value)+'%';",
               "}"
             )
  )

4.5.2 Top 10 Countries with Highest Agricultural Land

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Highest Agricultural Land") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "Agricultural Land ")) %>%
  hc_add_series(
    data = agriculture_land_percentage %>% 
      arrange(desc(agricultural_land)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = agricultural_land, name = name),
    name = "Agricultural Land",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.5.3 Top 10 Countries with Lowest Agricultural Land

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Lowest Agricultural Land") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "Agricultural Land ")) %>%
  hc_add_series(
    data = agriculture_land_percentage %>% 
      arrange((agricultural_land)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = agricultural_land, name = name),
    name = "Agricultural Land",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.5.4 Total Land vs Agricultural Land Highest

land_agri<-world_data %>% 
  select(name, land_area, agricultural_land)

highchart() %>%
  hc_chart(type = "bubble", zoomType = "xy") %>%
  hc_title(text = "Total Land vs Agricultural Land (Top 20 highest Countries by Agricultural Land)") %>%
  hc_xAxis(title = list(text = "Land Area in Sq.KM")) %>%
  hc_yAxis(title = list(text = "Agricultural Land (%)")) %>%
  hc_add_series(
    data = land_agri %>% 
      arrange(desc(agricultural_land)) %>% 
      head(20),
    type = "bubble",
    hcaes(x = land_area, y = agricultural_land, z = agricultural_land, name = name),
    name = "Agricultural Land %",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  ) %>%
  hc_tooltip(
    useHTML = TRUE,
    formatter = JS(
      "function() {",
      "  return '<b><u>' + this.point.options.name + '</u></b><br>'",
      "         + 'Land Area: ' + Highcharts.numberFormat(this.point.options.land_area) + ' Sq.KM<br>'",
      "         + 'Agricultural Land: ' + Highcharts.numberFormat(this.point.options.agricultural_land) + '%';",
      "}"
    )
  )

4.5.5 Total Land vs Agricultural Land Top 20 Lowest

land_agri<-world_data %>% 
  select(name, land_area, agricultural_land)

highchart() %>%
  hc_chart(type = "bubble", zoomType = "xy") %>%
  hc_title(text = "Total Land vs Agricultural Land (Top 20 Lowest Countries by Agricultural Land)") %>%
  hc_xAxis(title = list(text = "Land Area in Sq.KM")) %>%
  hc_yAxis(title = list(text = "Agricultural Land (%)")) %>%
  hc_add_series(
    data = land_agri %>% 
      arrange((agricultural_land)) %>% 
      head(20),
    type = "bubble",
    hcaes(x = land_area, y = agricultural_land, z = agricultural_land, name = name),
    name = "Agricultural Land %",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  ) %>%
  hc_tooltip(
    useHTML = TRUE,
    formatter = JS(
      "function() {",
      "  return '<b><u>' + this.point.options.name + '</u></b><br>'",
      "         + 'Land Area: ' + Highcharts.numberFormat(this.point.options.land_area) + ' Sq.KM<br>'",
      "         + 'Agricultural Land: ' + Highcharts.numberFormat(this.point.options.agricultural_land) + '%';",
      "}"
    )
  )

4.6 Fertility vs Maternal Mortality vs Birth Rate vs Infant Mortality

4.6.1 Scatter Plot

mortality_fertility<-world_data %>% 
  select(name, population, fertility_rate, birth_rate, maternal_mortality_ratio, infant_mortality)


plot1<-highchart() %>% 
  hc_title(text = "Birth Rate") %>% 
  hc_add_series(data = mortality_fertility, 
                type = "scatter", 
                hcaes(x = birth_rate, y = population)) %>%
  hc_xAxis(title = list(text = "Birth Rate")) %>% 
  hc_yAxis(title = list(text = "Population")) %>%
  hc_add_theme(hc_theme_google()) %>% 
  hc_tooltip(useHTML = TRUE,
             headerFormat = "<b><u>{point.key}</u></b><br>",
             pointFormat = "<b>Birth Rate:</b> {point.x}<br><b>Population:</b> {point.y}")

plot2<-highchart() %>% 
  hc_title(text = "Fertility Rate") %>% 
  hc_add_series(data = mortality_fertility, 
                type = "scatter", 
                hcaes(x = fertility_rate, y = population)) %>%
  hc_xAxis(title = list(text = "Fertility Rate")) %>% 
  hc_yAxis(title = list(text = "Population")) %>%
  hc_add_theme(hc_theme_google()) %>% 
  hc_tooltip(useHTML = TRUE,
             headerFormat = "<b><u>{point.key}</u></b><br>",
             pointFormat = "<b>Fertility Rate:</b> {point.x}<br><b>Population:</b> {point.y}")

plot3<-highchart() %>% 
  hc_title(text = "Maternal Mortality Ratio") %>% 
  hc_add_series(data = mortality_fertility, 
                type = "scatter", 
                hcaes(x = maternal_mortality_ratio, y = population)) %>%
  hc_xAxis(title = list(text = "Maternal Mortality Ratio")) %>% 
  hc_yAxis(title = list(text = "Population")) %>%
  hc_add_theme(hc_theme_google()) %>% 
  hc_tooltip(useHTML = TRUE,
             headerFormat = "<b><u>{point.key}</u></b><br>",
             pointFormat = "<b>Maternal Mortality Ratio:</b> {point.x}<br><b>Population:</b> {point.y}")


plot4<-highchart() %>% 
  hc_title(text = "Infant Mortality") %>% 
  hc_add_series(data = mortality_fertility, 
                type = "scatter", 
                hcaes(x = infant_mortality, y = population)) %>%
  hc_xAxis(title = list(text = "Infant Mortality")) %>% 
  hc_yAxis(title = list(text = "Population")) %>%
  hc_add_theme(hc_theme_google()) %>% 
  hc_tooltip(useHTML = TRUE,
             headerFormat = "<b><u>{point.key}</u></b><br>",
             pointFormat = "<b>Infant Mortality:</b> {point.x}<br><b>Population:</b> {point.y}")

pair1 <- tagList(plot1, plot2)


pair2 <- tagList(plot3, plot4)


browsable(div(
  style = "display: flex; justify-content: space-between;",
  pair1
))
browsable(div(
  style = "display: flex; justify-content: space-between;",
  pair2
))

4.6.2 Population vs Birth rate vs Infant Mortality

  • Sumbu x menunjukkan Population
  • Sumbu y menunjukkan Birth rate
  • Ukuran Bubble menunjukan nilai dari Infant Mortality
highchart() %>%
  hc_chart(type = "bubble", zoomType = "xy") %>%
  hc_title(text = "Population vs Birth rate vs Infant Mortality") %>%
  hc_xAxis(title = list(text = "Population")) %>%
  hc_yAxis(title = list(text = "Birth Rate")) %>%
  hc_add_series(
    data = mortality_fertility %>% 
      arrange(desc(fertility_rate)) %>% 
      head(20),
    type = "bubble",
    hcaes(x = population, y = birth_rate, z = infant_mortality , name = name),
    name = "Fertility Rate",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  ) %>%
  hc_tooltip(
    useHTML = TRUE,
    formatter = JS(
      "function() {",
      "  return '<b><u>' + this.point.options.name + '</u></b><br>'",
      "         + 'Population: ' + Highcharts.numberFormat(this.point.options.population) + '<br>'",
      "         + 'Birth Rate: ' + Highcharts.numberFormat(this.point.options.birth_rate) + '<br>'",
      "         + 'Infant Mortality: ' + Highcharts.numberFormat(this.point.options.infant_mortality) + '<br>'",
      "}"
    )
  )

4.7 Life Expectancy

4.7.1 Life Expectancy - World Map

life_expectancy<-world_data %>% 
  select(name, population, life_expectancy)

highchart() %>%
  hc_add_series_map(worldgeojson, df=life_expectancy, value="life_expectancy", joinBy = "name") %>%
  hc_colorAxis(stops=color_stops()) %>%
  hc_title(text="Life Expectancy") %>%
  hc_tooltip(useHTML = TRUE,
             formatter = JS(
               "function(){",
               "  return '<b><u>'+this.point.name+'</u></b><br>'",
               "         +'<b>Life Expectancy:</b> '+parseInt(this.point.value)",
               "}"
             )
  ) %>%
  hc_legend(
    enabled = TRUE,
    title = list(text = "Life Expectancy"),
    layout = "vertical",
    align = "right",
    verticalAlign = "middle"
  )

4.7.2 Top 10 Countries with Highest Life Expectancy

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Highest Life Expectancy") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "ife Expectancy")) %>%
  hc_add_series(
    data = life_expectancy %>% 
      arrange(desc(life_expectancy)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = life_expectancy, name = name),
    name = "Life Expectancy",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.7.3 Top 10 Countries with Lowest Life Expectancy

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Lowest Life Expectancy") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "ife Expectancy")) %>%
  hc_add_series(
    data = life_expectancy %>% 
      arrange((life_expectancy)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = life_expectancy, name = name),
    name = "Life Expectancy",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.7.4 Life Expectancy by contienent

plot_ly(world_data, x = ~life_expectancy, color = ~continent, type="box",
        boxpoints = "all") %>%
  layout(title = "Life Expectancy Per Continent + Global",
         yaxis = list(title = 'Continent', standoff = 35)
         ) %>%
  add_trace(x = ~world_data$life_expectancy, name = "Global", color = "yellow") 
## Warning: Ignoring 8 observations
## Warning: Ignoring 8 observations

4.8 Unemployment Rate

4.8.1 Top 20 Countries with high unemployment rate

unemployment<-world_data %>% 
  select(name, unemployment_rate) %>% 
  arrange(desc(unemployment_rate)) %>% 
  slice(1:20)

highchart() %>% 
  hc_chart(type="column") %>% 
  hc_xAxis(categories=unemployment$name) %>% 
  hc_add_series(data=unemployment$unemployment_rate, name="Unemployment Rate") %>% 
  hc_title(text="Top 20 Countries with high unemployment rate") %>% 
  hc_add_theme(hc_theme_google()) %>% 
  hc_tooltip(useHTML=TRUE,
             headerFormat="<b><u>{point.key}</u></b><br>",
             pointFormat="<b>Unemployment Rate:</b> {point.y}%")

4.8.2 Top 20 Countries with least unemployment rate

unemployment<-world_data %>% 
  select(name, unemployment_rate) %>% 
  arrange((unemployment_rate)) %>% 
  slice(1:20)

highchart() %>% 
  hc_chart(type="column") %>% 
  hc_xAxis(categories=unemployment$name) %>% 
  hc_add_series(data=unemployment$unemployment_rate, name="Unemployment Rate") %>% 
  hc_title(text="Top 20 Countries with least unemployment rate") %>% 
  hc_add_theme(hc_theme_google()) %>% 
  hc_tooltip(useHTML=TRUE,
             headerFormat="<b><u>{point.key}</u></b><br>",
             pointFormat="<b>Unemployment Rate:</b> {point.y}%")

4.9 CO2 Emissions

4.9.1 CO2 Emissions - Wolrd Map

co2_emissions<-world_data %>% 
  select(name, `co2-emissions`)

highchart() %>%
  hc_add_series_map(worldgeojson, df=co2_emissions, value="co2-emissions", joinBy = "name") %>%
  hc_colorAxis(minColor="#4FC978", maxColor="red") %>%
  hc_title(text="CO2 Emissions") %>%
  hc_tooltip(useHTML = TRUE,
             formatter = JS(
               "function(){",
               "  return '<b><u>'+this.point.name+'</u></b><br>'",
               "         +'<b>CO2 Emissions:</b> '+parseInt(this.point.value)",
               "}"
             )
  ) %>%
  hc_legend(
    enabled = TRUE,
    title = list(text = "CO2 Emissions"),
    layout = "vertical",
    align = "right",
    verticalAlign = "middle"
  )

4.9.2 Top 10 Countries with Highest CO2 Emissions

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Highest CO2 Emissions") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "CO2 Emissions")) %>%
  hc_add_series(
    data = co2_emissions %>% 
      arrange(desc(`co2-emissions`)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = `co2-emissions`, name = name),
    name = "CO2 Emissions",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.9.3 Top 10 Countries with Lowest CO2 Emissions

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Lowest CO2 Emissions") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "CO2 Emissions")) %>%
  hc_add_series(
    data = co2_emissions %>% 
      arrange((`co2-emissions`)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = `co2-emissions`, name = name),
    name = "CO2 Emissions",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.10 GDP

4.10.1 GDP - Wolrd Map

gdp<-world_data %>% 
  select(name, gdp)

highchart() %>%
  hc_add_series_map(worldgeojson, df=gdp, value="gdp", joinBy = "name") %>%
  hc_colorAxis(stops=color_stops()) %>%
  hc_title(text="GDP") %>%
  hc_tooltip(useHTML = TRUE,
             formatter = JS(
               "function(){",
               "  return '<b><u>'+this.point.name+'</u></b><br>'",
               "         +'<b>GDP:</b> '+parseInt(this.point.value)",
               "}"
             )
  ) %>%
  hc_legend(
    enabled = TRUE,
    title = list(text = "GDP"),
    layout = "vertical",
    align = "right",
    verticalAlign = "middle"
  )

4.10.2 Top 10 Countries with Highest GDP

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Highest GDP") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "GDP")) %>%
  hc_add_series(
    data = gdp %>% 
      arrange(desc(gdp)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = gdp, name = name),
    name = "GDP",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.10.3 Top 10 Countries with Lowest GDP

highchart() %>%
  hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
  hc_title(text = "Top 10 Countries with Lowest GDP") %>%
  hc_xAxis(title = list(text = "Country")) %>%
  hc_yAxis(title = list(text = "GDP")) %>%
  hc_add_series(
    data = gdp %>% 
      arrange(desc(gdp)) %>% 
      head(10)%>% 
      mutate(rank = row_number()),
    type = "column",
    hcaes(x = rank, y = gdp, name = name),
    name = "GDP",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  )

4.10.4 Top 50 contries with the highest GDP by continent

GDP_top <-
  world_data %>%
  select(name, gdp, continent) %>%
  arrange(desc(gdp)) %>%
    head(50)

GDP_top %>%
  plot_ly(x = ~continent, y = ~gdp, color = ~name) %>%
  layout(title = "Top 50 countries with the highest GDP by continent <br> with country included",
         xaxis = list(categoryorder='total descending'),
         barmode = 'stack')
## No trace type specified:
##   Based on info supplied, a 'bar' trace seems appropriate.
##   Read more about this trace type -> https://plotly.com/r/reference/#bar
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors

4.10.5 GDP per capita and Life Expectancy correlation

gdp_le<-world_data %>% 
  select(name, population, gdp, life_expectancy)

highchart() %>%
  hc_chart(type = "bubble", zoomType = "xy") %>%
  hc_title(text = "GDP per capita and Life Expectancy correlation") %>%
  hc_xAxis(title = list(text = "GDP")) %>%
  hc_yAxis(title = list(text = "Life Expectancy")) %>%
  hc_add_series(
    data = gdp_le %>% 
      arrange(desc(gdp)) %>% 
      head(20),
    type = "bubble",
    hcaes(x = gdp, y = life_expectancy, z = gdp , name = name),
    name = "GDP",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  ) %>%
  hc_tooltip(
    useHTML = TRUE,
    formatter = JS(
      "function() {",
      "  return '<b><u>' + this.point.options.name + '</u></b><br>'",
      "         + 'GDP: ' + Highcharts.numberFormat(this.point.options.gdp) + '<br>'",
      "         + 'Life Expectancy : ' + Highcharts.numberFormat(this.point.options.life_expectancy);",
      "}"
    )
  )

4.10.6 GDP and Unemployment Rate correlation

gdp_ur<-world_data %>% 
  select(name, gdp, unemployment_rate)

highchart() %>%
  hc_chart(type = "bubble", zoomType = "xy") %>%
  hc_title(text = "GDP and Unemployment Rate correlation") %>%
  hc_xAxis(title = list(text = "GDP")) %>%
  hc_yAxis(title = list(text = "Unemployment Rate")) %>%
  hc_add_series(
    data = gdp_ur %>% 
      arrange(desc(gdp)) %>% 
      head(20),
    type = "bubble",
    hcaes(x = gdp, y = unemployment_rate, z = gdp , name = name),
    name = "GDP",
    dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
    colorByPoint = TRUE
  ) %>%
  hc_tooltip(
    useHTML = TRUE,
    formatter = JS(
      "function() {",
      "  return '<b><u>' + this.point.options.name + '</u></b><br>'",
      "         + 'GDP: ' + Highcharts.numberFormat(this.point.options.gdp) + '<br>'",
      "         + 'Unemployment Rate : ' + Highcharts.numberFormat(this.point.options.unemployment_rate);",
      "}"
    )
  )

4.10.7 GDP Per Capita vs Birth Rate

world_data %>%
plot_ly(x = ~gdp, y = ~birth_rate, mode = 'markers', 
        color = ~continent,
        text = ~paste("Country: ", name, "<br>",
                      "Birth Rate: ", birth_rate, "%")) %>%
  layout(title = 'GDP vs Birth Rate',
         xaxis = list(title = 'GDP Per Capita (US dollars)'),
         yaxis = list(title = 'Birth Rate (%)'))
## No trace type specified:
##   Based on info supplied, a 'scatter' trace seems appropriate.
##   Read more about this trace type -> https://plotly.com/r/reference/#scatter
## Warning: Ignoring 6 observations

4.11 Languages

4.11.1 Languages across the globe

#Language by population
language_by_population <-
  world_data %>%
  group_by(official_language) %>%
  summarise(count = n(),
            population = sum(population, na.rm = TRUE)) %>%
  arrange(desc(population), desc(count))
language_by_population <- rename(language_by_population, num_countries = count)
f_lang <- as.data.frame(head(language_by_population, 20))
f_lang[nrow(f_lang) + 1, ] = c("Other", nrow(world_data) - sum(f_lang$count), sum(world_data$population) - sum(f_lang$population))

#####Pie language by population

plot_ly(f_lang, labels = ~official_language, 
        values = ~population, type = "pie",
        texttemplate="%{label}<br>(%{percent})",
        textposition="inside") 

4.11.2 Top 10 most spoken languages by population

f_lang <- as.data.frame(head(language_by_population, 10))
df_lang_country <- subset(world_data, official_language %in% f_lang$official_language)

df_lang_country %>%
  plot_ly(x = ~official_language, y = ~population, color = ~name) %>%
  layout(title = "Top 10 most spoken languages by population, with country included",
         yaxis = list(title = 'Population'), barmode = 'stack',
         xaxis = list(categoryorder='total descending', title = "Official Language"))
## No trace type specified:
##   Based on info supplied, a 'bar' trace seems appropriate.
##   Read more about this trace type -> https://plotly.com/r/reference/#bar
## Warning: Ignoring 1 observations
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors

4.11.3 Top 10 languages by number of countries

language_by_countries <-
  language_by_population %>%
  arrange(desc(num_countries ))
f_lang <- as.data.frame(head(language_by_countries, 10))

f_lang %>%
  plot_ly(x = ~official_language, y = ~num_countries, color = ~official_language,
          type = "bar") %>%
  layout(title = "Languages by Number of Countries",
         xaxis = list(categoryorder='total descending', title = "Official Language"
          ),
         yaxis = list(title = "Number of Countries"),
         showlegend = F
  )
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors

4.12 Correlation

world_data %>%
  select(population, gdp, fertility_rate, birth_rate, maternal_mortality_ratio,
         infant_mortality, life_expectancy, physicians_per_thousand) %>%
  na.omit() %>%
  cor() %>%
  hchart()

  1. 0-9\.↩︎