1 Pegenalan Dataset
1.1 Atribut
- Country: Name of the country.
- Density (P/Km2): Population density measured in persons per square kilometer.
- Abbreviation: Abbreviation or code representing the country.
- Agricultural Land (%): Percentage of land area used for agricultural purposes.
- Land Area (Km2): Total land area of the country in square kilometers.
- Armed Forces Size: Size of the armed forces in the country.
- Birth Rate: Number of births per 1,000 population per year.
- Calling Code: International calling code for the country.
- Capital/Major City: Name of the capital or major city.
- CO2 Emissions: Carbon dioxide emissions in tons.
- CPI: Consumer Price Index, a measure of inflation and purchasing power.
- CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.
- Currency_Code: Currency code used in the country.
- Fertility Rate: Average number of children born to a woman during her lifetime.
- Forested Area (%): Percentage of land area covered by forests.
- Gasoline_Price: Price of gasoline per liter in local currency.
- GDP: Gross Domestic Product, the total value of goods and services produced in the country.
- Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.
- Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.
- Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.
- Largest City: Name of the country’s largest city.
- Life Expectancy: Average number of years a newborn is expected to live.
- Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.
- Minimum Wage: Minimum wage level in local currency.
- Official Language: Official language(s) spoken in the country.
- Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.
- Physicians per Thousand: Number of physicians per thousand people.
- Population: Total population of the country.
- Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.
- Tax Revenue (%): Tax revenue as a percentage of GDP.
- Total Tax Rate: Overall tax burden as a percentage of commercial profits.
- Unemployment Rate: Percentage of the labor force that is unemployed.
- Urban Population: Percentage of the population living in urban areas.
- Latitude: Latitude coordinate of the country’s location.
- Longitude: Longitude coordinate of the country’s location.
2 Import Library
2.0.1 Cek Package yang Sudah Terinstal
installed.packages()
mengambil daftar package yang sudah ada.
!(my_essential_packages %in% installed.packages()[, "Package"])
mencari package yang belum terinstal.
2.0.2 Install Package yang Belum Ada
install.packages(packages_to_install, dependencies = TRUE)
memastikan semua package beserta dependensinya diinstal.
2.0.3 Load Semua Package yang Diperlukan
lapply(..., library, character.only = TRUE, quietly = TRUE)
digunakan untuk memuat package tanpa menampilkan output di console.
3 Data Cleaning
3.1 Membaca Dataset
Menggunakan fread()
dari data.table
untuk
membaca file CSV dengan lebih cepat dibandingkan
read.csv()
.
path_data = "C:/Users/Hamdani Umar/Downloads/Dataset2023/world-data-2023.csv"
world_data <- fread(path_data)
head(world_data)
## Country Density\n(P/Km2) Abbreviation Agricultural Land( %)
## <char> <char> <char> <char>
## 1: Afghanistan 60 AF 58.10%
## 2: Albania 105 AL 43.10%
## 3: Algeria 18 DZ 17.40%
## 4: Andorra 164 AD 40.00%
## 5: Angola 26 AO 47.50%
## 6: Antigua and Barbuda 223 AG 20.50%
## Land Area(Km2) Armed Forces size Birth Rate Calling Code
## <char> <char> <num> <int>
## 1: 652,230 323,000 32.49 93
## 2: 28,748 9,000 11.78 355
## 3: 2,381,741 317,000 24.28 213
## 4: 468 7.20 376
## 5: 1,246,700 117,000 40.73 244
## 6: 443 0 15.33 1
## Capital/Major City Co2-Emissions CPI CPI Change (%) Currency-Code
## <char> <char> <char> <char> <char>
## 1: Kabul 8,672 149.9 2.30% AFN
## 2: Tirana 4,536 119.05 1.40% ALL
## 3: Algiers 150,006 151.36 2.00% DZD
## 4: Andorra la Vella 469 EUR
## 5: Luanda 34,693 261.73 17.10% AOA
## 6: St. John's, Saint John 557 113.81 1.20% XCD
## Fertility Rate Forested Area (%) Gasoline Price GDP
## <num> <char> <char> <char>
## 1: 4.47 2.10% $0.70 $19,101,353,833
## 2: 1.62 28.10% $1.36 $15,278,077,447
## 3: 3.02 0.80% $0.28 $169,988,236,398
## 4: 1.27 34.00% $1.51 $3,154,057,987
## 5: 5.52 46.30% $0.97 $94,635,415,870
## 6: 1.99 22.30% $0.99 $1,727,759,259
## Gross primary education enrollment (%)
## <char>
## 1: 104.00%
## 2: 107.00%
## 3: 109.90%
## 4: 106.40%
## 5: 113.50%
## 6: 105.00%
## Gross tertiary education enrollment (%) Infant mortality
## <char> <num>
## 1: 9.70% 47.9
## 2: 55.00% 7.8
## 3: 51.40% 20.1
## 4: 2.7
## 5: 9.30% 51.6
## 6: 24.80% 5.0
## Largest city Life expectancy Maternal mortality ratio Minimum wage
## <char> <num> <int> <char>
## 1: Kabul 64.5 638 $0.43
## 2: Tirana 78.5 15 $1.12
## 3: Algiers 76.7 112 $0.95
## 4: Andorra la Vella NA NA $6.63
## 5: Luanda 60.8 241 $0.71
## 6: St. John's, Saint John 76.9 42 $3.04
## Official language Out of pocket health expenditure Physicians per thousand
## <char> <char> <num>
## 1: Pashto 78.40% 0.28
## 2: Albanian 56.90% 1.20
## 3: Arabic 28.10% 1.72
## 4: Catalan 36.40% 3.33
## 5: Portuguese 33.40% 0.21
## 6: English 24.30% 2.76
## Population Population: Labor force participation (%) Tax revenue (%)
## <char> <char> <char>
## 1: 38,041,754 48.90% 9.30%
## 2: 2,854,191 55.70% 18.60%
## 3: 43,053,054 41.20% 37.20%
## 4: 77,142
## 5: 31,825,295 77.50% 9.20%
## 6: 97,118 16.50%
## Total tax rate Unemployment rate Urban_population Latitude Longitude
## <char> <char> <char> <num> <num>
## 1: 71.40% 11.12% 9,797,273 33.93911 67.709953
## 2: 36.60% 12.33% 1,747,593 41.15333 20.168331
## 3: 66.10% 11.70% 31,510,100 28.03389 1.659626
## 4: 67,873 42.50628 1.521801
## 5: 49.10% 6.89% 21,061,025 -11.20269 17.873887
## 6: 43.00% 23,800 17.06082 -61.796428
3.2 Membersihkan nama kolom
- Mengganti spasi dengan underscore (_) agar lebih mudah digunakan dalam R.
- Menghapus teks dalam tanda kurung () dan %.
- Menghapus underscore yang tersisa di akhir nama kolom.
- Mengubah huruf kapital menjadi huruf kecil agar lebih konsisten.
colnames(world_data) <- gsub(" ", "_", colnames(world_data))
colnames(world_data) <- gsub("\\s*\\(.*?\\)%*", "", colnames(world_data))
colnames(world_data) <- gsub("_$", "", colnames(world_data))
colnames(world_data) <- tolower(colnames(world_data))
colnames(world_data)
## [1] "country"
## [2] "density"
## [3] "abbreviation"
## [4] "agricultural_land"
## [5] "land_area"
## [6] "armed_forces_size"
## [7] "birth_rate"
## [8] "calling_code"
## [9] "capital/major_city"
## [10] "co2-emissions"
## [11] "cpi"
## [12] "cpi_change"
## [13] "currency-code"
## [14] "fertility_rate"
## [15] "forested_area"
## [16] "gasoline_price"
## [17] "gdp"
## [18] "gross_primary_education_enrollment"
## [19] "gross_tertiary_education_enrollment"
## [20] "infant_mortality"
## [21] "largest_city"
## [22] "life_expectancy"
## [23] "maternal_mortality_ratio"
## [24] "minimum_wage"
## [25] "official_language"
## [26] "out_of_pocket_health_expenditure"
## [27] "physicians_per_thousand"
## [28] "population"
## [29] "population:_labor_force_participation"
## [30] "tax_revenue"
## [31] "total_tax_rate"
## [32] "unemployment_rate"
## [33] "urban_population"
## [34] "latitude"
## [35] "longitude"
3.3 membersihkan nilai dalam kolom numerik
Format numerik seperti angka dengan koma (,), persentase (%), dan mata uang ($) diubah agar dapat dikonversi ke tipe data numerik.
Penggunaan
str_replace_all()
dari stringr untuk menghapus karakter tidak diinginkan.Pengecekan nilai dengan regex (1+$) untuk memastikan hanya angka yang dikonversi ke numeric, sementara nilai tidak valid diubah menjadi NA.
Jika ada nilai NA pada GDP, diubah menjadi 0 untuk mencegah error dalam perhitungan statistik.
## Clean Density column
world_data$density <- str_replace_all(world_data$density, ',', '')
world_data$density <- ifelse(str_detect(world_data$density, '^[0-9\\.]+$'), as.numeric(world_data$density, NA))
## Clean Agricultural Land' column
world_data$agricultural_land <- str_replace_all(world_data$agricultural_land, '%', '')
world_data$agricultural_land <- ifelse(str_detect(world_data$agricultural_land, '^[0-9\\.]+$'), as.numeric(world_data$agricultural_land), NA)
## Clean Armed Forces size column
world_data$armed_forces_size <- str_replace_all(world_data$armed_forces_size, ',', '')
world_data$armed_forces_size <- ifelse(str_detect(world_data$armed_forces_size, '^[0-9\\.]+$'), as.numeric(world_data$armed_forces_size), NA)
## Clean Land Area column
world_data$land_area <- str_replace_all(world_data$land_area, ',', '')
world_data$land_area <- ifelse(str_detect(world_data$land_area, '^[0-9\\.]+$'), as.numeric(world_data$land_area), NA)
## Clean Co2-Emissions column
world_data$`co2-emissions` <- str_replace_all(world_data$`co2-emissions`, ',', '')
world_data$`co2-emissions` <- ifelse(str_detect(world_data$`co2-emissions`, '^[0-9\\.]+$'), as.numeric(world_data$`co2-emissions`), NA)
## Clean Forested Area column
world_data$forested_area <- str_replace_all(world_data$forested_area, '%', '')
world_data$forested_area <- ifelse(str_detect(world_data$forested_area, '^[0-9\\.]+$'), as.numeric(world_data$forested_area), NA)
## Clean GDP column
world_data$gdp <- str_replace_all(world_data$gdp, '\\$', '')
world_data$gdp <- as.numeric(str_replace_all(world_data$gdp, ',', ''))
world_data$gdp[is.na(world_data$gdp)] <- 0
## Clean Population column
world_data$population <- str_replace_all(world_data$population, ',', '')
world_data$population <- ifelse(str_detect(world_data$population, '^[0-9\\.]+$'), as.numeric(world_data$population), NA)
## Clean Unemployment rate column
world_data$unemployment_rate <- str_replace_all(world_data$unemployment_rate, '%', '')
world_data$unemployment_rate <- ifelse(str_detect(world_data$unemployment_rate, '^[0-9\\.]+$'), as.numeric(world_data$unemployment_rate), NA)
## Clean Urban_population column
world_data$urban_population <- str_replace_all(world_data$urban_population, ',', '')
world_data$urban_population <- ifelse(str_detect(world_data$urban_population, '^[0-9\\.]+$'), as.numeric(world_data$urban_population), NA)
3.4 Mengganti nama negara yang tidak lengkap atau salah ejaan
## [1] "name"
## [2] "density"
## [3] "abbreviation"
## [4] "agricultural_land"
## [5] "land_area"
## [6] "armed_forces_size"
## [7] "birth_rate"
## [8] "calling_code"
## [9] "capital/major_city"
## [10] "co2-emissions"
## [11] "cpi"
## [12] "cpi_change"
## [13] "currency-code"
## [14] "fertility_rate"
## [15] "forested_area"
## [16] "gasoline_price"
## [17] "gdp"
## [18] "gross_primary_education_enrollment"
## [19] "gross_tertiary_education_enrollment"
## [20] "infant_mortality"
## [21] "largest_city"
## [22] "life_expectancy"
## [23] "maternal_mortality_ratio"
## [24] "minimum_wage"
## [25] "official_language"
## [26] "out_of_pocket_health_expenditure"
## [27] "physicians_per_thousand"
## [28] "population"
## [29] "population:_labor_force_participation"
## [30] "tax_revenue"
## [31] "total_tax_rate"
## [32] "unemployment_rate"
## [33] "urban_population"
## [34] "latitude"
## [35] "longitude"
## Rename mis-spelt or incomplete names
world_data$name[world_data$name=="United States"]<-"United States of America"
world_data$name[world_data$name=="Guinea-Bissau"]<-"Guinea Bissau"
world_data$name[world_data$name=="Tanzania"]<-"United Republic of Tanzania"
world_data$name[world_data$name=="Republic of Ireland"]<-"Ireland"
world_data$name[world_data$name=="Republic of the Congo"]<-"Republic of Congo"
world_data$name[world_data$name=="S�����������"]<-"São Tomé and Príncipe"
3.5 Menambahkan Kolom Benua
3.6 Melihat struktur data
Menampilkan struktur dataset dengan 3 baris pertama untuk memastikan semua tipe data sudah benar.
## Classes 'data.table' and 'data.frame': 195 obs. of 36 variables:
## $ name : chr "Afghanistan" "Albania" "Algeria" "Andorra" ...
## $ density : num 60 105 18 164 26 223 17 104 3 109 ...
## $ abbreviation : chr "AF" "AL" "DZ" "AD" ...
## $ agricultural_land : num 58.1 43.1 17.4 40 47.5 20.5 54.3 58.9 48.2 32.4 ...
## $ land_area : num 652230 28748 2381741 468 1246700 ...
## $ armed_forces_size : num 323000 9000 317000 NA 117000 0 105000 49000 58000 21000 ...
## $ birth_rate : num 32.5 11.8 24.3 7.2 40.7 ...
## $ calling_code : int 93 355 213 376 244 1 54 374 61 43 ...
## $ capital/major_city : chr "Kabul" "Tirana" "Algiers" "Andorra la Vella" ...
## $ co2-emissions : num 8672 4536 150006 469 34693 ...
## $ cpi : chr "149.9" "119.05" "151.36" "" ...
## $ cpi_change : chr "2.30%" "1.40%" "2.00%" "" ...
## $ currency-code : chr "AFN" "ALL" "DZD" "EUR" ...
## $ fertility_rate : num 4.47 1.62 3.02 1.27 5.52 1.99 2.26 1.76 1.74 1.47 ...
## $ forested_area : num 2.1 28.1 0.8 34 46.3 22.3 9.8 11.7 16.3 46.9 ...
## $ gasoline_price : chr "$0.70" "$1.36" "$0.28" "$1.51" ...
## $ gdp : num 1.91e+10 1.53e+10 1.70e+11 3.15e+09 9.46e+10 ...
## $ gross_primary_education_enrollment : chr "104.00%" "107.00%" "109.90%" "106.40%" ...
## $ gross_tertiary_education_enrollment : chr "9.70%" "55.00%" "51.40%" "" ...
## $ infant_mortality : num 47.9 7.8 20.1 2.7 51.6 5 8.8 11 3.1 2.9 ...
## $ largest_city : chr "Kabul" "Tirana" "Algiers" "Andorra la Vella" ...
## $ life_expectancy : num 64.5 78.5 76.7 NA 60.8 76.9 76.5 74.9 82.7 81.6 ...
## $ maternal_mortality_ratio : int 638 15 112 NA 241 42 39 26 6 5 ...
## $ minimum_wage : chr "$0.43" "$1.12" "$0.95" "$6.63" ...
## $ official_language : chr "Pashto" "Albanian" "Arabic" "Catalan" ...
## $ out_of_pocket_health_expenditure : chr "78.40%" "56.90%" "28.10%" "36.40%" ...
## $ physicians_per_thousand : num 0.28 1.2 1.72 3.33 0.21 2.76 3.96 4.4 3.68 5.17 ...
## $ population : num 38041754 2854191 43053054 77142 31825295 ...
## $ population:_labor_force_participation: chr "48.90%" "55.70%" "41.20%" "" ...
## $ tax_revenue : chr "9.30%" "18.60%" "37.20%" "" ...
## $ total_tax_rate : chr "71.40%" "36.60%" "66.10%" "" ...
## $ unemployment_rate : num 11.12 12.33 11.7 NA 6.89 ...
## $ urban_population : num 9797273 1747593 31510100 67873 21061025 ...
## $ latitude : num 33.9 41.2 28 42.5 -11.2 ...
## $ longitude : num 67.71 20.17 1.66 1.52 17.87 ...
## $ continent : chr "Asia" "Europe" "Africa" "Europe" ...
## - attr(*, ".internal.selfref")=<externalptr>
## name density abbreviation agricultural_land
## Length:195 Min. : 2.0 Length:195 Min. : 0.60
## Class :character 1st Qu.: 35.5 Class :character 1st Qu.:21.70
## Mode :character Median : 89.0 Mode :character Median :39.60
## Mean : 356.8 Mean :39.12
## 3rd Qu.: 216.5 3rd Qu.:55.38
## Max. :26337.0 Max. :82.60
## NA's :7
## land_area armed_forces_size birth_rate calling_code
## Min. : 0 Min. : 0 Min. : 5.90 Min. : 1.0
## 1st Qu.: 23828 1st Qu.: 11000 1st Qu.:11.30 1st Qu.: 82.5
## Median : 119511 Median : 31000 Median :17.95 Median : 255.5
## Mean : 689624 Mean : 159275 Mean :20.21 Mean : 360.5
## 3rd Qu.: 524256 3rd Qu.: 142000 3rd Qu.:28.75 3rd Qu.: 506.8
## Max. :17098240 Max. :3031000 Max. :46.08 Max. :1876.0
## NA's :1 NA's :24 NA's :6 NA's :1
## capital/major_city co2-emissions cpi cpi_change
## Length:195 Min. : 11 Length:195 Length:195
## Class :character 1st Qu.: 2304 Class :character Class :character
## Mode :character Median : 12303 Mode :character Mode :character
## Mean : 177799
## 3rd Qu.: 63884
## Max. :9893038
## NA's :7
## currency-code fertility_rate forested_area gasoline_price
## Length:195 Min. :0.980 Min. : 0.00 Length:195
## Class :character 1st Qu.:1.705 1st Qu.:11.00 Class :character
## Mode :character Median :2.245 Median :32.00 Mode :character
## Mean :2.698 Mean :32.02
## 3rd Qu.:3.598 3rd Qu.:48.17
## Max. :6.910 Max. :98.30
## NA's :7 NA's :7
## gdp gross_primary_education_enrollment
## Min. :0.000e+00 Length:195
## 1st Qu.:7.892e+09 Class :character
## Median :3.412e+10 Mode :character
## Mean :4.724e+11
## 3rd Qu.:2.305e+11
## Max. :2.143e+13
##
## gross_tertiary_education_enrollment infant_mortality largest_city
## Length:195 Min. : 1.40 Length:195
## Class :character 1st Qu.: 6.00 Class :character
## Mode :character Median :14.00 Mode :character
## Mean :21.33
## 3rd Qu.:32.70
## Max. :84.50
## NA's :6
## life_expectancy maternal_mortality_ratio minimum_wage official_language
## Min. :52.80 Min. : 2.0 Length:195 Length:195
## 1st Qu.:67.00 1st Qu.: 13.0 Class :character Class :character
## Median :73.20 Median : 53.0 Mode :character Mode :character
## Mean :72.28 Mean : 160.4
## 3rd Qu.:77.50 3rd Qu.: 186.0
## Max. :85.40 Max. :1150.0
## NA's :8 NA's :14
## out_of_pocket_health_expenditure physicians_per_thousand population
## Length:195 Min. :0.0100 Min. :8.360e+02
## Class :character 1st Qu.:0.3325 1st Qu.:1.963e+06
## Mode :character Median :1.4600 Median :8.827e+06
## Mean :1.8398 Mean :3.938e+07
## 3rd Qu.:2.9350 3rd Qu.:2.859e+07
## Max. :8.4200 Max. :1.398e+09
## NA's :7 NA's :1
## population:_labor_force_participation tax_revenue total_tax_rate
## Length:195 Length:195 Length:195
## Class :character Class :character Class :character
## Mode :character Mode :character Mode :character
##
##
##
##
## unemployment_rate urban_population latitude longitude
## Min. : 0.090 Min. : 5464 Min. :-40.901 Min. :-175.198
## 1st Qu.: 3.395 1st Qu.: 1152961 1st Qu.: 4.544 1st Qu.: -7.941
## Median : 5.360 Median : 4678104 Median : 17.274 Median : 20.973
## Mean : 6.886 Mean : 22304543 Mean : 19.092 Mean : 20.232
## 3rd Qu.: 9.490 3rd Qu.: 14903239 3rd Qu.: 40.125 3rd Qu.: 48.282
## Max. :28.180 Max. :842933962 Max. : 64.963 Max. : 178.065
## NA's :19 NA's :5 NA's :1 NA's :1
## continent
## Length:195
## Class :character
## Mode :character
##
##
##
##
4 Analisis Data Eksploratif
4.1 Total Pupulation
4.1.1 Total Pupulation - Wolrd Map
population<-world_data %>%
select(name, population)
highchart() %>%
hc_add_series_map(worldgeojson, df=population, value="population", joinBy = "name") %>%
hc_colorAxis(stops=color_stops()) %>%
hc_title(text="World Population") %>%
hc_tooltip(useHTML = TRUE,
formatter = JS(
"function(){",
" return '<b><u>'+this.point.name+'</u></b><br>'",
" +'<b>Population:</b> '+parseInt(this.point.value);",
"}"
)
) %>%
hc_legend(
enabled = TRUE,
title = list(text = "Population"),
layout = "vertical",
align = "right",
verticalAlign = "middle"
)
4.1.2 Top 10 Countries with Highest Population
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Highest Population") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "Total Population")) %>%
hc_add_series(
data = population %>%
arrange(desc(population)) %>%
head(10) %>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = population, name = name),
name = "Population",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.1.3 Top 10 Countries with Lowest Population
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Lowest Population") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "Total Population")) %>%
hc_add_series(
data = population %>%
arrange(population) %>%
head(10) %>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = population, name = name),
name = "Population",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.1.4 Poppulation Vs Land
population_by_continent <-
world_data %>%
group_by(continent) %>%
summarise(
Population = sum(population))
land_by_continent <-
world_data %>%
group_by(continent) %>%
summarise(
sum_land = sum(land_area, na.rm = TRUE),
sum_agri = sum(agricultural_land, na.rm = TRUE),
forested_land = sum(forested_area, na.rm = TRUE)
)
#Pie group land_by_continent vs population_by_continent
fig <- plot_ly()
fig <- fig %>% add_pie(data = land_by_continent, labels = ~continent, values = ~sum_land,
name = "Land", domain = list(row = 1, column = 0),
texttemplate="%{label}<br>(%{percent})",
textposition="inside",
title = "Land distribution")
fig <- fig %>% add_pie(data = population_by_continent, labels = ~continent, values = ~population,
name = "Population", domain = list(row = 1, column = 1),
texttemplate="%{label}<br>(%{percent})",
textposition="inside",
title = "Population distribution")
fig <- fig %>% layout(title = "Land distribution vs Population distribution", showlegend = T,
grid=list(rows=1, columns=2),
xaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE),
yaxis = list(showgrid = FALSE, zeroline = FALSE, showticklabels = FALSE))
fig
4.2 Population Density
4.2.1 Population Density - World Map
pop_density<-world_data %>%
select(name, density)
highchart() %>%
hc_add_series_map(worldgeojson, df=pop_density, joinBy = "name", value="density") %>%
hc_colorAxis(type = "logarithmic", stops=color_stops(), min=1) %>%
hc_title(text = "Population Density - World Map") %>%
hc_tooltip(useHTML = TRUE,
formatter = JS(
"function(){",
" return '<b><u>'+this.point.name+'</u></b><br>'",
" +'<b>Population Density per Sq.KM:</b> '+parseInt(this.point.value);",
"}"
)
) %>%
hc_legend(
enabled = TRUE,
title = list(text = "Population Density"),
layout = "vertical",
align = "right",
verticalAlign = "middle"
)
4.2.2 Top 10 Countries with Highest Population Density
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Highest Population Density") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "Population Density")) %>%
hc_add_series(
data = pop_density %>%
arrange(desc(density)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = density, name = name),
name = "Density",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.2.3 Top 10 Countries with Lowest Population Density
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Lowest Population Density") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "Population Density")) %>%
hc_add_series(
data = pop_density %>%
arrange(density) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = density, name = name),
name = "Density",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.3 Urban Population
4.3.1 Urban Population - Wolrd Map
# Urban Population
urban_total<-world_data %>%
select(name,land_area, `capital/major_city`, population, urban_population, latitude, longitude )
highchart() %>%
hc_add_series_map(worldgeojson, df=urban_total, value="urban_population", joinBy = "name") %>%
hc_colorAxis(stops=color_stops()) %>%
hc_title(text="Urban Population") %>%
hc_tooltip(useHTML = TRUE,
formatter = JS(
"function(){",
" return '<b><u>'+this.point.name+'</u></b><br>'",
" +'<b>Urban Population:</b> '+parseInt(this.point.value);",
"}"
)
) %>%
hc_legend(
enabled = TRUE,
title = list(text = "Urban Population"),
layout = "vertical",
align = "right",
verticalAlign = "middle"
)
4.3.2 Top 10 Countries with Highest Urban Population
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Highest Urban Population") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "Urban Population")) %>%
hc_add_series(
data = urban_total %>%
arrange(desc(urban_population)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = urban_population, name = name),
name = "Urban Population",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.3.3 Top 10 Countries with Lowest Urban Population
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Lowest Urban Population") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "Urban Population")) %>%
hc_add_series(
data = urban_total %>%
arrange(urban_population) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = urban_population, name = name),
name = "Urban Population",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.4 Armed Forces
4.4.1 Armed Forces - World Map
armed_forces<-world_data %>%
select(name, armed_forces_size)
highchart() %>%
hc_add_series_map(worldgeojson, df=armed_forces, joinBy = "name", value="armed_forces_size") %>%
hc_colorAxis(stops=color_stops()) %>%
hc_title(text = "Armed Forces - World Map") %>%
hc_tooltip(useHTML = TRUE,
formatter = JS(
"function(){",
" return '<b><u>'+this.point.name+'</u></b><br>'",
" +'<b>Armed Forces:</b> '+parseInt(this.point.value);",
"}"
)
) %>%
hc_legend(
enabled = TRUE,
title = list(text = "Total Armed Forces"),
layout = "vertical",
align = "right",
verticalAlign = "middle"
)
4.4.2 Top 10 Countries with Highest Armed Forces
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Highest Armed Forces") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "Armed Forces")) %>%
hc_add_series(
data = armed_forces %>%
arrange(desc(armed_forces_size)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = armed_forces_size, name = name),
name = "Armed Forces",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.4.3 Top 10 Countries with Lowest Armed Forces
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Lowest Armed Forces") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "Armed Forces")) %>%
hc_add_series(
data = armed_forces %>%
arrange((armed_forces_size)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = armed_forces_size, name = name),
name = "Armed Forces",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.5 Agricultural Land
4.5.1 Agricultural Land - Wolrd Map
agriculture_land_percentage<-world_data %>%
select(name, agricultural_land)
highchart() %>%
hc_add_series_map(worldgeojson, df=agriculture_land_percentage, value="agricultural_land", joinBy = "name") %>%
hc_colorAxis(mincolor="lightblue", maxcolor="blue")%>%
hc_title(text="Agricultural Land Percentage") %>%
hc_tooltip(useHTML = TRUE,
formatter = JS(
"function(){",
" return '<b><u>'+this.point.name+'</u></b><br>'",
" +'<b>Agricultural Land:</b> '+parseInt(this.point.value)+'%';",
"}"
)
)
4.5.2 Top 10 Countries with Highest Agricultural Land
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Highest Agricultural Land") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "Agricultural Land ")) %>%
hc_add_series(
data = agriculture_land_percentage %>%
arrange(desc(agricultural_land)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = agricultural_land, name = name),
name = "Agricultural Land",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.5.3 Top 10 Countries with Lowest Agricultural Land
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Lowest Agricultural Land") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "Agricultural Land ")) %>%
hc_add_series(
data = agriculture_land_percentage %>%
arrange((agricultural_land)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = agricultural_land, name = name),
name = "Agricultural Land",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.5.4 Total Land vs Agricultural Land Highest
land_agri<-world_data %>%
select(name, land_area, agricultural_land)
highchart() %>%
hc_chart(type = "bubble", zoomType = "xy") %>%
hc_title(text = "Total Land vs Agricultural Land (Top 20 highest Countries by Agricultural Land)") %>%
hc_xAxis(title = list(text = "Land Area in Sq.KM")) %>%
hc_yAxis(title = list(text = "Agricultural Land (%)")) %>%
hc_add_series(
data = land_agri %>%
arrange(desc(agricultural_land)) %>%
head(20),
type = "bubble",
hcaes(x = land_area, y = agricultural_land, z = agricultural_land, name = name),
name = "Agricultural Land %",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
) %>%
hc_tooltip(
useHTML = TRUE,
formatter = JS(
"function() {",
" return '<b><u>' + this.point.options.name + '</u></b><br>'",
" + 'Land Area: ' + Highcharts.numberFormat(this.point.options.land_area) + ' Sq.KM<br>'",
" + 'Agricultural Land: ' + Highcharts.numberFormat(this.point.options.agricultural_land) + '%';",
"}"
)
)
4.5.5 Total Land vs Agricultural Land Top 20 Lowest
land_agri<-world_data %>%
select(name, land_area, agricultural_land)
highchart() %>%
hc_chart(type = "bubble", zoomType = "xy") %>%
hc_title(text = "Total Land vs Agricultural Land (Top 20 Lowest Countries by Agricultural Land)") %>%
hc_xAxis(title = list(text = "Land Area in Sq.KM")) %>%
hc_yAxis(title = list(text = "Agricultural Land (%)")) %>%
hc_add_series(
data = land_agri %>%
arrange((agricultural_land)) %>%
head(20),
type = "bubble",
hcaes(x = land_area, y = agricultural_land, z = agricultural_land, name = name),
name = "Agricultural Land %",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
) %>%
hc_tooltip(
useHTML = TRUE,
formatter = JS(
"function() {",
" return '<b><u>' + this.point.options.name + '</u></b><br>'",
" + 'Land Area: ' + Highcharts.numberFormat(this.point.options.land_area) + ' Sq.KM<br>'",
" + 'Agricultural Land: ' + Highcharts.numberFormat(this.point.options.agricultural_land) + '%';",
"}"
)
)
4.6 Fertility vs Maternal Mortality vs Birth Rate vs Infant Mortality
4.6.1 Scatter Plot
mortality_fertility<-world_data %>%
select(name, population, fertility_rate, birth_rate, maternal_mortality_ratio, infant_mortality)
plot1<-highchart() %>%
hc_title(text = "Birth Rate") %>%
hc_add_series(data = mortality_fertility,
type = "scatter",
hcaes(x = birth_rate, y = population)) %>%
hc_xAxis(title = list(text = "Birth Rate")) %>%
hc_yAxis(title = list(text = "Population")) %>%
hc_add_theme(hc_theme_google()) %>%
hc_tooltip(useHTML = TRUE,
headerFormat = "<b><u>{point.key}</u></b><br>",
pointFormat = "<b>Birth Rate:</b> {point.x}<br><b>Population:</b> {point.y}")
plot2<-highchart() %>%
hc_title(text = "Fertility Rate") %>%
hc_add_series(data = mortality_fertility,
type = "scatter",
hcaes(x = fertility_rate, y = population)) %>%
hc_xAxis(title = list(text = "Fertility Rate")) %>%
hc_yAxis(title = list(text = "Population")) %>%
hc_add_theme(hc_theme_google()) %>%
hc_tooltip(useHTML = TRUE,
headerFormat = "<b><u>{point.key}</u></b><br>",
pointFormat = "<b>Fertility Rate:</b> {point.x}<br><b>Population:</b> {point.y}")
plot3<-highchart() %>%
hc_title(text = "Maternal Mortality Ratio") %>%
hc_add_series(data = mortality_fertility,
type = "scatter",
hcaes(x = maternal_mortality_ratio, y = population)) %>%
hc_xAxis(title = list(text = "Maternal Mortality Ratio")) %>%
hc_yAxis(title = list(text = "Population")) %>%
hc_add_theme(hc_theme_google()) %>%
hc_tooltip(useHTML = TRUE,
headerFormat = "<b><u>{point.key}</u></b><br>",
pointFormat = "<b>Maternal Mortality Ratio:</b> {point.x}<br><b>Population:</b> {point.y}")
plot4<-highchart() %>%
hc_title(text = "Infant Mortality") %>%
hc_add_series(data = mortality_fertility,
type = "scatter",
hcaes(x = infant_mortality, y = population)) %>%
hc_xAxis(title = list(text = "Infant Mortality")) %>%
hc_yAxis(title = list(text = "Population")) %>%
hc_add_theme(hc_theme_google()) %>%
hc_tooltip(useHTML = TRUE,
headerFormat = "<b><u>{point.key}</u></b><br>",
pointFormat = "<b>Infant Mortality:</b> {point.x}<br><b>Population:</b> {point.y}")
pair1 <- tagList(plot1, plot2)
pair2 <- tagList(plot3, plot4)
browsable(div(
style = "display: flex; justify-content: space-between;",
pair1
))
4.6.2 Population vs Birth rate vs Infant Mortality
- Sumbu x menunjukkan Population
- Sumbu y menunjukkan Birth rate
- Ukuran Bubble menunjukan nilai dari Infant Mortality
highchart() %>%
hc_chart(type = "bubble", zoomType = "xy") %>%
hc_title(text = "Population vs Birth rate vs Infant Mortality") %>%
hc_xAxis(title = list(text = "Population")) %>%
hc_yAxis(title = list(text = "Birth Rate")) %>%
hc_add_series(
data = mortality_fertility %>%
arrange(desc(fertility_rate)) %>%
head(20),
type = "bubble",
hcaes(x = population, y = birth_rate, z = infant_mortality , name = name),
name = "Fertility Rate",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
) %>%
hc_tooltip(
useHTML = TRUE,
formatter = JS(
"function() {",
" return '<b><u>' + this.point.options.name + '</u></b><br>'",
" + 'Population: ' + Highcharts.numberFormat(this.point.options.population) + '<br>'",
" + 'Birth Rate: ' + Highcharts.numberFormat(this.point.options.birth_rate) + '<br>'",
" + 'Infant Mortality: ' + Highcharts.numberFormat(this.point.options.infant_mortality) + '<br>'",
"}"
)
)
4.7 Life Expectancy
4.7.1 Life Expectancy - World Map
life_expectancy<-world_data %>%
select(name, population, life_expectancy)
highchart() %>%
hc_add_series_map(worldgeojson, df=life_expectancy, value="life_expectancy", joinBy = "name") %>%
hc_colorAxis(stops=color_stops()) %>%
hc_title(text="Life Expectancy") %>%
hc_tooltip(useHTML = TRUE,
formatter = JS(
"function(){",
" return '<b><u>'+this.point.name+'</u></b><br>'",
" +'<b>Life Expectancy:</b> '+parseInt(this.point.value)",
"}"
)
) %>%
hc_legend(
enabled = TRUE,
title = list(text = "Life Expectancy"),
layout = "vertical",
align = "right",
verticalAlign = "middle"
)
4.7.2 Top 10 Countries with Highest Life Expectancy
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Highest Life Expectancy") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "ife Expectancy")) %>%
hc_add_series(
data = life_expectancy %>%
arrange(desc(life_expectancy)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = life_expectancy, name = name),
name = "Life Expectancy",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.7.3 Top 10 Countries with Lowest Life Expectancy
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Lowest Life Expectancy") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "ife Expectancy")) %>%
hc_add_series(
data = life_expectancy %>%
arrange((life_expectancy)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = life_expectancy, name = name),
name = "Life Expectancy",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.7.4 Life Expectancy by contienent
plot_ly(world_data, x = ~life_expectancy, color = ~continent, type="box",
boxpoints = "all") %>%
layout(title = "Life Expectancy Per Continent + Global",
yaxis = list(title = 'Continent', standoff = 35)
) %>%
add_trace(x = ~world_data$life_expectancy, name = "Global", color = "yellow")
## Warning: Ignoring 8 observations
## Warning: Ignoring 8 observations
4.8 Unemployment Rate
4.8.1 Top 20 Countries with high unemployment rate
unemployment<-world_data %>%
select(name, unemployment_rate) %>%
arrange(desc(unemployment_rate)) %>%
slice(1:20)
highchart() %>%
hc_chart(type="column") %>%
hc_xAxis(categories=unemployment$name) %>%
hc_add_series(data=unemployment$unemployment_rate, name="Unemployment Rate") %>%
hc_title(text="Top 20 Countries with high unemployment rate") %>%
hc_add_theme(hc_theme_google()) %>%
hc_tooltip(useHTML=TRUE,
headerFormat="<b><u>{point.key}</u></b><br>",
pointFormat="<b>Unemployment Rate:</b> {point.y}%")
4.8.2 Top 20 Countries with least unemployment rate
unemployment<-world_data %>%
select(name, unemployment_rate) %>%
arrange((unemployment_rate)) %>%
slice(1:20)
highchart() %>%
hc_chart(type="column") %>%
hc_xAxis(categories=unemployment$name) %>%
hc_add_series(data=unemployment$unemployment_rate, name="Unemployment Rate") %>%
hc_title(text="Top 20 Countries with least unemployment rate") %>%
hc_add_theme(hc_theme_google()) %>%
hc_tooltip(useHTML=TRUE,
headerFormat="<b><u>{point.key}</u></b><br>",
pointFormat="<b>Unemployment Rate:</b> {point.y}%")
4.9 CO2 Emissions
4.9.1 CO2 Emissions - Wolrd Map
co2_emissions<-world_data %>%
select(name, `co2-emissions`)
highchart() %>%
hc_add_series_map(worldgeojson, df=co2_emissions, value="co2-emissions", joinBy = "name") %>%
hc_colorAxis(minColor="#4FC978", maxColor="red") %>%
hc_title(text="CO2 Emissions") %>%
hc_tooltip(useHTML = TRUE,
formatter = JS(
"function(){",
" return '<b><u>'+this.point.name+'</u></b><br>'",
" +'<b>CO2 Emissions:</b> '+parseInt(this.point.value)",
"}"
)
) %>%
hc_legend(
enabled = TRUE,
title = list(text = "CO2 Emissions"),
layout = "vertical",
align = "right",
verticalAlign = "middle"
)
4.9.2 Top 10 Countries with Highest CO2 Emissions
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Highest CO2 Emissions") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "CO2 Emissions")) %>%
hc_add_series(
data = co2_emissions %>%
arrange(desc(`co2-emissions`)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = `co2-emissions`, name = name),
name = "CO2 Emissions",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.9.3 Top 10 Countries with Lowest CO2 Emissions
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Lowest CO2 Emissions") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "CO2 Emissions")) %>%
hc_add_series(
data = co2_emissions %>%
arrange((`co2-emissions`)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = `co2-emissions`, name = name),
name = "CO2 Emissions",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.10 GDP
4.10.1 GDP - Wolrd Map
gdp<-world_data %>%
select(name, gdp)
highchart() %>%
hc_add_series_map(worldgeojson, df=gdp, value="gdp", joinBy = "name") %>%
hc_colorAxis(stops=color_stops()) %>%
hc_title(text="GDP") %>%
hc_tooltip(useHTML = TRUE,
formatter = JS(
"function(){",
" return '<b><u>'+this.point.name+'</u></b><br>'",
" +'<b>GDP:</b> '+parseInt(this.point.value)",
"}"
)
) %>%
hc_legend(
enabled = TRUE,
title = list(text = "GDP"),
layout = "vertical",
align = "right",
verticalAlign = "middle"
)
4.10.2 Top 10 Countries with Highest GDP
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Highest GDP") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "GDP")) %>%
hc_add_series(
data = gdp %>%
arrange(desc(gdp)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = gdp, name = name),
name = "GDP",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.10.3 Top 10 Countries with Lowest GDP
highchart() %>%
hc_chart(type = "column", options3d=list(enabled=TRUE, alpha=10, beta=10)) %>%
hc_title(text = "Top 10 Countries with Lowest GDP") %>%
hc_xAxis(title = list(text = "Country")) %>%
hc_yAxis(title = list(text = "GDP")) %>%
hc_add_series(
data = gdp %>%
arrange(desc(gdp)) %>%
head(10)%>%
mutate(rank = row_number()),
type = "column",
hcaes(x = rank, y = gdp, name = name),
name = "GDP",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
)
4.10.4 Top 50 contries with the highest GDP by continent
GDP_top <-
world_data %>%
select(name, gdp, continent) %>%
arrange(desc(gdp)) %>%
head(50)
GDP_top %>%
plot_ly(x = ~continent, y = ~gdp, color = ~name) %>%
layout(title = "Top 50 countries with the highest GDP by continent <br> with country included",
xaxis = list(categoryorder='total descending'),
barmode = 'stack')
## No trace type specified:
## Based on info supplied, a 'bar' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#bar
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
4.10.5 GDP per capita and Life Expectancy correlation
gdp_le<-world_data %>%
select(name, population, gdp, life_expectancy)
highchart() %>%
hc_chart(type = "bubble", zoomType = "xy") %>%
hc_title(text = "GDP per capita and Life Expectancy correlation") %>%
hc_xAxis(title = list(text = "GDP")) %>%
hc_yAxis(title = list(text = "Life Expectancy")) %>%
hc_add_series(
data = gdp_le %>%
arrange(desc(gdp)) %>%
head(20),
type = "bubble",
hcaes(x = gdp, y = life_expectancy, z = gdp , name = name),
name = "GDP",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
) %>%
hc_tooltip(
useHTML = TRUE,
formatter = JS(
"function() {",
" return '<b><u>' + this.point.options.name + '</u></b><br>'",
" + 'GDP: ' + Highcharts.numberFormat(this.point.options.gdp) + '<br>'",
" + 'Life Expectancy : ' + Highcharts.numberFormat(this.point.options.life_expectancy);",
"}"
)
)
4.10.6 GDP and Unemployment Rate correlation
gdp_ur<-world_data %>%
select(name, gdp, unemployment_rate)
highchart() %>%
hc_chart(type = "bubble", zoomType = "xy") %>%
hc_title(text = "GDP and Unemployment Rate correlation") %>%
hc_xAxis(title = list(text = "GDP")) %>%
hc_yAxis(title = list(text = "Unemployment Rate")) %>%
hc_add_series(
data = gdp_ur %>%
arrange(desc(gdp)) %>%
head(20),
type = "bubble",
hcaes(x = gdp, y = unemployment_rate, z = gdp , name = name),
name = "GDP",
dataLabels = list(enabled = TRUE, format = '{point.options.name}'),
colorByPoint = TRUE
) %>%
hc_tooltip(
useHTML = TRUE,
formatter = JS(
"function() {",
" return '<b><u>' + this.point.options.name + '</u></b><br>'",
" + 'GDP: ' + Highcharts.numberFormat(this.point.options.gdp) + '<br>'",
" + 'Unemployment Rate : ' + Highcharts.numberFormat(this.point.options.unemployment_rate);",
"}"
)
)
4.10.7 GDP Per Capita vs Birth Rate
world_data %>%
plot_ly(x = ~gdp, y = ~birth_rate, mode = 'markers',
color = ~continent,
text = ~paste("Country: ", name, "<br>",
"Birth Rate: ", birth_rate, "%")) %>%
layout(title = 'GDP vs Birth Rate',
xaxis = list(title = 'GDP Per Capita (US dollars)'),
yaxis = list(title = 'Birth Rate (%)'))
## No trace type specified:
## Based on info supplied, a 'scatter' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#scatter
## Warning: Ignoring 6 observations
4.11 Languages
4.11.1 Languages across the globe
#Language by population
language_by_population <-
world_data %>%
group_by(official_language) %>%
summarise(count = n(),
population = sum(population, na.rm = TRUE)) %>%
arrange(desc(population), desc(count))
language_by_population <- rename(language_by_population, num_countries = count)
f_lang <- as.data.frame(head(language_by_population, 20))
f_lang[nrow(f_lang) + 1, ] = c("Other", nrow(world_data) - sum(f_lang$count), sum(world_data$population) - sum(f_lang$population))
#####Pie language by population
plot_ly(f_lang, labels = ~official_language,
values = ~population, type = "pie",
texttemplate="%{label}<br>(%{percent})",
textposition="inside")
4.11.2 Top 10 most spoken languages by population
f_lang <- as.data.frame(head(language_by_population, 10))
df_lang_country <- subset(world_data, official_language %in% f_lang$official_language)
df_lang_country %>%
plot_ly(x = ~official_language, y = ~population, color = ~name) %>%
layout(title = "Top 10 most spoken languages by population, with country included",
yaxis = list(title = 'Population'), barmode = 'stack',
xaxis = list(categoryorder='total descending', title = "Official Language"))
## No trace type specified:
## Based on info supplied, a 'bar' trace seems appropriate.
## Read more about this trace type -> https://plotly.com/r/reference/#bar
## Warning: Ignoring 1 observations
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
4.11.3 Top 10 languages by number of countries
language_by_countries <-
language_by_population %>%
arrange(desc(num_countries ))
f_lang <- as.data.frame(head(language_by_countries, 10))
f_lang %>%
plot_ly(x = ~official_language, y = ~num_countries, color = ~official_language,
type = "bar") %>%
layout(title = "Languages by Number of Countries",
xaxis = list(categoryorder='total descending', title = "Official Language"
),
yaxis = list(title = "Number of Countries"),
showlegend = F
)
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
0-9\.↩︎