Sistem pembuatan grafik pada software R memiliki banyak cara dan package yang mendukung sesuai kebutuhan. Namun, kali ini kita akan membuat grafing menggunakan package ggplot2 dan beberapa package pendukung lainya. Pembuatan grafik menggunakan ggplot2 merupakan inplementasi dari konsep Grammar of Graphic untuk bahasa R. Konsep Grammar of Graphic mengajak kita untuk merekonstruksi pembuatan grafik dengan menggunakan kaidah tata bahasa sehingga tidak terikat pada nama jenis grafik (contoh: scatterplot, line-chart, bar-chart, dll.) seperti yang umumnya dilakukan.
Sebelum memulai diharapkan sudah terinstall package ggplot2 dan memanggil dengan fungsi library() di R masing masing atau dapat menginstallnya dengan menggunakna kode sebagai berikut
ggplot2 sendiri memiliki satu fungsi yang dimana dapat membuat grafik dengan cepat yagn dinamakan qplot() atau biasa disebut quick plot yang bermanfaat untuk membuat plot yang cepat dan ringkas. Penggunaannya pun lebih mudah apabila kita sudha terbiasa dengan fungsi plot()
disini kita emnggunakan data diamonds untuk data yang dipakai dan kita kan membuat scatter plot dengan sumbu x adalah carat dan sumbu y adalah price dan color yang akan kita gunakan adalah clarity
qplot(x = carat, y = price, colour = clarity, data = diamonds)
namun pembuatan grafik menggunakan qplot() hanya dapat digunakan untuk grafik grafik sederhana saja, jika kita ingin membuat grafik yang menggunakan fungsi ggplot maka kode yang digunakan akan lebih kompleks dan dapat digunakan untuk membuat masalah pembuatan grafik yang kompleks.
ggplot(data = diamonds,
mapping = aes(x = carat, y = price, colour = clarity)) +
geom_point()
ggplot2 merupakan fungsi yang sangat fleksibel dan terdapat berbagai macam cara yang dapat digunakan. kita akan mencoba 3 cara berbeda namun menghasilkan hasil yang sama dalam menggunakan ggplot nda gunakna fungsi summary() untuk melihat perbedaan dari setiap kodenya
library(ggplot2)
# Cara 1
diamonds_c1 <-
ggplot(data = diamonds,
mapping = aes(x = carat, y = price, colour = clarity)) +
geom_point()
summary(diamonds_c1)
## data: carat, cut, color, clarity, depth, table, price, x, y, z
## [53940x10]
## mapping: x = ~carat, y = ~price, colour = ~clarity
## faceting: <ggproto object: Class FacetNull, Facet, gg>
## compute_layout: function
## draw_back: function
## draw_front: function
## draw_labels: function
## draw_panels: function
## finish_data: function
## init_scales: function
## map_data: function
## params: list
## setup_data: function
## setup_params: function
## shrink: TRUE
## train_scales: function
## vars: function
## super: <ggproto object: Class FacetNull, Facet, gg>
## -----------------------------------
## geom_point: na.rm = FALSE
## stat_identity: na.rm = FALSE
## position_identity
# Cara 2
diamonds_c2<-
ggplot(data = diamonds) +
geom_point(mapping = aes (x = carat, y = price, colour = clarity))
summary(diamonds_c2)
## data: carat, cut, color, clarity, depth, table, price, x, y, z
## [53940x10]
## faceting: <ggproto object: Class FacetNull, Facet, gg>
## compute_layout: function
## draw_back: function
## draw_front: function
## draw_labels: function
## draw_panels: function
## finish_data: function
## init_scales: function
## map_data: function
## params: list
## setup_data: function
## setup_params: function
## shrink: TRUE
## train_scales: function
## vars: function
## super: <ggproto object: Class FacetNull, Facet, gg>
## -----------------------------------
## mapping: x = ~carat, y = ~price, colour = ~clarity
## geom_point: na.rm = FALSE
## stat_identity: na.rm = FALSE
## position_identity
# Cara 3
diamonds_c3 <-
ggplot() + geom_point (
data = diamonds, mapping = aes(x = carat, y = price, colour = clarity)
)
summary(diamonds_c3)
## data: [x]
## faceting: <ggproto object: Class FacetNull, Facet, gg>
## compute_layout: function
## draw_back: function
## draw_front: function
## draw_labels: function
## draw_panels: function
## finish_data: function
## init_scales: function
## map_data: function
## params: list
## setup_data: function
## setup_params: function
## shrink: TRUE
## train_scales: function
## vars: function
## super: <ggproto object: Class FacetNull, Facet, gg>
## -----------------------------------
## mapping: x = ~carat, y = ~price, colour = ~clarity
## geom_point: na.rm = FALSE
## stat_identity: na.rm = FALSE
## position_identity
Selain tiga komponen dasar yang sebelumnya telah disinggung, dalam konsep Grammar of Graphic terdapat lima komponen utama lainnya yang berperan penting dalam pembuatan sebuah grafik. yaitu: Data, Mapping, Statistics, Scales, Geometries, Facets, Coordinates, dan Theme
namun tunggu dulu sebelummenyelam lebih dalam terkait visualisas data menggunakan ggplot2konsep dasar yang perlu kita pahami adlah transformasi data mengunakan package salahsatunya dplyr. karna dengan pmahaman konsep transformasi data yang kuat maka akan mempermudah seorang data analysis melakukan visualisasi
Transformasi data umumnya merupakan sebuah rangkaian yang terdiri lebih dari satu proses. Oleh karena itu, dalam tranformasi data menggunakan dplyr sering digunakan operator pipe (%>%) untuk menghubungkan antara satu fungsi ke fungsi selanjutnya. sebenrnya banyak sekali paket yang digunakan sesuai kebutuhan transformasi data itu sendiri dplyr merupakan salah satu dari sekian banyak paket. proses transfirmasi ini sangat dibutuhkan untuk mempermudah dalam membuat visualisasi yang cukup kompleks.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
glimpse(storms)
## Rows: 10,010
## Columns: 13
## $ name <chr> "Amy", "Amy", "Amy", "Amy", "Amy", "Amy", "Amy", "Amy",...
## $ year <dbl> 1975, 1975, 1975, 1975, 1975, 1975, 1975, 1975, 1975, 1...
## $ month <dbl> 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7...
## $ day <int> 27, 27, 27, 27, 28, 28, 28, 28, 29, 29, 29, 29, 30, 30,...
## $ hour <dbl> 0, 6, 12, 18, 0, 6, 12, 18, 0, 6, 12, 18, 0, 6, 12, 18,...
## $ lat <dbl> 27.5, 28.5, 29.5, 30.5, 31.5, 32.4, 33.3, 34.0, 34.4, 3...
## $ long <dbl> -79.0, -79.0, -79.0, -79.0, -78.8, -78.7, -78.0, -77.0,...
## $ status <chr> "tropical depression", "tropical depression", "tropical...
## $ category <ord> -1, -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0,...
## $ wind <int> 25, 25, 25, 25, 25, 25, 25, 30, 35, 40, 45, 50, 50, 55,...
## $ pressure <int> 1013, 1013, 1013, 1013, 1012, 1012, 1011, 1006, 1004, 1...
## $ ts_diameter <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
## $ hu_diameter <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,...
# Tanpa menggunakan %>%
storms1 <- select(storms, year, month, wind, pressure)
storms2 <- filter(storms1, between(year, 2000, 2015))
storms3 <- mutate(storms2, month = factor(month.name[storms2$month], levels = month.name))
storms4 <- group_by(storms3, month)
storms_nopipe <- summarise(storms4, avg_wind = mean(wind), avg_pressure = mean(pressure))
glimpse(storms_nopipe)
## Rows: 10
## Columns: 3
## $ month <fct> January, April, May, June, July, August, September, Oc...
## $ avg_wind <dbl> 45.65217, 44.61538, 36.76471, 39.03030, 48.21981, 51.9...
## $ avg_pressure <dbl> 999.4348, 996.9231, 1003.4510, 999.5333, 999.1300, 994...
# Menggunakan %>%
storms_pipe <-
storms %>%
select(year, month, wind, pressure) %>%
filter(between(year, 2000, 2015)) %>%
mutate(month = factor(month.name[month], levels = month.name)) %>%
group_by(month) %>%
summarise(
avg_wind = mean(wind),
avg_pressure = mean(pressure)
)
glimpse(storms_pipe)
## Rows: 10
## Columns: 3
## $ month <fct> January, April, May, June, July, August, September, Oc...
## $ avg_wind <dbl> 45.65217, 44.61538, 36.76471, 39.03030, 48.21981, 51.9...
## $ avg_pressure <dbl> 999.4348, 996.9231, 1003.4510, 999.5333, 999.1300, 994...
# Komparasi metode tanpa pipe dan dengan pipe
identical(storms_nopipe, storms_pipe)
## [1] TRUE
berikut merupakan simnplifikasi dari kode yang kita jalankan sebelumnya
storms %>%
select(year, month, wind, pressure) %>%
filter(between(year, 2000, 2015)) %>%
mutate(month = factor(month.name[month], levels = month.name)) %>%
group_by(month) %>%
summarise(
avg_wind = mean(wind),
avg_pressure = mean(pressure)
)
## # A tibble: 10 x 3
## month avg_wind avg_pressure
## <fct> <dbl> <dbl>
## 1 January 45.7 999.
## 2 April 44.6 997.
## 3 May 36.8 1003.
## 4 June 39.0 999.
## 5 July 48.2 999.
## 6 August 52.0 994.
## 7 September 58.3 988.
## 8 October 55.7 990.
## 9 November 56.5 990.
## 10 December 46.8 997.
Pada kali ini kita akan menggunakan data INDODAPOER yang sudah di sediakan oleh DQLab dalam penjelasan tersebut namun diperlukan pengisntalan package readr terlebih dahulu. dikarenakan perlunya fungsi read_tsv untuk membaca file tsv.gz.
library(readr)
indodapoer <- read_tsv("https://dqlab-dataset.s3-ap-southeast-1.amazonaws.com/indodapoer.tsv.gz")
## Parsed with column specification:
## cols(
## .default = col_double(),
## area_name = col_character(),
## `Import: Commodities and transaction not elsewhere classified (province Level, in USD)` = col_logical(),
## `Length of National Road: Dirt (in km) (BPS Data, Province only)` = col_logical(),
## `Length of National Road: Other (in km) (BPS Data, Province only)` = col_logical(),
## `Total Natural Resources Revenue Sharing from Geothermal Energy (in IDR, realization value)` = col_logical(),
## `Total Revenue Sharing` = col_logical(),
## `Total Specific Allocation Grant for Village (in IDR Billion)` = col_logical()
## )
## See spec(...) for full column specifications.
## Warning: 232 parsing failures.
## row col expected actual file
## 1008 Import: Commodities and transaction not elsewhere classified (province Level, in USD) 1/0/T/F/TRUE/FALSE 554693 'https://dqlab-dataset.s3-ap-southeast-1.amazonaws.com/indodapoer.tsv.gz'
## 1009 Import: Commodities and transaction not elsewhere classified (province Level, in USD) 1/0/T/F/TRUE/FALSE 1291450 'https://dqlab-dataset.s3-ap-southeast-1.amazonaws.com/indodapoer.tsv.gz'
## 1010 Import: Commodities and transaction not elsewhere classified (province Level, in USD) 1/0/T/F/TRUE/FALSE 365356 'https://dqlab-dataset.s3-ap-southeast-1.amazonaws.com/indodapoer.tsv.gz'
## 1011 Import: Commodities and transaction not elsewhere classified (province Level, in USD) 1/0/T/F/TRUE/FALSE 216478 'https://dqlab-dataset.s3-ap-southeast-1.amazonaws.com/indodapoer.tsv.gz'
## 1012 Import: Commodities and transaction not elsewhere classified (province Level, in USD) 1/0/T/F/TRUE/FALSE 646310 'https://dqlab-dataset.s3-ap-southeast-1.amazonaws.com/indodapoer.tsv.gz'
## .... ..................................................................................... .................. ....... .........................................................................
## See problems(...) for more details.
nrow(indodapoer)
## [1] 22468
ncol(indodapoer)
## [1] 222
pada data tersebut terdapat 22468 baris dan 222 kolom. dari data tersebut masih banyak kolom yang tidak memenuhi kaidah “syntactically valid names” namun pada R terdapat package janitor yang dapat mempermudah pekerjaan dalam membersihkan hal tersebut.
#install.packages("janitor", repos = "http://cran.us.r-project.org")
library(janitor)
## Warning: package 'janitor' was built under R version 3.6.3
##
## Attaching package: 'janitor'
## The following objects are masked from 'package:stats':
##
## chisq.test, fisher.test
head(colnames(indodapoer), 15)
## [1] "area_name"
## [2] "year"
## [3] "Agriculture function expenditure (in IDR)"
## [4] "Average National Exam Score: Junior Secondary Level (out of 100, available only in district level for 2009)"
## [5] "Average National Exam Score: Primary Level (out of 100, available only in district level for 2009)"
## [6] "Average National Exam Score: Senior Secondary Level (out of 100, available only in district level for 2009)"
## [7] "Birth attended by Skilled Health worker (in % of total birth)"
## [8] "BPK Audit Report on Sub-National Budget"
## [9] "Capital expenditure (in IDR)"
## [10] "Consumer Price Index in 42 cities base 1996"
## [11] "Consumer Price Index in 45 cities base 2002"
## [12] "Consumer Price Index in 66 cities base 2007"
## [13] "Economy function expenditure (in IDR)"
## [14] "Education function expenditure (in IDR)"
## [15] "Environment function expenditure (in IDR)"
indodapoer <- clean_names(indodapoer)
head(colnames(indodapoer), 15)
## [1] "area_name"
## [2] "year"
## [3] "agriculture_function_expenditure_in_idr"
## [4] "average_national_exam_score_junior_secondary_level_out_of_100_available_only_in_district_level_for_2009"
## [5] "average_national_exam_score_primary_level_out_of_100_available_only_in_district_level_for_2009"
## [6] "average_national_exam_score_senior_secondary_level_out_of_100_available_only_in_district_level_for_2009"
## [7] "birth_attended_by_skilled_health_worker_in_percent_of_total_birth"
## [8] "bpk_audit_report_on_sub_national_budget"
## [9] "capital_expenditure_in_idr"
## [10] "consumer_price_index_in_42_cities_base_1996"
## [11] "consumer_price_index_in_45_cities_base_2002"
## [12] "consumer_price_index_in_66_cities_base_2007"
## [13] "economy_function_expenditure_in_idr"
## [14] "education_function_expenditure_in_idr"
## [15] "environment_function_expenditure_in_idr"
dengan demikian masalah “syntactically valid names” sudah teratasi
Utuk melihat perkembangan Produk Domestik Regional Bruto (PDRB) Non-Migas dari provinsi-provinsi di pulau Jawa. Informasi PDRB Non-Migas tersebut tersimpan pada kolom total_gdp_excluding_oil_and_gas_in_idr_million_constant_price. Sebelum memulai membuat visualisasi, ekstraklah data tersebut menjadi pdrb_pjawa
library(stringr)
## Warning: package 'stringr' was built under R version 3.6.2
library(dplyr)
pdrb_pjawa <-
indodapoer %>%
filter(
area_name %in% c(
"Banten, Prop.",
"DKI Jakarta, Prop.",
"Jawa Barat, Prop.",
"Jawa Tengah, Prop.",
"DI Yogyakarta, Prop.",
"Jawa Timur, Prop."
)
) %>%
transmute(
provinsi = str_remove(area_name, ", Prop."),
tahun = year,
pdrb_nonmigas = total_gdp_excluding_oil_and_gas_in_idr_million_constant_price
) %>%
filter(!is.na(pdrb_nonmigas))
glimpse(pdrb_pjawa)
## Rows: 164
## Columns: 3
## $ provinsi <chr> "Banten", "Banten", "Banten", "Banten", "Banten", "Ba...
## $ tahun <dbl> 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008,...
## $ pdrb_nonmigas <dbl> 45690559, 47495383, 49449321, 51957458, 54880407, 581...
Dengan menggunakan data pdrb_pjawa kita akan membuat grafik tren PDRB Non-Migas dengan baris kode berikut:
library(dplyr)
library(ggplot2)
library(forcats)
pdrb_pjawa %>%
mutate(
provinsi = fct_reorder2(provinsi, tahun, pdrb_nonmigas)
) %>%
ggplot(aes(tahun, pdrb_nonmigas, colour = provinsi)) +
geom_line()
Urutan nama provinsi pada legenda tidak mempresentasikan urutan yang ditampilkan pada grafik. Bayangkan jika kita memiliki lebih banyak nama provinsi yang ditampilkan pada grafik, akan sulit untuk dapat mencocokan nama pada legenda dan garis pada grafik.
salah satu solusi untuk mengatasi permasalahan pada grafik sebelumnya dalah dengan menggunakan direct labeling. Hal ini lebih direkomendasikan karena salah satu prinsip dalam merancang grafik adalah “sebisa mungkin rancang grafik yang tidak memerlukan legenda”. Kita dapat memanfaatkan fungsi geom_dl() dari paket directlabels untuk membuat direct labeling di ggplot2. Adapun aesthetic mapping yang diperlukan dalam geom_dl() tersebut adalah label.
library(ggplot2)
library(dplyr)
library(directlabels)
## Warning: package 'directlabels' was built under R version 3.6.3
pdrb_pjawa %>%
ggplot(aes(tahun, pdrb_nonmigas)) +
geom_line(aes(colour = provinsi), show.legend = FALSE) +
geom_dl(
aes(label = provinsi),
method = "last.points",
position = position_nudge(x = 0.3) # agar teks tidak berhimpitan dengan garis
)
Selanjutnya kita akan melakukan finalisasi grafik menggunkana kode berikut:
library(ggplot2)
library(dplyr)
library(directlabels)
library(hrbrthemes)
## Warning: package 'hrbrthemes' was built under R version 3.6.3
## NOTE: Either Arial Narrow or Roboto Condensed fonts are required to use these themes.
## Please use hrbrthemes::import_roboto_condensed() to install Roboto Condensed and
## if Arial Narrow is not on your system, please see https://bit.ly/arialnarrow
pdrb_pjawa %>%
ggplot(aes(tahun, pdrb_nonmigas/1e6)) +
geom_line(aes(colour = provinsi), show.legend = FALSE) +
geom_dl(
aes(label = provinsi),
method = "last.points",
position = position_nudge(x = 0.3) # agar teks tidak berhimpitan dengan garis
) +
labs(
x = NULL,
y = NULL,
title = "PDRB Non-Migas di Pulau Jawa Hingga Tahun 2011",
subtitle = "PDRB atas dasar harga konstan, dalam satuan triliun",
caption = "Data: INDO-DAPOER, The World Bank"
) +
coord_cartesian(clip = "off") +
theme_ipsum(grid = "Y", ticks = TRUE)
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family not
## found in Windows font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family not
## found in Windows font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family not
## found in Windows font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family not
## found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
jika menemukan kata “font family not found in Windows font database” tidak usah hawatir dikarenakan haltersebut tandanya database pada mesin kita tidak terdapat font tersebut maka secara default akan di gantikan dengan font default pada R.
Kita akan banyak melakukan proses transformasi data sebelum akhirnya membuat visualisasi yang menarik. kita diminta untuk mengamati kondisi infrastruktur jalan raya di seluruh kabupatan dan kota di Indonesia.
library(dplyr)
library(stringr)
jalan_kabkota <-
indodapoer %>%
filter(str_detect(area_name, ", Prop.", negate = TRUE)) %>%
filter(year == 2008) %>%
transmute(
kabkota = area_name,
jalan_rusak_parah = length_of_district_road_bad_damage_in_km_bina_marga_data,
jalan_rusak_ringan = length_of_district_road_light_damage_in_km_bina_marga_data,
jalan_cukup_baik = length_of_district_road_fair_in_km_bina_marga_data,
jalan_sangat_baik = length_of_district_road_good_in_km_bina_marga_data
)
glimpse(jalan_kabkota)
## Rows: 514
## Columns: 5
## $ kabkota <chr> "Aceh Barat, Kab.", "Aceh Barat Daya, Kab.", "Ac...
## $ jalan_rusak_parah <dbl> 64, 1, 97, 112, 21, NA, 130, 8, 168, 76, 25, NA,...
## $ jalan_rusak_ringan <dbl> 191, 15, 101, 321, 36, 553, 183, 207, 174, 35, 7...
## $ jalan_cukup_baik <dbl> 218, 81, 270, 416, 59, 170, 146, 284, 201, 74, 3...
## $ jalan_sangat_baik <dbl> 153, 87, 105, 284, 89, 25, 177, 432, 177, 221, 3...
Selanjutnya Anda diminta untuk melakukan pivot pada data jalan_kabkota tersebut sehingga menghasilkan sebuah dataframe dengan tiga kolom, yaitu: kabkota, kondisi, dan panjang_jalan.
library(tidyr)
library(dplyr)
glimpse(jalan_kabkota)
## Rows: 514
## Columns: 5
## $ kabkota <chr> "Aceh Barat, Kab.", "Aceh Barat Daya, Kab.", "Ac...
## $ jalan_rusak_parah <dbl> 64, 1, 97, 112, 21, NA, 130, 8, 168, 76, 25, NA,...
## $ jalan_rusak_ringan <dbl> 191, 15, 101, 321, 36, 553, 183, 207, 174, 35, 7...
## $ jalan_cukup_baik <dbl> 218, 81, 270, 416, 59, 170, 146, 284, 201, 74, 3...
## $ jalan_sangat_baik <dbl> 153, 87, 105, 284, 89, 25, 177, 432, 177, 221, 3...
jalan_kabkota <-
jalan_kabkota %>%
pivot_longer(
cols = starts_with("jalan_"),
names_to = "kondisi",
names_prefix = "jalan_",
values_to = "panjang_jalan"
)
glimpse(jalan_kabkota)
## Rows: 2,056
## Columns: 3
## $ kabkota <chr> "Aceh Barat, Kab.", "Aceh Barat, Kab.", "Aceh Barat, ...
## $ kondisi <chr> "rusak_parah", "rusak_ringan", "cukup_baik", "sangat_...
## $ panjang_jalan <dbl> 64, 191, 218, 153, 1, 15, 81, 87, 97, 101, 270, 105, ...
langkah selanjutnya adalah menentukan mana wilayah kabupaten dan kota
library(dplyr)
library(stringr)
jalan_kabkota <-
jalan_kabkota %>%
mutate(
status = case_when(
str_detect(kabkota, ", Kab") ~ "Kabupaten",
str_detect(kabkota, ", Kota") ~ "Kota",
str_detect(kabkota, "City") ~ "Kota",
TRUE ~ NA_character_
),
kondisi = factor(
kondisi,
levels = c("rusak_parah", "rusak_ringan", "cukup_baik", "sangat_baik"),
labels = c("Rusak parah", "Rusak ringan", "Cukup baik", "Sangat baik")
)
)
glimpse(jalan_kabkota)
## Rows: 2,056
## Columns: 4
## $ kabkota <chr> "Aceh Barat, Kab.", "Aceh Barat, Kab.", "Aceh Barat, ...
## $ kondisi <fct> Rusak parah, Rusak ringan, Cukup baik, Sangat baik, R...
## $ panjang_jalan <dbl> 64, 191, 218, 153, 1, 15, 81, 87, 97, 101, 270, 105, ...
## $ status <chr> "Kabupaten", "Kabupaten", "Kabupaten", "Kabupaten", "...
Sekarang saatnya kita membuat grafik yang akan menunjukan kondisi jalan raya di kabupaten & kota berdasarkan kondisinya.
#install.packages("ggridges",repos = "http://cran.us.r-project.org")
library(ggplot2)
library(dplyr)
library(ggridges)
## Warning: package 'ggridges' was built under R version 3.6.3
jalan_kabkota_plot <-
jalan_kabkota %>%
ggplot(aes(panjang_jalan, kondisi)) +
facet_wrap(~status) +
geom_density_ridges_gradient(
aes(fill = after_stat(x)),
show.legend = FALSE
)
jalan_kabkota_plot
## Picking joint bandwidth of 41
## Picking joint bandwidth of 28.2
### Trnsformasi Logaritmik Kita dapat melakukan komparasi distribusi jalan kabupaten/kota berdasarkan berdasarkan kondisinya dengan mudah. Namun, dalam grafik tersebut masih ada beberapa hal yang harus diperbaiki. jika dilakukan transformasi menggunakan fungsi log sebagai berikut:
#install.packages("ggridges",repos = "http://cran.us.r-project.org")
library(ggplot2)
library(dplyr)
library(ggridges)
jalan_kabkota_plot <-
jalan_kabkota %>%
ggplot(aes(panjang_jalan, kondisi)) +
facet_wrap(~status) +
geom_density_ridges_gradient(
aes(fill = after_stat(x)),
show.legend = FALSE
)
jalan_kabkota_plot +
geom_vline(xintercept = 100, linetype = "dashed", colour = "darkslategray4") +
scale_x_continuous(trans = "log10")
## Picking joint bandwidth of 0.128
## Picking joint bandwidth of 0.17
### Finalisasi
#install.packages("ggridges",repos = "http://cran.us.r-project.org")
library(ggplot2)
library(dplyr)
library(ggridges)
library(hrbrthemes)
jalan_kabkota_plot <-
jalan_kabkota %>%
ggplot(aes(panjang_jalan, kondisi)) +
facet_wrap(~status) +
geom_density_ridges_gradient(
aes(fill = after_stat(x)),
show.legend = FALSE
)
jalan_kabkota_plot +
geom_vline(xintercept = 100, linetype = "dashed", colour = "darkslategray4") +
scale_x_continuous(trans = "log10") +
scale_fill_viridis_c(option = "magma") +
labs(
x = "Panjang jalan (Km)",
y = NULL,
title = "Jalan Kabupaten/Kota Berdasarkan Kondisi",
subtitle = "Berdasarkan data tahun 2008, garis vertikal menunjukan panjang jalan 100 Km",
caption = "Data: INDO-DAPOER, The World Bank"
) +
theme_ipsum(grid = FALSE, ticks = TRUE)
## Picking joint bandwidth of 0.128
## Picking joint bandwidth of 0.17
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call.graphics(C_text, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
tidak usah hawatir ketika ada tulisan “font family not found in Windows font database” karena hal tersebut menunjukan bahwa pada komputer atau laptop tidak terinstal font yang digunakan
sekarang Kita akan membuat sebuah grafik unik bernama waffle charts.
library(dplyr)
library(ggplot2)
library(tidyr)
library(stringr)
library(forcats)
faskes_kalimantan <-
indodapoer %>%
filter(str_detect(area_name, "Kalimantan")) %>%
filter(year == 2011) %>%
transmute(
provinsi = str_remove(area_name, ", Prop."),
rumahsakit = number_of_hospitals,
polindes = number_of_polindes_poliklinik_desa_village_polyclinic,
puskesmas = number_of_puskesmas_and_its_line_services
) %>%
pivot_longer(
cols = -provinsi,
names_to = "faskes",
values_to = "jumlah"
) %>%
filter(!is.na(jumlah)) %>%
mutate(
provinsi = fct_reorder(provinsi, jumlah, sum),
jumlah = ceiling(jumlah / 10)
)
glimpse(faskes_kalimantan)
## Rows: 12
## Columns: 3
## $ provinsi <fct> Kalimantan Barat, Kalimantan Barat, Kalimantan Barat, Kali...
## $ faskes <chr> "rumahsakit", "polindes", "puskesmas", "rumahsakit", "poli...
## $ jumlah <dbl> 4, 53, 98, 5, 11, 96, 3, 41, 75, 2, 22, 109
Waffle charts dapat dibuat di ggplot2 dengan menggunakan bantuan paket waffle. geom_waffle() yang memiliki dua aesthetic mappings wajib, yakni fill dan values, merupakan fungsi utama dalam pembuatan jenis grafik tersebut.
#install.packages("waffle", repos = "https://cinc.rud.is")
library(waffle)
library(ggplot2)
library(dplyr)
faskes_kalimantan_plot <-
faskes_kalimantan %>%
ggplot(aes(fill = faskes, values = jumlah)) +
facet_wrap(~provinsi) +
geom_waffle(colour = "white")
faskes_kalimantan_plot
### Mengatur Warna dan Label Selain itu, Kita juga dapat melakukan modifikasi teks label langsung pada fungsi tersebut dengan mengatur argumen labels.
#("waffle", repos = "https://cinc.rud.is")
library(waffle)
library(ggplot2)
library(dplyr)
faskes_kalimantan_plot <-
faskes_kalimantan %>%
ggplot(aes(fill = faskes, values = jumlah)) +
facet_wrap(~provinsi) +
geom_waffle(colour = "white")
faskes_kalimantan_plot <-
faskes_kalimantan_plot +
scale_fill_manual(
values = c(
"polindes" = "seagreen3",
"puskesmas" = "steelblue",
"rumahsakit" = "cyan4"
),
labels = c(
"polindes" = "Poliklinik Desa",
"puskesmas" = "Puskesmas",
"rumahsakit" = "Rumah Sakit"
)
) +
labs(
fill = NULL,
title = "Fasilitas Kesehatan di Kalimantan",
subtitle = "Berdasarkan data tahun 2011, satu petak menyatakan 戼㸱10 faskes",
caption = "Data: INDO-DAPOER, The World Bank"
)
faskes_kalimantan_plot
### Finalisasi Waffle Charts
#install.packages("waffle", repos = "https://cinc.rud.is")
library(waffle)
library(ggplot2)
library(dplyr)
faskes_kalimantan_plot <-
faskes_kalimantan %>%
ggplot(aes(fill = faskes, values = jumlah)) +
facet_wrap(~provinsi) +
geom_waffle(colour = "white")
faskes_kalimantan_plot <-
faskes_kalimantan_plot +
scale_fill_manual(
values = c(
"polindes" = "seagreen3",
"puskesmas" = "steelblue",
"rumahsakit" = "cyan4"
),
labels = c(
"polindes" = "Poliklinik Desa",
"puskesmas" = "Puskesmas",
"rumahsakit" = "Rumah Sakit"
)
) +
labs(
fill = NULL,
title = "Fasilitas Kesehatan di Kalimantan",
subtitle = "Berdasarkan data tahun 2011, satu petak menyatakan 戼㸱10 faskes",
caption = "Data: INDO-DAPOER, The World Bank"
)
faskes_kalimantan_plot +
coord_equal() +
theme(
text = element_text(family = "Arial Narrow"),
plot.title.position = "plot",
plot.title = element_text(face = "bold", size = 18, hjust = 0.5),
plot.subtitle = element_text(face = "plain", size = 12, hjust = 0.5),
plot.caption = element_text(face = "italic", size = 9),
legend.position = "bottom",
panel.background = element_blank(),
panel.grid = element_blank(),
strip.background = element_blank(),
strip.text = element_text(face = "italic", size = 9, hjust = 0),
axis.text.x = element_blank(),
axis.text.y = element_blank(),
axis.ticks = element_blank()
)
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_stringMetric, as.graphicsAnnot(x$label)): font family not
## found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, : font
## family not found in Windows font database
dan hasil akhirnya adalah seperti grafik diatas.
Sekian dulu sharing yang dilakukan menggunakan data dari DQLab semoga bermanfaat.