Salah satu package yang digunakan dalam meakukan visualisasi data pada bahasa R adalah “ggplot2”. Package ini adalah hasi dari konsep Grammar of Graphic yang memiliki pronsip merekontruksi pembuatan grafik dengan menggunakan kaidah tata bahasa seperti “scatterpot, line-chart, bar-chart, dll”.

Berikut adalah proses visualisasi menggunakan package “ggplot2” pada R.

Persiapkan package “ggplot2” melakukan instalasi terlebih dahulu apbila sebelumnya beum pernah menggunakan package ini.

Setelah itu panggil paket ggplot2 menggunakan function library

#install.packages("ggplot2")
library("ggplot2")

#Membuat kode ggplot Pada kasus ini menggunakan data “mtcars” yang memiliki variabel sebagai berikut:

  • mpg : Miles/(US) gallon
  • cyl : Number of cylinders
  • disp : Displacement (cu.in.)
  • hp : Gross horsepower
  • drat : Rear axle ratio
  • wt : Weight (1000 lbs)
  • qsec : 1/4 mile time
  • vs : Engine (0 = V-shaped, 1 = straight)
  • am : Transmission (0 = automatic, 1 = manual)
  • gear : Number of forward gears
  • crab : Number of carburetors

Kita bisa menuliskan penggunaan ggplot dengan cara seperti ini!

Cara Pertama

cara1 <- ggplot(data = mtcars, mapping = aes(x=mpg ,  y = cyl, color = drat)) +
  geom_point()
summary(cara1)
data: mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb
  [32x11]
mapping:  x = ~mpg, y = ~cyl, colour = ~drat
faceting: <ggproto object: Class FacetNull, Facet, gg>
    compute_layout: function
    draw_back: function
    draw_front: function
    draw_labels: function
    draw_panels: function
    finish_data: function
    init_scales: function
    map_data: function
    params: list
    setup_data: function
    setup_params: function
    shrink: TRUE
    train_scales: function
    vars: function
    super:  <ggproto object: Class FacetNull, Facet, gg>
-----------------------------------
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity 

Cara Kedua

cara2 <- ggplot(data = mtcars) + geom_point(mapping = aes(x=mpg ,  y = cyl, color = drat)) 
summary(cara2)
data: mpg, cyl, disp, hp, drat, wt, qsec, vs, am, gear, carb
  [32x11]
faceting: <ggproto object: Class FacetNull, Facet, gg>
    compute_layout: function
    draw_back: function
    draw_front: function
    draw_labels: function
    draw_panels: function
    finish_data: function
    init_scales: function
    map_data: function
    params: list
    setup_data: function
    setup_params: function
    shrink: TRUE
    train_scales: function
    vars: function
    super:  <ggproto object: Class FacetNull, Facet, gg>
-----------------------------------
mapping: x = ~mpg, y = ~cyl, colour = ~drat 
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity 

Cara Ketiga

cara3 <- ggplot() + 
  geom_point(
    data = mtcars, 
    mapping = aes(x=mpg ,  y = cyl, color = drat)
    ) 
summary(cara3)
data: [x]
faceting: <ggproto object: Class FacetNull, Facet, gg>
    compute_layout: function
    draw_back: function
    draw_front: function
    draw_labels: function
    draw_panels: function
    finish_data: function
    init_scales: function
    map_data: function
    params: list
    setup_data: function
    setup_params: function
    shrink: TRUE
    train_scales: function
    vars: function
    super:  <ggproto object: Class FacetNull, Facet, gg>
-----------------------------------
mapping: x = ~mpg, y = ~cyl, colour = ~drat 
geom_point: na.rm = FALSE
stat_identity: na.rm = FALSE
position_identity 

melalui ketiga cara tersebut, nantinya akan menghasilkan hasil yang sama. Meski demikian, jika dilihat dari fungsi summary akan terlihat perbedaan diantara ketiganya.

Komponen Dasar Pembuatan Grafik

Terdapat 3 komponen dasar dalam pembuatan grafik yaitu: data = data berisi informasi yang akan dibuat grafik Mapping = yaitu penentuan variabel/kolom yang akan ditampilkan dalam grafik Geometries = yaitu representasi visual dari variabel/kolom dalam grafik

Selain menggunakan ggplot2(), terdapat kode yang lain yang lebih sederhana yaitu qplot(). ## Fungsi qplot()

qplot(, data = , geom = )

Komponen Pembuatan Grafik

Jika konsep dasar pembuatan grafik terdapat 3, maka keseluruhan pembuatan grafik ini secara keseuruhan berjumlah 8 buah.

8 buah komponen dalam pemuatan grafik diantaranya, data, mapping, statistic, scales, geometries, facets, coordinates, dan theme.

Transformasi Data

Ketika melakukan visualisasi data menggunakan ggplot2() seringkali membutuhkan tools transformasi data yang nantinya akan memudahkan dalam proses pengambilan data itu sendiri.

Pada bahasa R package yang terkenal dalam melakukan transformasi data adalah tidyverse(). Paket ini memiliki beberapa library yang akan sangat membantu, diantaranya yaitu dplyr(), tidyr(), readr(), tibble(), stringr(), forcats(), dan purrr().

Sebagai contoh paket dplyr yang memiliki fugsi sebagai berikut: • select() • filter() • arrange() • mutate() • summarise() • group_by()

Penggunaan transformasi data ini juga kerap menggunakan bantuan operator pipe (%>%) jika terdiri atas beberapa proses.

Aktifkan package dplyr()

library(dplyr)

Attaching package: 㤼㸱dplyr㤼㸲

The following objects are masked from 㤼㸱package:stats㤼㸲:

    filter, lag

The following objects are masked from 㤼㸱package:base㤼㸲:

    intersect, setdiff, setequal, union
glimpse(mtcars)
Rows: 32
Columns: 11
$ mpg  <dbl> 21.0, 21.0, 22.8, 21.4, 18.7, 18.1, 14.3, 24.4, 22.8, 19.2,~
$ cyl  <dbl> 6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, 4, 4, 4,~
$ disp <dbl> 160.0, 160.0, 108.0, 258.0, 360.0, 225.0, 360.0, 146.7, 140~
$ hp   <dbl> 110, 110, 93, 110, 175, 105, 245, 62, 95, 123, 123, 180, 18~
$ drat <dbl> 3.90, 3.90, 3.85, 3.08, 3.15, 2.76, 3.21, 3.69, 3.92, 3.92,~
$ wt   <dbl> 2.620, 2.875, 2.320, 3.215, 3.440, 3.460, 3.570, 3.190, 3.1~
$ qsec <dbl> 16.46, 17.02, 18.61, 19.44, 17.02, 20.22, 15.84, 20.00, 22.~
$ vs   <dbl> 0, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, 1,~
$ am   <dbl> 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1,~
$ gear <dbl> 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 4, 4, 4,~
$ carb <dbl> 4, 4, 1, 1, 2, 1, 4, 2, 2, 4, 4, 3, 3, 3, 4, 4, 4, 1, 2, 1,~

#Penggunaan tanpa pipe (%>%)

cars1 <- select(mtcars, mpg, cyl, qsec, drat, gear)
cars2 <- filter(cars1, between(qsec, 15.00, 18.00))
cars3 <- mutate(cars2, gear_per_second = gear/qsec)
cars4 <- group_by(cars3, cyl)
cars_nopipe <- summarise(cars4, avg_mpg = mean(mpg), gear_per_second=gear_per_second, gear=gear)
`summarise()` has grouped output by 'cyl'. You can override using the `.groups` argument.
glimpse(cars_nopipe)
Rows: 17
Columns: 4
Groups: cyl [3]
$ cyl             <dbl> 4, 4, 6, 6, 6, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8
$ avg_mpg         <dbl> 28.20000, 28.20000, 20.56667, 20.56667, 20.56667~
$ gear_per_second <dbl> 0.2994012, 0.2958580, 0.2430134, 0.2350176, 0.32~
$ gear            <dbl> 5, 5, 4, 4, 5, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3

#Penggunaan dengan pipe (%>%)

cars = select(mtcars, mpg, cyl, qsec, drat, gear) %>%
  filter(between(qsec, 15.00, 18.00)) %>%
  mutate(gear_per_second = gear/qsec) %>%
  group_by( cyl) %>%
  summarise(avg_mpg = mean(mpg), gear_per_second=gear_per_second, gear=gear)
`summarise()` has grouped output by 'cyl'. You can override using the `.groups` argument.
glimpse(cars)
Rows: 17
Columns: 4
Groups: cyl [3]
$ cyl             <dbl> 4, 4, 6, 6, 6, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8
$ avg_mpg         <dbl> 28.20000, 28.20000, 20.56667, 20.56667, 20.56667~
$ gear_per_second <dbl> 0.2994012, 0.2958580, 0.2430134, 0.2350176, 0.32~
$ gear            <dbl> 5, 5, 4, 4, 5, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3

Hasil penggunaan pipe akan memberikan efektivitas dalam menuliskan code.

Menggunakan data Indonesia Database for Policy and Economic Research

Indonesia Database for Policy and Economic Research disingkat menjadi INDO-DAPOER. Data ini berisikan indikator ekonomi dan sosial pada level provinsi serta kota/kabupaten di Indonesia. Ada empat kategori utama yang terhimpun di dalam data ini, yaitu: fiskal, ekonomi, sosial-demografi, serta infrastuktur.

Import data


-- Column specification --------------------------------------------------
cols(
  .default = col_double(),
  area_name = col_character(),
  `Import: Commodities and transaction not elsewhere classified (province Level, in USD)` = col_logical(),
  `Length of National Road: Dirt (in km) (BPS Data, Province only)` = col_logical(),
  `Length of National Road: Other (in km) (BPS Data, Province only)` = col_logical(),
  `Total Natural Resources Revenue Sharing from Geothermal  Energy (in IDR, realization value)` = col_logical(),
  `Total Revenue Sharing` = col_logical(),
  `Total Specific Allocation Grant for Village (in IDR Billion)` = col_logical()
)
i Use `spec()` for the full column specifications.

232 parsing failures.
 row                                                                                   col           expected  actual                                                                                    file
1008 Import: Commodities and transaction not elsewhere classified (province Level, in USD) 1/0/T/F/TRUE/FALSE 554693  'C:/Users/ACER/Documents/BRANDING/DQLAB/Porto/New folder/indodapoer.tsv/indodapoer.tsv'
1009 Import: Commodities and transaction not elsewhere classified (province Level, in USD) 1/0/T/F/TRUE/FALSE 1291450 'C:/Users/ACER/Documents/BRANDING/DQLAB/Porto/New folder/indodapoer.tsv/indodapoer.tsv'
1010 Import: Commodities and transaction not elsewhere classified (province Level, in USD) 1/0/T/F/TRUE/FALSE 365356  'C:/Users/ACER/Documents/BRANDING/DQLAB/Porto/New folder/indodapoer.tsv/indodapoer.tsv'
1011 Import: Commodities and transaction not elsewhere classified (province Level, in USD) 1/0/T/F/TRUE/FALSE 216478  'C:/Users/ACER/Documents/BRANDING/DQLAB/Porto/New folder/indodapoer.tsv/indodapoer.tsv'
1012 Import: Commodities and transaction not elsewhere classified (province Level, in USD) 1/0/T/F/TRUE/FALSE 646310  'C:/Users/ACER/Documents/BRANDING/DQLAB/Porto/New folder/indodapoer.tsv/indodapoer.tsv'
.... ..................................................................................... .................. ....... .......................................................................................
See problems(...) for more details.

Cek data

glimpse(indodapoer)
Rows: 22,468
Columns: 222
$ area_name                                                                                                     <chr> ~
$ year                                                                                                          <dbl> ~
$ `Agriculture function expenditure (in IDR)`                                                                   <dbl> ~
$ `Average National Exam Score: Junior Secondary Level (out of 100, available only in district level for 2009)` <dbl> ~
$ `Average National Exam Score: Primary Level (out of 100, available only in district level for 2009)`          <dbl> ~
$ `Average National Exam Score: Senior Secondary Level (out of 100, available only in district level for 2009)` <dbl> ~
$ `Birth attended by Skilled Health worker (in % of total birth)`                                               <dbl> ~
$ `BPK Audit Report on Sub-National Budget`                                                                     <dbl> ~
$ `Capital expenditure (in IDR)`                                                                                <dbl> ~
$ `Consumer Price Index in 42 cities base 1996`                                                                 <dbl> ~
$ `Consumer Price Index in 45 cities base 2002`                                                                 <dbl> ~
$ `Consumer Price Index in 66 cities base 2007`                                                                 <dbl> ~
$ `Economy function expenditure (in IDR)`                                                                       <dbl> ~
$ `Education function expenditure (in IDR)`                                                                     <dbl> ~
$ `Environment function expenditure (in IDR)`                                                                   <dbl> ~
$ `Export: Animals and vegetable oil, fat and waxes (province Level, in USD)`                                   <dbl> ~
$ `Export: Beverages and tobacco (province Level, in USD)`                                                      <dbl> ~
$ `Export: Chemical and related products, nes (province Level, in USD)`                                         <dbl> ~
$ `Export: Commodities and transaction not elsewhere classified (province Level, in USD)`                       <dbl> ~
$ `Export: Crude materials, inedible, except fuels (province Level, in USD)`                                    <dbl> ~
$ `Export: Food and Live Animals (province Level, in USD)`                                                      <dbl> ~
$ `Export: Machinery and transport equipment (province Level, in USD)`                                          <dbl> ~
$ `Export: Manufactured goods, classified chiefly by material (province Level, in USD)`                         <dbl> ~
$ `Export: Mineral fuels, lubricants and related materials (province Level, in USD)`                            <dbl> ~
$ `Export: Miscellaneous manufactures articles (province Level, in USD)`                                        <dbl> ~
$ `GDP expenditure on changes in stock (in IDR Million)`                                                        <dbl> ~
$ `GDP expenditure on exports (in IDR Million)`                                                                 <dbl> ~
$ `GDP expenditure on general government consumption (in IDR Million)`                                          <dbl> ~
$ `GDP expenditure on gross fixed capital formation (in IDR Million)`                                           <dbl> ~
$ `GDP expenditure on imports (in IDR Million)`                                                                 <dbl> ~
$ `GDP expenditure on non profit private institution consumption (in IDR Million)`                              <dbl> ~
$ `GDP expenditure on private consumption (in IDR Million)`                                                     <dbl> ~
$ `GDP on Agriculture Sector (in IDR Million), Constant Price`                                                  <dbl> ~
$ `GDP on Agriculture Sector (in IDR Million), Current Price`                                                   <dbl> ~
$ `GDP on Construction Sector (in IDR Million), Constant Price`                                                 <dbl> ~
$ `GDP on Construction Sector (in IDR Million), Current Price`                                                  <dbl> ~
$ `GDP on Financial Service Sector (in IDR Million), Constant Price`                                            <dbl> ~
$ `GDP on Financial Service Sector (in IDR Million), Current Price`                                             <dbl> ~
$ `GDP on Manufacturing Sector (in IDR Million), Constant Price`                                                <dbl> ~
$ `GDP on Manufacturing Sector (in IDR Million), Current Price`                                                 <dbl> ~
$ `GDP on Mining and Quarrying Sector (in IDR Million), Constant Price`                                         <dbl> ~
$ `GDP on Mining and Quarrying Sector (in IDR Million), Current Price`                                          <dbl> ~
$ `GDP on Other Service Sector (in IDR Million), Constant Price`                                                <dbl> ~
$ `GDP on Other Service Sector (in IDR Million), Current Price`                                                 <dbl> ~
$ `GDP on Trade, Hotel and Restaurant Sector (in IDR Million), Constant Price`                                  <dbl> ~
$ `GDP on Trade, Hotel and Restaurant Sector (in IDR Million), Current Price`                                   <dbl> ~
$ `GDP on Transportation and Telecommunication Sector (in IDR Million), Constant Price`                         <dbl> ~
$ `GDP on Transportation and Telecommunication Sector (in IDR Million), Current Price`                          <dbl> ~
$ `GDP on Utilities Sector (in IDR Million), Constant Price`                                                    <dbl> ~
$ `GDP on Utilities Sector (in IDR Million), Current Price`                                                     <dbl> ~
$ `General administration function expenditure (in IDR)`                                                        <dbl> ~
$ `Goods and services expenditure (in IDR)`                                                                     <dbl> ~
$ `Health function expenditure (in IDR)`                                                                        <dbl> ~
$ `Household Access to Electricity: Total (in % of total household)`                                            <dbl> ~
$ `Household Access to Fixed Line Phone Connection (in % of total Household)`                                   <dbl> ~
$ `Household Access to safe Sanitation (in % of total Household)`                                               <dbl> ~
$ `Household Access to Safe Water (in % of total household)`                                                    <dbl> ~
$ `Household per capita expenditure (in IDR)`                                                                   <dbl> ~
$ `Housing and public facilities function expenditure (in IDR)`                                                 <dbl> ~
$ `Human Development Index`                                                                                     <dbl> ~
$ `Immunization Coverage for Children under 5 years old (in % of children population under 5 years old)`        <dbl> ~
$ `Import: Animals and vegetable oil, fat and waxes (province Level, in USD)`                                   <dbl> ~
$ `Import: Beverages and tobacco (province Level, in USD)`                                                      <dbl> ~
$ `Import: Chemical and related products, nes (province Level, in USD)`                                         <dbl> ~
$ `Import: Commodities and transaction not elsewhere classified (province Level, in USD)`                       <lgl> ~
$ `Import: Crude materials, inedible, except fuels (province Level, in USD)`                                    <dbl> ~
$ `Import: Food and Live Animals (province Level, in USD)`                                                      <dbl> ~
$ `Import: Machinery and transport equipment (province Level, in USD)`                                          <dbl> ~
$ `Import: Manufactured goods, classified chiefly by material (province Level, in USD)`                         <dbl> ~
$ `Import: Mineral fuels, lubricants and related materials (province Level, in USD)`                            <dbl> ~
$ `Import: Miscellaneous manufactures articles (province Level, in USD)`                                        <dbl> ~
$ `Infrastructure function expenditure (in IDR)`                                                                <dbl> ~
$ `Length of District Road: Asphalt (in km) (BPS Data, Province only)`                                          <dbl> ~
$ `Length of District Road: Bad Damage (in km) (Bina Marga Data)`                                               <dbl> ~
$ `Length of District Road: Bad Damage (in km) (BPS Data, Province only)`                                       <dbl> ~
$ `Length of District Road: Dirt (in km) (BPS Data, Province only)`                                             <dbl> ~
$ `Length of District Road: Fair (in km) (Bina Marga Data)`                                                     <dbl> ~
$ `Length of District Road: Fair (in km) (BPS Data, Province only)`                                             <dbl> ~
$ `Length of District Road: Good (in km) (Bina Marga Data)`                                                     <dbl> ~
$ `Length of District Road: Good (in km) (BPS Data, Province only)`                                             <dbl> ~
$ `Length of District Road: Gravel (in km) (BPS Data, Province only)`                                           <dbl> ~
$ `Length of District Road: Light Damage (in km) (Bina Marga Data)`                                             <dbl> ~
$ `Length of District Road: Light Damage (in km) (BPS Data, Province only)`                                     <dbl> ~
$ `Length of District Road: Other (in km) (BPS Data, Province only)`                                            <dbl> ~
$ `Length of National Road: Asphalt (in km) (BPS Data, Province only)`                                          <dbl> ~
$ `Length of National Road: Bad Damage (in km) (BPS Data, Province only)`                                       <dbl> ~
$ `Length of National Road: Dirt (in km) (BPS Data, Province only)`                                             <lgl> ~
$ `Length of National Road: Fair (in km) (BPS Data, Province only)`                                             <dbl> ~
$ `Length of National Road: Good (in km) (BPS Data, Province only)`                                             <dbl> ~
$ `Length of National Road: Gravel (in km) (BPS Data, Province only)`                                           <dbl> ~
$ `Length of National Road: Light Damage (in km) (BPS Data, Province only)`                                     <dbl> ~
$ `Length of National Road: Other (in km) (BPS Data, Province only)`                                            <lgl> ~
$ `Length of Province Road: Asphalt (in km) (BPS Data, Province only)`                                          <dbl> ~
$ `Length of Province Road: Bad Damage (in km) (BPS Data, Province only)`                                       <dbl> ~
$ `Length of Province Road: Dirt (in km) (BPS Data, Province only)`                                             <dbl> ~
$ `Length of Province Road: Fair (in km) (BPS Data, Province only)`                                             <dbl> ~
$ `Length of Province Road: Good (in km) (BPS Data, Province only)`                                             <dbl> ~
$ `Length of Province Road: Gravel (in km) (BPS Data, Province only)`                                           <dbl> ~
$ `Length of Province Road: Light Damage (in km) (BPS Data, Province only)`                                     <dbl> ~
$ `Length of Province Road: Other (in km) (BPS Data, Province only)`                                            <dbl> ~
$ `Literacy Rate for Population age 15 and over (in % of total population)`                                     <dbl> ~
$ `Monthly Per Capita Household Education Expenditure (in IDR)`                                                 <dbl> ~
$ `Monthly Per Capita Household Health Expenditure (in IDR)`                                                    <dbl> ~
$ `Monthly Per Capita TOTAL Household Expenditure for The Poorest 20 percent (in IDR)`                          <dbl> ~
$ `Morbidity Rate (in %)`                                                                                       <dbl> ~
$ `Net Enrollment Ratio: Junior Secondary (in %)`                                                               <dbl> ~
$ `Net Enrollment Ratio: Primary (in %)`                                                                        <dbl> ~
$ `Net Enrollment Ratio: Senior Secondary (in %)`                                                               <dbl> ~
$ `Number of Doctors`                                                                                           <dbl> ~
$ `Number of hospitals`                                                                                         <dbl> ~
$ `Number of Midwives`                                                                                          <dbl> ~
$ `Number of people employed`                                                                                   <dbl> ~
$ `Number of people employed in agriculture, forestry and fishery`                                              <dbl> ~
$ `Number of people employed in construction sector`                                                            <dbl> ~
$ `Number of people employed in electricity and utilities sector`                                               <dbl> ~
$ `Number of people employed in financial services sector`                                                      <dbl> ~
$ `Number of people employed in industrial sector`                                                              <dbl> ~
$ `Number of people employed in mining and quarrying sector`                                                    <dbl> ~
$ `Number of people employed in social services sector`                                                         <dbl> ~
$ `Number of people employed in trade, hotel and restaurant sector`                                             <dbl> ~
$ `Number of people employed in transportation and telecommunication sector`                                    <dbl> ~
$ `Number of people in labor force`                                                                             <dbl> ~
$ `Number of people live below the poverty line (in number of people)`                                          <dbl> ~
$ `Number of people underemployed`                                                                              <dbl> ~
$ `Number of people unemployed`                                                                                 <dbl> ~
$ `Number of Polindes (Poliklinik Desa/Village Polyclinic)`                                                     <dbl> ~
$ `Number of Puskesmas and its line services`                                                                   <dbl> ~
$ `Number of schools at junior secondary level`                                                                 <dbl> ~
$ `Number of schools at primary level`                                                                          <dbl> ~
$ `Number of schools at Senior Secondary level`                                                                 <dbl> ~
$ `Number of Student: Junior Secondary Level (in number of people, 2009 data only)`                             <dbl> ~
$ `Number of Student: Primary Level (in number of people, 2009 data only)`                                      <dbl> ~
$ `Number of Student: Senior Secondary Level (in number of people, 2009 data only)`                             <dbl> ~
$ `Number of Teacher: Junior Secondary Level (in number of people, 2009 data only)`                             <dbl> ~
$ `Number of Teacher: Primary Level (in number of people, 2009 data only)`                                      <dbl> ~
$ `Number of Teacher: Senior Secondary Level (in number of people, 2009 data only)`                             <dbl> ~
$ `Others expenditure (in IDR)`                                                                                 <dbl> ~
$ `Outstanding Deposits of Commercial Banks owned by Regional Government (Province Level, in IDR Million)`      <dbl> ~
$ `Palm Oil Land Area by type of condition: Damaged (in Hectares)`                                              <dbl> ~
$ `Palm Oil Land Area by type of condition: Immature (in Hectares)`                                             <dbl> ~
$ `Palm Oil Land Area by type of condition: Mature (in Hectares)`                                               <dbl> ~
$ `Palm Oil Land Area by type of ownership: Private (in Hectares)`                                              <dbl> ~
$ `Palm Oil Land Area by type of ownership: Smallholder (in Hectares)`                                          <dbl> ~
$ `Palm Oil Land Area by type of ownership: State Owned Enterprise (in Hectares)`                               <dbl> ~
$ `Palm Oil Land Area: Total (in Hectares)`                                                                     <dbl> ~
$ `Palm Oil Yield by type of ownership: Private (in Kg/Ha)`                                                     <dbl> ~
$ `Palm Oil Yield by type of ownership: Smallholder (in Kg/Ha)`                                                 <dbl> ~
$ `Palm Oil Yield by type of ownership: State Owned Enterprise (in Kg/Ha)`                                      <dbl> ~
$ `Palm Production by type of ownership: Private (in Tons)`                                                     <dbl> ~
$ `Palm Production by type of ownership: Smallholder (in Tons)`                                                 <dbl> ~
$ `Palm Production by type of ownership: State Owned Enterprise (in Tons)`                                      <dbl> ~
$ `Palm Production: Total (in Tons)`                                                                            <dbl> ~
$ `Percentage of Population in Rural Areas (only 2005 and 2010) (in % of Total Population)`                     <dbl> ~
$ `Percentage of Population in Urban Areas (only 2005 and 2010) (in % of Total Population)`                     <dbl> ~
$ `Personnel expenditure (in IDR)`                                                                              <dbl> ~
$ `Poverty Gap (index)`                                                                                         <dbl> ~
$ `Poverty Line (in IDR)`                                                                                       <dbl> ~
$ `Poverty Rate (in % of population)`                                                                           <dbl> ~
$ `Public, law and order function expenditure (in IDR)`                                                         <dbl> ~
$ `Religious function expenditure (in IDR)`                                                                     <dbl> ~
$ `Social protection function expenditure (in IDR)`                                                             <dbl> ~
$ `Total Area (in Km²)`                                                                                         <dbl> ~
$ `Total Commercial and Rural Banks Loans Rupiah and Foreign Currency (province level, in IDR Million)`         <dbl> ~
$ `Total Credit by Sector: Agriculture (province level, in IDR Million)`                                        <dbl> ~
$ `Total Credit by Sector: Business (province level, in IDR Million)`                                           <dbl> ~
$ `Total Credit by Sector: Construction (province level, in IDR Million)`                                       <dbl> ~
$ `Total Credit by Sector: Manufacture (province level, in IDR Million)`                                        <dbl> ~
$ `Total Credit by Sector: Mining and Quarrying (province level, in IDR Million)`                               <dbl> ~
$ `Total Credit by Sector: Other Services (province level, in IDR Million)`                                     <dbl> ~
$ `Total Credit by Sector: Social Services (province level, in IDR Million)`                                    <dbl> ~
$ `Total Credit by Sector: Trade (province level, in IDR Million)`                                              <dbl> ~
$ `Total Credit by Sector: Transportation (province level, in IDR Million)`                                     <dbl> ~
$ `Total Credit by Sector: Utilities (province level, in IDR Million)`                                          <dbl> ~
$ `Total Credit by Utilization: Consumption (province level, in IDR Million)`                                   <dbl> ~
$ `Total credit by Utilization: Investment (province level, in IDR Million)`                                    <dbl> ~
$ `Total Credit by Utilization: Working Capital (province level, in IDR Million)`                               <dbl> ~
$ `Total Deposits (province level, in IDR Million)`                                                             <dbl> ~
$ `Total Expenditure (in IDR)`                                                                                  <dbl> ~
$ `Total GDP based on expenditure (in IDR Million)`                                                             <dbl> ~
$ `Total GDP excluding Oil and Gas (in IDR Million), Constant Price`                                            <dbl> ~
$ `Total GDP excluding Oil and Gas (in IDR Million), Current Price`                                             <dbl> ~
$ `Total GDP including Oil and Gas (in IDR Million), Constant Price`                                            <dbl> ~
$ `Total GDP including Oil and Gas (in IDR Million), Current Price`                                             <dbl> ~
$ `Total General Allocation Grant/DAU (in IDR)`                                                                 <dbl> ~
$ `Total Natural Resource Revenue Sharing/DBH SDA (in IDR)`                                                     <dbl> ~
$ `Total Natural Resources Revenue Sharing from Fishery (in IDR, realization value)`                            <dbl> ~
$ `Total Natural Resources Revenue Sharing from Forestry (in IDR, realization value)`                           <dbl> ~
$ `Total Natural Resources Revenue Sharing from Gas (in IDR, realization value)`                                <dbl> ~
$ `Total Natural Resources Revenue Sharing from Geothermal  Energy (in IDR, realization value)`                 <lgl> ~
$ `Total Natural Resources Revenue Sharing from Mining (in IDR, realization value)`                             <dbl> ~
$ `Total Natural Resources Revenue Sharing from Oil (in IDR, realization value)`                                <dbl> ~
$ `Total Other Revenue (in IDR)`                                                                                <dbl> ~
$ `Total Own Source Revenue/PAD (in IDR)`                                                                       <dbl> ~
$ `Total Population (in number of people)`                                                                      <dbl> ~
$ `Total Population for Age 0-14 (only 2005 and 2010) (in number of people)`                                    <dbl> ~
$ `Total Population for Age 15-64 (only 2005 and 2010) (in number of people)`                                   <dbl> ~
$ `Total Population for Age 65 and above (only 2005 and 2010) (in number of people)`                            <dbl> ~
$ `Total Revenue (in IDR)`                                                                                      <dbl> ~
$ `Total Revenue Sharing`                                                                                       <lgl> ~
$ `Total Special Allocation Grant/DAK (in IDR)`                                                                 <dbl> ~
$ `Total Specific Allocation Grant for Agriculture (in IDR Billion)`                                            <dbl> ~
$ `Total Specific Allocation Grant for Demographic (in IDR Billion)`                                            <dbl> ~
$ `Total Specific Allocation Grant for Education (in IDR Billion)`                                              <dbl> ~
$ `Total Specific Allocation Grant for Environment (in IDR Billion)`                                            <dbl> ~
$ `Total Specific Allocation Grant for Fishery (in IDR Billion)`                                                <dbl> ~
$ `Total Specific Allocation Grant for Forestry (in IDR Billion)`                                               <dbl> ~
$ `Total Specific Allocation Grant for Government Sector (in IDR Billion)`                                      <dbl> ~
$ `Total Specific Allocation Grant for Health (in IDR Billion)`                                                 <dbl> ~
$ `Total Specific Allocation Grant for Health Sector (Subsect: Basic Services) (in IDR Billion)`                <dbl> ~
$ `Total Specific Allocation Grant for Health Sector (Subsect: Recommended Services) (in IDR Billion)`          <dbl> ~
$ `Total Specific Allocation Grant for Infrastructure (in IDR Billion)`                                         <dbl> ~
$ `Total Specific Allocation Grant for Infrastructure Sector (Subsect: Irrigation) (in IDR Billion)`            <dbl> ~
$ `Total Specific Allocation Grant for Infrastructure Sector (Subsect: Road) (in IDR Billion)`                  <dbl> ~
$ `Total Specific Allocation Grant for Infrastructure Sector (Subsect: Water) (in IDR Billion)`                 <dbl> ~
$ `Total Specific Allocation Grant for Trade (in IDR Billion)`                                                  <dbl> ~
$ `Total Specific Allocation Grant for Village (in IDR Billion)`                                                <lgl> ~
$ `Total Tax Revenue Sharing/DBH Pajak (in IDR)`                                                                <dbl> ~
$ `Tourism and culture function expenditure (in IDR)`                                                           <dbl> ~
$ `Villages with road: Asphalt (in % of total villages)`                                                        <dbl> ~
$ `Villages with road: Dirt (in % of total villages)`                                                           <dbl> ~
$ `Villages with road: Gravel (in % of total villages)`                                                         <dbl> ~
$ `Villages with road: Other (in % of total villages)`                                                          <dbl> ~
# nrow(indodapoer)
# ncol(indodapoer)

Jadi, total baris yang ada pada data INDO-DAPOER adalah 22.468 dan kolom yang berjumlah 222.

Wild Names and How to Tame Them

Selanjutnya menggunakan paket “Janitor” yang berfungsi untuk membuat nama-nama kolom sesuai dengan kaidah “syntactically valid names”. Fungsi yang memudahkan yang terdapat pada paket tersebut adalah clean_names(). Melalui fungsi ini kita dapat merapikan nama-nama kolom sehingga akan lebih mudah digunakan untuk analisis atau visualisasi data.

library(janitor)

Attaching package: 㤼㸱janitor㤼㸲

The following objects are masked from 㤼㸱package:stats㤼㸲:

    chisq.test, fisher.test
head(colnames(indodapoer), 15)
 [1] "area_name"                                                                                                  
 [2] "year"                                                                                                       
 [3] "Agriculture function expenditure (in IDR)"                                                                  
 [4] "Average National Exam Score: Junior Secondary Level (out of 100, available only in district level for 2009)"
 [5] "Average National Exam Score: Primary Level (out of 100, available only in district level for 2009)"         
 [6] "Average National Exam Score: Senior Secondary Level (out of 100, available only in district level for 2009)"
 [7] "Birth attended by Skilled Health worker (in % of total birth)"                                              
 [8] "BPK Audit Report on Sub-National Budget"                                                                    
 [9] "Capital expenditure (in IDR)"                                                                               
[10] "Consumer Price Index in 42 cities base 1996"                                                                
[11] "Consumer Price Index in 45 cities base 2002"                                                                
[12] "Consumer Price Index in 66 cities base 2007"                                                                
[13] "Economy function expenditure (in IDR)"                                                                      
[14] "Education function expenditure (in IDR)"                                                                    
[15] "Environment function expenditure (in IDR)"                                                                  
indodapoer <- clean_names(indodapoer)
head(colnames(indodapoer), 15)
 [1] "area_name"                                                                                              
 [2] "year"                                                                                                   
 [3] "agriculture_function_expenditure_in_idr"                                                                
 [4] "average_national_exam_score_junior_secondary_level_out_of_100_available_only_in_district_level_for_2009"
 [5] "average_national_exam_score_primary_level_out_of_100_available_only_in_district_level_for_2009"         
 [6] "average_national_exam_score_senior_secondary_level_out_of_100_available_only_in_district_level_for_2009"
 [7] "birth_attended_by_skilled_health_worker_in_percent_of_total_birth"                                      
 [8] "bpk_audit_report_on_sub_national_budget"                                                                
 [9] "capital_expenditure_in_idr"                                                                             
[10] "consumer_price_index_in_42_cities_base_1996"                                                            
[11] "consumer_price_index_in_45_cities_base_2002"                                                            
[12] "consumer_price_index_in_66_cities_base_2007"                                                            
[13] "economy_function_expenditure_in_idr"                                                                    
[14] "education_function_expenditure_in_idr"                                                                  
[15] "environment_function_expenditure_in_idr"                                                                

Produk Domestik Regional Bruto

Disini akan menunjukkan PDRB untuk Pulau Jawa, akan tetapi karena pada data terdapat kata “Prop.” pada masing-masing provinsi, maka dari itu kata tersebut akan dihapus menggunakan function str_remove() yang terdapat dalam package (stringr)

library(stringr)

pdrb_pjawa <- 
  indodapoer %>%
  filter(
    area_name %in% c(
      "Banten, Prop.",
      "DKI Jakarta, Prop.",
      "Jawa Barat, Prop.",
      "Jawa Tengah, Prop.",
      "DI Yogyakarta, Prop.",
      "Jawa Timur, Prop."
    )
  )%>%
  transmute(
    provinsi = str_remove(area_name, ", Prop."),
    tahun = year,
    pdrb_nonmigas = total_gdp_excluding_oil_and_gas_in_idr_million_constant_price) %>% 
    filter(!is.na(pdrb_nonmigas))

glimpse(pdrb_pjawa)
Rows: 164
Columns: 3
$ provinsi      <chr> "Banten", "Banten", "Banten", "Banten", "Banten", ~
$ tahun         <dbl> 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 20~
$ pdrb_nonmigas <dbl> 45690559, 47495383, 49449321, 51957458, 54880407, ~

Grafik PDRB Non-Migas

cara 1

pdrb_pjawa%>%
  ggplot(aes(tahun, pdrb_nonmigas, colour = provinsi)) +
  geom_line()

Menggunakan fungsi fct_reorder2 yang terdapat dalam package forcats untuk mengurutkan data berdasarkan besaran PDRB Non-Migas pada tahun terakhir.

library(forcats)

pdrb_pjawa%>%
  mutate(
    provinsi = fct_reorder2(provinsi, tahun, pdrb_nonmigas)
  ) %>%
  ggplot(aes(tahun, pdrb_nonmigas, colour = provinsi)) +
  geom_line()

Tampak terjadi perbedaan, bukan?

Direct Labeling

Penggunaan direct labeling ini akan mempermudah kita dalam melihat setiap garis

library(directlabels)

pdrb_pjawa %>% 
  ggplot(aes(tahun, pdrb_nonmigas)) +
  geom_line(aes(colour = provinsi), show.legend = FALSE) +
  geom_dl(
    aes(label = provinsi), 
    method = "last.points",
    position = position_nudge(x = 0.3) # agar teks tidak berhimpitan dengan garis
  )

Finalisasi Grafik

library(hrbrthemes)

pdrb_pjawa %>% 
  ggplot(aes(tahun, pdrb_nonmigas/1e6)) +
  geom_line(aes(colour = provinsi), show.legend = FALSE) +
  geom_dl(
    aes(label = provinsi), 
    method = "last.points",
    position = position_nudge(x = 0.3) # agar teks tidak berhimpitan dengan garis
  ) +
  labs(
    x = NULL,
    y = NULL,
    title = "PDRB Non-Migas di Pulau Jawa Hingga Tahun 2011",
    subtitle = "PDRB atas dasar harga konstan, dalam satuan triliun",
    caption = "Data: INDO-DAPOER, The World Bank"
  ) +
  coord_cartesian(clip = "off") +
  theme_ipsum(grid = "Y", ticks = TRUE)

Seluas Apa?

luas_provinsi <- 
  indodapoer %>% 
  filter(str_detect(area_name, "Prop")) %>% 
  filter(year==2009)%>%
  transmute(
    provinsi = str_remove(area_name, ", Prop."),
    luas_wilayah = total_area_in_km2
  )
glimpse(luas_provinsi)
Rows: 34
Columns: 2
$ provinsi     <chr> "Nanggroe Aceh Darussalam", "Bali", "Kepulauan Bang~
$ luas_wilayah <dbl> 57956.00, 5780.06, 16424.06, 19919.33, 9662.92, 112~

Komparasi Luas Wilayah

Pada visualisasi berikut akan menggunakan jenis visualisasi treemap yang bisa diperoleh dengan memanfaatkan package “treemapify”

Sebelumnya mari kita cek apakah data sudah bersih atau belum. Hal ini dapat dilakukan dengan menggunakan fungsi

library(DataExplorer)
Registered S3 method overwritten by 'data.table':
  method           from
  print.data.table     
luas_provinsi
profile_missing(luas_provinsi)
luas_provinsi <- na.omit(luas_provinsi)

Terdapat 1 data num_missing pada luas_wilayah, sehingga harus dihilangkan.

library(treemapify)

luas_provinsi %>% 
  ggplot(aes(area = luas_wilayah)) +
  geom_treemap() +
  geom_treemap_text(aes(label = provinsi))

Modifikasi Grafik

library(scales)

Attaching package: 㤼㸱scales㤼㸲

The following object is masked from 㤼㸱package:readr㤼㸲:

    col_factor
luas_provinsi %>% 
  ggplot(aes(
    area = luas_wilayah, 
    fill = luas_wilayah)
  ) +
  geom_treemap() +
  geom_treemap_text(
    aes(label = provinsi), 
    family = "Arial Narrow",
    colour = "white",
    reflow = TRUE,
    grow = TRUE
  ) +
  scale_fill_viridis_c(
    guide = guide_colourbar(
      barwidth = 30,
      barheight = 0.8
    ),
    labels = label_number(
      big.mark = ".", 
      decimal.mark = ",", 
      suffix = " km2")
  ) +
  labs(
    fill = "Luas\nwilayah",
    title = "Perbandingan Luas 33 Provinsi di Indonesia",
    subtitle = "Berdasarkan data tahun 2009, sehingga Kalimantan Utara tidak tercantum dalam grafik",
    caption = "Data: INDO-DAPOER, The World Bank"
  ) +
  theme_ipsum() +
  theme(legend.position = "bottom")

Perjalanan Ini

Pada visualisasi ini akan menunjukkan kondisi infrastruktur jalan raya di seluruh kabupatan dan kota di Indonesia.

Persiapkan data

jalan_kabkota <- 
  indodapoer %>% 
  filter(str_detect(area_name, ", Prop.", negate = TRUE)) %>% 
  filter(year == 2008) %>%
  transmute(
    kabkota = area_name,
    jalan_rusak_parah = length_of_district_road_bad_damage_in_km_bina_marga_data,
    jalan_rusak_ringan = length_of_district_road_light_damage_in_km_bina_marga_data,
    jalan_cukup_baik =length_of_district_road_fair_in_km_bina_marga_data,
    jalan_sangat_baik =length_of_district_road_good_in_km_bina_marga_data)
glimpse(jalan_kabkota)
Rows: 514
Columns: 5
$ kabkota            <chr> "Aceh Barat, Kab.", "Aceh Barat Daya, Kab.", ~
$ jalan_rusak_parah  <dbl> 64, 1, 97, 112, 21, NA, 130, 8, 168, 76, 25, ~
$ jalan_rusak_ringan <dbl> 191, 15, 101, 321, 36, 553, 183, 207, 174, 35~
$ jalan_cukup_baik   <dbl> 218, 81, 270, 416, 59, 170, 146, 284, 201, 74~
$ jalan_sangat_baik  <dbl> 153, 87, 105, 284, 89, 25, 177, 432, 177, 221~

Pivot

Meakukan pivot dengan fungsi pivot_longer() yang terdapat dalam package tidyr

library(tidyr)

glimpse(jalan_kabkota)
Rows: 514
Columns: 5
$ kabkota            <chr> "Aceh Barat, Kab.", "Aceh Barat Daya, Kab.", ~
$ jalan_rusak_parah  <dbl> 64, 1, 97, 112, 21, NA, 130, 8, 168, 76, 25, ~
$ jalan_rusak_ringan <dbl> 191, 15, 101, 321, 36, 553, 183, 207, 174, 35~
$ jalan_cukup_baik   <dbl> 218, 81, 270, 416, 59, 170, 146, 284, 201, 74~
$ jalan_sangat_baik  <dbl> 153, 87, 105, 284, 89, 25, 177, 432, 177, 221~
jalan_kabkota <- 
  jalan_kabkota %>% 
  pivot_longer(
    cols = starts_with("jalan_"),
    names_to = "kondisi",
    names_prefix = "jalan_",
    values_to = "panjang_jalan"
  )

glimpse(jalan_kabkota)
Rows: 2,056
Columns: 3
$ kabkota       <chr> "Aceh Barat, Kab.", "Aceh Barat, Kab.", "Aceh Bara~
$ kondisi       <chr> "rusak_parah", "rusak_ringan", "cukup_baik", "sang~
$ panjang_jalan <dbl> 64, 191, 218, 153, 1, 15, 81, 87, 97, 101, 270, 10~

The Next Step

jalan_kabkota <-
  jalan_kabkota %>%
  mutate(
    status = case_when(
      str_detect(kabkota, ", Kab") ~ "Kabupaten",
      str_detect(kabkota, ", Kota") ~ "Kota",
      str_detect(kabkota, "City") ~ "Kota",
      TRUE ~ NA_character_
    ),
    kondisi = factor(
      kondisi,
      levels = c("rusak_parah", "rusak_ringan", "cukup_baik", "sangat_baik"),
      labels = c("Rusak parah", "Rusak ringan", "Cukup baik", "Sangat baik")
    )
  )
glimpse(jalan_kabkota)
Rows: 2,056
Columns: 4
$ kabkota       <chr> "Aceh Barat, Kab.", "Aceh Barat, Kab.", "Aceh Bara~
$ kondisi       <fct> Rusak parah, Rusak ringan, Cukup baik, Sangat baik~
$ panjang_jalan <dbl> 64, 191, 218, 153, 1, 15, 81, 87, 97, 101, 270, 10~
$ status        <chr> "Kabupaten", "Kabupaten", "Kabupaten", "Kabupaten"~

Grafik Kondisi Jalan

Disini akan menggunakan Rideline plot yang sangat bermanfaat untuk menampilkan perubahan distribusi dari suatu variabel numerik.

library(ggridges)

jalan_kabkota_plot <- 
  jalan_kabkota %>% 
  ggplot(aes(panjang_jalan, kondisi)) +
  facet_wrap(~status) +
  geom_density_ridges_gradient(
    aes(fill = after_stat(x)), 
    show.legend = FALSE
  )
jalan_kabkota_plot

Transformasi Logaritmik

Agar gambar tampak lebih enak dilihat maka disini akan dilakukan transformasi logaritmik yang hasilnya sebagai berikut:

jalan_kabkota_plot <-
  jalan_kabkota %>%
  ggplot(aes(panjang_jalan, kondisi)) +
  facet_wrap(~status) +
  geom_density_ridges_gradient(
    aes(fill = after_stat(x)),
    show.legend = FALSE
  )
jalan_kabkota_plot +
  geom_vline(xintercept = 100, linetype = "dashed", colour = "darkslategray4") +
  scale_x_continuous(trans = "log10")

Finalisasi

jalan_kabkota_plot <-
  jalan_kabkota %>%
  ggplot(aes(panjang_jalan, kondisi)) +
  facet_wrap(~status) +
  geom_density_ridges_gradient(
    aes(fill = after_stat(x)),
    show.legend = FALSE
  )
jalan_kabkota_plot +
  geom_vline(xintercept = 100, linetype = "dashed", colour = "darkslategray4") +
  scale_x_continuous(trans = "log10") +
  scale_fill_viridis_c(option = "magma") +
  labs(
    x = "Panjang jalan (Km)",
    y = NULL,
    title = "Jalan Kabupaten/Kota Berdasarkan Kondisi",
    subtitle =  "Berdasarkan data tahun 2008, garis vertikal menunjukan panjang jalan 100 Km",
    caption = "Data: INDO-DAPOER, The World Bank"
  ) +
  theme_ipsum(grid = FALSE,ticks = TRUE)

Fasilitas Kesehatan di Kalimantan

faskes_kalimantan <-
  indodapoer %>%
  filter(str_detect(area_name, "Kalimantan")) %>%
  filter(year == 2011) %>%
  transmute(
    provinsi = str_remove(area_name, ", Prop."),
    rumahsakit = number_of_hospitals,
    polindes = number_of_polindes_poliklinik_desa_village_polyclinic,
    puskesmas = number_of_puskesmas_and_its_line_services
  ) %>%
  pivot_longer(
    cols = -provinsi,
    names_to = "faskes",
    values_to = "jumlah"
  ) %>%
  filter(!is.na(jumlah)) %>%
  mutate(
    provinsi = fct_reorder(provinsi, jumlah, sum),
    jumlah = ceiling(jumlah / 10)
  )
glimpse(faskes_kalimantan)
Rows: 12
Columns: 3
$ provinsi <fct> Kalimantan Barat, Kalimantan Barat, Kalimantan Barat, K~
$ faskes   <chr> "rumahsakit", "polindes", "puskesmas", "rumahsakit", "p~
$ jumlah   <dbl> 4, 53, 98, 5, 11, 96, 3, 41, 75, 2, 22, 109

Kon(teks)

Disini akan membuat worldcloud dengan menggunakan package ggworldcloud. Selain itu, akan digunakan font yang berbeda dengan bantuan showtext.

Font yang akan digunakan yaitu “Lacquer” yang berasal dari fungsi font_add_google().

library(ggwordcloud)
library(showtext)

-- Column specification --------------------------------------------------
cols(
  hashtags = col_character(),
  count = col_double(),
  contains_data_word = col_logical()
)
glimpse(hashtags)
Rows: 60
Columns: 3
$ hashtags           <chr> "bigdata", "data", "datascience", "datascient~
$ count              <dbl> 489, 489, 489, 489, 489, 489, 489, 489, 489, ~
$ contains_data_word <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, T~
font_add_google("Lacquer")
showtext_auto()

hashtags %>%
  ggplot(
    aes(
      label = hashtags,
      size = count,
      colour = contains_data_word
    )
  ) +
  geom_text_wordcloud_area(family = "Lacquer", shape = "square") +
  scale_size_area(max_size = 20) +
  scale_colour_manual(values = c("#009AB3", "#B0E601")) +
  theme_void() +
  theme(plot.background = element_rect(fill = "#1E1E1E"))

Jumlah Rata-rata Likes per Hari


-- Column specification --------------------------------------------------
cols(
  day = col_character(),
  is_weekend = col_logical(),
  nposts = col_double(),
  nlikes = col_double(),
  avglikes = col_double()
)
library(ggtext)

glimpse(igstats)
Rows: 7
Columns: 5
$ day        <chr> "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday~
$ is_weekend <lgl> TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, TRUE
$ nposts     <dbl> 10, 103, 78, 93, 80, 108, 23
$ nlikes     <dbl> 1237, 16055, 10496, 12803, 10474, 25595, 3828
$ avglikes   <dbl> 123.7000, 155.8738, 134.5641, 137.6667, 130.9250, 236~
igstats_plot <- 
  igstats %>%
  mutate(day = fct_reorder(day, avglikes)) %>%
  ggplot() +
  geom_segment(aes(
    x = 0,
    xend = avglikes,
    y = day,
    yend = day
  ),
  colour = "white",
  linetype = "longdash"
  ) +
  geom_point(
    aes(avglikes, day, fill = is_weekend),
    shape = "circle filled",
    size = 18,
    colour = "white",
    show.legend = FALSE
  ) +
  geom_text(
    aes(avglikes, day, label = round(avglikes)),
    colour = "white",
    family = "Lacquer",
    size = 7
  ) +
  geom_text(
    aes(x = 0, day, label = day),
    colour = "white",
    nudge_y = 0.15,
    hjust = "left",
    family = "Lacquer"
  ) +
  geom_curve(
    aes(
      x = 185,
      xend = 174,
      y = 6.3,
      yend = 6
    ),
    colour = "white",
    curvature = -0.3,
    arrow = arrow(length = unit(0.1, "inches"), type = "closed")
  ) +
  geom_curve(
    aes(
      x = 185,
      xend = 230,
      y = 6.8,
      yend = 7.2
    ),
    colour = "white",
    curvature = -0.25,
    arrow = arrow(length = unit(0.1, "inches"), type = "closed")
  ) +
  annotate(
    geom = "richtext",
    x = 200,
    y = 6.5,
    label = "<span style='color:Blue'>Blue</span> is weekday,<br><span style='color:Green'>green</span> is weekend",
    fill = NA,
    label.colour = NA,
    colour = "white",
    family = "Lacquer",
    size = 4
  ) +
  annotate(
    geom = "text",
    x = 200,
    y = 3,
    label = "How many\nlikes did \nI get?",
    colour = "white",
    hjust = "center",
    family = "Lacquer",
    size = 15
  ) +
  scale_fill_manual(values = c("Blue", "Green")) +
  theme_void() +
  theme(plot.background = element_rect(fill = "Black"))

igstats_plot

Jati Diri

library(cowplot)

ggdraw(igstats_plot) +
   draw_image(
     image = "https://storage.googleapis.com/dqlab-dataset/assets/images/logo-dqlab.png",
     x = 0.425,
     y = -0.44,
     scale = 0.1
   )
Package `magick` is required to draw images. Image not drawn.

library(gganimate)
igcomments <- read.csv("C:/Users/ACER/Documents/BRANDING/DQLAB/Porto/New folder/igcomments.csv")

glimpse(igcomments)
Rows: 495
Columns: 4
$ date       <chr> "2020-08-15", "2020-08-15", "2020-08-14", "2020-08-14~
$ hour       <int> 19, 14, 19, 14, 14, 13, 19, 14, 19, 19, 19, 14, 19, 1~
$ is_video   <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALS~
$ n_comments <int> 3, 3, 4, 6, 0, 22, 6, 18, 0, 8, 3, 8, 5, 3, 0, 28, 2,~
font_add_google(name = "Roboto Condensed")
showtext_auto()
igcomments_plot <- 
  igcomments %>%
  sample_frac() %>%
  mutate(
    frame = row_number(),
    label = format(date, format = "%e %b %y")
  ) %>%
  ggplot(aes(frame, hour, colour = is_video, size = n_comments)) +
  geom_jitter(alpha = 0.8, show.legend = FALSE) +
  scale_colour_manual(values = c("#009AB3", "#B0E601")) +
  scale_size_area(max_size = 12) +
  theme_modern_rc(
    base_family = "Roboto Condensed",
    plot_title_size = 13,
    plot_title_face = "plain",
    subtitle_size = 35,
    subtitle_face = "bold",
    caption_face = "italic"
  ) +
  theme(
    plot.title.position = "plot",
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5),
    plot.caption = element_text(hjust = 0.5),
    axis.text = element_blank(),
    axis.text.x = element_blank(),
    axis.text.y = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank()
  ) +
  coord_polar()
igcomments_anim <-
  igcomments_plot +
  labs(
    title = "Constellation of Instagram contents!",
    subtitle = "{current_frame}",
    caption = "More comment make the star bigger\nGreen stars are video contents"
  ) +
  transition_manual(label, cumulative = TRUE) +
  enter_appear() +
  ease_aes("cubic-in-out")

Ini dia hasilnya!

---
title: "ADVANCED DATA VISUALIZATION WITH 'ggplot2' USING R"
subtitle: 'by : DQLAB'
output:
  html_notebook:
    toc: yes
    toc_depth: 2
    toc_float:
      collapsed: no
      smooth_scroll: no
  html_document:
    toc: yes
    toc_depth: '2'
    df_print: paged
---

Salah satu package yang digunakan dalam meakukan visualisasi data pada bahasa R adalah "ggplot2". Package ini adalah hasi dari konsep _Grammar of Graphic_ yang memiliki pronsip merekontruksi pembuatan grafik dengan menggunakan kaidah tata bahasa seperti "scatterpot, line-chart, bar-chart, dll".

Berikut adalah proses visualisasi menggunakan package "ggplot2" pada R.

Persiapkan package "ggplot2"
melakukan instalasi terlebih dahulu apbila sebelumnya beum pernah menggunakan package ini. 

Setelah itu panggil paket ggplot2 menggunakan function library

```{r}
#install.packages("ggplot2")
library("ggplot2")
```

#Membuat kode ggplot
Pada kasus ini menggunakan data "mtcars" yang memiliki variabel sebagai berikut:

- mpg : Miles/(US) gallon
- cyl : Number of cylinders
- disp : Displacement (cu.in.)
- hp :	Gross horsepower
- drat : Rear axle ratio
- wt : Weight (1000 lbs)
- qsec : 1/4 mile time
- vs : Engine (0 = V-shaped, 1 = straight)
- am : Transmission (0 = automatic, 1 = manual)
- gear : Number of forward gears
- crab : Number of carburetors

Kita bisa menuliskan penggunaan ggplot dengan cara seperti ini!

## Cara Pertama

```{r}
cara1 <- ggplot(data = mtcars, mapping = aes(x=mpg ,  y = cyl, color = drat)) +
  geom_point()
summary(cara1)
```

## Cara Kedua

```{r}
cara2 <- ggplot(data = mtcars) + geom_point(mapping = aes(x=mpg ,  y = cyl, color = drat)) 
summary(cara2)
```

## Cara Ketiga

```{r}
cara3 <- ggplot() + 
  geom_point(
    data = mtcars, 
    mapping = aes(x=mpg ,  y = cyl, color = drat)
    ) 
summary(cara3)
```

melalui ketiga cara tersebut, nantinya akan menghasilkan hasil yang sama. Meski demikian, jika dilihat dari fungsi summary akan terlihat perbedaan diantara ketiganya. 

# Komponen Dasar Pembuatan Grafik

Terdapat 3 komponen dasar dalam pembuatan grafik yaitu:
data = data berisi informasi yang akan dibuat grafik
Mapping = yaitu penentuan variabel/kolom yang akan ditampilkan dalam grafik
Geometries = yaitu representasi visual dari variabel/kolom dalam grafik

Selain menggunakan ggplot2(), terdapat kode yang lain yang lebih sederhana yaitu qplot().
## Fungsi qplot()

qplot(<MAPPING>, data = <DATA>, geom = <GEOM>)

# Komponen Pembuatan Grafik

Jika konsep dasar pembuatan grafik terdapat 3, maka keseluruhan pembuatan grafik ini secara keseuruhan berjumlah 8 buah. 

8 buah komponen dalam pemuatan grafik diantaranya, data, mapping, statistic, scales, geometries, facets, coordinates, dan theme. 

# Transformasi Data

Ketika melakukan visualisasi data menggunakan ggplot2() seringkali membutuhkan tools transformasi data yang nantinya akan memudahkan dalam proses pengambilan data itu sendiri. 

Pada bahasa R package yang terkenal dalam melakukan transformasi data adalah tidyverse(). Paket ini memiliki beberapa library yang akan sangat membantu, diantaranya yaitu dplyr(), tidyr(), readr(), tibble(), stringr(), forcats(), dan purrr().

Sebagai contoh paket dplyr yang memiliki fugsi sebagai berikut:
•	select()
•	filter()
•	arrange()
•	mutate()
•	summarise()
•	group_by()

Penggunaan transformasi data ini juga kerap menggunakan bantuan operator pipe (%>%) jika terdiri atas beberapa proses. 

## Aktifkan package dplyr()

```{r}
library(dplyr)
glimpse(mtcars)
```

#Penggunaan tanpa pipe (%>%)
```{r}
cars1 <- select(mtcars, mpg, cyl, qsec, drat, gear)
cars2 <- filter(cars1, between(qsec, 15.00, 18.00))
cars3 <- mutate(cars2, gear_per_second = gear/qsec)
cars4 <- group_by(cars3, cyl)
cars_nopipe <- summarise(cars4, avg_mpg = mean(mpg), gear_per_second=gear_per_second, gear=gear)

glimpse(cars_nopipe)
```

#Penggunaan dengan pipe (%>%)
```{r}
cars = select(mtcars, mpg, cyl, qsec, drat, gear) %>%
  filter(between(qsec, 15.00, 18.00)) %>%
  mutate(gear_per_second = gear/qsec) %>%
  group_by( cyl) %>%
  summarise(avg_mpg = mean(mpg), gear_per_second=gear_per_second, gear=gear)

glimpse(cars)
```

Hasil penggunaan pipe akan memberikan efektivitas dalam menuliskan code. 

# Menggunakan data Indonesia Database for Policy and Economic Research

Indonesia Database for Policy and Economic Research disingkat menjadi INDO-DAPOER. Data ini berisikan indikator ekonomi dan sosial pada level provinsi serta kota/kabupaten di Indonesia. Ada empat kategori utama yang terhimpun di dalam data ini, yaitu: fiskal, ekonomi, sosial-demografi, serta infrastuktur.

## Import data
```{r, echo=FALSE}
library(readr)
indodapoer<- read_tsv("C:/Users/ACER/Documents/BRANDING/DQLAB/Porto/New folder/indodapoer.tsv/indodapoer.tsv")
```

## Cek data
```{r}
glimpse(indodapoer)
# nrow(indodapoer)
# ncol(indodapoer)
```
Jadi, total baris yang ada pada data INDO-DAPOER adalah 22.468 dan kolom yang berjumlah 222. 

# Wild Names and How to Tame Them

Selanjutnya menggunakan paket "Janitor" yang berfungsi untuk  membuat nama-nama kolom sesuai dengan kaidah “syntactically valid names”. Fungsi yang memudahkan yang terdapat pada paket tersebut adalah clean_names(). Melalui fungsi ini kita dapat merapikan nama-nama kolom sehingga akan lebih mudah digunakan untuk analisis atau visualisasi data.

```{r}
library(janitor)

head(colnames(indodapoer), 15)
indodapoer <- clean_names(indodapoer)
head(colnames(indodapoer), 15)
```
# Produk Domestik Regional Bruto

Disini akan menunjukkan PDRB untuk Pulau Jawa, akan tetapi karena pada data terdapat kata "Prop." pada masing-masing provinsi, maka dari itu kata tersebut akan dihapus menggunakan function _str_remove()_ yang terdapat dalam package (stringr)

```{r}
library(stringr)

pdrb_pjawa <- 
  indodapoer %>%
  filter(
    area_name %in% c(
      "Banten, Prop.",
      "DKI Jakarta, Prop.",
      "Jawa Barat, Prop.",
      "Jawa Tengah, Prop.",
      "DI Yogyakarta, Prop.",
      "Jawa Timur, Prop."
    )
  )%>%
  transmute(
    provinsi = str_remove(area_name, ", Prop."),
    tahun = year,
    pdrb_nonmigas = total_gdp_excluding_oil_and_gas_in_idr_million_constant_price) %>% 
    filter(!is.na(pdrb_nonmigas))

glimpse(pdrb_pjawa)
```
# Grafik PDRB Non-Migas

cara 1 

```{r}
pdrb_pjawa%>%
  ggplot(aes(tahun, pdrb_nonmigas, colour = provinsi)) +
  geom_line()
```
Menggunakan fungsi fct_reorder2 yang terdapat dalam package forcats untuk mengurutkan data berdasarkan besaran PDRB Non-Migas pada tahun terakhir.

```{r}
library(forcats)

pdrb_pjawa%>%
  mutate(
    provinsi = fct_reorder2(provinsi, tahun, pdrb_nonmigas)
  ) %>%
  ggplot(aes(tahun, pdrb_nonmigas, colour = provinsi)) +
  geom_line()

```
Tampak terjadi perbedaan, bukan?

# Direct Labeling

Penggunaan direct labeling ini akan mempermudah kita dalam melihat setiap garis 

```{r,fig.height=10, fig.width=25}
library(directlabels)

pdrb_pjawa %>% 
  ggplot(aes(tahun, pdrb_nonmigas)) +
  geom_line(aes(colour = provinsi), show.legend = FALSE) +
  geom_dl(
    aes(label = provinsi), 
    method = "last.points",
    position = position_nudge(x = 0.3) # agar teks tidak berhimpitan dengan garis
  )
```
# Finalisasi Grafik 



```{r, warning=FALSE, fig.height=10, fig.width=15}
library(hrbrthemes)

pdrb_pjawa %>% 
  ggplot(aes(tahun, pdrb_nonmigas/1e6)) +
  geom_line(aes(colour = provinsi), show.legend = FALSE) +
  geom_dl(
    aes(label = provinsi), 
    method = "last.points",
    position = position_nudge(x = 0.3) # agar teks tidak berhimpitan dengan garis
  ) +
  labs(
    x = NULL,
    y = NULL,
    title = "PDRB Non-Migas di Pulau Jawa Hingga Tahun 2011",
    subtitle = "PDRB atas dasar harga konstan, dalam satuan triliun",
    caption = "Data: INDO-DAPOER, The World Bank"
  ) +
  coord_cartesian(clip = "off") +
  theme_ipsum(grid = "Y", ticks = TRUE)
```

# Seluas Apa?

```{r}
luas_provinsi <- 
  indodapoer %>% 
  filter(str_detect(area_name, "Prop")) %>% 
  filter(year==2009)%>%
  transmute(
    provinsi = str_remove(area_name, ", Prop."),
    luas_wilayah = total_area_in_km2
  )
glimpse(luas_provinsi)
```
# Komparasi Luas Wilayah

Pada visualisasi berikut akan menggunakan jenis visualisasi treemap yang bisa diperoleh dengan memanfaatkan package "treemapify"

Sebelumnya mari kita cek apakah data sudah bersih atau belum. Hal ini dapat dilakukan dengan menggunakan fungsi 

```{r}
library(DataExplorer)
luas_provinsi
profile_missing(luas_provinsi)
luas_provinsi <- na.omit(luas_provinsi)
```

Terdapat 1 data num_missing pada luas_wilayah, sehingga harus dihilangkan. 


```{r}
library(treemapify)

luas_provinsi %>% 
  ggplot(aes(area = luas_wilayah)) +
  geom_treemap() +
  geom_treemap_text(aes(label = provinsi))
```

# Modifikasi Grafik

```{r}
library(scales)

luas_provinsi %>% 
  ggplot(aes(
    area = luas_wilayah, 
    fill = luas_wilayah)
  ) +
  geom_treemap() +
  geom_treemap_text(
    aes(label = provinsi), 
    family = "Arial Narrow",
    colour = "white",
    reflow = TRUE,
    grow = TRUE
  ) +
  scale_fill_viridis_c(
    guide = guide_colourbar(
      barwidth = 30,
      barheight = 0.8
    ),
    labels = label_number(
      big.mark = ".", 
      decimal.mark = ",", 
      suffix = " km2")
  ) +
  labs(
    fill = "Luas\nwilayah",
    title = "Perbandingan Luas 33 Provinsi di Indonesia",
    subtitle = "Berdasarkan data tahun 2009, sehingga Kalimantan Utara tidak tercantum dalam grafik",
    caption = "Data: INDO-DAPOER, The World Bank"
  ) +
  theme_ipsum() +
  theme(legend.position = "bottom")
```


# Perjalanan Ini

Pada visualisasi ini akan menunjukkan kondisi infrastruktur jalan raya di seluruh kabupatan dan kota di Indonesia. 

## Persiapkan data

```{r}
jalan_kabkota <- 
  indodapoer %>% 
  filter(str_detect(area_name, ", Prop.", negate = TRUE)) %>% 
  filter(year == 2008) %>%
  transmute(
    kabkota = area_name,
    jalan_rusak_parah = length_of_district_road_bad_damage_in_km_bina_marga_data,
    jalan_rusak_ringan = length_of_district_road_light_damage_in_km_bina_marga_data,
    jalan_cukup_baik =length_of_district_road_fair_in_km_bina_marga_data,
    jalan_sangat_baik =length_of_district_road_good_in_km_bina_marga_data)
glimpse(jalan_kabkota)
```
# Pivot

Meakukan pivot dengan fungsi pivot_longer() yang terdapat dalam package tidyr

```{r}
library(tidyr)

glimpse(jalan_kabkota)
jalan_kabkota <- 
  jalan_kabkota %>% 
  pivot_longer(
    cols = starts_with("jalan_"),
    names_to = "kondisi",
    names_prefix = "jalan_",
    values_to = "panjang_jalan"
  )

glimpse(jalan_kabkota)
```
# The Next Step

```{r}
jalan_kabkota <-
  jalan_kabkota %>%
  mutate(
	status = case_when(
	  str_detect(kabkota, ", Kab") ~ "Kabupaten",
	  str_detect(kabkota, ", Kota") ~ "Kota",
	  str_detect(kabkota, "City") ~ "Kota",
	  TRUE ~ NA_character_
	),
	kondisi = factor(
	  kondisi,
	  levels = c("rusak_parah", "rusak_ringan", "cukup_baik", "sangat_baik"),
	  labels = c("Rusak parah", "Rusak ringan", "Cukup baik", "Sangat baik")
	)
  )
glimpse(jalan_kabkota)
```
# Grafik Kondisi Jalan

Disini akan menggunakan Rideline plot yang sangat bermanfaat untuk menampilkan perubahan distribusi dari suatu variabel numerik.

```{r,}
library(ggridges)

jalan_kabkota_plot <- 
  jalan_kabkota %>% 
  ggplot(aes(panjang_jalan, kondisi)) +
  facet_wrap(~status) +
  geom_density_ridges_gradient(
    aes(fill = after_stat(x)), 
    show.legend = FALSE
  )
jalan_kabkota_plot
```

# Transformasi Logaritmik

Agar gambar tampak lebih enak dilihat maka disini akan dilakukan transformasi logaritmik yang hasilnya sebagai berikut:

```{r}
jalan_kabkota_plot <-
  jalan_kabkota %>%
  ggplot(aes(panjang_jalan, kondisi)) +
  facet_wrap(~status) +
  geom_density_ridges_gradient(
    aes(fill = after_stat(x)),
    show.legend = FALSE
  )
jalan_kabkota_plot +
  geom_vline(xintercept = 100, linetype = "dashed", colour = "darkslategray4") +
  scale_x_continuous(trans = "log10")
```

# Finalisasi

```{r}
jalan_kabkota_plot <-
  jalan_kabkota %>%
  ggplot(aes(panjang_jalan, kondisi)) +
  facet_wrap(~status) +
  geom_density_ridges_gradient(
    aes(fill = after_stat(x)),
    show.legend = FALSE
  )
jalan_kabkota_plot +
  geom_vline(xintercept = 100, linetype = "dashed", colour = "darkslategray4") +
  scale_x_continuous(trans = "log10") +
  scale_fill_viridis_c(option = "magma") +
  labs(
    x = "Panjang jalan (Km)",
    y = NULL,
    title = "Jalan Kabupaten/Kota Berdasarkan Kondisi",
    subtitle =  "Berdasarkan data tahun 2008, garis vertikal menunjukan panjang jalan 100 Km",
    caption = "Data: INDO-DAPOER, The World Bank"
  ) +
  theme_ipsum(grid = FALSE,ticks = TRUE)
```

# Fasilitas Kesehatan di Kalimantan

```{r}
faskes_kalimantan <-
  indodapoer %>%
  filter(str_detect(area_name, "Kalimantan")) %>%
  filter(year == 2011) %>%
  transmute(
    provinsi = str_remove(area_name, ", Prop."),
    rumahsakit = number_of_hospitals,
    polindes = number_of_polindes_poliklinik_desa_village_polyclinic,
    puskesmas = number_of_puskesmas_and_its_line_services
  ) %>%
  pivot_longer(
    cols = -provinsi,
    names_to = "faskes",
    values_to = "jumlah"
  ) %>%
  filter(!is.na(jumlah)) %>%
  mutate(
    provinsi = fct_reorder(provinsi, jumlah, sum),
    jumlah = ceiling(jumlah / 10)
  )
glimpse(faskes_kalimantan)
```

# Kon(teks)

Disini akan membuat worldcloud dengan menggunakan package ggworldcloud. Selain itu, akan digunakan font yang berbeda dengan bantuan showtext. 

Font yang akan digunakan yaitu "Lacquer" yang berasal dari fungsi font_add_google().

```{r}
library(ggwordcloud)
library(showtext)
```

```{r, echo=FALSE}
hashtags <- read_csv("C:/Users/ACER/Documents/BRANDING/DQLAB/Porto/New folder/hashtags.csv")

```

```{r}
glimpse(hashtags)
```


```{r, warning=FALSE}
font_add_google("Lacquer")
showtext_auto()

hashtags %>%
  ggplot(
    aes(
      label = hashtags,
      size = count,
      colour = contains_data_word
    )
  ) +
  geom_text_wordcloud_area(family = "Lacquer", shape = "square") +
  scale_size_area(max_size = 20) +
  scale_colour_manual(values = c("#009AB3", "#B0E601")) +
  theme_void() +
  theme(plot.background = element_rect(fill = "#1E1E1E"))
```
# Jumlah Rata-rata Likes per Hari

```{r, echo=FALSE}
igstats <- read_csv("C:/Users/ACER/Documents/BRANDING/DQLAB/Porto/New folder/igstats.csv")
```



```{r, fig.height=5, fig.width=15}
library(ggtext)

glimpse(igstats)
igstats_plot <- 
  igstats %>%
  mutate(day = fct_reorder(day, avglikes)) %>%
  ggplot() +
  geom_segment(aes(
    x = 0,
    xend = avglikes,
    y = day,
    yend = day
  ),
  colour = "white",
  linetype = "longdash"
  ) +
  geom_point(
    aes(avglikes, day, fill = is_weekend),
    shape = "circle filled",
    size = 18,
    colour = "white",
    show.legend = FALSE
  ) +
  geom_text(
    aes(avglikes, day, label = round(avglikes)),
    colour = "white",
    family = "Lacquer",
    size = 7
  ) +
  geom_text(
    aes(x = 0, day, label = day),
    colour = "white",
    nudge_y = 0.15,
    hjust = "left",
    family = "Lacquer"
  ) +
  geom_curve(
    aes(
      x = 185,
      xend = 174,
      y = 6.3,
      yend = 6
    ),
    colour = "white",
    curvature = -0.3,
    arrow = arrow(length = unit(0.1, "inches"), type = "closed")
  ) +
  geom_curve(
    aes(
      x = 185,
      xend = 230,
      y = 6.8,
      yend = 7.2
    ),
    colour = "white",
    curvature = -0.25,
    arrow = arrow(length = unit(0.1, "inches"), type = "closed")
  ) +
  annotate(
    geom = "richtext",
    x = 200,
    y = 6.5,
    label = "<span style='color:Blue'>Blue</span> is weekday,<br><span style='color:Green'>green</span> is weekend",
    fill = NA,
    label.colour = NA,
    colour = "white",
    family = "Lacquer",
    size = 4
  ) +
  annotate(
    geom = "text",
    x = 200,
    y = 3,
    label = "How many\nlikes did \nI get?",
    colour = "white",
    hjust = "center",
    family = "Lacquer",
    size = 15
  ) +
  scale_fill_manual(values = c("Blue", "Green")) +
  theme_void() +
  theme(plot.background = element_rect(fill = "Black"))

igstats_plot
```
# Jati Diri

```{r, fig.height=5, fig.width=15}
library(cowplot)

ggdraw(igstats_plot) +
   draw_image(
     image = "https://storage.googleapis.com/dqlab-dataset/assets/images/logo-dqlab.png",
     x = 0.425,
     y = -0.44,
     scale = 0.1
   )
```

```{r}
library(gganimate)
igcomments <- read.csv("C:/Users/ACER/Documents/BRANDING/DQLAB/Porto/New folder/igcomments.csv")

glimpse(igcomments)
```
```{r}
font_add_google(name = "Roboto Condensed")
showtext_auto()
```


```{r}
igcomments_plot <- 
  igcomments %>%
  sample_frac() %>%
  mutate(
    frame = row_number(),
    label = format(date, format = "%e %b %y")
  ) %>%
  ggplot(aes(frame, hour, colour = is_video, size = n_comments)) +
  geom_jitter(alpha = 0.8, show.legend = FALSE) +
  scale_colour_manual(values = c("#009AB3", "#B0E601")) +
  scale_size_area(max_size = 12) +
  theme_modern_rc(
    base_family = "Roboto Condensed",
    plot_title_size = 13,
    plot_title_face = "plain",
    subtitle_size = 35,
    subtitle_face = "bold",
    caption_face = "italic"
  ) +
  theme(
    plot.title.position = "plot",
    plot.title = element_text(hjust = 0.5),
    plot.subtitle = element_text(hjust = 0.5),
    plot.caption = element_text(hjust = 0.5),
    axis.text = element_blank(),
    axis.text.x = element_blank(),
    axis.text.y = element_blank(),
    axis.title.x = element_blank(),
    axis.title.y = element_blank()
  ) +
  coord_polar()
```

```{r}
igcomments_anim <-
  igcomments_plot +
  labs(
    title = "Constellation of Instagram contents!",
    subtitle = "{current_frame}",
    caption = "More comment make the star bigger\nGreen stars are video contents"
  ) +
  transition_manual(label, cumulative = TRUE) +
  enter_appear() +
  ease_aes("cubic-in-out")
```

```{r, echo=FALSE}
animate(igcomments_anim, renderer = gifski_renderer("IG Comments.gif"))
```
![Ini dia hasilnya!](C:/Users/ACER/Documents/IG Comments.gif)









