LBB: Data Visualization

Syahvan Alviansyah Diva Ritonga

2023-06-12

1. Introduction

Alfamart and Indomaret are the two largest minimarket chains in Indonesia. Established in 1999, Alfamart is part of the Alfamidi Group, while Indomaret, founded in 1988, is owned by the Salim Group. Both stores offer a variety of everyday necessities, with extensive networks across Indonesia, competing to provide competitive service and prices to customers. This dataset contains information regarding Alfamart-Indomaret’ stores across South Jakarta. This dataset provide Alfamart-Indomaret coordinates (latitude, longitude) in South Jakarta, not only coordinates actually, but also their ratings in Google Maps, the number of users rating it, etc. The raw data comes from Kaggle which is collected from the Google Maps API with the following links: https://www.kaggle.com/datasets/ilhammukti/data-alfamart-indomaret

2. Data Preprocessing

2.1 Import library and dataset

library(dplyr) # untuk transformasi data
library(plotly) # untuk membuat plot menjadi interaktif
library(glue) # untuk custom informasi saat plot interaktif
library(scales) # untuk custom keterangan axis atau lainnya
library(tidyr) # untuk custom keterangan axis atau lainnya
library(ggpubr) # untuk export plot

minimarket <- read.csv("Data_Alfamart Indomaret_South Jakarta.csv")
dim(minimarket)
#> [1] 675  12

From inspection above, we got short description of the data. The dataset above is consist of 675 rows and 12 columns.

2.2 Change the appropriate data type

glimpse(minimarket)
#> Rows: 675
#> Columns: 12
#> $ nama_tempat        <chr> "indomaret", "indomaret", "indomaret bdn raya", "in…
#> $ rating_tempat      <dbl> 4.3, 0.0, 4.2, 4.7, 4.3, 0.0, 4.8, 4.2, 0.0, 4.3, 4…
#> $ user_ratings_total <int> 104, 0, 66, 3, 108, 0, 8, 22, 0, 6, 26, 6, 4, 0, 10…
#> $ latitude           <dbl> -6.302203, -6.307003, -6.279392, -6.278223, -6.3013…
#> $ longitude          <dbl> 106.7919, 106.7937, 106.7984, 106.7971, 106.7805, 1…
#> $ alamat_tempat      <chr> "jl. lb. bulus iii no.40, rt.9/rw.7, cilandak bar.,…
#> $ place_id           <chr> "ChIJaxj8TyDuaS4RmJ8rDkpB2GA", "ChIJPz4W5BPvaS4RoUn…
#> $ store              <chr> "Indomaret", "Indomaret", "Indomaret", "Indomaret",…
#> $ place_id.1         <chr> "ChIJaxj8TyDuaS4RmJ8rDkpB2GA", "ChIJPz4W5BPvaS4RoUn…
#> $ nama_kelurahan     <chr> "Cilandak Barat", "Pondok Labu", "Cilandak Barat", …
#> $ nama_kecamatan     <chr> "Cilandak", "Cilandak", "Cilandak", "Cilandak", "Ci…
#> $ nama_kota          <chr> "Kota Jakarta Selatan", "Kota Jakarta Selatan", "Ko…

Several columns need to be converted into factors, namely the store, kelurahan_name, and kecamatan_name. In addition, the place_id.1 column needs to be removed because it is a duplicate and the name_city column also needs to be deleted because everything is only found in the city of South Jakarta.

minimarket <- minimarket %>% 
  select(-place_id.1, -nama_kota) %>% 
  mutate(nama_kecamatan = as.factor(nama_kecamatan),
         nama_kelurahan = as.factor(nama_kelurahan),
         store = as.factor(store))
glimpse(minimarket)
#> Rows: 675
#> Columns: 10
#> $ nama_tempat        <chr> "indomaret", "indomaret", "indomaret bdn raya", "in…
#> $ rating_tempat      <dbl> 4.3, 0.0, 4.2, 4.7, 4.3, 0.0, 4.8, 4.2, 0.0, 4.3, 4…
#> $ user_ratings_total <int> 104, 0, 66, 3, 108, 0, 8, 22, 0, 6, 26, 6, 4, 0, 10…
#> $ latitude           <dbl> -6.302203, -6.307003, -6.279392, -6.278223, -6.3013…
#> $ longitude          <dbl> 106.7919, 106.7937, 106.7984, 106.7971, 106.7805, 1…
#> $ alamat_tempat      <chr> "jl. lb. bulus iii no.40, rt.9/rw.7, cilandak bar.,…
#> $ place_id           <chr> "ChIJaxj8TyDuaS4RmJ8rDkpB2GA", "ChIJPz4W5BPvaS4RoUn…
#> $ store              <fct> Indomaret, Indomaret, Indomaret, Indomaret, Indomar…
#> $ nama_kelurahan     <fct> Cilandak Barat, Pondok Labu, Cilandak Barat, Gandar…
#> $ nama_kecamatan     <fct> Cilandak, Cilandak, Cilandak, Cilandak, Cilandak, C…

2.3 Checks whether there is missing value in the data frame

minimarket %>% 
  is.na %>% 
  colSums()
#>        nama_tempat      rating_tempat user_ratings_total           latitude 
#>                  0                  0                  0                  0 
#>          longitude      alamat_tempat           place_id              store 
#>                  0                  0                  0                  0 
#>     nama_kelurahan     nama_kecamatan 
#>                  0                  0

There are no missing values in the dataset.

3. Data Visualization

Alfamart vs Indomaret: Who has the most stores in South Jakarta?

# Import Theme
my_theme <- theme(legend.position = "right",
                  legend.direction = "vertical",
                  legend.key = element_rect(fill="transparent"),
                  legend.background = element_rect(fill="transparent"),
                  legend.title = element_text(size=6, color="black", family = "Montserrat"),
                  plot.subtitle = element_text(size=6, color="black", family = "Montserrat"),
                  panel.background = element_rect(fill="transparent"),
                  panel.border = element_rect(fill=NA),
                  panel.grid = element_line(alpha(colour = "#C3C3C3",alpha = 0.4), linetype=2),
                  plot.background = element_rect(fill="transparent"),
                  text = element_text(color="black", family = "Montserrat"),
                  axis.text = element_text(color="black", family = "Montserrat")
)
jumlah_toko <- minimarket %>% 
  group_by(store) %>% 
  summarise(jumlah = n(), rating = mean(rating_tempat))

plot_toko <- jumlah_toko %>% 
  ggplot(mapping=aes(x = jumlah,
                     y = store)) +
  geom_col(aes(fill = store), position = "dodge") +
  labs(title = "Jumlah Alfamart dan Indomaret di Jakarta Selatan",
       x = NULL,
       y = "Store") +
  my_theme

plot_toko

Insight: Indomaret has more stores compared to Alfamart.

Based on Kecamatan

kecamatan <- minimarket %>% 
  group_by(store, nama_kecamatan) %>% 
  summarise(.groups = 'drop', jumlah = n()) %>% 
  arrange(desc(jumlah))

plot_kecamatan <- kecamatan %>% 
  ggplot(mapping=aes(x = jumlah,
                     y = nama_kecamatan)) +
  geom_col(aes(fill = store), position = "dodge") +
  labs(title = "Jumlah Alfamart dan Indomaret Berdasarkan Kecamatan",
       x = NULL,
       y = "Store") +
  my_theme

plot_kecamatan

Insight:

  • The most numerous Alfamarts are in Jagakarsa District.
  • The most numerous Indomarets are in Kebayoran Lama District.
  • Jagakarsa has the most Alfamart and Indomaret compared to other districts. This might happen because most Jagakarsa areas are residential areas.

Alfamart vs Indomaret: Who has the best-rated stores in South Jakarta?

plot_rating <- jumlah_toko %>% 
  ggplot(mapping=aes(x = rating,
                     y = store)) +
  geom_col(aes(fill = store), position = "dodge") +
  labs(title = "Rata-Rata Rating Alfamart dan Indomaret di Jakarta Selatan",
       x = NULL,
       y = "Store") +
  my_theme

plot_rating

Insight:

  • Indomaret has a higher Google Maps rating than Alfamart.
  • This is possible because Indomaret has more complete products and good service. In addition, Indomaret also has more stores than Alfamart.

4. Conclusion

Indomaret dominates compared to Alfamart in South Jakarta. Indomaret has more stores and a better rating than Alfamart.