1 Background

PT Mitsubishi Motors Krama Yudha Indonesia (PT.MMKI) is an automotive company that owns an assembly plant located in Cikarang, West Java. This plant is known as the Mitsubishi Motors Indonesia Plant.

This analysis offers users the opportunity to explore comprehensive visualizations of production data for Mitsubishi vehicles, specifically the Pajero, Expander, and L300 models provided by PT. MMKI and intended for distribution in Indonesia. By utilizing the visualizations and insights generated by this analysis, PT.MMKI can make informed business decisions and formulate strategies that are aligned with market demands and production trends. This knowledge can help PT.MMKI optimize production planning, identify areas for improvement, and stay competitive in the Indonesian automotive market.

2 Data Preprocessing

2.1 Checking Data Structure

This dataset provides vehicle production data from April 2017 to July 2023, with a focus on domestic sales. The dataset includes detailed information about the number of vehicles produced during this period.Let’s create a DataFrame from a CSV file and display the first six rows to get a general overview of the data

handovers <- read.csv("data_input/handover.csv")
head(handovers)

The glimpse() function is a function in the dplyr package in R. It is used to provide a concise overview of the data structure, displaying information about each column in a data frame.Let’s take a look at the data structure using the glimpse() function:

glimpse(handovers)
#> Rows: 109,667
#> Columns: 7
#> $ Variant.Name <chr> "DAKAR Ultimate", "DAKAR Ultimate", "DAKAR Ultimate RLESS…
#> $ Variant      <chr> "KR1WGUFFPRV", "KR1WGUFFPRV5", "KR1WGUBFPRV", "KR1WGUBFPR…
#> $ Date         <chr> "2017-04-01", "2017-04-01", "2017-04-01", "2017-04-01", "…
#> $ Qty.Prod     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, …
#> $ Year         <dbl> 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 2017, 201…
#> $ Month        <dbl> 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, …
#> $ Model        <chr> "QX", "QX", "QX", "QX", "QX", "QX", "QX", "QX", "QX", "QX…

Based on the provided data structure, here is the summary:

  • Rows: 109,667
  • Columns: 7

The data frame has the following columns:

  • Variant.Name: Represents the name of the vehicle model variation.
  • Variant: Indicates the code or identifier for the specific variation of the vehicle.
  • Date: Specifies the date of production when the vehicle is ready for distribution.
  • Qty.Prod: Indicates the quantity of vehicles produced.
  • Year: Represents the production year of the vehicles.
  • Month: Indicates the production month of the vehicles.
  • Model: Represents the specific model of the vehicles produced at PT.MMKI.

The descriptions above provide a clearer understanding of each column and the type of information it holds in the dataset.

2.2 Changing Data Types

Understanding the data types of each column is crucial for performing accurate data analysis and applying appropriate data manipulation techniques.By observing the data, we can change the Date column to the date data type. This is useful for easy manipulation and analysis of dates.

handovers$Date <- as.Date(handovers$Date, format = "%Y-%m-%d")

2.3 Handling NA (Not Available) Values in the Dataset

we need to handle the NA (Not Available) values in the dataset. NA values indicate missing or unavailable data.

colSums(is.na(handovers))
#> Variant.Name      Variant         Date     Qty.Prod         Year        Month 
#>            0            0           97            0           97           97 
#>        Model 
#>            0
subset(handovers, is.na(Year) | is.na(Month))

Based on the provided data, it appears that the number of missing values is relatively small, with a total of 97 instances. In this case, if the number of missing values is very small and randomly distributed, one approach that can be taken is to remove the rows containing those missing values. In this context, functions such as na.omit() can be used for this purpose.

handovers <- na.omit(handovers)
colSums(is.na(handovers))
#> Variant.Name      Variant         Date     Qty.Prod         Year        Month 
#>            0            0            0            0            0            0 
#>        Model 
#>            0

2.4 Creating the Fiscal Year Column

It is important to note that PT.MMKI follows a fiscal year schedule, which starts in April and ends in March of the following year. The fiscal year is used for financial planning and reporting purposes.

Additionally, the Calendar Year is also used as a reference for annual date data commonly used in Indonesia, following the calendar year from January to December.

By creating the fiscal year column, we can analyze and track data based on the fiscal year period, enabling better financial insights and comparisons across different fiscal years. This column helps in aligning the data with the organization’s fiscal reporting structure and facilitates relevant financial calculations and reporting.

start_year <- 2016
end_year <- 2023

for (year in start_year:end_year) {
  start_date <- ymd(paste0(year, "-04-01"))
  end_date <- ymd(paste0(year + 1, "-03-31"))
  column_name <- paste("Fiscal", substring(as.character(year), 3, 4), sep = "")
  
  handovers[column_name] <- ifelse(handovers$Date >= start_date & handovers$Date <= end_date, 1, 0)
}

# Mengambil kolom FiscalXX yang ada
fiscal_cols <- grep("^Fiscal", names(handovers), value = TRUE)

# Membuat kolom Fiscal.Year baru dengan nilai awal NA
handovers$Fiscal.Year <- NA

# Mengisi nilai kolom Fiscal.Year berdasarkan tahun dari kolom FiscalXX yang sesuai
for (col in fiscal_cols) {
  year <- as.integer(substring(col, 7))
  handovers$Fiscal.Year[handovers[[col]] == 1] <- year
}

# Menghapus kolom FiscalXX yang tidak diperlukan lagi
handovers <- handovers[, -which(names(handovers) %in% fiscal_cols)]

head(handovers)

3 Visualization

Data visualization is one of the effective ways to understand and analyze patterns, trends, and relationships within a dataset. Let’s explore the valuable insights obtained from visualizing MMKI production data.

3.1 Exploring the Overall Total Production Data from 2017 to 2023

In data analysis, it is important to gain an overview of a variable or phenomenon as a whole. In this case, we want to examine the overall production quantity from 2017 to 2023.

## Tahapan pengolahan data
handovers_fiscal <- handovers %>%
  group_by(Model, Fiscal.Year) %>%
  summarise(Qty = sum(Qty.Prod))

handovers_fiscal <- handovers_fiscal %>% 
  mutate(label = glue("Fiscal Year : {Fiscal.Year}\nTotal Qty : {comma(Qty)}"))
## Mendefinisikan skala warna gradient
colors <- c("red", "black", "gray")

## Tahapan pembuatan ggplot
plotFiscal <- handovers_fiscal %>%
  ggplot(aes(x = Fiscal.Year,
             y = Qty,
             fill = Model,
             text = label)) +
  geom_bar(stat = "identity") +
  labs(y = "") +
  scale_x_continuous(breaks = handovers_fiscal$Fiscal.Year)+
  scale_y_continuous(labels = comma)+
  scale_fill_manual(values = colors)+
  ggtitle("Total Production of Cars")+
  theme(plot.title = element_text(hjust = 0.5))

## Tahapan pembuatan plot interaktif atau plot dengan tooltip
ggplotly(plotFiscal, tooltip = "text")      
  • It is apparent from the first fiscal year of PT.MMKI in 2017 that there were two vehicle models produced: Pajero (QX) and Expander (RN).

  • In the year 2018, L300 was produced at PT.MMKI, and during that year, it demonstrated high total production numbers. Specifically, PAJERO (QX) had a production total of 20,120 units, Expander(RN) reached 75,784 units, and L300 (SL) amounted to 23,780 units.

  • In the year 2020, there was a decline in production due to the COVID-19 pandemic. However, the production gradually recovered in the following year, 2021

  • In the year 2022, there was a decrease in L300 production. This was due to a factory expansion project that resulted in L300 production being temporarily shifted to Mitsubishi Philippines and directly exported to Indonesia.

  • And in the year 2023, the production is still ongoing.

The points mentioned above provide an overview of the trends and changes in vehicle production over the past few years at PT.MMKI.

3.2 Time Series Analysis of Vehicle Production at PT.MMKI”

Next, we will employ time series analysis to gain deeper insights into the patterns and trends of vehicle production at PT.MMKI over time.

handovers_series <- handovers %>%
  group_by(Month,Year) %>%
  summarise(Qty = sum(Qty.Prod))

handovers_series <- handovers_series %>%
  mutate(Date = as.Date(paste(Year, Month, "01", sep = "-")))

handovers_series <-handovers_series %>% 
  mutate(label = glue("Date : {month.name[Month]}-{Year}\nTotal Qty : {comma(Qty)}"))
plot_series <- handovers_series %>%
      ggplot(aes(x = Date, 
                 y = Qty)) +
      geom_line() +
      geom_point(aes(text = label),
                 size = 3, color = "red") +
      labs(title = "Production Series (Total of All Model)",
           x = "Year",
           y = "Quantity") +
      scale_x_date(date_labels = "%Y", date_breaks = "1 year") +
      scale_y_continuous(labels = comma)+
      theme(axis.text.x = element_text(angle = 45, hjust = 1),
            plot.title = element_text(hjust = 0.5))
    
    ggplotly(plot_series, tooltip = "text")

By observing the above chart, it is evident that production commenced in 2017 and showed a steady increase until early 2020. However, starting from February 2020, there was a significant decline in production, reaching zero in May and June 2020, primarily due to the impact of the COVID-19 pandemic. However, from July 2020 onwards, there was a gradual recovery and an upward trend in production, indicating PT.MMKI’s resilience and improvement.Let’s proceed to examine the time series of production for each vehicle model.

handovers_Mseries <- handovers %>%
  group_by(Model,Month,Year) %>%
  summarise(Qty = sum(Qty.Prod))

handovers_Mseries <- handovers_Mseries %>%
  group_by(Month,Year) %>%
  mutate(Date = as.Date(paste(Year, Month, "01", sep = "-")),
         MonthName = month(Date, label = TRUE))

handovers_Mseries  <-handovers_Mseries  %>% 
  mutate(label = glue("Date : {MonthName}-{Year}\nTotal Qty : {comma(Qty)}"))

plot_QX <- handovers_Mseries %>%
      filter(Model == "QX") %>%
      ggplot(aes(x = Date, y = Qty)) +
      geom_point(aes(text = label), color = "#FF0000", size = 1) +
      geom_area(fill = "#FF0000", alpha = 0.5) +
      geom_line(color = "#FF0000") +
      labs(title = "Total Production of Model QX",
           x = "Year",
           y = "Quantity") +
      scale_x_date(date_labels = "%Y", date_breaks = "1 year") +
      scale_y_continuous(labels = comma) +
      theme(axis.text.x = element_text(angle = 45, hjust = 1),
            plot.title = element_text(hjust = 0.5))
    

    
plot_RN <- handovers_Mseries %>%
      filter(Model == "RN") %>%
      ggplot(aes(x = Date, y = Qty)) +
      geom_area(fill = "#000000", alpha = 0.5) +
      geom_line(color = "#000000") +
      geom_point(aes(text = label), color = "#FF0000", size = 1) +
      labs(title = "Total Production of Model RN",
           x = "Year",
           y = "Quantity") +
      scale_x_date(date_labels = "%Y", date_breaks = "1 year") +
      scale_y_continuous(labels = comma) +
      theme(axis.text.x = element_text(angle = 45, hjust = 1),
            plot.title = element_text(hjust = 0.5))
    

    
plot_SL <- handovers_Mseries %>%
      filter(Model == "SL") %>%
      ggplot(aes(x = Date, y = Qty)) +
      geom_area(fill = "#808080", alpha = 0.5) +
      geom_line(color = "#808080") +
      geom_point(aes(text = label), color = "#FF0000", size = 1) +
      labs(title = "Total Production of Model SL",
           x = "Year",
           y = "Quantitiy") +
      scale_x_date(date_labels = "%Y", date_breaks = "1 year") +
      scale_y_continuous(labels = comma) +
      theme(axis.text.x = element_text(angle = 45, hjust = 1),
            plot.title = element_text(hjust = 0.5))
    



  plot_QX %>% ggplotly(tooltip = "text")
  plot_RN %>% ggplotly(tooltip = "text")
  plot_SL %>% ggplotly(tooltip = "text")

Regarding the QX (Pajero) and RN (Expander) models, the overall trend is similar to the previous explanation. However, in the case of the L300 model, a notable observation can be made for the fiscal year 2022 (April 2022 to March 2023). In April 2022, there were only 2000 units produced, which marked the last production of the old L300 model. After that, the production ceased due to a factory expansion project, and L300 units were imported from the Philippines. The production resumed in Indonesia in April 2023.

By looking at this data, we can gain insights into production trends during that period and understand the changes that occurred over time.

3.3 Analyzing the Proportions of Total Production for Each Model at PT.MMKI

The analysis aims to examine the proportions of total production for each vehicle model at PT.MMKI. By delving into the dataset, we can gain insights into the distribution of production across different models and understand the relative contribution of each model to the overall production volume. This analysis is essential in understanding the market demand and the popularity of specific vehicle models manufactured by PT.MMKI. By identifying the proportions of production

handovers_count <- handovers %>%
  group_by(Model) %>%
  summarise(Qty = sum(Qty.Prod))

# Compute percentagesh
handovers_count$fraction <- handovers_count$Qty / sum(handovers_count$Qty)

# Compute percentages
handovers_count$percentage <- handovers_count$Qty / sum(handovers_count$Qty) * 100

# Compute the cumulative percentages (top of each rectangle)
handovers_count$ymax = cumsum(handovers_count$fraction)

# Compute the bottom of each rectangle
handovers_count$ymin = c(0, head(handovers_count$ymax, n=-1))

# Compute label position
handovers_count$labelPosition <- (handovers_count$ymax + handovers_count$ymin) / 2

handovers_count <- handovers_count %>% 
  mutate(label = glue("Model : {Model}\nTotal Qty : {comma(Qty)}"))


colors <- c("red", "black", "gray")

plotPie <- plot_ly(handovers_count, labels = ~paste(Model, paste0("(", round(percentage, 2), "%)")), values = ~Qty, type = "pie", 
                   text = ~label, hoverinfo = "text", textinfo = "label") %>%
  layout(showlegend = TRUE, title = "Proportion of Production", 
         legend = list(title = "Model")) %>%
  add_trace(marker = list(colors = colors))

plotPie

Based on the Pie Chart above, it can be seen that the RN (Expander) model has the highest production proportion, accounting for 60.38%. Furthermore, there are two models that have nearly balanced proportions, with SL (L300) at 19.96% and QX (Pajero) at 19.66%.

3.4 Ranking the Total Production of Expander Variants

Considering the highest proportion, which is Expander, we are interested in determining the best-selling variants of Expander in terms of production. Here is the Production Ranking for the Variants of Expander.

filtered_handovers <- handovers %>%
  filter( Model == "RN") %>% 
  group_by(Variant.Name) %>%
  summarise(Qty = sum(Qty.Prod))

filtered_handovers <- filtered_handovers %>% 
  mutate(label = glue("Variant : {Variant.Name}\nTotal Qty : {comma(Qty)}"))

    
    plotVariantName <- filtered_handovers  %>%
      ggplot(aes(x = Qty, y = reorder(Variant.Name, Qty), fill = Qty, text = label)) +
      geom_bar(stat = "identity")+
      labs(x = "Quantity",y = "")+
      theme(legend.position = "none")+
      scale_fill_gradient(low = "red", high = "black") +
      scale_x_continuous(labels = comma)+
      ggtitle("Total Production of Expander Variants") +
      theme(plot.title = element_text(hjust = 0.5))
      
    
ggplotly(plotVariantName, tooltip = "text")

Based on the Bar Chart above, the highest ranking is held by the Ultimate variant, and interestingly, this variant also happens to be the highest-priced variant of Expander. Here is a list of the Expander model prices:

variant <- c("GLS MT", "GLS CVT", "Exceed MT", "Exceed CVT", "Sport MT", "Sport CVT", "Ultimate CVT")
price <- c("Rp 258.200.000", "Rp 267.400.000", "Rp 271.500.000", "Rp 280.700.000", "Rp 294.850.000", "Rp 309.500.000", "Rp 312.900.000")

price_df <- data.frame(Variant = variant, Price = price)
price_df

Besides the Ultimate variant, the second position is held by the Exceed MT variant, which is a manual transmission option for the Expander. This suggests that the Exceed MT variant is highly preferred by consumers compared to the Sport MT variant. The rankings provided above serve as valuable insights to evaluate and consider improvements for future Expander variants, aiming to develop competitive products that can effectively compete with other market competitors.”

4 Final Conclusion

Based on the results of this analysis and visualization, we can draw the following conclusions:

  1. MMKI started production in 2017 for the Xpander and Pajero models.

  2. In the years following 2017, there was a significant increase in production, specifically in 2018 and 2019.

  3. In 2020, production drastically declined, even reaching zero. This was due to the impact of the global COVID-19 pandemic, which halted component supplies and sales.

  4. Post the Corona disaster, MMKI began to rise and demonstrated gradual improvement in sales.

  5. However, the SL model experienced a decline in Fiscal Year 2022. This was caused by a factory expansion, leading to a temporary halt in SL model production while sales continued with a reliance on exported units from the Philippines.

  6. Expander has the highest production proportion among all the vehicle models analyzed.The Ultimate variant of Expander ranks the highest in terms of production, indicating its popularity among consumers. It is also noteworthy that the Ultimate variant is priced higher than other variants.