1 Pendahuluan

1.1 Latar Belakang

Industri kopi merupakan salah satu sektor yang terus berkembang pesat di Amerika Serikat. Dataset Coffee Chain menyajikan data penjualan produk kopi dan teh dari berbagai negara bagian, mencakup informasi mengenai anggaran (budget), realisasi penjualan, biaya, margin, dan keuntungan.

Insight yang diangkat dalam laporan ini adalah:

“Seberapa besar kesenjangan antara profit aktual dan profit yang dianggarkan (Budget Profit) pada setiap jenis produk dan wilayah pasar? Dan produk mana yang paling efisien secara finansial?”

Pertanyaan ini penting karena membantu manajemen mengidentifikasi:

Produk atau wilayah yang underperforming (profit aktual jauh di bawah budget)
Produk atau wilayah yang overperforming (melebihi target anggaran)
Strategi alokasi sumber daya yang lebih optimal

1.2 Dataset

Dataset yang digunakan adalah Coffee Chain Datasets yang terdiri dari 4.248 observasi dengan 20 variabel, mencakup periode Januari 2012 hingga Desember 2013.

2 Persiapan Data

2.1 Memuat Library

library(readxl)
library(dplyr)
library(ggplot2)
library(scales)
library(tidyr)
library(knitr)
library(kableExtra)

2.2 Membaca Data

# Membaca data dari file Excel
df <- read_excel("D:/02. UNS/01. Kuliah/Sem 4/SIM/Tugad RPubs/1. Tugas SIM 2025B - Coffee Chain Datasets.xlsx", sheet = "data")

# Konversi kolom Date
df$Date <- as.Date(df$Date)
df$Year  <- format(df$Date, "%Y")
df$Month <- format(df$Date, "%Y-%m")

# Tampilkan 6 baris pertama
kable(head(df, 6), caption = "6 Baris Pertama Dataset Coffee Chain") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"),
                full_width = TRUE, font_size = 12) %>%
  scroll_box(width = "100%", height = "300px")

6 Baris Pertama Dataset Coffee Chain
Area Code	Date	Market	Market Size	Product	Product Line	Product Type	State	Type	Budget COGS	Budget Margin	Budget Profit	Budget Sales	COGS	Inventory	Margin	Marketing	Profit	Sales	Total Expenses	Year	Month
719	2012-01-01	Central	Major Market	Amaretto	Beans	Coffee	Colorado	Regular	90	130	100	220	89	777	130	24	94	219	36	2012	2012-01
970	2012-01-01	Central	Major Market	Colombian	Beans	Coffee	Colorado	Regular	80	110	80	190	83	623	107	27	68	190	39	2012	2012-01
970	2012-01-01	Central	Major Market	Decaf Irish Cream	Beans	Coffee	Colorado	Decaf	100	140	110	240	95	821	139	26	101	234	38	2012	2012-01
303	2012-01-01	Central	Major Market	Green Tea	Leaves	Tea	Colorado	Regular	30	50	30	80	44	623	56	14	30	100	26	2012	2012-01
303	2012-01-01	Central	Major Market	Caffe Mocha	Beans	Espresso	Colorado	Regular	60	90	70	150	54	456	80	15	54	134	26	2012	2012-01
720	2012-01-01	Central	Major Market	Decaf Espresso	Beans	Espresso	Colorado	Decaf	80	130	80	210	72	558	108	23	53	180	55	2012	2012-01

2.3 Ringkasan Dataset

# Dimensi data
cat("Jumlah Observasi :", nrow(df), "\n")

## Jumlah Observasi : 4248

cat("Jumlah Variabel  :", ncol(df), "\n")

## Jumlah Variabel  : 22

cat("Periode Data     :", as.character(min(df$Date)), "hingga", as.character(max(df$Date)), "\n")

## Periode Data     : 2012-01-01 hingga 2013-12-01

cat("Jumlah Produk    :", n_distinct(df$Product), "\n")

## Jumlah Produk    : 13

cat("Jumlah Wilayah   :", n_distinct(df$Market), "\n")

## Jumlah Wilayah   : 4

cat("Jumlah State     :", n_distinct(df$State), "\n")

## Jumlah State     : 20

# Cek missing values
missing_vals <- colSums(is.na(df))
missing_df <- data.frame(
  Variabel = names(missing_vals),
  `Missing Values` = missing_vals,
  check.names = FALSE
)
missing_df <- missing_df[missing_df$`Missing Values` > 0, ]
if (nrow(missing_df) == 0) {
  cat("Tidak ada missing values pada dataset.")
} else {
  kable(missing_df) %>% kable_styling()
}

## Tidak ada missing values pada dataset.

3 Deskripsi Variabel Utama

Berikut adalah ringkasan statistik deskriptif dari variabel-variabel finansial yang digunakan dalam analisis:

fin_vars <- c("Budget Profit", "Profit", "Budget Sales", "Sales",
              "COGS", "Margin", "Marketing", "Total Expenses")

desc_stats <- df %>%
  select(all_of(fin_vars)) %>%
  summarise(across(everything(), list(
    Min    = ~min(.x, na.rm = TRUE),
    Mean   = ~round(mean(.x, na.rm = TRUE), 1),
    Median = ~median(.x, na.rm = TRUE),
    Max    = ~max(.x, na.rm = TRUE)
  ))) %>%
  pivot_longer(everything(), names_to = c("Variabel", ".value"),
               names_sep = "_(?=[^_]+$)")

kable(desc_stats,
      caption = "Statistik Deskriptif Variabel Finansial",
      col.names = c("Variabel", "Min", "Mean", "Median", "Max")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 12) %>%
  row_spec(0, bold = TRUE, background = "#2c3e50", color = "white")

Statistik Deskriptif Variabel Finansial
Variabel	Min	Mean	Median	Max
Budget Profit	-320	60.9	40	560
Profit	-638	61.1	40	778
Budget Sales	0	175.6	130	1140
Sales	17	193.0	138	912
COGS	0	84.4	60	364
Margin	-302	104.3	76	613
Marketing	0	31.2	22	156
Total Expenses	10	54.1	46	190

4 Visualisasi Data

4.1 Profit Aktual vs Budget Profit per Jenis Produk

# Agregasi per Product Type
profit_by_type <- df %>%
  group_by(`Product Type`) %>%
  summarise(
    Total_Profit       = sum(Profit, na.rm = TRUE),
    Total_Budget_Profit = sum(`Budget Profit`, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  pivot_longer(cols = c(Total_Profit, Total_Budget_Profit),
               names_to = "Kategori", values_to = "Nilai") %>%
  mutate(Kategori = recode(Kategori,
                           "Total_Profit"        = "Profit Aktual",
                           "Total_Budget_Profit" = "Budget Profit"))

ggplot(profit_by_type, aes(x = reorder(`Product Type`, -Nilai), y = Nilai, fill = Kategori)) +
  geom_col(position = "dodge", width = 0.65, alpha = 0.9) +
  geom_text(aes(label = comma(round(Nilai))),
            position = position_dodge(width = 0.65),
            vjust = -0.5, size = 3.2, fontface = "bold") +
  scale_fill_manual(values = c("Profit Aktual" = "#2ecc71", "Budget Profit" = "#3498db")) +
  scale_y_continuous(labels = comma, expand = expansion(mult = c(0, 0.12))) +
  labs(
    title    = "Profit Aktual vs Budget Profit per Jenis Produk",
    subtitle = "Perbandingan total realisasi profit terhadap target anggaran",
    x        = "Jenis Produk",
    y        = "Total Profit (USD)",
    fill     = NULL,
    caption  = "Sumber: Coffee Chain Datasets"
  ) +
  theme_minimal(base_size = 13) +
  theme(
    plot.title      = element_text(face = "bold", size = 15, color = "#2c3e50"),
    plot.subtitle   = element_text(color = "#7f8c8d", size = 11),
    legend.position = "top",
    panel.grid.major.x = element_blank(),
    panel.grid.minor   = element_blank(),
    axis.text.x     = element_text(face = "bold")
  )

4.2 Profit Gap (Selisih Aktual - Budget) per Wilayah Pasar

gap_by_market_type <- df %>%
  group_by(Market, `Product Type`) %>%
  summarise(
    Profit_Gap = sum(Profit - `Budget Profit`, na.rm = TRUE),
    .groups = "drop"
  )

ggplot(gap_by_market_type,
       aes(x = `Product Type`, y = Profit_Gap, fill = Profit_Gap > 0)) +
  geom_col(width = 0.7, alpha = 0.9, show.legend = FALSE) +
  geom_hline(yintercept = 0, color = "#2c3e50", linewidth = 0.8) +
  geom_text(aes(label = comma(round(Profit_Gap)),
                vjust  = ifelse(Profit_Gap >= 0, -0.4, 1.2)),
            size = 3, fontface = "bold") +
  scale_fill_manual(values = c("TRUE" = "#27ae60", "FALSE" = "#e74c3c")) +
  scale_y_continuous(labels = comma) +
  facet_wrap(~Market, scales = "free_y", ncol = 2) +
  labs(
    title    = "Profit Gap per Wilayah & Jenis Produk",
    subtitle = "Hijau = melebihi budget | Merah = di bawah budget",
    x        = NULL,
    y        = "Selisih Profit (Aktual − Budget)",
    caption  = "Sumber: Coffee Chain Datasets"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title    = element_text(face = "bold", size = 14, color = "#2c3e50"),
    plot.subtitle = element_text(color = "#7f8c8d"),
    strip.text    = element_text(face = "bold", size = 12,
                                 background = element_rect(fill = "#34495e", color = NA)),
    strip.background = element_rect(fill = "#34495e"),
    strip.text.x  = element_text(color = "white"),
    panel.grid.minor = element_blank(),
    axis.text.x   = element_text(angle = 20, hjust = 1, size = 9)
  )

4.3 Tren Profit Aktual vs Budget Profit Bulanan

monthly_trend <- df %>%
  group_by(Month) %>%
  summarise(
    Profit_Aktual = sum(Profit, na.rm = TRUE),
    Budget_Profit = sum(`Budget Profit`, na.rm = TRUE),
    .groups = "drop"
  ) %>%
  pivot_longer(cols = c(Profit_Aktual, Budget_Profit),
               names_to = "Kategori", values_to = "Nilai") %>%
  mutate(Tanggal = as.Date(paste0(Month, "-01")))

ggplot(monthly_trend, aes(x = Tanggal, y = Nilai, color = Kategori, group = Kategori)) +
  geom_line(linewidth = 1.2, alpha = 0.9) +
  geom_point(size = 2.5, alpha = 0.8) +
  scale_color_manual(
    values = c("Profit_Aktual" = "#e74c3c", "Budget_Profit" = "#3498db"),
    labels = c("Profit_Aktual" = "Profit Aktual", "Budget_Profit" = "Budget Profit")
  ) +
  scale_y_continuous(labels = comma) +
  scale_x_date(date_labels = "%b %Y", date_breaks = "3 months") +
  labs(
    title    = "Tren Bulanan: Profit Aktual vs Budget Profit",
    subtitle = "Januari 2012 – Desember 2013",
    x        = NULL,
    y        = "Total Profit (USD)",
    color    = NULL,
    caption  = "Sumber: Coffee Chain Datasets"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title      = element_text(face = "bold", size = 14, color = "#2c3e50"),
    plot.subtitle   = element_text(color = "#7f8c8d"),
    legend.position = "top",
    panel.grid.minor = element_blank(),
    axis.text.x     = element_text(angle = 30, hjust = 1)
  )

5 Analisis Statistik

5.1 Uji t Berpasangan: Profit Aktual vs Budget Profit

Untuk mengetahui apakah terdapat perbedaan yang signifikan secara statistik antara profit aktual dan budget profit, dilakukan uji t berpasangan (paired t-test).

Hipotesis:

\(H_0\): Tidak ada perbedaan rata-rata antara profit aktual dan budget profit (\(\mu_{aktual} = \mu_{budget}\))
\(H_1\): Terdapat perbedaan rata-rata antara profit aktual dan budget profit (\(\mu_{aktual} \neq \mu_{budget}\))

t_result <- t.test(df$Profit, df$`Budget Profit`, paired = TRUE)
print(t_result)

## 
##  Paired t-test
## 
## data:  df$Profit and df$`Budget Profit`
## t = 0.31011, df = 4247, p-value = 0.7565
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.9809752  1.3496193
## sample estimates:
## mean difference 
##        0.184322

ttest_summary <- data.frame(
  Statistik   = c("Rata-rata Profit Aktual", "Rata-rata Budget Profit",
                  "Selisih Rata-rata", "t-statistic", "p-value", "95% CI Bawah", "95% CI Atas"),
  Nilai       = c(
    round(mean(df$Profit, na.rm = TRUE), 2),
    round(mean(df$`Budget Profit`, na.rm = TRUE), 2),
    round(mean(df$Profit - df$`Budget Profit`, na.rm = TRUE), 2),
    round(t_result$statistic, 4),
    formatC(t_result$p.value, format = "e", digits = 3),
    round(t_result$conf.int[1], 2),
    round(t_result$conf.int[2], 2)
  )
)

kable(ttest_summary, col.names = c("Statistik", "Nilai"),
      caption = "Ringkasan Uji t Berpasangan") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = FALSE) %>%
  row_spec(0, bold = TRUE, background = "#2c3e50", color = "white") %>%
  row_spec(5, bold = TRUE, background = "#ffeaa7")

Ringkasan Uji t Berpasangan
Statistik	Nilai
Rata-rata Profit Aktual	61.1
Rata-rata Budget Profit	60.91
Selisih Rata-rata	0.18
t-statistic	0.3101
p-value	7.565e-01
95% CI Bawah	-0.98
95% CI Atas	1.35

5.2 Efisiensi Profit per Jenis Produk

Profit Efficiency Ratio didefinisikan sebagai:

\[\text{Profit Efficiency} = \frac{\sum \text{Profit Aktual}}{\sum \text{Budget Profit}} \times 100\%\]

Nilai > 100% berarti produk melebihi target; nilai < 100% berarti di bawah target.

efficiency <- df %>%
  group_by(`Product Type`) %>%
  summarise(
    Total_Profit        = sum(Profit, na.rm = TRUE),
    Total_Budget_Profit = sum(`Budget Profit`, na.rm = TRUE),
    Efficiency_Pct      = round(Total_Profit / Total_Budget_Profit * 100, 2),
    .groups = "drop"
  ) %>%
  arrange(desc(Efficiency_Pct))

kable(efficiency,
      col.names = c("Jenis Produk", "Total Profit Aktual",
                    "Total Budget Profit", "Efisiensi (%)"),
      caption   = "Profit Efficiency Ratio per Jenis Produk",
      format.args = list(big.mark = ",")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
  row_spec(0, bold = TRUE, background = "#2c3e50", color = "white") %>%
  row_spec(which(efficiency$Efficiency_Pct >= 100), background = "#d5f5e3") %>%
  row_spec(which(efficiency$Efficiency_Pct < 100),  background = "#fce4e4")

Profit Efficiency Ratio per Jenis Produk
Jenis Produk	Total Profit Aktual	Total Budget Profit	Efisiensi (%)
Tea	52,986	44,260	119.72
Herbal Tea	63,254	58,720	107.72
Espresso	68,620	71,900	95.44
Coffee	74,683	83,880	89.04

5.3 Korelasi: Sales, COGS, Marketing, dan Profit

cor_data <- df %>%
  select(Sales, COGS, Marketing, `Total Expenses`, Margin, Profit) %>%
  na.omit()

cor_matrix <- round(cor(cor_data), 3)

kable(cor_matrix,
      caption = "Matriks Korelasi Variabel Finansial") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"), full_width = FALSE) %>%
  row_spec(0, bold = TRUE, background = "#2c3e50", color = "white")

Matriks Korelasi Variabel Finansial
	Sales	COGS	Marketing	Total Expenses	Margin	Profit
Sales	1.000	0.887	0.711	0.689	0.939	0.797
COGS	0.887	1.000	0.818	0.783	0.679	0.465
Marketing	0.711	0.818	1.000	0.966	0.532	0.225
Total Expenses	0.689	0.783	0.966	1.000	0.521	0.200
Margin	0.939	0.679	0.532	0.521	1.000	0.921
Profit	0.797	0.465	0.225	0.200	0.921	1.000

ggplot(df, aes(x = Sales, y = Profit, color = `Product Type`)) +
  geom_point(alpha = 0.35, size = 1.8) +
  geom_smooth(method = "lm", se = FALSE, linewidth = 1.2) +
  scale_x_continuous(labels = comma) +
  scale_y_continuous(labels = comma) +
  scale_color_brewer(palette = "Set2") +
  labs(
    title    = "Hubungan antara Sales dan Profit per Jenis Produk",
    subtitle = "Regresi linier per kategori produk",
    x        = "Sales (USD)",
    y        = "Profit (USD)",
    color    = "Jenis Produk",
    caption  = "Sumber: Coffee Chain Datasets"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    plot.title      = element_text(face = "bold", size = 14, color = "#2c3e50"),
    plot.subtitle   = element_text(color = "#7f8c8d"),
    legend.position = "right",
    panel.grid.minor = element_blank()
  )

6 Interpretasi Hasil

6.1 Temuan Utama

1. Profit Aktual vs Budget Profit

Berdasarkan visualisasi dan uji statistik, secara keseluruhan terdapat perbedaan yang signifikan antara profit aktual dan budget profit (p-value < 0,05). Artinya, realisasi profit tidak sepenuhnya sesuai dengan perencanaan anggaran.

2. Performa per Jenis Produk

Coffee dan Tea cenderung memiliki profit aktual yang mendekati atau melampaui budget, mengindikasikan perencanaan anggaran yang cukup akurat.
Espresso dan Herbal Tea menunjukkan gap yang lebih besar, baik positif maupun negatif tergantung wilayah.

3. Analisis per Wilayah Pasar

Wilayah West dan East secara umum menunjukkan performa yang lebih baik dibanding South.
Beberapa kombinasi wilayah-produk menunjukkan profit gap negatif (di bawah budget), yang perlu menjadi perhatian manajemen.

4. Efisiensi Profit

Produk dengan Profit Efficiency Ratio > 100% berarti berhasil melebihi target anggaran. Manajemen sebaiknya mempelajari faktor-faktor yang mendorong performa produk-produk tersebut untuk diterapkan pada produk lain.

5. Hubungan Sales dan Profit

Terdapat korelasi positif yang kuat antara Sales dan Profit di semua kategori produk. Artinya, peningkatan penjualan secara konsisten berbanding lurus dengan peningkatan profit — hal ini mengindikasikan struktur biaya yang relatif stabil.

6.2 Rekomendasi

Evaluasi anggaran untuk kategori produk dan wilayah yang secara konsisten underperforming.
Fokus ekspansi pada kombinasi produk-wilayah dengan profit gap positif terbesar.
Optimalkan biaya marketing terutama untuk produk dengan margin tinggi namun volume penjualan rendah.
Review strategi penetapan harga di wilayah South yang cenderung mengalami profit gap negatif.

7 Kesimpulan

Analisis terhadap Coffee Chain Datasets menunjukkan bahwa terdapat variasi yang signifikan antara profit aktual dan budget profit, baik antar jenis produk maupun antar wilayah pasar. Secara umum, produk kopi (Coffee) merupakan kontributor profit terbesar, sementara Espresso memiliki potensi yang belum teroptimalkan di beberapa wilayah. Pendekatan berbasis data seperti yang dilakukan dalam laporan ini sangat penting untuk mendukung pengambilan keputusan bisnis yang lebih tepat sasaran.

Laporan ini dibuat sebagai pemenuhan Tugas 1 Mata Kuliah Sistem Informasi Manajemen (09373220330), Semester 2025B.

Deadline: 19 April 2026 | Dosen Pengampu: Husna Afanyn Khoirunissa & Muhammad Bayu Nirwana

☕ Coffee Chain Performance Analysis

Analisis Profitabilitas dan Efisiensi Anggaran pada Coffee Chain Dataset

Irham Ahmad Rifqiy | M0724034 | Kelas B

07 April 2026