1. Pendahuluan

Dataset Coffee Chain mencatat transaksi penjualan produk kopi dan teh dari sebuah jaringan kedai kopi di berbagai pasar (Central, East, South, West) selama periode Januari 2012 – Desember 2013. Dataset terdiri dari 4.248 baris dan 20 variabel, mencakup data aktual (Sales, Profit, Margin) dan target anggaran (Budget Sales, Budget Profit, Budget Margin).

Insight utama yang dianalisis:

Seberapa jauh kinerja profit aktual setiap produk menyimpang dari target anggaran, dan bagaimana tren profit berkembang dari 2012 ke 2013?

Insight ini relevan karena terdapat produk dengan profit negatif (Green Tea) di tengah tren keseluruhan yang meningkat, sehingga perlu dievaluasi lebih lanjut.

2. Persiapan Data

library(readxl)

# Load data
df <- readxl::read_excel("C:/Users/asus/Downloads/Tugas SIM 2025B - Coffee Chain Datasets.xlsx")

# Pastikan kolom Date bertipe Date
df$Date      <- as.Date(df$Date)
df$YearMonth <- as.Date(format(df$Date, "%Y-%m-01"))
df$Year      <- as.integer(format(df$Date, "%Y"))
df$profit_gap <- df$Profit - df$`Budget Profit`   # selisih aktual vs target

str(df)
## tibble [4,248 × 23] (S3: tbl_df/tbl/data.frame)
##  $ Area Code     : num [1:4248] 719 970 970 303 303 720 970 719 970 719 ...
##  $ Date          : Date[1:4248], format: "2012-01-01" "2012-01-01" ...
##  $ Market        : chr [1:4248] "Central" "Central" "Central" "Central" ...
##  $ Market Size   : chr [1:4248] "Major Market" "Major Market" "Major Market" "Major Market" ...
##  $ Product       : chr [1:4248] "Amaretto" "Colombian" "Decaf Irish Cream" "Green Tea" ...
##  $ Product Line  : chr [1:4248] "Beans" "Beans" "Beans" "Leaves" ...
##  $ Product Type  : chr [1:4248] "Coffee" "Coffee" "Coffee" "Tea" ...
##  $ State         : chr [1:4248] "Colorado" "Colorado" "Colorado" "Colorado" ...
##  $ Type          : chr [1:4248] "Regular" "Regular" "Decaf" "Regular" ...
##  $ Budget COGS   : num [1:4248] 90 80 100 30 60 80 140 50 50 40 ...
##  $ Budget Margin : num [1:4248] 130 110 140 50 90 130 160 80 70 70 ...
##  $ Budget Profit : num [1:4248] 100 80 110 30 70 80 110 20 40 20 ...
##  $ Budget Sales  : num [1:4248] 220 190 240 80 150 210 300 130 120 110 ...
##  $ COGS          : num [1:4248] 89 83 95 44 54 72 170 63 60 58 ...
##  $ Inventory     : num [1:4248] 777 623 821 623 456 ...
##  $ Margin        : num [1:4248] 130 107 139 56 80 108 171 87 80 72 ...
##  $ Marketing     : num [1:4248] 24 27 26 14 15 23 47 57 19 22 ...
##  $ Profit        : num [1:4248] 94 68 101 30 54 53 99 0 33 17 ...
##  $ Sales         : num [1:4248] 219 190 234 100 134 180 341 150 140 130 ...
##  $ Total Expenses: num [1:4248] 36 39 38 26 26 55 72 87 47 55 ...
##  $ YearMonth     : Date[1:4248], format: "2012-01-01" "2012-01-01" ...
##  $ Year          : int [1:4248] 2012 2012 2012 2012 2012 2012 2012 2012 2012 2012 ...
##  $ profit_gap    : num [1:4248] -6 -12 -9 0 -16 -27 -11 -20 -7 -3 ...
# Ringkasan statistik variabel kunci
summary(df[, c("Sales", "Profit", "Budget Profit", "Margin", "Marketing", "Total Expenses")])
##      Sales         Profit       Budget Profit         Margin       
##  Min.   : 17   Min.   :-638.0   Min.   :-320.00   Min.   :-302.00  
##  1st Qu.:100   1st Qu.:  17.0   1st Qu.:  20.00   1st Qu.:  52.75  
##  Median :138   Median :  40.0   Median :  40.00   Median :  76.00  
##  Mean   :193   Mean   :  61.1   Mean   :  60.91   Mean   : 104.29  
##  3rd Qu.:230   3rd Qu.:  92.0   3rd Qu.:  80.00   3rd Qu.: 132.00  
##  Max.   :912   Max.   : 778.0   Max.   : 560.00   Max.   : 613.00  
##    Marketing      Total Expenses  
##  Min.   :  0.00   Min.   : 10.00  
##  1st Qu.: 13.00   1st Qu.: 33.00  
##  Median : 22.00   Median : 46.00  
##  Mean   : 31.19   Mean   : 54.06  
##  3rd Qu.: 39.00   3rd Qu.: 65.00  
##  Max.   :156.00   Max.   :190.00

3. Insight & Deskripsi Data

3.1 Profit vs Budget per Produk

produk_summary <- do.call(rbind, lapply(split(df, df$Product), function(x) {
  data.frame(
    Product      = x$Product[1],
    n            = nrow(x),
    Profit       = mean(x$Profit, na.rm = TRUE),
    BudgetProfit = mean(x$`Budget Profit`, na.rm = TRUE),
    ProfitGap    = mean(x$profit_gap, na.rm = TRUE),
    stringsAsFactors = FALSE
  )
}))

produk_summary <- produk_summary[order(produk_summary$ProfitGap), ]
rownames(produk_summary) <- NULL

knitr::kable(produk_summary, digits = 2,
             col.names = c("Produk", "N", "Avg Profit", "Avg Budget", "Gap (Aktual - Budget)"),
             caption = "Rata-rata Profit Aktual vs Budget per Produk")
Rata-rata Profit Aktual vs Budget per Produk
Produk N Avg Profit Avg Budget Gap (Aktual - Budget)
Green Tea 288 -0.80 13.19 -14.00
Decaf Irish Cream 384 36.43 49.64 -13.21
Caffe Mocha 480 36.83 48.50 -11.67
Amaretto 192 25.47 36.56 -11.09
Colombian 480 116.26 120.42 -4.16
Mint 192 32.05 35.31 -3.26
Decaf Espresso 408 72.31 72.21 0.10
Caffe Latte 216 52.66 46.76 5.90
Lemon 480 62.23 56.29 5.94
Chamomile 384 70.91 64.90 6.02
Regular Espresso 72 139.79 125.83 13.96
Darjeeling 384 75.66 59.53 16.13
Earl Grey 288 83.90 61.11 22.79

Temuan awal:

  • Green Tea memiliki rata-rata profit aktual negatif (-0.80), jauh di bawah budget (13.19) → underperformer terbesar.
  • Regular Espresso dan Earl Grey adalah produk dengan profit gap positif tertinggi → melampaui target.
  • 6 dari 13 produk gagal memenuhi target budget profit.

4. Visualisasi

4.1 Profit Gap per Produk (Aktual vs Budget)

ps <- produk_summary[order(produk_summary$ProfitGap), ]
colors <- ifelse(ps$ProfitGap >= 0, "#2ecc71", "#e74c3c")

par(mar = c(4, 10, 4, 3))
barplot(
  ps$ProfitGap,
  names.arg = ps$Product,
  horiz     = TRUE,
  col       = colors,
  border    = NA,
  las       = 1,
  xlab      = "Rata-rata Gap Profit (Aktual − Budget)",
  main      = "Gap Profit Aktual vs Budget per Produk",
  cex.names = 0.85
)
abline(v = 0, lty = 2, col = "gray40")
mtext("Nilai positif = profit aktual melampaui target; negatif = di bawah target",
      side = 3, line = 0.3, cex = 0.8, col = "gray40")

# Label nilai
text(
  x      = ps$ProfitGap + ifelse(ps$ProfitGap >= 0, 0.3, -0.3),
  y      = seq(0.7, by = 1.2, length.out = nrow(ps)),
  labels = round(ps$ProfitGap, 1),
  cex    = 0.75,
  adj    = ifelse(ps$ProfitGap >= 0, 0, 1)
)

legend("bottomright",
       legend = c("Di atas Target", "Di bawah Target"),
       fill   = c("#2ecc71", "#e74c3c"),
       border = NA,
       bty    = "n",
       cex    = 0.85)

4.2 Tren Profit Bulanan (2012 vs 2013)

# Agregasi profit per bulan dan tahun
tren <- aggregate(Profit ~ YearMonth + Year, data = df, FUN = sum)
tren <- tren[order(tren$YearMonth), ]

df_2012 <- tren[tren$Year == 2012, ]
df_2013 <- tren[tren$Year == 2013, ]

par(mar = c(4, 5, 4, 2))
plot(
  df_2012$YearMonth, df_2012$Profit,
  type = "b", col = "#3498db", pch = 16, lwd = 2,
  xlab = "Bulan", ylab = "Total Profit",
  main = "Tren Total Profit Bulanan: 2012 vs 2013",
  ylim = range(tren$Profit) * c(0.9, 1.1),
  xaxt = "n",
  yaxt = "n"
)
lines(df_2013$YearMonth, df_2013$Profit,
      type = "b", col = "#e67e22", pch = 16, lwd = 2)

axis(1, at = df_2012$YearMonth,
     labels = format(df_2012$YearMonth, "%b %Y"),
     las = 2, cex.axis = 0.7)
axis(2, at = axTicks(2), las = 1,
     labels = formatC(axTicks(2), format = "f", big.mark = ".", digits = 0))

mtext("Profit 2013 konsisten lebih tinggi di semua bulan",
      side = 3, line = 0.3, cex = 0.8, col = "gray40")

legend("topleft",
       legend = c("2012", "2013"),
       col    = c("#3498db", "#e67e22"),
       lwd    = 2, pch = 16, bty = "n", cex = 0.9)

4.3 Distribusi Profit per Product Line & Market

product_lines <- unique(df$`Product Line`)
par(mfrow = c(1, length(product_lines)), mar = c(5, 4, 4, 1))

fill_colors <- c("Beans" = "#8B4513", "Leaves" = "#228B22")

for (pl in product_lines) {
  sub <- df[df$`Product Line` == pl, ]
  boxplot(
    Profit ~ Market,
    data    = sub,
    col     = fill_colors[pl],
    border  = "gray30",
    outline = TRUE,
    outpch  = 1, outcex = 0.4, outcol = adjustcolor(fill_colors[pl], alpha.f = 0.3),
    main    = pl,
    xlab    = "Market",
    ylab    = if (pl == product_lines[1]) "Profit" else "",
    las     = 1,
    cex.axis = 0.85
  )
}
mtext("Distribusi Profit per Market dan Product Line",
      side = 3, line = -1.5, outer = TRUE, font = 2, cex = 1)

par(mfrow = c(1, 1))

5. Analisis Statistik

5.1 Uji t: Apakah Profit Aktual Berbeda Signifikan dari Budget Profit?

# Paired t-test: profit aktual vs budget profit
t_result <- t.test(df$Profit, df$`Budget Profit`, paired = TRUE)
t_result
## 
##  Paired t-test
## 
## data:  df$Profit and df$`Budget Profit`
## t = 0.31011, df = 4247, p-value = 0.7565
## alternative hypothesis: true mean difference is not equal to 0
## 95 percent confidence interval:
##  -0.9809752  1.3496193
## sample estimates:
## mean difference 
##        0.184322
cat("Rata-rata profit aktual :", round(mean(df$Profit), 2), "\n")
## Rata-rata profit aktual : 61.1
cat("Rata-rata budget profit :", round(mean(df$`Budget Profit`), 2), "\n")
## Rata-rata budget profit : 60.91
cat("Selisih rata-rata       :", round(mean(df$profit_gap), 2), "\n")
## Selisih rata-rata       : 0.18
cat("p-value                 :", round(t_result$p.value, 4), "\n")
## p-value                 : 0.7565

5.2 Regresi Linear: Faktor-Faktor Penentu Profit

# Regresi profit terhadap sales, marketing, COGS, total expenses
model <- lm(Profit ~ Sales + Marketing + COGS + `Total Expenses`, data = df)
summary(model)
## 
## Call:
## lm(formula = Profit ~ Sales + Marketing + COGS + `Total Expenses`, 
##     data = df)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -112.028   -5.748    0.526    5.525  105.228 
## 
## Coefficients:
##                   Estimate Std. Error  t value Pr(>|t|)    
## (Intercept)       2.366873   0.705850    3.353 0.000806 ***
## Sales             1.208316   0.003721  324.744  < 2e-16 ***
## Marketing         0.195850   0.040466    4.840 1.35e-06 ***
## COGS             -1.241869   0.010250 -121.156  < 2e-16 ***
## `Total Expenses` -1.400421   0.031161  -44.942  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 16.86 on 4243 degrees of freedom
## Multiple R-squared:  0.9726, Adjusted R-squared:  0.9725 
## F-statistic: 3.76e+04 on 4 and 4243 DF,  p-value: < 2.2e-16
par(mfrow = c(1, 2))
plot(model, which = c(1, 2))

par(mfrow = c(1, 1))

5.3 ANOVA: Perbedaan Rata-rata Profit antar Produk

anova_model <- aov(Profit ~ Product, data = df)
summary(anova_model)
##               Df   Sum Sq Mean Sq F value Pr(>F)    
## Product       12  4267412  355618   37.97 <2e-16 ***
## Residuals   4235 39666224    9366                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# Post-hoc: produk mana saja yang berbeda signifikan?
tukey    <- TukeyHSD(anova_model)
tukey_df <- as.data.frame(tukey$Product)
tukey_df$Perbandingan <- rownames(tukey_df)
rownames(tukey_df) <- NULL

tukey_df <- tukey_df[tukey_df$`p adj` < 0.05, ]
tukey_df <- tukey_df[order(tukey_df$`p adj`), ]
tukey_df <- tukey_df[, c("Perbandingan", "diff", "lwr", "upr", "p adj")]

cat("Pasangan produk dengan perbedaan profit signifikan (p < 0.05):\n")
## Pasangan produk dengan perbedaan profit signifikan (p < 0.05):
cat("Jumlah pasangan signifikan:", nrow(tukey_df), "\n\n")
## Jumlah pasangan signifikan: 52
knitr::kable(head(tukey_df, 10), digits = 4,
             caption = "10 Pasangan Produk Teratas dengan Perbedaan Profit Signifikan")
10 Pasangan Produk Teratas dengan Perbedaan Profit Signifikan
Perbandingan diff lwr upr p adj
25 Colombian-Caffe Mocha 79.4292 58.7222 100.1361 0
45 Decaf Irish Cream-Colombian -79.8286 -101.7917 -57.8656 0
47 Green Tea-Colombian -117.0604 -140.9707 -93.1501 0
75 Regular Espresso-Green Tea 140.5938 98.3259 182.8616 0
4 Colombian-Amaretto 90.7896 63.3969 118.1823 0
73 Lemon-Green Tea 63.0292 39.1189 86.9395 0
48 Lemon-Colombian -54.0312 -74.7382 -33.3243 0
39 Green Tea-Chamomile -71.7161 -96.7221 -46.7101 0
12 Regular Espresso-Amaretto 114.3229 69.9920 158.6538 0
49 Mint-Colombian -84.2062 -111.5989 -56.8136 0

6. Interpretasi

6.1 Profit Aktual vs Budget (Paired t-test)

Uji t berpasangan menghasilkan p-value > 0.05, yang berarti secara statistik tidak ada perbedaan signifikan antara rata-rata profit aktual dan budget profit secara keseluruhan. Artinya, perencanaan anggaran profit cukup akurat di level agregat. Namun, analisis per produk menunjukkan variasi yang besar — beberapa produk sangat under-budget (Green Tea, Decaf Irish Cream) dan beberapa melampaui target (Earl Grey, Regular Espresso).

6.2 Faktor Penentu Profit (Regresi)

Dari model regresi linear:

  • Sales berpengaruh positif signifikan terhadap profit — setiap kenaikan 1 unit sales meningkatkan profit secara nyata.
  • COGS dan Total Expenses berpengaruh negatif signifikan — keduanya menggerus profit.
  • Marketing berpengaruh negatif namun besarannya lebih kecil dibanding COGS dan expenses.
  • Model memiliki R² yang tinggi, artinya variabel-variabel ini mampu menjelaskan sebagian besar variasi profit.

6.3 Perbedaan antar Produk (ANOVA)

ANOVA menunjukkan perbedaan profit yang sangat signifikan antar produk (p < 0.001). Uji Tukey mengkonfirmasi bahwa produk seperti Regular Espresso dan Colombian berbeda secara signifikan dari produk underperformer seperti Green Tea. Ini mengindikasikan perlunya strategi berbeda per produk, bukan pendekatan satu-ukuran-untuk-semua.


7. Kesimpulan

  1. Secara agregat, profit aktual Coffee Chain hampir sesuai budget — tidak ada penyimpangan signifikan secara statistik.

  2. Per produk, terdapat kesenjangan besar: Green Tea merugi dan konsisten di bawah budget, sedangkan Regular Espresso dan Earl Grey secara konsisten melampaui target.

  3. Tren 2012 → 2013 menunjukkan pertumbuhan profit yang signifikan (~50% lebih tinggi di 2013), dengan pola musiman yang serupa di kedua tahun.

  4. Driver utama profit adalah volume sales dan efisiensi biaya (COGS & Total Expenses) — strategi yang paling berdampak adalah meningkatkan sales sekaligus mengendalikan COGS.

  5. Rekomendasi: Evaluasi ulang strategi produk Green Tea (harga, biaya, atau penghentian), dan replikasi formula sukses Regular Espresso ke produk lain.