Dataset Coffee Chain mencatat data transaksi penjualan produk minuman berbasis kopi dan teh pada jaringan kedai kopi di berbagai wilayah Amerika Serikat selama periode Januari 2012 hingga Desember 2013, dengan total 4.248 observasi dan 20 variabel.
Insight yang diangkat dalam laporan ini adalah:
“Bagaimana profitabilitas dan kinerja penjualan Coffee Chain berbeda-beda berdasarkan jenis produk dan wilayah pasar, serta sejauh mana realisasi profit menyimpang dari target anggaran?”
Insight ini relevan karena membantu manajemen memahami produk dan pasar mana yang paling menguntungkan, serta mengidentifikasi kesenjangan antara target (budget) dan realisasi profit, sehingga dapat menjadi dasar pengambilan keputusan strategis.
Variabel yang digunakan dalam analisis ini adalah:
library(readxl)
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(ggplot2)
## Warning: package 'ggplot2' was built under R version 4.5.2
library(tidyr)
library(scales)
df <- read_excel("D:/Pancar/KULIAH UNS/SEMESTER 4/Sistem Informasi Manajemen/1. Tugas SIM 2025B - Coffee Chain Datasets/1. Tugas SIM 2025B - Coffee Chain Datasets.xlsx", sheet = "data")
cat("Dimensi Data:", nrow(df), "baris x", ncol(df), "kolom\n")
## Dimensi Data: 4248 baris x 20 kolom
str(df)
## tibble [4,248 × 20] (S3: tbl_df/tbl/data.frame)
## $ Area Code : num [1:4248] 719 970 970 303 303 720 970 719 970 719 ...
## $ Date : POSIXct[1:4248], format: "2012-01-01" "2012-01-01" ...
## $ Market : chr [1:4248] "Central" "Central" "Central" "Central" ...
## $ Market Size : chr [1:4248] "Major Market" "Major Market" "Major Market" "Major Market" ...
## $ Product : chr [1:4248] "Amaretto" "Colombian" "Decaf Irish Cream" "Green Tea" ...
## $ Product Line : chr [1:4248] "Beans" "Beans" "Beans" "Leaves" ...
## $ Product Type : chr [1:4248] "Coffee" "Coffee" "Coffee" "Tea" ...
## $ State : chr [1:4248] "Colorado" "Colorado" "Colorado" "Colorado" ...
## $ Type : chr [1:4248] "Regular" "Regular" "Decaf" "Regular" ...
## $ Budget COGS : num [1:4248] 90 80 100 30 60 80 140 50 50 40 ...
## $ Budget Margin : num [1:4248] 130 110 140 50 90 130 160 80 70 70 ...
## $ Budget Profit : num [1:4248] 100 80 110 30 70 80 110 20 40 20 ...
## $ Budget Sales : num [1:4248] 220 190 240 80 150 210 300 130 120 110 ...
## $ COGS : num [1:4248] 89 83 95 44 54 72 170 63 60 58 ...
## $ Inventory : num [1:4248] 777 623 821 623 456 ...
## $ Margin : num [1:4248] 130 107 139 56 80 108 171 87 80 72 ...
## $ Marketing : num [1:4248] 24 27 26 14 15 23 47 57 19 22 ...
## $ Profit : num [1:4248] 94 68 101 30 54 53 99 0 33 17 ...
## $ Sales : num [1:4248] 219 190 234 100 134 180 341 150 140 130 ...
## $ Total Expenses: num [1:4248] 36 39 38 26 26 55 72 87 47 55 ...
head(df)
## # A tibble: 6 × 20
## `Area Code` Date Market `Market Size` Product `Product Line`
## <dbl> <dttm> <chr> <chr> <chr> <chr>
## 1 719 2012-01-01 00:00:00 Central Major Market Amaretto Beans
## 2 970 2012-01-01 00:00:00 Central Major Market Colombian Beans
## 3 970 2012-01-01 00:00:00 Central Major Market Decaf Ir… Beans
## 4 303 2012-01-01 00:00:00 Central Major Market Green Tea Leaves
## 5 303 2012-01-01 00:00:00 Central Major Market Caffe Mo… Beans
## 6 720 2012-01-01 00:00:00 Central Major Market Decaf Es… Beans
## # ℹ 14 more variables: `Product Type` <chr>, State <chr>, Type <chr>,
## # `Budget COGS` <dbl>, `Budget Margin` <dbl>, `Budget Profit` <dbl>,
## # `Budget Sales` <dbl>, COGS <dbl>, Inventory <dbl>, Margin <dbl>,
## # Marketing <dbl>, Profit <dbl>, Sales <dbl>, `Total Expenses` <dbl>
df %>%
select(Sales, Profit, `Budget Profit`, COGS, Marketing) %>%
summary()
## Sales Profit Budget Profit COGS
## Min. : 17 Min. :-638.0 Min. :-320.00 Min. : 0.00
## 1st Qu.:100 1st Qu.: 17.0 1st Qu.: 20.00 1st Qu.: 43.00
## Median :138 Median : 40.0 Median : 40.00 Median : 60.00
## Mean :193 Mean : 61.1 Mean : 60.91 Mean : 84.43
## 3rd Qu.:230 3rd Qu.: 92.0 3rd Qu.: 80.00 3rd Qu.:100.00
## Max. :912 Max. : 778.0 Max. : 560.00 Max. :364.00
## Marketing
## Min. : 0.00
## 1st Qu.: 13.00
## Median : 22.00
## Mean : 31.19
## 3rd Qu.: 39.00
## Max. :156.00
# Agregasi data per jenis produk
profit_produk <- df %>%
group_by(`Product Type`) %>%
summarise(
Total_Sales = sum(Sales),
Total_Profit = sum(Profit),
Profit_Margin = round(sum(Profit) / sum(Sales) * 100, 2)
) %>%
arrange(desc(Total_Profit))
# Mengubah ke format panjang untuk plotting grouped bar
profit_long <- profit_produk %>%
select(`Product Type`, Total_Sales, Total_Profit) %>%
pivot_longer(cols = c(Total_Sales, Total_Profit),
names_to = "Kategori", values_to = "Nilai")
ggplot(profit_long, aes(x = reorder(`Product Type`, -Nilai), y = Nilai, fill = Kategori)) +
geom_bar(stat = "identity", position = "dodge") +
geom_text(aes(label = comma(Nilai)),
position = position_dodge(width = 0.9),
vjust = -0.4, size = 3.2) +
scale_fill_manual(values = c("Total_Sales" = "#6F4E37", "Total_Profit" = "#C8A96E"),
labels = c("Total Profit", "Total Sales")) +
scale_y_continuous(labels = comma) +
labs(
title = "Total Sales dan Profit berdasarkan Jenis Produk",
subtitle = "Periode Januari 2012 - Desember 2013",
x = "Jenis Produk",
y = "Nilai",
fill = "Keterangan"
) +
theme_minimal() +
theme(plot.title = element_text(face = "bold", size = 13))
# Agregasi data per wilayah pasar
profit_market <- df %>%
group_by(Market) %>%
summarise(
Total_Sales = sum(Sales),
Total_Profit = sum(Profit),
Profit_Margin = round(sum(Profit) / sum(Sales) * 100, 2)
) %>%
arrange(desc(Profit_Margin))
ggplot(profit_market, aes(x = reorder(Market, -Profit_Margin), y = Profit_Margin, fill = Market)) +
geom_bar(stat = "identity", width = 0.6) +
geom_text(aes(label = paste0(Profit_Margin, "%")), vjust = -0.5, size = 4, fontface = "bold") +
scale_fill_manual(values = c("Central" = "#4E342E", "East" = "#795548",
"South" = "#A1887F", "West" = "#D7CCC8")) +
labs(
title = "Profit Margin (%) berdasarkan Wilayah Pasar",
x = "Wilayah Pasar",
y = "Profit Margin (%)",
fill = "Wilayah"
) +
theme_minimal() +
theme(plot.title = element_text(face = "bold", size = 13),
legend.position = "none")
# Perbandingan realisasi vs anggaran
budget_vs_actual <- df %>%
group_by(`Product Type`) %>%
summarise(
Budget_Profit = sum(`Budget Profit`),
Actual_Profit = sum(Profit),
Selisih = sum(Profit) - sum(`Budget Profit`)
) %>%
pivot_longer(cols = c(Budget_Profit, Actual_Profit),
names_to = "Tipe", values_to = "Nilai")
ggplot(budget_vs_actual, aes(x = `Product Type`, y = Nilai, fill = Tipe)) +
geom_bar(stat = "identity", position = "dodge") +
geom_hline(yintercept = 0, linetype = "dashed", color = "gray40") +
geom_text(aes(label = comma(Nilai)),
position = position_dodge(width = 0.9),
vjust = -0.4, size = 3.2) +
scale_fill_manual(values = c("Budget_Profit" = "#90A4AE", "Actual_Profit" = "#37474F"),
labels = c("Profit Aktual", "Budget Profit")) +
scale_y_continuous(labels = comma) +
labs(
title = "Perbandingan Budget Profit vs Profit Aktual per Jenis Produk",
x = "Jenis Produk",
y = "Profit",
fill = "Keterangan"
) +
theme_minimal() +
theme(plot.title = element_text(face = "bold", size = 13))
# Heatmap: Sales per Product Type x Market Size
heatmap_data <- df %>%
group_by(`Market Size`, `Product Type`) %>%
summarise(Total_Sales = sum(Sales), .groups = "drop")
ggplot(heatmap_data, aes(x = `Product Type`, y = `Market Size`, fill = Total_Sales)) +
geom_tile(color = "white", linewidth = 1) +
geom_text(aes(label = comma(Total_Sales)), size = 4, fontface = "bold", color = "white") +
scale_fill_gradient(low = "#BCAAA4", high = "#3E2723", labels = comma) +
labs(
title = "Heatmap Total Sales berdasarkan Jenis Produk dan Ukuran Pasar",
x = "Jenis Produk",
y = "Ukuran Pasar",
fill = "Total Sales"
) +
theme_minimal() +
theme(plot.title = element_text(face = "bold", size = 13))
analisis_produk <- df %>%
group_by(`Product Type`) %>%
summarise(
Jumlah_Transaksi = n(),
Total_Sales = sum(Sales),
Total_COGS = sum(COGS),
Total_Marketing = sum(Marketing),
Total_Profit = sum(Profit),
Profit_Margin = round(sum(Profit) / sum(Sales) * 100, 2),
Budget_Profit = sum(`Budget Profit`),
Selisih_Profit = sum(Profit) - sum(`Budget Profit`),
Pct_vs_Budget = round((sum(Profit) - sum(`Budget Profit`)) / sum(`Budget Profit`) * 100, 2)
) %>%
arrange(desc(Total_Profit))
print(analisis_produk)
## # A tibble: 4 × 10
## `Product Type` Jumlah_Transaksi Total_Sales Total_COGS Total_Marketing
## <chr> <int> <dbl> <dbl> <dbl>
## 1 Coffee 1056 216828 90696 33366
## 2 Espresso 1176 222996 97000 38216
## 3 Herbal Tea 1056 207214 92810 34154
## 4 Tea 960 172773 78166 26738
## # ℹ 5 more variables: Total_Profit <dbl>, Profit_Margin <dbl>,
## # Budget_Profit <dbl>, Selisih_Profit <dbl>, Pct_vs_Budget <dbl>
analisis_market <- df %>%
group_by(Market, `Market Size`) %>%
summarise(
Total_Sales = sum(Sales),
Total_Profit = sum(Profit),
Profit_Margin = round(sum(Profit) / sum(Sales) * 100, 2),
Avg_Marketing = round(mean(Marketing), 2),
.groups = "drop"
) %>%
arrange(desc(Profit_Margin))
print(analisis_market)
## # A tibble: 8 × 6
## Market `Market Size` Total_Sales Total_Profit Profit_Margin Avg_Marketing
## <chr> <chr> <dbl> <dbl> <dbl> <dbl>
## 1 South Major Market 37410 15766 42.1 28.2
## 2 Central Major Market 152579 59337 38.9 30.1
## 3 East Major Market 138260 48848 35.3 41.2
## 4 West Major Market 96892 31785 32.8 55.8
## 5 Central Small Market 112466 34515 30.7 28.5
## 6 East Small Market 40316 10369 25.7 19.9
## 7 South Small Market 66516 16712 25.1 23.4
## 8 West Small Market 175372 42211 24.1 29.4
# Uji korelasi Pearson antar variabel numerik utama
cor_data <- df %>%
select(Sales, COGS, Marketing, Profit) %>%
cor(method = "pearson")
round(cor_data, 3)
## Sales COGS Marketing Profit
## Sales 1.000 0.887 0.711 0.797
## COGS 0.887 1.000 0.818 0.465
## Marketing 0.711 0.818 1.000 0.225
## Profit 0.797 0.465 0.225 1.000
# Uji signifikansi korelasi Sales dan Profit
cor_test <- cor.test(df$Sales, df$Profit, method = "pearson")
cat("Korelasi Sales vs Profit:", round(cor_test$estimate, 4), "\n")
## Korelasi Sales vs Profit: 0.7973
cat("P-value:", format(cor_test$p.value, scientific = TRUE), "\n")
## P-value: 0e+00
cat("Kesimpulan: Korelasi", ifelse(cor_test$p.value < 0.05, "SIGNIFIKAN", "tidak signifikan"),
"pada α = 0.05\n")
## Kesimpulan: Korelasi SIGNIFIKAN pada α = 0.05
# Model regresi linear sederhana
model_regresi <- lm(Profit ~ Sales, data = df)
summary(model_regresi)
##
## Call:
## lm(formula = Profit ~ Sales, data = df)
##
## Residuals:
## Min 1Q Median 3Q Max
## -606.28 -9.15 11.77 28.31 466.85
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -42.456004 1.527850 -27.79 <2e-16 ***
## Sales 0.536582 0.006233 86.08 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 61.39 on 4246 degrees of freedom
## Multiple R-squared: 0.6357, Adjusted R-squared: 0.6357
## F-statistic: 7410 on 1 and 4246 DF, p-value: < 2.2e-16
# Scatter plot dengan garis regresi
ggplot(df, aes(x = Sales, y = Profit, color = `Product Type`)) +
geom_point(alpha = 0.4, size = 1.5) +
geom_smooth(method = "lm", se = TRUE, color = "black", linewidth = 1.2) +
scale_color_manual(values = c("Coffee" = "#6F4E37", "Espresso" = "#3E2723",
"Tea" = "#A1887F", "Herbal Tea" = "#BCAAA4")) +
labs(
title = "Hubungan Sales dan Profit (Regresi Linear)",
x = "Sales",
y = "Profit",
color = "Jenis Produk"
) +
theme_minimal() +
theme(plot.title = element_text(face = "bold", size = 13))
## `geom_smooth()` using formula = 'y ~ x'
Berdasarkan analisis pada Visualisasi 1 dan Tabel 4.1, dapat disimpulkan sebagai berikut:
Berdasarkan Visualisasi 2 dan Tabel 4.2:
Dari Visualisasi 4 (Heatmap):
Dari Analisis 4.3 dan 4.4:
summary(model)). Model ini menjelaskan cukup besar variasi
dalam Profit, namun perlu diingat bahwa hubungannya tidak sepenuhnya
linear karena faktor lain seperti Marketing dan Market juga
berperan.