Syntax and Control Flow

Practice: Conditional Statement and Loops

1 Setup Library

# Install jika belum ada
packages <- c("ggplot2", "dplyr", "tidyr", "scales",
              "gridExtra", "knitr", "kableExtra", "RColorBrewer")

for (pkg in packages) {
  if (!require(pkg, character.only = TRUE, quietly = TRUE)) {
    install.packages(pkg, quiet = TRUE)
    library(pkg, character.only = TRUE)
  }
}

## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union

## 
## Attaching package: 'gridExtra'

## The following object is masked from 'package:dplyr':
## 
##     combine

## 
## Attaching package: 'kableExtra'

## The following object is masked from 'package:dplyr':
## 
##     group_rows

set.seed(42)
cat("✅ Semua library berhasil dimuat!\n")

## ✅ Semua library berhasil dimuat!

2 Task 1: Dynamic Multi-Formula Function

Deskripsi: Membuat fungsi compute_formula(x, formula) untuk menghitung formula linear, quadratic, cubic, dan exponential, lalu plot semua formula dalam satu grafik untuk x = 1:20.

2.1 Fungsi & Komputasi

# ============================================================
# TASK 1: Dynamic Multi-Formula Function
# ============================================================

compute_formula <- function(x, formula) {
  # Validasi input formula
  valid_formulas <- c("linear", "quadratic", "cubic", "exponential")
  if (!(formula %in% valid_formulas)) {
    stop(paste("❌ Formula tidak valid:", formula,
               "\nPilih dari:", paste(valid_formulas, collapse = ", ")))
  }
  
  # Hitung berdasarkan jenis formula
  result <- switch(formula,
    "linear"      = 2 * x + 1,              # f(x) = 2x + 1
    "quadratic"   = x^2 + 3*x + 2,          # f(x) = x² + 3x + 2
    "cubic"       = x^3 - 2*x^2 + x,        # f(x) = x³ - 2x² + x
    "exponential" = exp(0.3 * x)             # f(x) = e^(0.3x)
  )
  return(result)
}

# ── Nested loop: hitung semua formula untuk x = 1:20 ──
x_values <- 1:20
formulas <- c("linear", "quadratic", "cubic", "exponential")

# Buat data frame menggunakan nested loop
df_formulas <- data.frame(x = x_values)

for (formula in formulas) {
  values <- c()
  for (x in x_values) {
    values <- c(values, compute_formula(x, formula))  # nested loop
  }
  df_formulas[[formula]] <- values
}

# Tampilkan tabel hasil
df_formulas %>%
  mutate(across(where(is.numeric), ~ round(., 2))) %>%
  kable(caption = "📊 Hasil Komputasi Multi-Formula (x = 1 sampai 20)",
        align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  column_spec(1, bold = TRUE, color = "#2c3e50")

📊 Hasil Komputasi Multi-Formula (x = 1 sampai 20)
x	linear	quadratic	cubic	exponential
1	3	6	0	1.35
2	5	12	2	1.82
3	7	20	12	2.46
4	9	30	36	3.32
5	11	42	80	4.48
6	13	56	150	6.05
7	15	72	252	8.17
8	17	90	392	11.02
9	19	110	576	14.88
10	21	132	810	20.09
11	23	156	1100	27.11
12	25	182	1452	36.60
13	27	210	1872	49.40
14	29	240	2366	66.69
15	31	272	2940	90.02
16	33	306	3600	121.51
17	35	342	4352	164.02
18	37	380	5202	221.41
19	39	420	6156	298.87
20	41	462	7220	403.43

2.2 Plot

library(plotly)

## 
## Attaching package: 'plotly'

## The following object is masked from 'package:ggplot2':
## 
##     last_plot

## The following object is masked from 'package:stats':
## 
##     filter

## The following object is masked from 'package:graphics':
## 
##     layout

# ── Reshape ke format long ──
df_long <- df_formulas %>%
  pivot_longer(cols = -x, names_to = "formula", values_to = "y") %>%
  mutate(formula = factor(formula,
                          levels = c("linear","quadratic","cubic","exponential"),
                          labels = c("Linear: f(x)=2x+1",
                                     "Quadratic: f(x)=x²+3x+2",
                                     "Cubic: f(x)=x³-2x²+x",
                                     "Exponential: f(x)=e^0.3x")))

# ── Plot ggplot ──
p <- ggplot(df_long, aes(x = x, y = y, color = formula, group = formula,
                         text = paste("x:", x, "<br>y:", round(y,2)))) +
  geom_line(linewidth = 1.2) +
  geom_point(size = 2.5, alpha = 0.8) +
  scale_color_manual(values = c("#2196F3","#4CAF50","#FF5722","#9C27B0")) +
  scale_x_continuous(breaks = 1:20) +
  labs(
    title    = "📈 Task 1: Perbandingan Multi-Formula",
    x        = "x",
    y        = "f(x)",
    color    = "Formula"
  ) +
  theme_minimal()

ggplotly(p, tooltip = "text")

✅ Task 1 Selesai! Fungsi compute_formula() berhasil menghitung 4 jenis formula menggunakan nested loop dan divalidasi input-nya.

3 Task 2: Nested Simulation — Multi-Sales & Discounts

Deskripsi: Simulasi data penjualan simulate_sales(n_salesperson, days) dengan conditional discount dan cumulative sales per salesperson.

3.1 Fungsi & Simulasi

# ============================================================
# TASK 2: Nested Simulation — Multi-Sales & Discounts
# ============================================================

# Nested helper function: tentukan discount berdasarkan sales amount
apply_discount <- function(sales_amount) {
  if (sales_amount >= 900) return(0.15)
  else if (sales_amount >= 700) return(0.10)
  else if (sales_amount >= 500) return(0.07)
  else if (sales_amount >= 300) return(0.05)
  else return(0.00)
}

# Fungsi utama simulasi penjualan
simulate_sales <- function(n_salesperson, days) {
  records <- list()
  idx     <- 1
  
  # Nested loop: per salesperson, per day
  for (sp_id in 1:n_salesperson) {
    for (day in 1:days) {
      sales_amount  <- round(runif(1, min = 100, max = 1000), 2)
      discount_rate <- apply_discount(sales_amount)  # nested function call
      net_sales     <- round(sales_amount * (1 - discount_rate), 2)
      
      records[[idx]] <- data.frame(
        sales_id      = sp_id,
        day           = day,
        sales_amount  = sales_amount,
        discount_rate = discount_rate,
        net_sales     = net_sales
      )
      idx <- idx + 1
    }
  }
  return(do.call(rbind, records))
}

# ── Simulasi: 5 salesperson × 30 hari ──
set.seed(42)
df_sales <- simulate_sales(n_salesperson = 5, days = 30)

# ── Hitung cumulative sales per salesperson ──
df_sales <- df_sales %>%
  group_by(sales_id) %>%
  mutate(cumulative_sales = cumsum(net_sales)) %>%
  ungroup()

# ── Summary statistics ──
summary_sales <- df_sales %>%
  group_by(sales_id) %>%
  summarise(
    Total_Sales   = round(sum(sales_amount), 2),
    Total_Net     = round(sum(net_sales), 2),
    Avg_Discount  = paste0(round(mean(discount_rate) * 100, 1), "%"),
    Avg_Daily_Sales = round(mean(sales_amount), 2),
    .groups = "drop"
  ) %>%
  rename("Salesperson ID" = sales_id)

summary_sales %>%
  kable(caption = "📊 Summary Statistik per Salesperson (30 Hari)",
        align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  row_spec(0, background = "#3498db", color = "white")

📊 Summary Statistik per Salesperson (30 Hari)
Salesperson ID	Total_Sales	Total_Net	Avg_Discount	Avg_Daily_Sales
1	19575.11	17486.71	8.8%	652.50
2	17220.40	15659.25	6.9%	574.01
3	14295.31	13237.04	5.2%	476.51
4	17910.00	16274.47	7.3%	597.00
5	18565.48	16815.22	7.6%	618.85

3.2 Plot

# ── Plot 1: Cumulative Sales per Salesperson ──
p1 <- ggplot(df_sales,
             aes(x = day, y = cumulative_sales,
                 color = factor(sales_id), group = factor(sales_id))) +
  geom_line(linewidth = 1.1) +
  geom_point(size = 1.5, alpha = 0.7) +
  scale_color_brewer(palette = "Set1") +
  labs(
    title  = "📈 Cumulative Net Sales per Salesperson",
    x      = "Day", y = "Cumulative Net Sales",
    color  = "Salesperson"
  ) +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(face = "bold"),
        legend.position = "bottom")

# ── Plot 2: Total Net Sales per Salesperson ──
total_net <- df_sales %>%
  group_by(sales_id) %>%
  summarise(total_net = sum(net_sales), .groups = "drop")

p2 <- ggplot(total_net, aes(x = factor(sales_id), y = total_net,
                             fill = factor(sales_id))) +
  geom_col(width = 0.65, show.legend = FALSE) +
  geom_text(aes(label = paste0("$", format(round(total_net), big.mark=","))),
            vjust = -0.4, size = 3.8, fontface = "bold") +
  scale_fill_brewer(palette = "Set1") +
  scale_x_discrete(labels = paste0("SP ", 1:5)) +
  labs(
    title = "💰 Total Net Sales per Salesperson",
    x     = "Salesperson", y = "Total Net Sales"
  ) +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(face = "bold"),
        panel.grid.major.x = element_blank())

grid.arrange(p1, p2, ncol = 2)

✅ Task 2 Selesai! Fungsi simulate_sales() + nested function apply_discount() berhasil mensimulasikan penjualan dengan conditional discount.

4 Task 3: Multi-Level Performance Categorization

Deskripsi: Fungsi categorize_performance() dengan 5 kategori: Excellent, Very Good, Good, Average, Poor.

4.1 Fungsi & Kategorisasi

# ============================================================
# TASK 3: Multi-Level Performance Categorization
# ============================================================

categorize_performance <- function(sales_amount) {
  # Kategorisasi berdasarkan sales amount
  if      (sales_amount >= 850) return("Excellent")
  else if (sales_amount >= 700) return("Very Good")
  else if (sales_amount >= 500) return("Good")
  else if (sales_amount >= 300) return("Average")
  else                          return("Poor")
}

# ── Loop through vector untuk kategorisasi ──
set.seed(99)
sales_vector <- runif(200, min = 100, max = 1000)

categories <- c()
for (amount in sales_vector) {              # loop through vector
  categories <- c(categories, categorize_performance(amount))
}

df_perf <- data.frame(sales_amount = sales_vector, category = categories)

# ── Hitung persentase per kategori ──
cat_order   <- c("Excellent", "Very Good", "Good", "Average", "Poor")
cat_summary <- df_perf %>%
  group_by(category) %>%
  summarise(Count = n(), .groups = "drop") %>%
  mutate(
    category   = factor(category, levels = cat_order),
    Percentage = round(Count / sum(Count) * 100, 2),
    Label      = paste0(Count, " (", Percentage, "%)")
  ) %>%
  arrange(category)

cat_summary %>%
  select(Category = category, Count, Percentage) %>%
  mutate(Percentage = paste0(Percentage, "%")) %>%
  kable(caption = "📊 Distribusi Kategori Performa",
        align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE) %>%
  row_spec(1, background = "#d5f5e3") %>%
  row_spec(2, background = "#a9dfbf") %>%
  row_spec(3, background = "#fdebd0") %>%
  row_spec(4, background = "#fad7a0") %>%
  row_spec(5, background = "#f5b7b1")

📊 Distribusi Kategori Performa
Category	Count	Percentage
Excellent	30	15%
Very Good	37	18.5%
Good	55	27.5%
Average	36	18%
Poor	42	21%

4.2 Plot

colors_perf <- c("Excellent" = "#1a9641",
                 "Very Good" = "#a6d96a",
                 "Good"      = "#ffffbf",
                 "Average"   = "#fdae61",
                 "Poor"      = "#d7191c")

# ── Bar Chart ──
p_bar <- ggplot(cat_summary,
                aes(x = category, y = Count, fill = category)) +
  geom_col(width = 0.65, show.legend = FALSE) +
  geom_text(aes(label = paste0(Count, "\n(", Percentage, "%)")),
            vjust = -0.3, size = 3.8, fontface = "bold") +
  scale_fill_manual(values = colors_perf) +
  scale_x_discrete(limits = cat_order) +
  labs(title = "📊 Bar Chart: Distribusi Kategori Performa",
       x = "Kategori", y = "Jumlah") +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(face = "bold"),
        panel.grid.major.x = element_blank())

# ── Pie Chart ──
p_pie <- ggplot(cat_summary,
                aes(x = "", y = Count, fill = category)) +
  geom_col(width = 1, color = "white", linewidth = 0.8) +
  coord_polar("y", start = 0) +
  geom_text(aes(label = paste0(Percentage, "%")),
            position = position_stack(vjust = 0.5),
            size = 4, fontface = "bold") +
  scale_fill_manual(values = colors_perf,
                    limits = cat_order) +
  labs(title = "🥧 Pie Chart: Distribusi Kategori Performa",
       fill = "Kategori") +
  theme_void(base_size = 12) +
  theme(plot.title = element_text(face = "bold", hjust = 0.5),
        legend.position = "right")

grid.arrange(p_bar, p_pie, ncol = 2)

✅ Task 3 Selesai! Fungsi categorize_performance() berhasil mengkategorikan 200 data penjualan ke dalam 5 tier performa.

5 Task 4: Multi-Company Dataset Simulation

Deskripsi: Fungsi generate_company_data() dengan nested loop per company & employee, conditional logic untuk top performers (KPI > 90).

5.1 Fungsi & Simulasi

# ============================================================
# TASK 4: Multi-Company Dataset Simulation
# ============================================================

departments <- c("Engineering", "Marketing", "Finance", "HR", "Operations")

generate_company_data <- function(n_company, n_employees) {
  records <- list()
  emp_id  <- 1
  
  # Nested loop: per company, per employee
  for (c_id in 1:n_company) {
    for (e in 1:n_employees) {
      salary            <- max(2000, round(rnorm(1, mean = 5000, sd = 1500), 2))
      performance_score <- round(runif(1, 50, 100), 2)
      kpi_score         <- round(runif(1, 50, 100), 2)
      department        <- sample(departments, 1)
      
      # Conditional logic: top performers (KPI > 90)
      is_top <- kpi_score > 90
      
      records[[emp_id]] <- data.frame(
        company_id        = paste0("C", sprintf("%02d", c_id)),
        employee_id       = paste0("E", sprintf("%04d", emp_id)),
        salary            = salary,
        department        = department,
        performance_score = performance_score,
        KPI_score         = kpi_score,
        top_performer     = is_top
      )
      emp_id <- emp_id + 1
    }
  }
  return(do.call(rbind, records))
}

# ── Generate: 4 perusahaan × 50 karyawan ──
set.seed(7)
df_company <- generate_company_data(n_company = 4, n_employees = 50)

# ── Summary per company ──
company_summary <- df_company %>%
  group_by(company_id) %>%
  summarise(
    Avg_Salary      = round(mean(salary), 2),
    Avg_Performance = round(mean(performance_score), 2),
    Max_KPI         = round(max(KPI_score), 2),
    Top_Performers  = sum(top_performer),
    Total_Employees = n(),
    .groups = "drop"
  )

company_summary %>%
  kable(caption = "📊 Summary per Perusahaan",
        align = "c",
        col.names = c("Company", "Avg Salary", "Avg Performance",
                      "Max KPI", "Top Performers", "Total Employees")) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = FALSE) %>%
  row_spec(0, background = "#2c3e50", color = "white")

📊 Summary per Perusahaan
Company	Avg Salary	Avg Performance	Max KPI	Top Performers	Total Employees
C01	5385.29	77.46	97.45	7	50
C02	4541.99	69.13	99.29	12	50
C03	4939.54	75.80	99.25	7	50
C04	4964.16	76.17	98.51	9	50

5.2 Plot

pal4 <- brewer.pal(4, "Set2")

# 1. Avg Salary per Company
p4a <- ggplot(company_summary,
              aes(x = company_id, y = Avg_Salary, fill = company_id)) +
  geom_col(show.legend = FALSE, width = 0.6) +
  geom_text(aes(label = paste0("$", format(Avg_Salary, big.mark=","))),
            vjust = -0.3, size = 4, fontface = "bold") +
  scale_fill_manual(values = pal4) +
  labs(title = "💵 Avg Salary per Company", x = "Company", y = "Avg Salary") +
  theme_minimal() + theme(plot.title = element_text(face="bold"),
                          panel.grid.major.x = element_blank())

# 2. Avg Performance Score
p4b <- ggplot(company_summary,
              aes(x = company_id, y = Avg_Performance, fill = company_id)) +
  geom_col(show.legend = FALSE, width = 0.6) +
  geom_text(aes(label = round(Avg_Performance, 1)),
            vjust = -0.3, size = 4, fontface = "bold") +
  scale_fill_manual(values = pal4) +
  labs(title = "🏆 Avg Performance Score", x = "Company", y = "Score") +
  theme_minimal() + theme(plot.title = element_text(face="bold"),
                          panel.grid.major.x = element_blank())

# 3. KPI Score — Boxplot
p4c <- ggplot(df_company,
              aes(x = company_id, y = KPI_score, fill = company_id)) +
  geom_boxplot(show.legend = FALSE, alpha = 0.85, outlier.color = "red") +
  scale_fill_manual(values = pal4) +
  labs(title = "📦 KPI Score Distribution (Boxplot)",
       x = "Company", y = "KPI Score") +
  theme_minimal() + theme(plot.title = element_text(face="bold"))

# 4. Top Performers Count
p4d <- ggplot(company_summary,
              aes(x = company_id, y = Top_Performers, fill = company_id)) +
  geom_col(show.legend = FALSE, width = 0.6) +
  geom_text(aes(label = Top_Performers), vjust = -0.3, size = 4.5, fontface = "bold") +
  scale_fill_manual(values = pal4) +
  labs(title = "⭐ Top Performers (KPI > 90)", x = "Company", y = "Count") +
  theme_minimal() + theme(plot.title = element_text(face="bold"),
                          panel.grid.major.x = element_blank())

grid.arrange(p4a, p4b, p4c, p4d, ncol = 2,
             top = "Task 4: Multi-Company Dataset Analysis")

✅ Task 4 Selesai! Fungsi generate_company_data() menggunakan nested loop dan conditional logic untuk mengidentifikasi top performers.

6 Task 5: Monte Carlo Simulation — Pi & Probability

Deskripsi: Estimasi π menggunakan Monte Carlo, hitung probabilitas titik di sub-square, dan visualisasi titik inside vs outside circle.

6.1 Fungsi & Hasil

# ============================================================
# TASK 5: Monte Carlo Simulation — Pi & Probability
# ============================================================

monte_carlo_pi <- function(n_points, seed = 42) {
  set.seed(seed)
  inside_circle  <- 0
  in_subsquare   <- 0
  
  x_in <- c(); y_in <- c()
  x_out <- c(); y_out <- c()
  
  # Loop untuk iterasi
  for (i in 1:n_points) {
    x <- runif(1, -1, 1)
    y <- runif(1, -1, 1)
    
    # Cek apakah di dalam unit circle
    if (x^2 + y^2 <= 1) {
      inside_circle <- inside_circle + 1
      x_in <- c(x_in, x); y_in <- c(y_in, y)
    } else {
      x_out <- c(x_out, x); y_out <- c(y_out, y)
    }
    
    # Cek sub-square [0, 0.5] × [0, 0.5]
    if (x >= 0 && x <= 0.5 && y >= 0 && y <= 0.5) {
      in_subsquare <- in_subsquare + 1
    }
  }
  
  pi_estimate    <- 4 * inside_circle / n_points
  subsquare_prob <- in_subsquare / n_points
  
  list(
    pi_estimate    = pi_estimate,
    inside_circle  = inside_circle,
    outside_circle = n_points - inside_circle,
    subsquare_prob = subsquare_prob,
    x_in = x_in, y_in = y_in,
    x_out = x_out, y_out = y_out
  )
}

# ── Uji dengan berbagai n_points ──
n_list    <- c(100, 500, 1000, 5000, 10000)
pi_results <- data.frame(
  n_points    = n_list,
  pi_estimate = sapply(n_list, function(n) monte_carlo_pi(n)$pi_estimate)
) %>%
  mutate(error = round(abs(pi_estimate - pi), 6),
         pi_estimate = round(pi_estimate, 6))

pi_results %>%
  kable(caption = paste("🎯 Konvergensi Estimasi π (nilai asli π =", round(pi, 6), ")"),
        align = "c",
        col.names = c("n Points", "π Estimate", "Absolute Error")) %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE) %>%
  row_spec(5, bold = TRUE, background = "#d5f5e3")

🎯 Konvergensi Estimasi π (nilai asli π = 3.141593 )
n Points	π Estimate	Absolute Error
100	3.0400	0.101593
500	3.0880	0.053593
1000	3.0920	0.049593
5000	3.0992	0.042393
10000	3.1208	0.020793

final <- monte_carlo_pi(10000)
cat(sprintf("\n🎯 Estimasi π (n=10.000) : %.5f\n", final$pi_estimate))

## 
## 🎯 Estimasi π (n=10.000) : 3.12080

cat(sprintf("📦 Probabilitas Sub-square: %.4f  (expected: 0.0625)\n",
            final$subsquare_prob))

## 📦 Probabilitas Sub-square: 0.0625  (expected: 0.0625)

6.2 Plot

# ── Plot 1: Scatter Inside vs Outside Circle ──
df_circle <- rbind(
  data.frame(x = final$x_in,  y = final$y_in,  status = "Inside Circle"),
  data.frame(x = final$x_out, y = final$y_out, status = "Outside Circle")
)

# Buat circle & sub-square data
theta  <- seq(0, 2*pi, length.out = 300)
circle_df  <- data.frame(x = cos(theta), y = sin(theta))
square_df  <- data.frame(
  x = c(0, 0.5, 0.5, 0, 0),
  y = c(0, 0, 0.5, 0.5, 0)
)

p5a <- ggplot(df_circle, aes(x = x, y = y, color = status)) +
  geom_point(size = 0.3, alpha = 0.4) +
  geom_path(data = circle_df, aes(x, y),
            color = "navy", linewidth = 1.2, inherit.aes = FALSE) +
  geom_path(data = square_df, aes(x, y),
            color = "orange", linewidth = 1.2, linetype = "dashed",
            inherit.aes = FALSE) +
  scale_color_manual(values = c("Inside Circle" = "#27ae60",
                                "Outside Circle" = "#e74c3c")) +
  coord_fixed() +
  labs(
    title = sprintf("🎯 Monte Carlo π ≈ %.5f (n=10.000)", final$pi_estimate),
    color = NULL
  ) +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold"),
        legend.position = "bottom")

# ── Plot 2: Konvergensi Estimasi π ──
p5b <- ggplot(pi_results, aes(x = n_points, y = pi_estimate)) +
  geom_line(color = "#2196F3", linewidth = 1.2) +
  geom_point(color = "#2196F3", size = 3) +
  geom_hline(yintercept = pi, color = "red",
             linetype = "dashed", linewidth = 1) +
  annotate("text", x = max(n_list)*0.6, y = pi + 0.015,
           label = paste0("True π = ", round(pi, 5)),
           color = "red", size = 3.8) +
  scale_x_continuous(labels = scales::comma) +
  labs(
    title = "📉 Konvergensi Estimasi π",
    x     = "Jumlah Titik",
    y     = "Estimasi π"
  ) +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold"))

grid.arrange(p5a, p5b, ncol = 2)

✅ Task 5 Selesai! Fungsi monte_carlo_pi() berhasil mengestimasi π menggunakan loop iterasi, sekaligus menghitung probabilitas sub-square.

7 Task 6: Advanced Data Transformation & Feature Engineering

Deskripsi: Fungsi normalize_columns() & z_score() dengan loop, feature engineering, dan perbandingan distribusi before & after.

7.1 Fungsi & Transformasi

# ============================================================
# TASK 6: Advanced Data Transformation & Feature Engineering
# ============================================================

normalize_columns <- function(df, cols) {
  # Min-Max normalization menggunakan loop
  for (col in cols) {
    col_min <- min(df[[col]], na.rm = TRUE)
    col_max <- max(df[[col]], na.rm = TRUE)
    df[[paste0(col, "_norm")]] <- (df[[col]] - col_min) / (col_max - col_min)
  }
  return(df)
}

z_score <- function(df, cols) {
  # Z-score standardization menggunakan loop
  for (col in cols) {
    mu    <- mean(df[[col]], na.rm = TRUE)
    sigma <- sd(df[[col]], na.rm = TRUE)
    df[[paste0(col, "_zscore")]] <- (df[[col]] - mu) / sigma
  }
  return(df)
}

# ── Gunakan df_company dari Task 4 ──
df_t6 <- df_company

# ── Feature Engineering ──
# 1. performance_category
df_t6$performance_category <- ifelse(
  df_t6$performance_score >= 80, "High",
  ifelse(df_t6$performance_score >= 65, "Mid", "Low")
)

# 2. salary_bracket
df_t6$salary_bracket <- cut(
  df_t6$salary,
  breaks = c(0, 3000, 5000, 7000, Inf),
  labels = c("Low", "Medium", "High", "Very High"),
  include.lowest = TRUE
)

# ── Terapkan transformasi ──
num_cols <- c("salary", "performance_score", "KPI_score")
df_t6    <- normalize_columns(df_t6, num_cols)
df_t6    <- z_score(df_t6, num_cols)

# Tampilkan preview
df_t6 %>%
  select(employee_id, salary, salary_norm, salary_zscore,
         performance_score, performance_score_norm,
         performance_category, salary_bracket) %>%
  head(8) %>%
  mutate(across(where(is.numeric), ~round(., 3))) %>%
  kable(caption = "📊 Preview Data Setelah Transformasi & Feature Engineering",
        align = "c") %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
                full_width = TRUE, font_size = 12) %>%
  row_spec(0, background = "#8e44ad", color = "white")

📊 Preview Data Setelah Transformasi & Feature Engineering
employee_id	salary	salary_norm	salary_zscore	performance_score	performance_score_norm	performance_category	salary_bracket
E0001	8430.87	0.886	2.314	55.78	0.107	Low	Very High
E0002	4381.56	0.328	-0.384	58.29	0.158	Low	Medium
E0003	6122.21	0.568	0.776	72.67	0.449	Mid	High
E0004	8284.97	0.866	2.217	81.97	0.638	High	Very High
E0005	8422.18	0.885	2.308	81.35	0.625	High	Very High
E0006	5701.52	0.510	0.496	59.29	0.178	Low	High
E0007	6535.63	0.625	1.051	89.53	0.791	High	High
E0008	4549.43	0.351	-0.272	71.84	0.433	Mid	Medium

7.2 Plot — Before & After

# ── Histogram: Before & After untuk 3 kolom ──
plot_list <- list()
col_labels <- c("Salary", "Performance Score", "KPI Score")

for (i in seq_along(num_cols)) {
  col <- num_cols[i]
  lbl <- col_labels[i]
  
  # Original
  p_orig <- ggplot(df_t6, aes_string(x = col)) +
    geom_histogram(bins = 15, fill = "#3498db", color = "white", alpha = 0.85) +
    labs(title = paste(lbl, "— Original"), x = col, y = "Freq") +
    theme_minimal(base_size = 11) +
    theme(plot.title = element_text(face = "bold", size = 10))
  
  # Normalized
  p_norm <- ggplot(df_t6, aes_string(x = paste0(col, "_norm"))) +
    geom_histogram(bins = 15, fill = "#2ecc71", color = "white", alpha = 0.85) +
    labs(title = paste(lbl, "— Min-Max Norm"), x = "Normalized", y = "Freq") +
    theme_minimal(base_size = 11) +
    theme(plot.title = element_text(face = "bold", size = 10))
  
  # Z-Score
  p_z <- ggplot(df_t6, aes_string(x = paste0(col, "_zscore"))) +
    geom_histogram(bins = 15, fill = "#e74c3c", color = "white", alpha = 0.85) +
    labs(title = paste(lbl, "— Z-Score"), x = "Z-Score", y = "Freq") +
    theme_minimal(base_size = 11) +
    theme(plot.title = element_text(face = "bold", size = 10))
  
  plot_list <- c(plot_list, list(p_orig, p_norm, p_z))
}

## Warning: `aes_string()` was deprecated in ggplot2 3.0.0.
## ℹ Please use tidy evaluation idioms with `aes()`.
## ℹ See also `vignette("ggplot2-in-packages")` for more information.
## This warning is displayed once per session.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

do.call(grid.arrange, c(plot_list, ncol = 3,
        top = "Task 6: Distribusi Sebelum & Sesudah Transformasi"))

7.3 Plot — Boxplot

ggplot(df_t6, aes(x = salary_bracket, y = KPI_score, fill = salary_bracket)) +
  geom_boxplot(alpha = 0.8, outlier.color = "red", show.legend = FALSE) +
  scale_fill_brewer(palette = "Set3") +
  scale_x_discrete(limits = c("Low", "Medium", "High", "Very High")) +
  labs(
    title = "📦 KPI Score berdasarkan Salary Bracket",
    x     = "Salary Bracket",
    y     = "KPI Score"
  ) +
  theme_minimal(base_size = 13) +
  theme(plot.title = element_text(face = "bold"))

✅ Task 6 Selesai! Berhasil membuat fungsi normalisasi dan z-score berbasis loop, feature engineering, serta visualisasi before vs after.

8 Task 7: Mini Project — Company KPI Dashboard

Deskripsi: Dataset lengkap 7 perusahaan × 80 karyawan, KPI tier, grouped bar charts, scatter plot dengan regression lines, dan department analysis.

8.1 Data & Summary

# ============================================================
# TASK 7: Mini Project — Company KPI Dashboard
# ============================================================

set.seed(2024)
df_dash <- generate_company_data(n_company = 7, n_employees = 80)

# ── KPI Tier Categorization (loop-based) ──
kpi_tier <- function(score) {
  if      (score >= 90) return("Platinum")
  else if (score >= 75) return("Gold")
  else if (score >= 60) return("Silver")
  else                  return("Bronze")
}

kpi_tiers <- c()
for (score in df_dash$KPI_score) {   # loop per employee
  kpi_tiers <- c(kpi_tiers, kpi_tier(score))
}
df_dash$KPI_tier <- factor(kpi_tiers,
                            levels = c("Platinum","Gold","Silver","Bronze"))

# ── Summary per Company ──
dash_summary <- df_dash %>%
  group_by(company_id) %>%
  summarise(
    Avg_Salary    = round(mean(salary), 0),
    Avg_KPI       = round(mean(KPI_score), 2),
    Top_Performers = sum(top_performer),
    Total_Emp     = n(),
    .groups = "drop"
  )

dash_summary %>%
  kable(caption = "📋 Company KPI Summary (7 Perusahaan × 80 Karyawan)",
        align   = "c",
        col.names = c("Company","Avg Salary","Avg KPI","Top Performers","Total Emp")) %>%
  kable_styling(bootstrap_options = c("striped","hover","condensed"),
                full_width = FALSE) %>%
  row_spec(0, background = "#2c3e50", color = "white") %>%
  column_spec(3, bold = TRUE, color = "#2980b9")

📋 Company KPI Summary (7 Perusahaan × 80 Karyawan)
Company	Avg Salary	Avg KPI	Top Performers	Total Emp
C01	4925	72.50	11	80
C02	5022	76.43	15	80
C03	4990	74.02	14	80
C04	4860	73.43	12	80
C05	4759	73.37	12	80
C06	5102	76.33	15	80
C07	5448	75.66	17	80

8.2 KPI Tier & Top Performers

tier_colors <- c("Platinum" = "#8e44ad",
                 "Gold"     = "#f39c12",
                 "Silver"   = "#95a5a6",
                 "Bronze"   = "#a04000")

# ── Grouped Bar: KPI Tier per Company ──
tier_counts <- df_dash %>%
  group_by(company_id, KPI_tier) %>%
  summarise(count = n(), .groups = "drop")

p7a <- ggplot(tier_counts,
              aes(x = company_id, y = count, fill = KPI_tier)) +
  geom_col(position = "dodge", width = 0.75) +
  scale_fill_manual(values = tier_colors) +
  labs(title = "🏅 KPI Tier Distribution per Company",
       x = "Company", y = "Count", fill = "KPI Tier") +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(face="bold"),
        legend.position = "bottom")

# ── Top Performers per Company ──
p7b <- ggplot(dash_summary,
              aes(x = company_id, y = Top_Performers, fill = company_id)) +
  geom_col(show.legend = FALSE, width = 0.6) +
  geom_text(aes(label = Top_Performers), vjust = -0.3,
            size = 4.5, fontface = "bold") +
  scale_fill_brewer(palette = "Set1") +
  labs(title = "⭐ Top Performers per Company",
       x = "Company", y = "Count") +
  theme_minimal(base_size = 12) +
  theme(plot.title = element_text(face="bold"),
        panel.grid.major.x = element_blank())

grid.arrange(p7a, p7b, ncol = 2)

8.3 Scatter + Regression

ggplot(df_dash, aes(x = salary, y = KPI_score,
                    color = company_id)) +
  geom_point(size = 1.8, alpha = 0.55) +
  geom_smooth(method = "lm", se = TRUE, linewidth = 1,
              aes(fill = company_id), alpha = 0.1) +
  scale_color_brewer(palette = "Set1") +
  scale_fill_brewer(palette = "Set1") +
  labs(
    title    = "📊 Salary vs KPI Score dengan Regression Lines per Company",
    subtitle = "Setiap warna mewakili 1 perusahaan; garis = regresi linear",
    x        = "Salary",
    y        = "KPI Score",
    color    = "Company",
    fill     = "Company"
  ) +
  theme_minimal(base_size = 12) +
  theme(plot.title    = element_text(face = "bold"),
        legend.position = "bottom") +
  guides(color = guide_legend(nrow = 1))

## `geom_smooth()` using formula = 'y ~ x'

8.4 Salary & Department Analysis

# ── Boxplot Salary per Company ──
p_sal <- ggplot(df_dash,
                aes(x = company_id, y = salary, fill = company_id)) +
  geom_boxplot(alpha = 0.85, show.legend = FALSE,
               outlier.color = "red", outlier.size = 1.5) +
  scale_fill_brewer(palette = "Set1") +
  labs(title = "💰 Distribusi Salary per Company",
       x = "Company", y = "Salary") +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold"))

# ── Department: Avg KPI per Company (Grouped Bar) ──
dept_kpi <- df_dash %>%
  group_by(department, company_id) %>%
  summarise(avg_kpi = round(mean(KPI_score), 2), .groups = "drop")

p_dept <- ggplot(dept_kpi,
                 aes(x = department, y = avg_kpi, fill = company_id)) +
  geom_col(position = "dodge", width = 0.75) +
  scale_fill_brewer(palette = "Set1") +
  labs(title = "🏢 Avg KPI per Department per Company",
       x = "Department", y = "Avg KPI Score",
       fill = "Company") +
  theme_minimal() +
  theme(plot.title = element_text(face = "bold"),
        axis.text.x = element_text(angle = 15, hjust = 1),
        legend.position = "bottom")

grid.arrange(p_sal, p_dept, nrow = 2,
             top = "Task 7: Company KPI Dashboard — Salary & Department Analysis")

✅ Task 7 Selesai! Dashboard lengkap dengan grouped bar chart, scatter+regression, boxplot, dan analisis departemen.

9 Task 8 (BONUS): Automated Report Generation

Deskripsi: Gunakan fungsi + loop untuk menghasilkan summary report otomatis per perusahaan, lengkap dengan tabel, statistik, dan plot.

9.1 Fungsi Report

# ============================================================
# TASK 8 (BONUS): Automated Report Generation
# ============================================================

# Fungsi: buat summary per company secara otomatis
generate_company_report <- function(df, c_id) {
  df_sub <- df %>% filter(company_id == c_id)
  
  # Statistik utama
  stats <- list(
    company      = c_id,
    n_emp        = nrow(df_sub),
    avg_salary   = round(mean(df_sub$salary), 0),
    avg_kpi      = round(mean(df_sub$KPI_score), 2),
    avg_perf     = round(mean(df_sub$performance_score), 2),
    n_top        = sum(df_sub$top_performer),
    max_kpi      = round(max(df_sub$KPI_score), 2),
    min_salary   = round(min(df_sub$salary), 0),
    max_salary   = round(max(df_sub$salary), 0)
  )
  
  # Top 5 performers
  top5 <- df_sub %>%
    arrange(desc(KPI_score)) %>%
    select(employee_id, department, salary, performance_score, KPI_score) %>%
    head(5) %>%
    mutate(across(where(is.numeric), ~round(., 2)))
  
  # KPI Tier distribution
  tier_dist <- df_sub %>%
    group_by(KPI_tier) %>%
    summarise(count = n(), pct = round(n()/nrow(df_sub)*100, 1), .groups="drop") %>%
    arrange(KPI_tier)
  
  # Department summary
  dept_sum <- df_sub %>%
    group_by(department) %>%
    summarise(
      Avg_KPI    = round(mean(KPI_score), 2),
      Avg_Salary = round(mean(salary), 0),
      Count      = n(),
      .groups = "drop"
    ) %>%
    arrange(desc(Avg_KPI))
  
  return(list(stats = stats, top5 = top5,
              tier_dist = tier_dist, dept_sum = dept_sum,
              data = df_sub))
}

cat("✅ Fungsi generate_company_report() siap digunakan.\n")

## ✅ Fungsi generate_company_report() siap digunakan.

cat("✅ Loop akan dijalankan di bawah untuk menghasilkan report semua perusahaan.\n")

## ✅ Loop akan dijalankan di bawah untuk menghasilkan report semua perusahaan.

9.2 Loop — Auto Report

company_ids <- sort(unique(df_dash$company_id))
pal_tier    <- c("Platinum"="#8e44ad","Gold"="#f39c12","Silver"="#95a5a6","Bronze"="#a04000")

# ── LOOP: Generate report untuk setiap perusahaan ──
for (c_id in company_ids) {
  
  rpt <- generate_company_report(df_dash, c_id)
  s   <- rpt$stats
  
  # ── Section Header ──
  cat(sprintf("\n\n### 📌 %s — Automated Report\n\n", c_id))
  
  # ── KPI Stat Cards (dalam tabel kecil) ──
  stats_df <- data.frame(
    Metric = c("Total Karyawan", "Avg Salary", "Avg KPI Score",
               "Avg Performance", "Top Performers", "Max KPI",
               "Min Salary", "Max Salary"),
    Value  = c(s$n_emp,
               paste0("$", format(s$avg_salary, big.mark=",")),
               s$avg_kpi, s$avg_perf, s$n_top, s$max_kpi,
               paste0("$", format(s$min_salary, big.mark=",")),
               paste0("$", format(s$max_salary, big.mark=",")))
  )
  
  print(
    stats_df %>%
      kable(align = "c",
            caption = paste("📊 Key Metrics —", c_id)) %>%
      kable_styling(bootstrap_options = c("striped","hover"),
                    full_width = FALSE, font_size = 12) %>%
      row_spec(0, background = "#2c3e50", color = "white") %>%
      column_spec(2, bold = TRUE, color = "#2980b9")
  )
  
  # ── Top 5 Performers ──
  cat("\n**🥇 Top 5 Performers:**\n\n")
  print(
    rpt$top5 %>%
      kable(align = "c",
            col.names = c("Employee ID","Department","Salary","Performance","KPI Score")) %>%
      kable_styling(bootstrap_options = c("striped","hover","condensed"),
                    full_width = FALSE, font_size = 12) %>%
      row_spec(1, bold = TRUE, background = "#d5f5e3")
  )
  
  # ── KPI Tier Distribution ──
  cat("\n**🏷️ KPI Tier Breakdown:**\n\n")
  print(
    rpt$tier_dist %>%
      mutate(label = paste0(count, " karyawan (", pct, "%)")) %>%
      select(KPI_tier, label) %>%
      kable(align = "c", col.names = c("KPI Tier","Count & Percentage")) %>%
      kable_styling(bootstrap_options = c("striped"),
                    full_width = FALSE, font_size = 12)
  )
  
  # ── Department Summary ──
  cat("\n**🏢 Department Summary:**\n\n")
  print(
    rpt$dept_sum %>%
      kable(align = "c",
            col.names = c("Department","Avg KPI","Avg Salary","Employee Count")) %>%
      kable_styling(bootstrap_options = c("striped","hover"),
                    full_width = FALSE, font_size = 12)
  )
  
  # ── Plot untuk company ini ──
  df_sub <- rpt$data
  
  p_hist <- ggplot(df_sub, aes(x = KPI_score)) +
    geom_histogram(bins = 15, fill = "#3498db", color = "white", alpha = 0.85) +
    geom_vline(xintercept = mean(df_sub$KPI_score),
               color = "red", linetype = "dashed", linewidth = 1) +
    annotate("text", x = mean(df_sub$KPI_score) + 1.5, y = Inf,
             label = paste0("Mean: ", round(mean(df_sub$KPI_score),1)),
             vjust = 1.5, color = "red", size = 3.5) +
    labs(title = paste(c_id, "— KPI Score Distribution"),
         x = "KPI Score", y = "Count") +
    theme_minimal(base_size = 11) +
    theme(plot.title = element_text(face = "bold"))
  
  p_dept2 <- ggplot(rpt$dept_sum,
                    aes(x = reorder(department, Avg_KPI), y = Avg_KPI,
                        fill = Avg_KPI)) +
    geom_col(show.legend = FALSE) +
    geom_text(aes(label = Avg_KPI), hjust = -0.2, size = 3.5, fontface = "bold") +
    scale_fill_gradient(low = "#f39c12", high = "#27ae60") +
    coord_flip() +
    labs(title = paste(c_id, "— Avg KPI per Department"),
         x = "Department", y = "Avg KPI") +
    theme_minimal(base_size = 11) +
    theme(plot.title = element_text(face = "bold"))
  
  grid.arrange(p_hist, p_dept2, ncol = 2)
  
  cat("\n---\n")
}

9.2.1 📌 C01 — Automated Report

📊 Key Metrics — C01
Metric	Value
Total Karyawan	80
Avg Salary	$4,925
Avg KPI Score	72.5
Avg Performance	76.41
Top Performers	11
Max KPI	99.71
Min Salary	$2,000
Max Salary	$8,897

🥇 Top 5 Performers:

Employee ID	Department	Salary	Performance	KPI Score
E0034	Operations	4218.15	62.02	99.71
E0074	Operations	3307.80	93.34	99.46
E0019	HR	3645.48	75.34	98.69
E0044	Marketing	4602.81	90.89	96.84
E0075	HR	2280.26	75.04	94.30

🏷️ KPI Tier Breakdown:

KPI Tier	Count & Percentage
Platinum	11 karyawan (13.8%)
Gold	22 karyawan (27.5%)
Silver	26 karyawan (32.5%)
Bronze	21 karyawan (26.2%)

🏢 Department Summary:

Department	Avg KPI	Avg Salary	Employee Count
Operations	76.87	5027	13
HR	74.28	4913	18
Marketing	73.30	4707	17
Finance	72.36	4911	16
Engineering	66.26	5101	16

9.3

9.3.1 📌 C02 — Automated Report

📊 Key Metrics — C02
Metric	Value
Total Karyawan	80
Avg Salary	$5,022
Avg KPI Score	76.43
Avg Performance	73.31
Top Performers	15
Max KPI	97.79
Min Salary	$2,000
Max Salary	$8,249

🥇 Top 5 Performers:

Employee ID	Department	Salary	Performance	KPI Score
E0095	Marketing	5828.61	56.24	97.79
E0142	Finance	6809.20	75.29	97.77
E0112	Finance	5873.98	63.51	97.22
E0155	Marketing	5687.39	56.00	96.39
E0157	Operations	6650.24	93.63	96.29

🏷️ KPI Tier Breakdown:

KPI Tier	Count & Percentage
Platinum	15 karyawan (18.8%)
Gold	31 karyawan (38.8%)
Silver	23 karyawan (28.7%)
Bronze	11 karyawan (13.8%)

🏢 Department Summary:

Department	Avg KPI	Avg Salary	Employee Count
Operations	79.69	4229	11
Finance	77.09	5477	19
Engineering	75.92	5440	16
HR	75.90	4853	15
Marketing	74.74	4810	19

9.4

9.4.1 📌 C03 — Automated Report

📊 Key Metrics — C03
Metric	Value
Total Karyawan	80
Avg Salary	$4,990
Avg KPI Score	74.02
Avg Performance	74.35
Top Performers	14
Max KPI	98.94
Min Salary	$2,215
Max Salary	$8,350

🥇 Top 5 Performers:

Employee ID	Department	Salary	Performance	KPI Score
E0194	Operations	3271.21	97.97	98.94
E0203	HR	4868.60	79.44	98.76
E0219	Operations	5195.30	79.28	97.33
E0239	Operations	4527.36	71.29	96.66
E0216	HR	4342.96	68.31	96.58

🏷️ KPI Tier Breakdown:

KPI Tier	Count & Percentage
Platinum	14 karyawan (17.5%)
Gold	19 karyawan (23.8%)
Silver	32 karyawan (40%)
Bronze	15 karyawan (18.8%)

🏢 Department Summary:

Department	Avg KPI	Avg Salary	Employee Count
Operations	80.54	5183	18
Engineering	74.18	5314	18
HR	73.70	4623	23
Finance	69.42	5098	9
Marketing	68.07	4837	12

9.5

9.5.1 📌 C04 — Automated Report

📊 Key Metrics — C04
Metric	Value
Total Karyawan	80
Avg Salary	$4,860
Avg KPI Score	73.43
Avg Performance	75.24
Top Performers	12
Max KPI	99.22
Min Salary	$2,000
Max Salary	$8,284

🥇 Top 5 Performers:

Employee ID	Department	Salary	Performance	KPI Score
E0305	Finance	6471.22	77.83	99.22
E0308	HR	5725.36	99.56	99.08
E0273	Operations	3487.90	74.52	98.65
E0284	Finance	7078.38	71.67	97.44
E0248	Engineering	5757.39	80.87	96.58

🏷️ KPI Tier Breakdown:

KPI Tier	Count & Percentage
Platinum	12 karyawan (15%)
Gold	24 karyawan (30%)
Silver	27 karyawan (33.8%)
Bronze	17 karyawan (21.2%)

🏢 Department Summary:

Department	Avg KPI	Avg Salary	Employee Count
Finance	78.06	5155	13
HR	73.94	5060	12
Engineering	73.71	4864	13
Marketing	72.70	4826	22
Operations	70.73	4582	20

9.6

9.6.1 📌 C05 — Automated Report

📊 Key Metrics — C05
Metric	Value
Total Karyawan	80
Avg Salary	$4,759
Avg KPI Score	73.37
Avg Performance	74.44
Top Performers	12
Max KPI	99.92
Min Salary	$2,000
Max Salary	$8,772

🥇 Top 5 Performers:

Employee ID	Department	Salary	Performance	KPI Score
E0323	Finance	6395.96	67.19	99.92
E0336	Engineering	2520.98	89.65	99.79
E0344	Engineering	5202.98	61.86	97.65
E0349	Finance	2121.47	65.93	95.82
E0339	Engineering	5385.21	95.54	95.59

🏷️ KPI Tier Breakdown:

KPI Tier	Count & Percentage
Platinum	12 karyawan (15%)
Gold	24 karyawan (30%)
Silver	26 karyawan (32.5%)
Bronze	18 karyawan (22.5%)

🏢 Department Summary:

Department	Avg KPI	Avg Salary	Employee Count
Engineering	79.69	4578	14
Finance	74.59	4622	16
Operations	71.65	4578	17
Marketing	71.29	5489	15
HR	70.71	4583	18

9.7

9.7.1 📌 C06 — Automated Report

📊 Key Metrics — C06
Metric	Value
Total Karyawan	80
Avg Salary	$5,102
Avg KPI Score	76.33
Avg Performance	75.51
Top Performers	15
Max KPI	98.38
Min Salary	$2,000
Max Salary	$9,073

🥇 Top 5 Performers:

Employee ID	Department	Salary	Performance	KPI Score
E0477	Marketing	3000.72	61.14	98.38
E0431	Operations	4932.59	72.28	98.05
E0479	HR	4738.47	70.08	98.05
E0447	Marketing	6179.19	83.02	97.77
E0421	Finance	4926.45	50.29	97.14

🏷️ KPI Tier Breakdown:

KPI Tier	Count & Percentage
Platinum	15 karyawan (18.8%)
Gold	26 karyawan (32.5%)
Silver	26 karyawan (32.5%)
Bronze	13 karyawan (16.2%)

🏢 Department Summary:

Department	Avg KPI	Avg Salary	Employee Count
Marketing	81.20	5607	15
Operations	77.73	4927	16
Engineering	76.06	6087	15
HR	75.53	4376	17
Finance	71.73	4677	17

9.8

9.8.1 📌 C07 — Automated Report

📊 Key Metrics — C07
Metric	Value
Total Karyawan	80
Avg Salary	$5,448
Avg KPI Score	75.66
Avg Performance	76.54
Top Performers	17
Max KPI	99.58
Min Salary	$2,239
Max Salary	$8,809

🥇 Top 5 Performers:

Employee ID	Department	Salary	Performance	KPI Score
E0556	HR	7311.07	74.50	99.58
E0523	HR	5941.86	70.68	99.30
E0489	Engineering	3911.32	54.13	98.98
E0499	HR	4040.96	70.72	97.72
E0514	Finance	4785.85	69.57	96.49

🏷️ KPI Tier Breakdown:

KPI Tier	Count & Percentage
Platinum	17 karyawan (21.2%)
Gold	26 karyawan (32.5%)
Silver	20 karyawan (25%)
Bronze	17 karyawan (21.2%)

🏢 Department Summary:

Department	Avg KPI	Avg Salary	Employee Count
HR	82.29	5212	14
Marketing	78.63	5757	9
Engineering	76.35	5703	22
Finance	74.90	5608	16
Operations	69.21	5049	19

9.9

✅ Task 8 (BONUS) Selesai! Loop berhasil menghasilkan report otomatis untuk setiap perusahaan lengkap dengan tabel statistik, top performers, KPI tier, department summary, dan visualisasi.

10 Ringkasan Semua Task

📋 Ringkasan Seluruh Task Practicum
No	Task	Konsep Utama	Status
1	Dynamic Multi-Formula Function	Nested loop, validasi input, multi-plot	✅ Selesai \|
2	Nested Simulation: Multi-Sales & Discounts	Nested function, conditional discount, cumulative stats	✅ Selesai \|
3	Multi-Level Performance Categorization	Loop vector, persentase, bar+pie chart	✅ Selesai \|
4	Multi-Company Dataset Simulation	Nested loop, conditional KPI flag, summary table	✅ Selesai \|
5	Monte Carlo Simulation: Pi & Probability	Loop iterasi, estimasi π, probabilitas sub-square	✅ Selesai \|
6	Advanced Data Transformation & Feature Engineering	Loop normalisasi, z-score, feature engineering	✅ Selesai \|
7	Mini Project: Company KPI Dashboard	Dashboard lengkap, grouped bar, scatter+regression	✅ Selesai \|
8	Automated Report Generation (BONUS)	Loop otomatis, tabel+plot per perusahaan	✅ Selesai \|

Submitted for Data Science Programming — Instructor: Bakti Siregar, M.Sc
07 April 2026, 00:41 WIB

Syntax and Control Flow

Practice: Conditional Statement and Loops