Advanced Practicum: Functions & Loops + Data Science

Assignment Week-5

April 06, 2026

Profile Photo

KAYLA APRILIA

Data Science Student at ITSB

NIM: 52250057

Email: kaylaaprilia2142@gmail.com

R Programming Data Science Statistics

1 Dynamic Multi-Formula Function

This project develops a dynamic multi-formula function in R that computes linear, quadratic, cubic, and exponential models. The implementation uses nested loops, conditional logic, and input validation to ensure flexibility and correctness.

Additionally, the workflow demonstrates a complete data science process:

  • Data computation
  • Data transformation
  • Data visualization

library(ggplot2)
library(dplyr)
library(reshape2)
library(plotly)
library(htmltools)

# Define function
compute_formula <- function(x, formulas) {
  
  # Input validation
  valid_formulas <- c("linear", "quadratic", "cubic", "exponential")
  
  if (!all(formulas %in% valid_formulas)) {
    stop("Invalid formula detected. Allowed: linear, quadratic, cubic, exponential")
  }
  
  results <- list()
  
  # Outer loop (formulas)
  for (f in formulas) {
    
    y_values <- c()
    
    # Inner loop (x values)
    for (i in x) {
      
      if (f == "linear") {
        y <- 2*i + 3
      } else if (f == "quadratic") {
        y <- i^2 + 2*i + 1
      } else if (f == "cubic") {
        y <- i^3 - i^2 + 2
      } else if (f == "exponential") {
        y <- 2^i
      }
      
      y_values <- c(y_values, y)
    }
    
    results[[f]] <- y_values
  }
  
  return(results)
}

# Generate data
x <- 1:20
formulas <- c("linear", "quadratic", "cubic", "exponential")

results <- compute_formula(x, formulas)

# Convert to dataframe
df <- data.frame(x = x)

for (name in names(results)) {
  df[[name]] <- results[[name]]
}

# Reshape data for plotting
df_long <- melt(df, id.vars = "x")

# Plot visualization
p <- ggplot(df_long, aes(x = x, y = value, color = variable)) +
  geom_line(size = 1) +
  labs(
    title = "Comparison of Multiple Mathematical Functions",
    x = "X values",
    y = "Y values",
    color = "Function Type"
  ) +
  theme_minimal()

ggplotly(p, width = 700, height = 450)

Visualization Interpretation

The graph compares four mathematical functions over the same range of x values (1–20):

  • The linear function increases at a constant rate, forming a straight line.
  • The quadratic function grows faster and produces a curved shape.
  • The cubic function increases more rapidly with a steeper curve.
  • The exponential function grows the fastest, rising sharply as x increases.

2 Nested Simulation: Multi-Sales & Discounts

This project simulates a multi-salesperson environment using a nested function approach in R.
The function generates sales data including sales ID, day, sales amount, and discount rate.

This demonstrates a complete workflow: simulation → transformation → analysis → visualization.


# Main simulation function
simulate_sales <- function(n_salesperson, days) {
  
  all_data <- data.frame(
    salesperson = character(),
    sales_id = character(),
    day = integer(),
    sales_amount = numeric(),
    discount_rate = numeric(),
    cumulative_sales = numeric()
  )
  
  # Loop per salesperson
  for (s in 1:n_salesperson) {
    
    sales_amounts <- c()
    
    # Nested loop per day
    for (d in 1:days) {
      
      # Generate random sales amount
      sales_amount <- round(runif(1, 100, 1000), 2)
      
      # Conditional discount logic
      if (sales_amount > 800) {
        discount_rate <- 0.20
      } else if (sales_amount > 500) {
        discount_rate <- 0.10
      } else {
        discount_rate <- 0.05
      }
      
      sales_amounts <- c(sales_amounts, sales_amount)
      
      # Create row
      temp <- data.frame(
        salesperson = paste0("SP", s),
        sales_id = paste0("S", s, "_", d),
        day = d,
        sales_amount = sales_amount,
        discount_rate = discount_rate,
        cumulative_sales = NA
      )
      
      all_data <- dplyr::bind_rows(all_data, temp)
    }
    
    # Nested function for cumulative sales
    cumulative_sales <- function(x) {
      return(cumsum(x))
    }
    
    # Apply cumulative calculation
    idx <- all_data$salesperson == paste0("SP", s)
    all_data$cumulative_sales[idx] <- cumulative_sales(sales_amounts)
  }
  
  return(all_data)
}

# Run simulation
set.seed(123)
sales_data <- simulate_sales(n_salesperson = 5, days = 30)

# Summary statistics
summary_stats <- sales_data %>%
  group_by(salesperson) %>%
  summarise(
    total_sales = sum(sales_amount),
    avg_sales = mean(sales_amount),
    max_sales = max(sales_amount),
    min_sales = min(sales_amount)
  )

# Display table
knitr::kable(summary_stats, 
             caption = "Sales Summary per Salesperson", 
             align = "c")
Sales Summary per Salesperson
salesperson total_sales avg_sales max_sales min_sales
SP1 18454.79 615.1597 994.84 137.85
SP2 14843.30 494.7767 966.72 122.15
SP3 16840.52 561.3507 986.46 100.56
SP4 17019.89 567.3297 959.03 154.65
SP5 15904.32 530.1440 985.80 109.42
# Plot cumulative sales
p <- ggplot(sales_data, aes(x = day, y = cumulative_sales, color = salesperson)) +
  geom_line(size = 1) +
  labs(
    title = "Cumulative Sales per Salesperson",
    x = "Day",
    y = "Cumulative Sales"
  ) +
  theme_minimal()

ggplotly(p, width = 700, height = 450)

Visualization Interpretation

The graph shows the cumulative sales growth of each salesperson over time.

Each line represents one salesperson’s total accumulated sales. The upward trend indicates that sales are continuously increasing over days. Differences in slope reflect performance:

  • A steeper line means higher daily sales accumulation.
  • A flatter line indicates slower growth.

3 Multi-Level Performance Categorization

This project develops a function to categorize sales performance into five levels: Excellent, Very Good, Good, Average, and Poor.

The implementation includes:

  • Looping through a vector of sales data
  • Applying conditional logic for categorization
  • Calculating percentage distribution for each category
  • Visualizing results using a bar plot and pie chart

This demonstrates data classification, transformation, and visualization in a structured workflow.


library(ggplot2)
library(plotly)
library(dplyr)
# Function to categorize performance
categorize_performance <- function(sales_amount) {
  
  categories <- c()
  
  # Loop through sales values
  for (i in sales_amount) {
    
    if (i > 800) {
      category <- "Excellent"
    } else if (i > 600) {
      category <- "Very Good"
    } else if (i > 400) {
      category <- "Good"
    } else if (i > 200) {
      category <- "Average"
    } else {
      category <- "Poor"
    }
    
    categories <- c(categories, category)
  }
  
  return(categories)
}

# Example data (can reuse from previous task)
set.seed(123)
sales_amount <- round(runif(100, 100, 1000), 2)

# Apply categorization
performance <- categorize_performance(sales_amount)

# Create dataframe
df <- data.frame(
  sales_amount = sales_amount,
  performance = performance
)

# Calculate percentages
category_counts <- table(df$performance)
category_percent <- prop.table(category_counts) * 100

# Convert to dataframe
df_summary <- data.frame(
  category = names(category_counts),
  count = as.numeric(category_counts),
  percentage = as.numeric(category_percent)
)

knitr::kable(df_summary, 
             caption = "Performance Category Distribution", 
             align = "c")
Performance Category Distribution
category count percentage
Average 24 24
Excellent 24 24
Good 24 24
Poor 9 9
Very Good 19 19
# Bar plot
p1 <- ggplot(df_summary, aes(x = category, y = percentage, fill = category)) +
  geom_bar(stat = "identity") +
  labs(
    title = "Performance Distribution (Bar Plot)",
    x = "Category",
    y = "Percentage (%)"
  ) +
  theme_minimal()

ggplotly(p1, width = 700, height = 450)
# Pie chart
plot_ly(df_summary,
        labels = ~category,
        values = ~percentage,
        type = 'pie') %>%
  layout(
    title = "Performance Distribution (Pie Chart)",
    width = 700,
    height = 450
  )

Visualization Interpretation

The bar plot and pie chart show the distribution of sales performance across five categories.

  • The bar plot clearly compares percentages between categories.
  • The pie chart highlights the proportion of each category in the dataset.

Key observations:

  • Most data points typically fall into middle categories (Good and Very Good) due to random distribution.
  • Excellent and Poor categories usually have smaller proportions.
  • The distribution reflects how sales performance is spread across different levels.

4 Multi-Company Dataset Simulation

This project simulates a multi-company dataset using nested loops in R.
The function generates employee-level data including:

  • Company ID
  • Employee ID
  • Salary
  • Department
  • Performance Score
  • KPI Score

This demonstrates a full workflow: simulation → aggregation → analysis → visualization.


# Function to generate company data
generate_company_data <- function(n_company, n_employees) {
  
  all_data <- data.frame()
  departments <- c("HR", "Finance", "IT", "Marketing", "Operations")
  
  # Loop per company
  for (c in 1:n_company) {
    
    # Loop per employee
    for (e in 1:n_employees) {
      
      salary <- round(runif(1, 3000, 10000), 2)
      performance_score <- round(runif(1, 60, 100), 2)
      KPI_score <- round(runif(1, 50, 100), 2)
      
      # Conditional logic: top performer
      top_performer <- ifelse(KPI_score > 90, "Yes", "No")
      
      temp <- data.frame(
        company_id = paste0("C", c),
        employee_id = paste0("E", c, "_", e),
        salary = salary,
        department = sample(departments, 1),
        performance_score = performance_score,
        KPI_score = KPI_score,
        top_performer = top_performer
      )
      
      all_data <- rbind(all_data, temp)
    }
  }
  
  return(all_data)
}

# Generate data
set.seed(123)
company_data <- generate_company_data(n_company = 3, n_employees = 50)

# Summary per company
summary_table <- company_data %>%
  group_by(company_id) %>%
  summarise(
    avg_salary = mean(salary),
    avg_performance = mean(performance_score),
    max_KPI = max(KPI_score)
  )

knitr::kable(summary_table, 
             caption = "Company Performance Summary", 
             align = "c")
Company Performance Summary
company_id avg_salary avg_performance max_KPI
C1 5951.905 84.2378 96.87
C2 6468.252 79.2874 99.97
C3 6639.389 79.2358 99.45
# Plot: Average Salary per Company
p1 <- ggplot(summary_table, aes(x = company_id, y = avg_salary, fill = company_id)) +
  geom_bar(stat = "identity") +
  labs(
    title = "Average Salary per Company",
    x = "Company",
    y = "Average Salary"
  ) +
  theme_minimal()

ggplotly(p1, width = 700, height = 450)
# Plot: Average Performance per Company
p2 <- ggplot(summary_table, aes(x = company_id, y = avg_performance, fill = company_id)) +
  geom_bar(stat = "identity") +
  labs(
    title = "Average Performance Score per Company",
    x = "Company",
    y = "Average Performance"
  ) +
  theme_minimal()

ggplotly(p2, width = 700, height = 450)
# Plot: KPI Distribution
p3 <- ggplot(company_data, aes(x = KPI_score, fill = company_id)) +
  geom_histogram(bins = 20, alpha = 0.6, position = "identity") +
  labs(
    title = "KPI Score Distribution",
    x = "KPI Score",
    y = "Frequency"
  ) +
  theme_minimal()

ggplotly(p3, width = 700, height = 450)

Visualization Interpretation

The visualizations provide insights into company performance:

  • Average Salary Plot, shows how compensation differs between companies. Higher bars indicate companies paying more on average.

  • Average Performance Plot, displays overall employee performance levels per company. Companies with higher values indicate stronger workforce performance.

  • KPI Distribution Histogram, illustrates how KPI scores are spread across employees.

    • A concentration near high values suggests many high performers.
    • The presence of values above 90 highlights top performers.

5 Monte Carlo Simulation: Pi & Probability

This project implements a Monte Carlo simulation to estimate the value of π (pi) and analyze probability.

This demonstrates simulation-based estimation and probabilistic modeling in data science.


# Monte Carlo function
monte_carlo_pi <- function(n_points) {
  
  inside_circle <- 0
  inside_square_small <- 0
  
  x_vals <- c()
  y_vals <- c()
  inside_flag <- c()
  
  # Loop for simulation
  for (i in 1:n_points) {
    
    x <- runif(1, -1, 1)
    y <- runif(1, -1, 1)
    
    x_vals <- c(x_vals, x)
    y_vals <- c(y_vals, y)
    
    # Check if inside circle
    if (x^2 + y^2 <= 1) {
      inside_circle <- inside_circle + 1
      inside_flag <- c(inside_flag, "Inside")
    } else {
      inside_flag <- c(inside_flag, "Outside")
    }
    
    # Probability for sub-square (-0.5 to 0.5)
    if (x >= -0.5 && x <= 0.5 && y >= -0.5 && y <= 0.5) {
      inside_square_small <- inside_square_small + 1
    }
  }
  
  # Estimate Pi
  pi_estimate <- 4 * (inside_circle / n_points)
  
  # Probability of falling in sub-square
  prob_subsquare <- inside_square_small / n_points
  
  # Create dataframe
  df <- data.frame(
    x = x_vals,
    y = y_vals,
    position = inside_flag
  )
  
  return(list(
    pi_estimate = pi_estimate,
    prob_subsquare = prob_subsquare,
    data = df
  ))
}

# Run simulation
set.seed(123)
result <- monte_carlo_pi(5000)

# Print results
result$pi_estimate
## [1] 3.1552
result$prob_subsquare
## [1] 0.2624
# Plot points
p <- ggplot(result$data, aes(x = x, y = y, color = position)) +
  geom_point(alpha = 0.6) +
  labs(
    title = "Monte Carlo Simulation of Pi",
    x = "X",
    y = "Y",
    color = "Position"
  ) +
  theme_minimal()

ggplotly(p, width = 700, height = 450)

Visualization Interpretation

The scatter plot displays randomly generated points:

  • Points inside the circle form a circular shape centered at (0,0).
  • Points outside the circle fill the remaining square area.

Key observations:

  • The circular boundary becomes clearer as the number of points increases.
  • The ratio of points inside the circle compared to total points approximates Ï€.
  • The sub-square probability reflects how often points fall within a smaller region inside the square.

6 Advanced Data Transformation & Feature Engineering

This project focuses on data transformation and feature engineering techniques in R.

This demonstrates how raw data can be transformed into more meaningful and analysis-ready features.


# Example dataset
set.seed(123)
df <- data.frame(
  salary = runif(100, 3000, 10000),
  performance_score = runif(100, 60, 100),
  KPI_score = runif(100, 50, 100)
)

# Function: Min-Max Normalization
normalize_columns <- function(df) {
  
  df_norm <- df
  
  for (col in names(df)) {
    if (is.numeric(df[[col]])) {
      min_val <- min(df[[col]])
      max_val <- max(df[[col]])
      
      df_norm[[col]] <- (df[[col]] - min_val) / (max_val - min_val)
    }
  }
  
  return(df_norm)
}

# Function: Z-score Standardization
z_score <- function(df) {
  
  df_z <- df
  
  for (col in names(df)) {
    if (is.numeric(df[[col]])) {
      mean_val <- mean(df[[col]])
      sd_val <- sd(df[[col]])
      
      df_z[[col]] <- (df[[col]] - mean_val) / sd_val
    }
  }
  
  return(df_z)
}

# Apply transformations
df_normalized <- normalize_columns(df)
df_zscore <- z_score(df)

# Feature Engineering
df$performance_category <- cut(
  df$performance_score,
  breaks = c(-Inf, 70, 80, 90, Inf),
  labels = c("Poor", "Average", "Good", "Excellent")
)

df$salary_bracket <- cut(
  df$salary,
  breaks = c(-Inf, 4000, 7000, Inf),
  labels = c("Low", "Medium", "High")
)

# Histogram before & after normalization
p1 <- ggplot(df, aes(x = salary, fill = salary_bracket)) +
  geom_histogram(bins = 20, alpha = 0.7) + 
  labs(title = "Salary Distribution (Original)") +
  theme_minimal()

ggplotly(p1, width = 700, height = 450)
p2 <- ggplot(df_normalized, aes(x = salary)) +
  geom_histogram(bins = 20, fill = "steelblue", alpha = 0.7) +
  labs(title = "Salary Distribution (Normalized)") +
  theme_minimal()

ggplotly(p2, width = 700, height = 450)
# Boxplot comparison
p3 <- ggplot(df, aes(x = salary_bracket, y = salary, fill = salary_bracket)) +
  geom_boxplot() +
  labs(title = "Boxplot (Original Salary)") +
  theme_minimal()

ggplotly(p3, width = 700, height = 450)
p4 <- ggplot(df_zscore, aes(y = salary)) +
  geom_boxplot(fill = "tomato", alpha = 0.7) +
  labs(title = "Boxplot (Z-Score Salary)") +
  theme_minimal()

ggplotly(p4, width = 700, height = 450)

Visualization Interpretation

The visualizations compare data distributions before and after transformation:

  • Histogram (Original vs Normalized)
    • Original data shows actual salary distribution.
    • Normalized data scales values between 0 and 1, making them easier to compare across variables.
  • Boxplot (Original vs Z-score)
    • Original boxplot shows raw spread and outliers.
    • Z-score transformation centers the data around 0 with standardized deviation, making patterns more comparable.

Key observations:

  • Normalization changes the scale but preserves the distribution shape.
  • Z-score transformation standardizes the data, making it suitable for statistical modeling.
  • Feature engineering (performance_category & salary_bracket) simplifies analysis by grouping continuous data into meaningful categories.

7 Mini Project: Company KPI Dashboard & Simulation

This mini project simulates a company KPI dashboard using synthetic data.

The dataset includes:

  • Employee ID
  • Company ID
  • Salary
  • Performance Score
  • KPI Score
  • Department

This represents a complete data science pipeline from simulation to dashboard-style insights.


# Function to generate dataset
generate_kpi_data <- function(n_company) {
  
  all_data <- data.frame()
  departments <- c("HR", "Finance", "IT", "Marketing", "Operations")
  
  for (c in 1:n_company) {
    
    n_employees <- sample(50:200, 1)
    
    for (e in 1:n_employees) {
      
      salary <- round(runif(1, 3000, 10000), 2)
      performance_score <- round(runif(1, 60, 100), 2)
      KPI_score <- round(runif(1, 50, 100), 2)
      
      temp <- data.frame(
        employee_id = paste0("E", c, "_", e),
        company_id = paste0("C", c),
        salary = salary,
        performance_score = performance_score,
        KPI_score = KPI_score,
        department = sample(departments, 1)
      )
      
      all_data <- dplyr::bind_rows(all_data, temp)
    }
  }
  
  return(all_data)
}

# Generate data
set.seed(123)
df <- generate_kpi_data(5)

# KPI Categorization
df$KPI_tier <- NA

for (i in 1:nrow(df)) {
  
  if (df$KPI_score[i] > 90) {
    df$KPI_tier[i] <- "Top Performer"
  } else if (df$KPI_score[i] > 75) {
    df$KPI_tier[i] <- "High"
  } else if (df$KPI_score[i] > 60) {
    df$KPI_tier[i] <- "Medium"
  } else {
    df$KPI_tier[i] <- "Low"
  }
}

# Summary per company
summary_company <- df %>%
  group_by(company_id) %>%
  summarise(
    avg_salary = mean(salary),
    avg_KPI = mean(KPI_score),
    top_performers = sum(KPI_score > 90)
  )

knitr::kable(summary_company, 
             caption = "Company KPI Summary", 
             align = "c")
Company KPI Summary
company_id avg_salary avg_KPI top_performers
C1 6082.763 73.68413 12
C2 6612.674 75.56000 17
C3 6324.746 73.57075 23
C4 6683.919 75.73927 17
C5 6333.176 75.43160 35
# Department analysis
dept_summary <- df %>%
  group_by(company_id, department) %>%
  summarise(avg_KPI = mean(KPI_score), .groups = "drop")

# Salary distribution plot
p1 <- ggplot(df, aes(x = salary, fill = company_id)) +
  geom_histogram(bins = 20, alpha = 0.6, position = "identity") +
  labs(title = "Salary Distribution by Company") +
  theme_minimal()

ggplotly(p1, width = 700, height = 450)
# Grouped bar chart (department KPI)
p2 <- ggplot(dept_summary, aes(x = department, y = avg_KPI, fill = company_id)) +
  geom_bar(stat = "identity", position = "dodge") +
  labs(
    title = "Average KPI by Department and Company",
    x = "Department",
    y = "Average KPI"
  ) +
  theme_minimal()

ggplotly(p2, width = 700, height = 450)
# Scatter plot with regression line
p3 <- ggplot(df, aes(x = salary, y = KPI_score, color = company_id)) +
  geom_point(alpha = 0.6) +
  geom_smooth(method = "lm", se = FALSE) +
  labs(
    title = "Salary vs KPI Score",
    x = "Salary",
    y = "KPI Score"
  ) +
  theme_minimal()

ggplotly(p3, width = 700, height = 450)
# Top performers table
top_performers <- df %>%
  filter(KPI_score > 90) %>%
  arrange(desc(KPI_score))

knitr::kable(head(top_performers), 
             caption = "Top Performers (KPI > 90)", 
             align = "c")
Top Performers (KPI > 90)
employee_id company_id salary performance_score KPI_score department KPI_tier
E5_121 C5 5099.34 71.58 99.96 Operations Top Performer
E3_122 C3 6795.32 76.66 99.94 Marketing Top Performer
E5_7 C5 6046.03 67.89 99.73 Marketing Top Performer
E5_34 C5 7044.63 83.18 99.68 Operations Top Performer
E4_28 C4 3425.87 69.40 99.65 Operations Top Performer
E3_99 C3 8171.24 90.28 99.62 IT Top Performer

Visualization Interpretation

The dashboard visualizations provide several insights:

  • Salary Distribution, shows how salaries vary across companies. Overlapping distributions indicate similarities, while shifts suggest differences in pay structure.
  • Grouped Bar Chart (Department KPI), compares average KPI across departments and companies.
    • Some departments consistently perform better.
    • Differences between companies highlight organizational performance gaps.
  • Scatter Plot with Regression Line, displays the relationship between salary and KPI score.
    • A positive slope suggests that higher salaries may be associated with higher performance.
    • The spread of points shows variability among employees.

8 Automated Company Report Generation

This project automates the creation of company-level reports using functions and loops in R.

This demonstrates how automation can streamline reporting workflows in data science.


# Load libraries
library(dplyr)
library(ggplot2)

set.seed(123)

generate_kpi_data <- function(n_company) {
  all_data <- data.frame()
  departments <- c("HR", "Finance", "IT", "Marketing", "Operations")
  
  for (c in 1:n_company) {
    n_employees <- sample(50:200, 1)
    
    for (e in 1:n_employees) {
      temp <- data.frame(
        employee_id = paste0("E", c, "_", e),
        company_id = paste0("C", c),
        salary = runif(1, 3000, 10000),
        performance_score = runif(1, 60, 100),
        KPI_score = runif(1, 50, 100),
        department = sample(departments, 1)
      )
      all_data <- rbind(all_data, temp)
    }
  }
  
  return(all_data)
}

df <- generate_kpi_data(5)

# Function to generate report per company
generate_company_report <- function(data, company_name) {
  
  cat("====================================\n")
  cat("Report for Company:", company_name, "\n")
  cat("====================================\n")
  
  df_company <- data %>% filter(company_id == company_name)
  
  # Summary table
  summary <- df_company %>%
    summarise(
      avg_salary = mean(salary),
      avg_KPI = mean(KPI_score),
      max_KPI = max(KPI_score)
    )
  
  print(summary)
  
  # Plot salary distribution
p1 <- ggplot(df_company, aes(x = salary)) +
  geom_histogram(bins = 20, fill = "skyblue") +
  labs(title = paste("Salary Distribution -", company_name)) +
  theme_minimal()

# Plot KPI distribution
p2 <- ggplot(df_company, aes(x = KPI_score)) +
  geom_histogram(bins = 20, fill = "orange") +
  labs(title = paste("KPI Distribution -", company_name)) +
  theme_minimal()

gridExtra::grid.arrange(p1, p2, ncol = 2, widths = c(1,1))
  
  # Export CSv
  write.csv(df_company, paste0("report_", company_name, ".csv"), row.names = FALSE)
}

# Loop through companies
companies <- unique(df$company_id)

for (c in companies) {
  generate_company_report(df, c)
}
## ====================================
## Report for Company: C1 
## ====================================
##   avg_salary  avg_KPI  max_KPI
## 1   6082.763 73.68449 98.36992

## ====================================
## Report for Company: C2 
## ====================================
##   avg_salary  avg_KPI  max_KPI
## 1   6612.675 75.55986 99.45391

## ====================================
## Report for Company: C3 
## ====================================
##   avg_salary  avg_KPI  max_KPI
## 1   6324.746 73.57111 99.94413

## ====================================
## Report for Company: C4 
## ====================================
##   avg_salary  avg_KPI  max_KPI
## 1   6683.919 75.73943 99.65078

## ====================================
## Report for Company: C5 
## ====================================
##   avg_salary  avg_KPI  max_KPI
## 1   6333.177 75.43162 99.96369


Visualization Interpretation

Each company report includes:

  • Summary Table, displays average salary, average KPI, and maximum KPI, giving a quick overview of company performance.
  • Salary Distribution Plot, shows how employee salaries are spread within the company. A wider spread indicates higher variability in compensation.
  • KPI Distribution Plot, illustrates employee performance levels. Concentration at higher values indicates stronger overall performance.