Assignment Week 5
Vanessa Ziba Ardelia
Data Science 25 – ITSB
Dosen Pengampu: Bakti Siregar, M.Sc., CDS.
Mata
Kuliah: Statistika Dasar
R Programming Data Science Statistics
Introduction
This assignment aims to practice the implementation of functions, nested loops, and conditional statements in R. Students are required to create a dynamic mathematical function, simulate sales data with discount logic, and categorize sales performance into multiple levels. The results are then analyzed and visualized using appropriate graphs.
Overall, this task reflects a structured data analysis workflow from computation to interpretation.
Task 1 — Dynamic Multi-Formula Function (R)
Objective
The objective of this program is to compare the growth patterns of several mathematical functions (linear, quadratic, cubic, and exponential) using loops and visualize them on a single graph. This approach helps illustrate how different mathematical functions grow at different rates.
Visualisation
Interpretation
The graph compares the growth patterns of four functions: linear, quadratic, cubic, and exponential for 𝑥 = 1 x=1 to 20 20. The linear function increases at a constant rate, while the quadratic and cubic functions grow faster due to their higher polynomial degrees. The exponential function shows the fastest growth and increases dramatically compared to the other functions. This comparison highlights how different mathematical models produce different growth behaviors.
Task 2 — Nested Simulation: Multi Sales & Discounts
Objective
This task simulates sales generated by multiple salespersons across several days using nested loops and conditional discount rules. The simulation calculates final sales after discounts and tracks cumulative sales growth over time.
Visualisation
visualisasi 2
Sales Summary
Interpretation
Nested loops allow the simulation of multiple salespersons over several days. Conditional statements apply different discount rates depending on the sales amount. The visualization shows cumulative sales growth for each salesperson over time. Salespersons with steeper lines indicate stronger sales performance during the simulation period.
Task 3 — Multi-Level Performance Categorization
Objective
This task classifies sales performance into five categories based on the sales amount. The categorization helps evaluate the distribution of performance levels among sales activities.
Visualisation
Interpretation
The charts show the distribution of sales performance categories. Most sales tend to fall within the middle categories such as “Good” and “Very Good”, while “Excellent” represents high-performing sales. These visualizations help summarize team performance and make it easier to evaluate overall sales effectiveness.
Task 4 – Multi-Company Dataset Simulation
Objective
This task simulates a dataset containing employees from multiple companies using nested loops. Each employee has attributes such as salary, department, performance score, and KPI score. The dataset is then analyzed to identify top performers and generate summary statistics for each company.
Visualisation
Company Performance Bar Chart
KPI Comparison Chart
Analysis
The simulation generates employee data across multiple companies using nested loops. Each employee is assigned a random salary, department, performance score, and KPI score. Employees with a KPI score above 90 are categorized as top performers.
After generating the dataset, summary statistics are calculated for each company, including the average salary, average performance score, and maximum KPI score. These metrics help evaluate the overall performance of each company.
The visualizations provide a clear comparison between companies, making it easier to identify which company has higher employee salaries or better KPI achievements. This approach demonstrates how simulated data can be used to perform organizational performance analysis.
Task 5 – Monte Carlo Simulation: Pi & Probability
Objective
This task estimates the value of π using the Monte Carlo method. Random points are generated within a square, and the probability of points falling inside a circle is calculated to approximate the value of π.
Visualisasi
The scatter plot displays randomly generated points within a square. Points inside the circle are shown in blue, while points outside the circle are shown in red.
Monte Carlo Simulation
## [1] 3.1552
Interpretation
The simulation estimates the value of π by generating random points inside a square and checking whether the points fall inside the circle. The visualization shows two groups of points: points inside the circle and points outside the circle.
The estimated value of π is obtained from the ratio of points inside the circle compared to the total number of generated points. The result is usually close to the actual value of π (approximately 3.14).
From the scatter plot, most points that satisfy the condition 𝑥 2 + 𝑦 2 ≤ 1 x 2 +y 2 ≤1 fall inside the circular region. As the number of generated points increases, the distribution of points becomes more consistent and the estimated value of π becomes closer to the true value.
Task 6 – Advanced Data Transformation & Feature Engineering
Objective
This task performs data transformation and feature engineering by applying normalization and z-score standardization. These techniques help understand the distribution of data before and after transformation.
Visualisasi
Dataset
Normalization Function
Z-Score Transformation
Original Data Distribution
Normalized Data Distribution
Interpretation
Data transformation helps adjust the scale of variables so that they can be analyzed and compared more easily. Normalization converts the data values into a range between 0 and 1 based on the minimum and maximum values.
The z-score transformation measures how far each value is from the mean in terms of standard deviations. By comparing the histograms before and after normalization, we can observe how the transformation changes the scale of the data while maintaining the overall distribution pattern. These transformations are commonly used in data science to improve the performance of analytical models and ensure consistent data scaling.
Task 7 – Mini Project: Company KPI Dashboard & Simulation
Objective
This mini project simulates a dataset of multiple companies with a large number of employees. The analysis focuses on employee KPI performance, department performance, and several visualizations to create a simple analytical dashboard.
Visualisasi
Generate Dataset
KPI Categorization
KPI Category Distribution
Company KPI Summary
Company Performance Chart
Department KPI Performance
Salary Distribution
Performance vs KPI Relationship
Interpretation
The simulated dataset represents employees from multiple companies with attributes such as salary, department, performance score, and KPI score. Employees are categorized into different KPI levels such as Top Performer, Good, Average, and Low.
The analysis calculates the average salary, average KPI score, and average performance score for each company. The bar chart visualization allows comparisons between companies, highlighting differences in overall employee performance.
Department analysis shows how KPI performance varies across departments, indicating which departments tend to have higher KPI scores. The salary distribution provides insight into the spread of employee salaries, while the scatter plot between performance score and KPI score helps illustrate the relationship between these two variables. Overall, the dashboard visualizations provide a clear overview of company and employee performance within the simulated dataset.
Task 8 – Automated Report Generation (Bonus)
Objective
This task generates an automated report summarizing KPI analysis for each company using functions and loops. The report includes average salary, average KPI score, and the number of top-performing employees in each company.
Visualisasi
Automated Company Report
## Company: 1
## Average Salary: 6863.12
## Average KPI: 74.88
## Top Performers: 23
## ----------------------------
## Company: 2
## Average Salary: 6870.13
## Average KPI: 73.1
## Top Performers: 14
## ----------------------------
## Company: 3
## Average Salary: 7155.98
## Average KPI: 75.59
## Top Performers: 23
## ----------------------------
## Company: 4
## Average Salary: 6772.41
## Average KPI: 75.09
## Top Performers: 20
## ----------------------------
## Company: 5
## Average Salary: 7203.16
## Average KPI: 74.55
## Top Performers: 20
## ----------------------------
Export Company Summary
Interpretation
The automated report summarizes key performance indicators for each company in a consistent format. By using functions and loops, the system automatically calculates important metrics such as average salary, average KPI score, and the number of top-performing employees.
This approach allows analysts to quickly generate structured reports for multiple companies without manually calculating each metric. Additionally, exporting the results to a CSV file enables the data to be shared, stored, or used for further analysis and reporting.