Assignment 5  ·  Python Programming

Dynamic
Loops & Functions

Profile Photo
with numpy, pandas & matplotlib ✦

8 tasks · simulations · data transformation · visualization

Inside this Assignment

01 · Multi-Formula Function 02 · Sales & Discounts 03 · Performance Categorization 04 · Multi-Company Dataset 05 · Monte Carlo π 06 · Feature Engineering 07 · KPI Dashboard 08 · Automated Reports

01
Dynamic Functions · Loops · Visualization
Dynamic Multi-Formula Function
Linear
2x + 3
Quadratic
x² + 2x + 1
Cubic
Exponential
compute_formula( )
def compute_formula(x, formula):
    if formula == "linear":
        return 2*x + 3
    elif formula == "quadratic":
        return x**2 + 2*x + 1
    elif formula == "cubic":
        return x**3
    elif formula == "exponential":
        return np.exp(x)
    else:
        raise ValueError("Formula not recognized")
Looping & Plotting
x = np.arange(1, 21)
formulas = ["linear", "quadratic", "cubic", "exponential"]
results = {}

for formula in formulas:
    y_values = []
    for val in x:
        y = compute_formula(val, formula)
        y_values.append(y)
    results[formula] = y_values

# Plot all formulas in one graph
for formula in formulas:
    plt.plot(x, results[formula], label=formula)
plt.title("Comparison of Mathematical Formulas")
plt.legend()
plt.show()
This program calculates several types of mathematical formulas — linear, quadratic, cubic, and exponential. A function with conditional statements determines which formula to apply. Loops calculate values for x ranging from 1 to 20, and all results are visualized in a single graph for easy comparison.

02
Nested Loops · Conditional Logic
Nested Simulation: Multi-Sales & Discounts

Discount logic: ≥ 100 → 20% off  |  ≥ 50 → 10% off  |  < 50 → no discount. Simulated over 5 days × 3 products.

apply_discount( ) + nested loop
def apply_discount(amount):
    if amount >= 100:
        return amount * 0.8    # 20% off
    elif amount >= 50:
        return amount * 0.9    # 10% off
    else:
        return amount           # no discount

days = 5;  products = 3
sales_data = []

for day in range(1, days+1):
    daily_sales = []
    for product in range(1, products+1):
        sale = np.random.randint(20, 120)
        final_sale = apply_discount(sale)
        daily_sales.append(final_sale)
    sales_data.append(daily_sales)
Sample Output
Sales Data (after discount): Day 1: [21, 37, 49] Day 2: [20, 89.6, 77.4] Day 3: [31, 25, 38] Day 4: [73.8, 84.8, 47.7] Day 5: [71.1, 63.0, 86.4]
This simulation models sales for multiple products over several days using nested loops. A discount function applies conditional logic based on the sales amount. Total daily sales are calculated and displayed in a line chart to show the sales trend.

03
Categorization · Counter · Bar & Pie Chart
Multi-Level Performance Categorization
16.7% Excellent (≥90)
25.0% Very Good (≥75)
25.0% Good (≥60)
25.0% Average (≥40)
8.3% Poor (<40)
categorize_performance( ) + Counter
def categorize_performance(sales_amount):
    if   sales_amount >= 90: return "Excellent"
    elif sales_amount >= 75: return "Very Good"
    elif sales_amount >= 60: return "Good"
    elif sales_amount >= 40: return "Average"
    else:                    return "Poor"

sales = [95, 82, 67, 45, 30, 88, 76, 54, 61, 40, 92, 70]

categories = []
for amount in sales:
    categories.append(categorize_performance(amount))

from collections import Counter
category_counts = Counter(categories)

# Calculate percentages
total = len(categories)
percentages = {cat: (count/total)*100 for cat, count in category_counts.items()}
Counter Output
Counter({'Very Good': 3, 'Good': 3, 'Average': 3, 'Excellent': 2, 'Poor': 1}) Percentages: {'Excellent': 16.67, 'Very Good': 25.0, 'Good': 25.0, 'Average': 25.0, 'Poor': 8.33}
Categorizes 12 sales values into 5 performance levels. Distribution is analyzed by calculating percentages and visualized using both a bar chart and a pie chart.

04
Pandas · GroupBy · Scatter Plot
Multi-Company Dataset Simulation
generate_company_data( ) — nested loops
def generate_company_data(n_company, n_employees):
    data = []
    for company in range(1, n_company + 1):
        for emp in range(1, n_employees + 1):
            salary           = np.random.randint(3000, 10000)
            performance_score = np.random.randint(50, 100)
            kpi_score        = np.random.randint(60, 100)
            department       = np.random.choice(["HR","Sales","IT","Finance"])
            data.append([company, emp, salary, department,
                          performance_score, kpi_score])
    return pd.DataFrame(data, columns=["company_id","employee_id",
        "salary","department","performance_score","kpi_score"])

df = generate_company_data(5, 50)
df["top_performer"] = df["kpi_score"] > 90
Summary per Company
avg_salary avg_performance max_kpi company_id 1 6369.76 75.28 98 2 6430.80 71.30 99 3 6436.70 73.10 99 4 6657.22 76.30 99 5 (generated)
Top Performers per Company
company_id 1 10 2 9 3 16 4 14 5 19 Name: top_performer, dtype: int64
Simulates employee data across 5 companies using nested loops. Each employee has salary, department, performance score, and KPI score. Top performers (KPI > 90) are identified. Results visualized with bar charts and scatter plots.

05
Monte Carlo · Probability · Scatter
Monte Carlo Simulation (π & Probability)

Estimated π ≈ 3.1216 using 5,000 random points. Probability inside sub-square: 0.243. Points falling inside the quarter circle (x² + y² ≤ 1) are used to approximate π × 4.

monte_carlo_pi( n_points )
def monte_carlo_pi(n_points):
    inside_circle = 0
    x_inside, y_inside = [], []
    x_outside, y_outside = [], []
    sub_square_count = 0

    for i in range(n_points):
        x = np.random.rand()
        y = np.random.rand()

        if x**2 + y**2 <= 1:
            inside_circle += 1
            x_inside.append(x); y_inside.append(y)
        else:
            x_outside.append(x); y_outside.append(y)

    pi_estimate = 4 * inside_circle / n_points
    return pi_estimate, sub_square_count/n_points, ...
Simulation Result (n = 5000)
Estimated Pi: 3.1216 Probability in sub-square: 0.243
Estimates π using the Monte Carlo method by randomly placing 5,000 points in a unit square and checking how many fall inside the quarter circle. Visualized by coloring inside vs outside points in a scatter plot.

06
Feature Engineering · Normalization · Z-Score
Advanced Data Transformation & Feature Engineering
normalize_columns( ) — min-max scaling
def normalize_columns(df):
    df_norm = df.copy()
    cols = ["salary", "performance_score", "kpi_score"]
    for col in cols:
        min_val = df[col].min()
        max_val = df[col].max()
        df_norm[col] = (df[col] - min_val) / (max_val - min_val)
    return df_norm
z_score( ) — standardization
def z_score(df):
    df_z = df.copy()
    cols = ["salary", "performance_score", "kpi_score"]
    for col in cols:
        mean = df[col].mean()
        std  = df[col].std()
        df_z[col] = (df[col] - mean) / std
    return df_z
New Feature Columns
# Performance category from score
def performance_category(score):
    if   score >= 90: return "Excellent"
    elif score >= 75: return "Very Good"
    elif score >= 60: return "Good"
    else:             return "Needs Improvement"

# Salary bracket
def salary_bracket(salary):
    if   salary >= 8000: return "High"
    elif salary >= 5000: return "Medium"
    else:                 return "Low"

df["performance_category"] = df["performance_score"].apply(performance_category)
df["salary_bracket"]       = df["salary"].apply(salary_bracket)
Performs min-max normalization and z-score standardization using loop-based calculations. New features (performance category, salary bracket) are engineered with apply(). Salary distributions before and after normalization are compared with histograms and boxplots.

07
Mini Project · Dashboard · Multi-Company
Company KPI Dashboard & Simulation
kpi_tier( ) — loop-based classification
def kpi_tier(score):
    if   score >= 90: return "Top Performer"
    elif score >= 75: return "High"
    elif score >= 60: return "Medium"
    else:             return "Low"

tiers = []
for score in df["kpi_score"]:
    tiers.append(kpi_tier(score))
df["kpi_tier"] = tiers
Company ID Avg Salary Avg KPI Avg KPI Tier
16,295.3479.60High
26,603.9880.53High
36,300.2277.88High
46,436.3678.23High
56,028.1180.13High
66,395.5579.86High
76,331.54Medium
8Medium
Department KPI Mean
Finance 79.26 HR 79.76 IT 78.65 Sales 79.63
Mini project simulating 8 companies × 120 employees. KPI tiers are classified with a loop. Analysis includes avg salary/KPI per company, top performer identification, department-level KPI, salary distribution histogram, grouped bar chart by dept & company, and performance vs KPI scatter with regression line.

08
Bonus / Advanced · Automated Reporting · CSV Export
Automated Report Generation
company_report( df )
def company_report(df):
    companies = df["company_id"].unique()
    for c in companies:
        company_df = df[df["company_id"] == c]
        avg_salary  = company_df["salary"].mean()
        avg_kpi     = company_df["kpi_score"].mean()
        top         = len(company_df[company_df["kpi_score"] > 90])
        print(f"------ Company {c} ------")
        print(f"Employees:      {len(company_df)}")
        print(f"Average Salary: {round(avg_salary, 2)}")
        print(f"Average KPI:    {round(avg_kpi, 2)}")
        print(f"Top Performers: {top}")
Printed Report Output
------ Company 1 ------ Employees: 120 Average Salary: 6295.34 Average KPI: 79.6 Top Performers: 27 ------ Company 2 ------ Employees: 120 Average Salary: 6603.98 Average KPI: 80.53 Top Performers: 29 ...
GroupBy Summary + CSV Export
report_summary = df.groupby("company_id").agg({
    "salary":      "mean",
    "kpi_score":   "mean",
    "employee_id": "count"
}).rename(columns={
    "salary":      "avg_salary",
    "kpi_score":   "avg_kpi",
    "employee_id": "num_employees"
})

report_summary.to_csv("company_summary_report.csv")
Generates automated per-company reports by looping through unique company IDs and computing summary statistics. The final summary table — avg salary, avg KPI, employee count — is exported as a CSV file for a simple automated reporting pipeline.