Member

Column

Column

Tim Penyusun Project

KELOMPOK 8


Joans Henky Servatius Simanullang (52240017)


Isnaini Nur Hasanah (52240005)


PROGRAM STUDI SAINS DATA FAKULTAS DIGITAL, DESAIN, DAN BISNIS INSTITUT TEKNOLOGI SAINS BANDUNG

Column

Objectives

A. Dataset Understanding & Exploratory Data Analysis (EDA)

(Weight: ±25%)

Students are required to:

  • Describe the dataset context and analytical objectives.
  • Explain the data structure and variable types.
  • Present key descriptive statistics.
  • Identify and discuss:
    • missing values,
    • outliers,
    • data distributions.
  • Provide at least five (5) relevant data visualizations.

B. Relationship and Pattern Analysis

(Weight: ±20%)

Students are required to:

  • Analyze relationships among key variables.
  • Apply appropriate analytical techniques (e.g., correlation, regression, cross-tabulation).
  • Identify potential data issues (e.g., multicollinearity, heterogeneity).
  • Interpret analytical results clearly and logically.

C. Advanced Analysis (Context-Dependent)

(Weight: ±20%)

Students are required to apply an advanced analytical approach that is appropriate to the dataset, such as:

  • Time series analysis (if time-related variables exist),
  • Clustering or segmentation,
  • Risk or anomaly detection,
  • Classification or forecasting.

D. Analytical / Predictive Modeling

(Weight: ±25%)

Students are required to:

  • Develop at least one analytical or predictive model.
  • Explain model selection and underlying assumptions.
  • Evaluate model performance using appropriate metrics.
  • Discuss model limitations and potential improvements.

E. Insights, Conclusions, and Recommendations

(Weight: ±10%)

Students are required to:

  • Summarize key findings from the analysis.
  • Present data-driven insights.
  • Provide logical and actionable recommendations aligned with the dataset context.

Dataset

Table

Table Cleaned Data

EDA

## Column

Correlation Heatmap

Scatter: Income vs Loan

Boxplot: Balance Distribution

Line Chart: Loan Trend

Histogram: Income Distribution

Regresi

Row

Model Accuracy (R-Squared)

6.4%

Root Mean Squared Error (RMSE)

251

Mean Absolute Error (MAE)

194

Observations

7,361

Row

The Best Model Visualization: Actual vs Predicted Validation

Significant Factors (Model Coefficients)


Klasifikasi

Row

Akurasi (Accuracy)

87.2%

Sensitivity (Recall)

90.6%

Risk Rate (Test Data)

19.8%

Row

Confusion Matrix (Evaluasi Prediksi)

Faktor Penentu Risiko (Odds Ratio)


Klastering

Row

Jumlah Cluster (Optimal K)

3

Metode Segmentasi

K-Means Algorithm

Dominasi Cluster

4,751

Row

Visualisasi Sebaran Segmentasi (PCA Biplot)

Interpretasi & Implikasi Bisnis (Profiling Table)


Time Series

Row

Akurasi Prediksi (MAPE)

15.84%

Root Mean Squared Error (RMSE)

1,430

Model Time Series

ts_data

Row

Forecast: Data Aktual vs Prediksi (6 Bulan ke Depan)

Time Series Decomposition (Trend, Seasonal, Noise)


Insights

Students are required to: