______________________________________________________

Scenario

______________________________________________________

This exercise shows how to apply Multiple Linear Regression in R programming. The objective is to predict students’ final exam scores using several independent variables such as: Hours Studied, Attendance Performed, and Assignment Scored.

We are using ‘Student Performance Dataset’ to predict: ‘Final Exam Score’ Using multiple independent variables: (Hours Studied, Attendance Rate, and Assignment Score). Multiple Linear Regression is used to determine how these variables affect the final exam results.

______________________________________________________

Step-1: Create the Dataset

______________________________________________________

# Creating the student dataset
# ----------------------------
student_ds <- data.frame(
  Hr_Studied = c(5,8,2,7,6,9,4,10),
  Attendance = c(80,90,60,85,75,95,70,98),
  Assignment = c(70,85,55,80,78,92,60,96),
  Final_Exam = c(65,88,50,82,76,94,62,98)
)

# Display the student dataset
# ---------------------------
print(student_ds)
##   Hr_Studied Attendance Assignment Final_Exam
## 1          5         80         70         65
## 2          8         90         85         88
## 3          2         60         55         50
## 4          7         85         80         82
## 5          6         75         78         76
## 6          9         95         92         94
## 7          4         70         60         62
## 8         10         98         96         98

______________________________________________________

Step-2: Explore the Dataset

_______________________________________________________

# Structure of the dataset
# ------------------------
str(student_ds)
## 'data.frame':    8 obs. of  4 variables:
##  $ Hr_Studied: num  5 8 2 7 6 9 4 10
##  $ Attendance: num  80 90 60 85 75 95 70 98
##  $ Assignment: num  70 85 55 80 78 92 60 96
##  $ Final_Exam: num  65 88 50 82 76 94 62 98
# Summary statistics
# ------------------
summary(student_ds)
##    Hr_Studied       Attendance      Assignment      Final_Exam   
##  Min.   : 2.000   Min.   :60.00   Min.   :55.00   Min.   :50.00  
##  1st Qu.: 4.750   1st Qu.:73.75   1st Qu.:67.50   1st Qu.:64.25  
##  Median : 6.500   Median :82.50   Median :79.00   Median :79.00  
##  Mean   : 6.375   Mean   :81.62   Mean   :77.00   Mean   :76.88  
##  3rd Qu.: 8.250   3rd Qu.:91.25   3rd Qu.:86.75   3rd Qu.:89.50  
##  Max.   :10.000   Max.   :98.00   Max.   :96.00   Max.   :98.00

______________________________________________________

Step-3: Fit Multiple Linear Regression Model

______________________________________________________

# Building the regression model
#------------------------------
model <- lm(Final_Exam ~ Hr_Studied + Attendance + Assignment, data = student_ds)

# Display model results
# ---------------------
summary(model)
## 
## Call:
## lm(formula = Final_Exam ~ Hr_Studied + Attendance + Assignment, 
##     data = student_ds)
## 
## Residuals:
##       1       2       3       4       5       6       7       8 
## -1.2306  1.1975  0.2227  1.3683 -0.4524  1.0717 -0.2318 -1.9454 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) 59.04408   19.98987   2.954   0.0418 *
## Hr_Studied   8.34302    2.22418   3.751   0.0199 *
## Attendance  -0.41186    0.23404  -1.760   0.1533  
## Assignment  -0.02256    0.29796  -0.076   0.9433  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.586 on 4 degrees of freedom
## Multiple R-squared:  0.9949, Adjusted R-squared:  0.9911 
## F-statistic: 260.4 on 3 and 4 DF,  p-value: 4.859e-05

______________________________________________________

Step-4: Interpretation of the Results

______________________________________________________

Regression Equation

—————————

The model follows this equation: \[ Y = \beta_0 + \beta_1X_1 + \beta_2X_2 + \beta_3X_3 \] Where: - \(Y\) = Final Exam - \(X_1\) = Hours Studied - \(X_2\) = Attendance Score - \(X_3\) = Assignment Score

______________________________________________________

Step-5: Visualize Actual vs Predicted Values

______________________________________________________

# Predict final exam scores
# -------------------------
predicted_scores <- predict(model)

# Create comparison dataset
# -------------------------
results <- data.frame(
  Actual = student_ds$Final_Exam,
  Predicted = predicted_scores
)
print(results)
##   Actual Predicted
## 1     65  66.23058
## 2     88  86.80253
## 3     50  49.77727
## 4     82  80.63165
## 5     76  76.45240
## 6     94  92.92828
## 7     62  62.23184
## 8     98  99.94545

______________________________________________________

Step-6: Scatter Plot of Actual vs Predicted Scores

______________________________________________________

# Plot actual vs predicted scores
# -------------------------------
plot(results$Actual, results$Predicted, main = "Actual vs Predicted Final Exam Scores",
     xlab = "Actual Scores", ylab = "Predicted Scores", pch = 19)

# Adding regression line
# ----------------------
abline(0,1,col="yellow",lwd=2)

## Visualization Explanation
## -------------------------
## - Points close to the yellow line indicate good predictions.
## - Large distances from the line indicate prediction errors.

______________________________________________________

Step-7: Model Performance

______________________________________________________

# Calculate R-squared
# -------------------
R2 <- cor(results$Actual, results$Predicted)^2
print(R2)
## [1] 0.994905
# Interpretation
# ----------------
# R-squared value shows how well the independent variables explain the variation in final exam scores.
# For example: R² = 0.95 means 95% of exam performance is explained by the model.

______________________________________________________

Conclusion and Analysis

______________________________________________________

This exercise demonstrated the application of Multiple Linear Regression using R programming.

The analysis showed that:
- Students who study more tend to perform better.
- Higher attendance contributes positively to final exam performance.
- Assignment scores are also important predictors of academic success.