Ordinal Logistic Regression

Author

Takafumi Kubota

Published

June 7, 2024

Abstract

This study compares multinomial and ordinal logistic regression models using custom penalized error metrics. By applying larger penalties for misordered predictions, the performance of each model is assessed through penalized Mean Absolute Error (MAE) and Mean Squared Error (MSE), highlighting the benefits of considering order in predictive accuracy.

Keywords

Multinomial Logistic Regression, Ordinal Logistic Regression, Penalized Error Metrics

1 Difference Between Multinomial Logistic Regression and Ordinal Logistic Regression

1.1 R Code

Code

#install.packages("MASS")
#install.packages("caret")

library(MASS)
library(caret)

Loading required package: ggplot2

Warning: package 'ggplot2' was built under R version 4.3.1

Loading required package: lattice

Code

# Create the data
set.seed(42)
data <- data.frame(
  study_hours = rnorm(100, mean = 5, sd = 2),
  attendance = rnorm(100, mean = 80, sd = 10),
  past_grades = rnorm(100, mean = 75, sd = 15),
  grade = factor(sample(c("Fail", "Pass", "Good", "Excellent"), 100, replace = TRUE), ordered = TRUE)
)

# Split into training and test data
set.seed(42)
trainIndex <- createDataPartition(data$grade, p = .7, list = FALSE)
trainData <- data[trainIndex, ]
testData <- data[-trainIndex, ]

# Fit the model
model <- polr(grade ~ study_hours + attendance + past_grades, data = trainData, method = "logistic")

# Summary of the model
summary(model)


Re-fitting to get Hessian

Call:
polr(formula = grade ~ study_hours + attendance + past_grades, 
    data = trainData, method = "logistic")

Coefficients:
                Value Std. Error t value
study_hours -0.100706    0.10773 -0.9348
attendance   0.003411    0.02697  0.1265
past_grades  0.007496    0.01353  0.5539

Intercepts:
               Value   Std. Error t value
Excellent|Fail -0.7294  2.3099    -0.3158
Fail|Good       0.3684  2.3033     0.1599
Good|Pass       1.4210  2.3180     0.6130

Residual Deviance: 198.2922 
AIC: 210.2922

Code

# Predictions
predictions <- predict(model, testData)

# Calculate MAE
mae <- mean(abs(as.numeric(testData$grade) - as.numeric(predictions)))
print(paste("Mean Absolute Error: ", mae))

[1] "Mean Absolute Error:  1.75"

Code

# Calculate MSE
mse <- mean((as.numeric(testData$grade) - as.numeric(predictions))^2)
print(paste("Mean Squared Error: ", mse))

[1] "Mean Squared Error:  4.25"

1.2 Code Explanation

Install and Load Necessary Libraries:
- Install and load the MASS and caret packages.
- The MASS package is used to fit the ordinal logistic regression model, and the caret package is used to split the data.
Prepare the Dataset:
- Create a hypothetical dataset where student study hours, attendance, and past grades are the features, and the grade is categorized into four ordered categories: “Fail”, “Pass”, “Good”, “Excellent”.
- Use the createDataPartition function to split the dataset into 70% training data and 30% test data.
Fit the Ordinal Logistic Regression Model:
- Use the polr function from the MASS package to fit an ordinal logistic regression model using the logistic method.
- Display the summary of the fitted model using the summary function.
Prediction and Evaluation:
- Use the predict function to make predictions on the test data.
- Calculate the Mean Absolute Error (MAE) and Mean Squared Error (MSE) using the absolute differences and squared differences between the predicted and actual values, respectively. Display these values to evaluate the model’s performance.

2 Compare

Code

#install.packages("nnet")
#install.packages("MASS")
#install.packages("caret")

library(nnet)
library(MASS)
library(caret)

# Create the data
set.seed(42)
data <- data.frame(
  study_hours = rnorm(100, mean = 5, sd = 2),
  attendance = rnorm(100, mean = 80, sd = 10),
  past_grades = rnorm(100, mean = 75, sd = 15),
  grade = factor(sample(c("Fail", "Pass", "Good", "Excellent"), 100, replace = TRUE), ordered = TRUE)
)

# Split into training and test data
set.seed(42)
trainIndex <- createDataPartition(data$grade, p = .7, list = FALSE)
trainData <- data[trainIndex, ]
testData <- data[-trainIndex, ]

# Multinomial Logistic Regression
multinom_model <- multinom(grade ~ study_hours + attendance + past_grades, data = trainData)

# weights:  20 (12 variable)
initial  value 99.813194 
iter  10 value 98.256239
iter  20 value 96.716208
final  value 96.716069 
converged

Code

# Ordinal Logistic Regression
ordinal_model <- polr(grade ~ study_hours + attendance + past_grades, data = trainData, method = "logistic")

# Predictions
multinom_predictions <- predict(multinom_model, testData)
ordinal_predictions <- predict(ordinal_model, testData)

# Calculate MAE
multinom_mae <- mean(abs(as.numeric(testData$grade) - as.numeric(multinom_predictions)))
ordinal_mae <- mean(abs(as.numeric(testData$grade) - as.numeric(ordinal_predictions)))
print(paste("Multinomial Logistic Regression MAE: ", multinom_mae))

[1] "Multinomial Logistic Regression MAE:  1.64285714285714"

Code

print(paste("Ordinal Logistic Regression MAE: ", ordinal_mae))

[1] "Ordinal Logistic Regression MAE:  1.75"

Code

# Calculate MSE
multinom_mse <- mean((as.numeric(testData$grade) - as.numeric(multinom_predictions))^2)
ordinal_mse <- mean((as.numeric(testData$grade) - as.numeric(ordinal_predictions))^2)
print(paste("Multinomial Logistic Regression MSE: ", multinom_mse))

[1] "Multinomial Logistic Regression MSE:  3.57142857142857"

Code

print(paste("Ordinal Logistic Regression MSE: ", ordinal_mse))

[1] "Ordinal Logistic Regression MSE:  4.25"

2.1 Explanation

Install and Load Necessary Libraries:
- Install and load the nnet, MASS, and caret packages.
- The nnet package is used for multinomial logistic regression, the MASS package is used for ordinal logistic regression, and the caret package is used to split the data.
Prepare the Dataset:
- Create a hypothetical dataset where student study hours, attendance, and past grades are the features, and the grade is categorized into four ordered categories: “Fail”, “Pass”, “Good”, “Excellent”.
- Use the createDataPartition function to split the dataset into 70% training data and 30% test data.
Fit the Models:
- Fit a multinomial logistic regression model using the multinom function from the nnet package.
- Fit an ordinal logistic regression model using the polr function from the MASS package.
Prediction and Evaluation:
- Use the predict function to make predictions on the test data for both models.
- Calculate the Mean Absolute Error (MAE) for both models by taking the absolute differences between the predicted and actual values.
- Calculate the Mean Squared Error (MSE) for both models by taking the squared differences between the predicted and actual values.
- Print the MAE and MSE for both models to evaluate their performance.

2.2 Comparison and Interpretation

The MAE and MSE are calculated for both the multinomial logistic regression model and the ordinal logistic regression model.
MAE measures the average magnitude of the errors without considering their direction, while MSE gives more weight to larger errors.
By comparing these metrics, you can determine which model performs better in terms of predictive accuracy.
Given that ordinal logistic regression takes into account the order of categories, it is expected to perform better when the order is important, leading to lower MAE and MSE if the order information is crucial for the predictions.

3 Full R Code with Custom Penalized Error Metrics

Code

#install.packages("nnet")
#install.packages("MASS")
#install.packages("caret")

library(nnet)
library(MASS)
library(caret)

# Create the data
set.seed(42)
data <- data.frame(
  study_hours = rnorm(100, mean = 5, sd = 2),
  attendance = rnorm(100, mean = 80, sd = 10),
  past_grades = rnorm(100, mean = 75, sd = 15),
  grade = factor(sample(c("Fail", "Pass", "Good", "Excellent"), 100, replace = TRUE), ordered = TRUE)
)

# Split into training and test data
set.seed(42)
trainIndex <- createDataPartition(data$grade, p = .7, list = FALSE)
trainData <- data[trainIndex, ]
testData <- data[-trainIndex, ]

# Multinomial Logistic Regression
multinom_model <- multinom(grade ~ study_hours + attendance + past_grades, data = trainData)

# weights:  20 (12 variable)
initial  value 99.813194 
iter  10 value 98.256239
iter  20 value 96.716208
final  value 96.716069 
converged

Code

# Ordinal Logistic Regression
ordinal_model <- polr(grade ~ study_hours + attendance + past_grades, data = trainData, method = "logistic")

# Predictions
multinom_predictions <- predict(multinom_model, testData)
ordinal_predictions <- predict(ordinal_model, testData)

# Define a custom penalized absolute error function
penalized_absolute_error <- function(actual, predicted) {
  abs_diff <- abs(as.numeric(actual) - as.numeric(predicted))
  return(abs_diff)
}

# Define a custom penalized squared error function
penalized_squared_error <- function(actual, predicted) {
  squared_diff <- (as.numeric(actual) - as.numeric(predicted))^2
  return(squared_diff)
}

# Calculate Penalized MAE
multinom_penalized_mae <- mean(penalized_absolute_error(testData$grade, multinom_predictions))
ordinal_penalized_mae <- mean(penalized_absolute_error(testData$grade, ordinal_predictions))
print(paste("Multinomial Logistic Regression Penalized MAE: ", multinom_penalized_mae))

[1] "Multinomial Logistic Regression Penalized MAE:  1.64285714285714"

Code

print(paste("Ordinal Logistic Regression Penalized MAE: ", ordinal_penalized_mae))

[1] "Ordinal Logistic Regression Penalized MAE:  1.75"

Code

# Calculate Penalized MSE
multinom_penalized_mse <- mean(penalized_squared_error(testData$grade, multinom_predictions))
ordinal_penalized_mse <- mean(penalized_squared_error(testData$grade, ordinal_predictions))
print(paste("Multinomial Logistic Regression Penalized MSE: ", multinom_penalized_mse))

[1] "Multinomial Logistic Regression Penalized MSE:  3.57142857142857"

Code

print(paste("Ordinal Logistic Regression Penalized MSE: ", ordinal_penalized_mse))

[1] "Ordinal Logistic Regression Penalized MSE:  4.25"

3.1 Explanation

Install and Load Necessary Libraries:
- Install and load the nnet, MASS, and caret packages.
- The nnet package is used for multinomial logistic regression, the MASS package is used for ordinal logistic regression, and the caret package is used to split the data.
Prepare the Dataset:
- Create a hypothetical dataset where student study hours, attendance, and past grades are the features, and the grade is categorized into four ordered categories: “Fail”, “Pass”, “Good”, “Excellent”.
- Use the createDataPartition function to split the dataset into 70% training data and 30% test data.
Fit the Models:
- Fit a multinomial logistic regression model using the multinom function from the nnet package.
- Fit an ordinal logistic regression model using the polr function from the MASS package.
Prediction and Evaluation:
- Use the predict function to make predictions on the test data for both models.
Custom Penalized Error Metrics:
- Define custom functions for penalized absolute error and penalized squared error that account for the order of the categories.
- The penalized_absolute_error function calculates the absolute difference between the actual and predicted categories.
- The penalized_squared_error function calculates the squared difference between the actual and predicted categories.
Calculate Penalized MAE and MSE:
- Use the custom functions to calculate the penalized Mean Absolute Error (MAE) and Mean Squared Error (MSE) for both models.
- Print the penalized MAE and MSE to compare the performance of the multinomial and ordinal logistic regression models.

4 with Larger Penalties

Code

#install.packages("nnet")
#install.packages("MASS")
#install.packages("caret")

library(nnet)
library(MASS)
library(caret)

# Create the data
set.seed(42)
data <- data.frame(
  study_hours = rnorm(100, mean = 5, sd = 2),
  attendance = rnorm(100, mean = 80, sd = 10),
  past_grades = rnorm(100, mean = 75, sd = 15),
  grade = factor(sample(c("Fail", "Pass", "Good", "Excellent"), 100, replace = TRUE), ordered = TRUE)
)

# Split into training and test data
set.seed(42)
trainIndex <- createDataPartition(data$grade, p = .7, list = FALSE)
trainData <- data[trainIndex, ]
testData <- data[-trainIndex, ]

# Multinomial Logistic Regression
multinom_model <- multinom(grade ~ study_hours + attendance + past_grades, data = trainData)

# weights:  20 (12 variable)
initial  value 99.813194 
iter  10 value 98.256239
iter  20 value 96.716208
final  value 96.716069 
converged

Code

# Ordinal Logistic Regression
ordinal_model <- polr(grade ~ study_hours + attendance + past_grades, data = trainData, method = "logistic")

# Predictions
multinom_predictions <- predict(multinom_model, testData)
ordinal_predictions <- predict(ordinal_model, testData)

# Define a custom penalized absolute error function
penalized_absolute_error <- function(actual, predicted) {
  abs_diff <- abs(as.numeric(actual) - as.numeric(predicted))
  # Larger penalty for misordering
  penalty <- abs_diff^2
  return(penalty)
}

# Define a custom penalized squared error function
penalized_squared_error <- function(actual, predicted) {
  squared_diff <- (as.numeric(actual) - as.numeric(predicted))^2
  # Larger penalty for misordering
  penalty <- squared_diff^2
  return(penalty)
}

# Calculate Penalized MAE
multinom_penalized_mae <- mean(penalized_absolute_error(testData$grade, multinom_predictions))
ordinal_penalized_mae <- mean(penalized_absolute_error(testData$grade, ordinal_predictions))
print(paste("Multinomial Logistic Regression Penalized MAE: ", multinom_penalized_mae))

[1] "Multinomial Logistic Regression Penalized MAE:  3.57142857142857"

Code

print(paste("Ordinal Logistic Regression Penalized MAE: ", ordinal_penalized_mae))

[1] "Ordinal Logistic Regression Penalized MAE:  4.25"

Code

# Calculate Penalized MSE
multinom_penalized_mse <- mean(penalized_squared_error(testData$grade, multinom_predictions))
ordinal_penalized_mse <- mean(penalized_squared_error(testData$grade, ordinal_predictions))
print(paste("Multinomial Logistic Regression Penalized MSE: ", multinom_penalized_mse))

[1] "Multinomial Logistic Regression Penalized MSE:  24.1428571428571"

Code

print(paste("Ordinal Logistic Regression Penalized MSE: ", ordinal_penalized_mse))

[1] "Ordinal Logistic Regression Penalized MSE:  32.1071428571429"

4.1 Explanation

Install and Load Necessary Libraries:
- Install and load the nnet, MASS, and caret packages.
- The nnet package is used for multinomial logistic regression, the MASS package is used for ordinal logistic regression, and the caret package is used to split the data.
Prepare the Dataset:
- Create a hypothetical dataset where student study hours, attendance, and past grades are the features, and the grade is categorized into four ordered categories: “Fail”, “Pass”, “Good”, “Excellent”.
- Use the createDataPartition function to split the dataset into 70% training data and 30% test data.
Fit the Models:
- Fit a multinomial logistic regression model using the multinom function from the nnet package.
- Fit an ordinal logistic regression model using the polr function from the MASS package.
Prediction and Evaluation:
- Use the predict function to make predictions on the test data for both models.
Custom Penalized Error Metrics:
- Define custom functions for penalized absolute error and penalized squared error that account for the order of the categories.
- In the penalized_absolute_error function, calculate the absolute difference between the actual and predicted categories, and apply a larger penalty by squaring this difference.
- In the penalized_squared_error function, calculate the squared difference between the actual and predicted categories, and apply a larger penalty by squaring this difference again.
Calculate Penalized MAE and MSE:
- Use the custom functions to calculate the penalized Mean Absolute Error (MAE) and Mean Squared Error (MSE) for both models.
- Print the penalized MAE and MSE to compare the performance of the multinomial and ordinal logistic regression models.

By using these custom penalized error metrics with larger penalties, you ensure that predictions that are far off in order receive a significantly higher penalty, providing a more accurate assessment of the model’s performance in the context of ordered categories.