SaiTeja_Assignment

R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

library(tidyverse)

## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.3     ✔ readr     2.1.4
## ✔ forcats   1.0.0     ✔ stringr   1.5.0
## ✔ ggplot2   3.4.3     ✔ tibble    3.2.1
## ✔ lubridate 1.9.2     ✔ tidyr     1.3.0
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library(readr)
library(knitr)
library(kableExtra)

## 
## Attaching package: 'kableExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     group_rows

library(nnet)  
library(ggplot2)
library(ggpubr)
library(modelr)
library(car)

## Loading required package: carData
## 
## Attaching package: 'car'
## 
## The following object is masked from 'package:dplyr':
## 
##     recode
## 
## The following object is masked from 'package:purrr':
## 
##     some

library(broom)

## 
## Attaching package: 'broom'
## 
## The following object is masked from 'package:modelr':
## 
##     bootstrap

set.seed(123)
setwd("/Users/saitejaravulapalli/Documents/IUPUI_SEM 01/Intro to Statistic in R/DATA SET")
data <- read.csv("student dropout.csv" , sep = ";", header = TRUE)


response_var <- "Target"

explanatory_vars <- c(
  "Marital.status",
  "Application.mode",
  "Previous.qualification"
)

formula <- as.formula(paste(response_var, "~", paste(explanatory_vars, collapse = "+")))
multinom_model <- multinom(formula, data = data)

## # weights:  15 (8 variable)
## initial  value 4860.260765 
## iter  10 value 4396.891934
## final  value 4394.414358 
## converged

summary(multinom_model)

## Call:
## multinom(formula = formula, data = data)
## 
## Coefficients:
##          (Intercept) Marital.status Application.mode Previous.qualification
## Enrolled  -0.1164986     -0.1913456      -0.01173285            0.003841907
## Graduate   1.1094640     -0.1098765      -0.03030552            0.009830611
## 
## Std. Errors:
##          (Intercept) Marital.status Application.mode Previous.qualification
## Enrolled  0.10492233     0.07819043      0.002859901            0.004583197
## Graduate  0.07858255     0.05732652      0.002287985            0.003688167
## 
## Residual Deviance: 8788.829 
## AIC: 8804.829

data$Predicted <- predict(multinom_model, newdata = data, "probs")

data$Predicted_Class <- apply(data$Predicted, 1, function(x) names(which.max(x)))

confusion_matrix <- table(data$Predicted_Class, data$Target)
print(confusion_matrix)

##           
##            Dropout Enrolled Graduate
##   Dropout      553      224      410
##   Graduate     868      570     1799

accuracy <- sum(diag(confusion_matrix)) / sum(confusion_matrix)
print(paste("Accuracy:", accuracy))

## [1] "Accuracy: 0.253842676311031"

##I’ll interpret the coefficient for the “Marital.status” variable for the “Graduate” category. The coefficient is approximately -0.1099, and the standard error is approximately 0.0573.

Coefficient Interpretation:

The coefficient for “Marital.status” in the “Graduate” category is approximately -0.1099. This coefficient represents the log-odds of a student being in the “Graduate” category as opposed to the reference category (which is typically the first category in the response variable, in this case, “Dropout”) for a one-unit change in the “Marital.status” variable.

Interpretation of Sign: Since the coefficient is negative, it implies that an increase in the “Marital.status” variable is associated with a decrease in the log-odds of a student being in the “Graduate” category compared to the reference category (e.g., “Dropout”).

Magnitude: The magnitude of the coefficient (-0.1099) indicates the strength of this relationship. The closer the coefficient is to zero, the weaker the effect. In this case, a one-unit increase in “Marital.status” is associated with a relatively small decrease in the log-odds of being a “Graduate.”

Standard Error: The standard error (approximately 0.0573) tells us about the uncertainty in the coefficient estimate. Smaller standard errors suggest more precise estimates. In this case, the standard error is relatively small, indicating that the coefficient is estimated with some degree of confidence.

Keep in mind that this interpretation is specific to the “Graduate” category in your multinomial logistic regression model. You can similarly interpret the coefficients for the other categories and variables in your model to understand their effects on the log-odds of being in each category relative to the reference category.

SaiTeja_Assignment_11

2023-11-07

R Markdown