Loading and Summarizing the Data

##    Loan_ID             Gender            Married           Dependents       
##  Length:381         Length:381         Length:381         Length:381        
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##   Education         Self_Employed      ApplicantIncome CoapplicantIncome
##  Length:381         Length:381         Min.   : 150    Min.   :    0    
##  Class :character   Class :character   1st Qu.:2600    1st Qu.:    0    
##  Mode  :character   Mode  :character   Median :3333    Median :  983    
##                                        Mean   :3580    Mean   : 1277    
##                                        3rd Qu.:4288    3rd Qu.: 2016    
##                                        Max.   :9703    Max.   :33837    
##                                                                         
##    LoanAmount  Loan_Amount_Term Credit_History   Property_Area     
##  Min.   :  9   Min.   : 12.0    Min.   :0.0000   Length:381        
##  1st Qu.: 90   1st Qu.:360.0    1st Qu.:1.0000   Class :character  
##  Median :110   Median :360.0    Median :1.0000   Mode  :character  
##  Mean   :105   Mean   :340.9    Mean   :0.8376                     
##  3rd Qu.:127   3rd Qu.:360.0    3rd Qu.:1.0000                     
##  Max.   :150   Max.   :480.0    Max.   :1.0000                     
##                NA's   :11       NA's   :30                         
##  Loan_Status       
##  Length:381        
##  Class :character  
##  Mode  :character  
##                    
##                    
##                    
## 

Data Cleaning

## [1] "Missing Values"
##           Loan_ID            Gender           Married        Dependents 
##                 0                 0                 0                 0 
##         Education     Self_Employed   ApplicantIncome CoapplicantIncome 
##                 0                 0                 0                 0 
##        LoanAmount  Loan_Amount_Term    Credit_History     Property_Area 
##                 0                11                30                 0 
##       Loan_Status 
##                 0
## [1] "Data Types"
##           Loan_ID            Gender           Married        Dependents 
##       "character"       "character"       "character"       "character" 
##         Education     Self_Employed   ApplicantIncome CoapplicantIncome 
##       "character"       "character"         "integer"         "numeric" 
##        LoanAmount  Loan_Amount_Term    Credit_History     Property_Area 
##         "numeric"         "numeric"         "numeric"       "character" 
##       Loan_Status 
##       "character"
## [1] "Check for missing values: "
## [1] 0

Distribution of Applicant Income

Distribution of Self Employed Individuals (Pie Chart)

Loan Approval Based on Self Employed Individuals

Correlation of Applicant Income and Self Employment

Loan Amount vs. Applicant Income (Scatter Plot Code)

p <- ggplot(loan_data, aes(x = ApplicantIncome, y = LoanAmount)) +
  geom_point(aes(color = Loan_Status), alpha = 0.5) +
  geom_smooth(method = "lm", color = "blue", se = FALSE) +
  labs(title = "Loan Amount vs. Applicant Income", 
       x = "Applicant Income", 
       y = "Loan Amount") +
  scale_color_manual(values = c("Y" = "#76CD26", 
                                "N" = "#CD2626")) +
  theme_minimal()

# Converted ggplot to a plotly object to make the plot interactive
ggplotly(p)

Loan Amount vs. Applicant Income (Scatter Plot)

Credit History vs. Loan Status (Bar Plot)

Property Area Impact on Loan Status (Bar Plot)

Coapplicant Income Effect (Density Plot)

Loan Status vs. Number of Dependents (Bar Plot)

Loan Status by Gender vs. Marital Status (Bar Plot)

Interactive Scatter Plot Code

loan_data$Status_Education <- with(loan_data, paste(Loan_Status, Education))

plot_ly(loan_data, x = ~ApplicantIncome, y = ~LoanAmount, 
        type = 'scatter', 
        mode = 'markers',
        
        color = ~Status_Education, colors = c('Y Graduate' = '#3498db', 
                                              'N Graduate' = '#2ecc71', 
                                              'Y Not Graduate' = '#f0948d', 
                                              'N Not Graduate' = '#7b59b6'),
        marker = list(size = 10, opacity = 0.5)) %>%
        layout(
          title = 'Applicant Income vs. Loan Amount by Education and Loan Status',
         xaxis = list(title = 'Applicant Income'),
         yaxis = list(title = 'Loan Amount'),
         legend = list(title = list(text = 'Combined Status')))

Applicant Income vs. Loan Amount by Education and Loan Status

Applicant Income vs. Coapplicant Income Effect on Loan Status