1. Find two numeric variables that are highly correlated by checking the correlation coefficient. Then create a graph to illustrate that.

bank_data_1 <- select(BankChurners, Months_on_book, Total_Relationship_Count, Months_Inactive_12_mon, Contacts_Count_12_mon, Credit_Limit, Total_Revolving_Bal, Avg_Open_To_Buy, Total_Amt_Chng_Q4_Q1, Total_Trans_Amt, Total_Trans_Ct, Total_Ct_Chng_Q4_Q1, Avg_Utilization_Ratio)
ggpairs(bank_data_1)

cor(BankChurners$Credit_Limit, BankChurners$Avg_Open_To_Buy)
## [1] 0.9959805
ggplot(BankChurners, aes(Avg_Open_To_Buy, Credit_Limit)) +
  geom_point() +
  geom_smooth() +
  scale_x_continuous(labels = scales::dollar) +
  scale_y_continuous(labels = scales::dollar) +
  labs(title = "Correlation Plot of Customer's AOTP and CL") +
  theme(plot.title = element_text(hjust = 0.5, size = rel(1.5)))

2. Find two categorical variables (other than Attrition_Flag) that are strongly dependent of each other. Then create a graph to illustrate that.

chisq.test(BankChurners$Card_Category, BankChurners$Gender)
## 
##  Pearson's Chi-squared test
## 
## data:  BankChurners$Card_Category and BankChurners$Gender
## X-squared = 75.01, df = 3, p-value = 3.605e-16
ggplot(BankChurners) +
  geom_bar(aes(x = Card_Category, fill = Gender), position = "fill")

3. Find at least 4 variables that have non-negligible correlation or dependence with Attrition_Flag. Show how you find them.

data_1 <- select(BankChurners, Attrition_Flag, Customer_Age, Gender, Dependent_count, Education_Level, Marital_Status)
data_2 <- select(BankChurners, Attrition_Flag, Income_Category, Card_Category, Months_on_book, Total_Relationship_Count, Months_Inactive_12_mon)
data_3 <- select(BankChurners, Attrition_Flag, Contacts_Count_12_mon, Credit_Limit, Total_Revolving_Bal, Avg_Open_To_Buy)
data_4 <- select(BankChurners, Attrition_Flag, Total_Amt_Chng_Q4_Q1, Total_Trans_Amt, Total_Trans_Ct, Total_Ct_Chng_Q4_Q1, Avg_Utilization_Ratio)

ggpairs(data_1) +
  theme(axis.text.x = element_text(angle = 45, hjust = 0.7))

Comments: No variable has a strong correlation with Attrition_Flag.

ggpairs(data_2) +
  theme(axis.text.x = element_text(angle = 45, hjust = 0.7))

Comments: Total_Relationship_Count and Months_Inactive_12_mon have correlation with Attrition_Flag.

ggpairs(data_3) +
  theme(axis.text.x = element_text(angle = 45, hjust = 0.7))

Comments: Contacts_Count_12_mon and Total_Revolving_Bal have correlation with Attribution Flag.

ggpairs(data_4) +
  theme(axis.text.x = element_text(angle = 45, hjust = 0.7))

Comments: The Total_Trans_Ct and Avg_Utilization_Ratio have rather strong correlation with Attrition_Flag compare to other.