library(datasetsICR)
library(ggplot2)Warning: package 'ggplot2' was built under R version 4.3.2
data(package = "datasetsICR")
data("german")library(datasetsICR)
library(ggplot2)Warning: package 'ggplot2' was built under R version 4.3.2
data(package = "datasetsICR")
data("german")head(german, 10) Age Gender Housing Saving accounts Checking account Credit amount Duration
1 67 male own <NA> little 1169 6
2 22 female own little moderate 5951 48
3 49 male own little <NA> 2096 12
4 45 male free little little 7882 42
5 53 male free little little 4870 24
6 35 male free <NA> <NA> 9055 36
7 53 male own quite rich <NA> 2835 24
8 35 male rent little moderate 6948 36
9 61 male own rich <NA> 3059 12
10 28 male own little moderate 5234 30
Purpose Class Risk
1 radio/TV 1
2 radio/TV 2
3 education 1
4 furniture/equipment 1
5 car 2
6 education 1
7 furniture/equipment 1
8 car 1
9 radio/TV 1
10 car 2
tail(german, 10) Age Gender Housing Saving accounts Checking account Credit amount Duration
991 37 male own <NA> <NA> 3565 12
992 34 male own moderate <NA> 1569 15
993 23 male rent <NA> little 1936 18
994 30 male own little little 3959 36
995 50 male own <NA> <NA> 2390 12
996 31 female own little <NA> 1736 12
997 40 male own little little 3857 30
998 38 male own little <NA> 804 12
999 23 male free little little 1845 45
1000 27 male own moderate moderate 4576 45
Purpose Class Risk
991 education 1
992 radio/TV 1
993 radio/TV 1
994 furniture/equipment 1
995 car 1
996 furniture/equipment 1
997 car 1
998 radio/TV 1
999 radio/TV 2
1000 car 1
str(german)'data.frame': 1000 obs. of 9 variables:
$ Age : num 67 22 49 45 53 35 53 35 61 28 ...
$ Gender : chr "male" "female" "male" "male" ...
$ Housing : chr "own" "own" "own" "free" ...
$ Saving accounts : chr NA "little" "little" "little" ...
$ Checking account: chr "little" "moderate" NA "little" ...
$ Credit amount : num 1169 5951 2096 7882 4870 ...
$ Duration : num 6 48 12 42 24 36 24 36 12 30 ...
$ Purpose : chr "radio/TV" "radio/TV" "education" "furniture/equipment" ...
$ Class Risk : num 1 2 1 1 2 1 1 1 1 2 ...
summary_stats <- summary(german[c("Credit amount", "Duration")])
print(summary_stats) Credit amount Duration
Min. : 250 Min. : 4.0
1st Qu.: 1366 1st Qu.:12.0
Median : 2320 Median :18.0
Mean : 3271 Mean :20.9
3rd Qu.: 3972 3rd Qu.:24.0
Max. :18424 Max. :72.0
# Convert "Gender" to factor
german$Gender <- factor(german$Gender)table(german$Gender)
female male
310 690
ggplot(german, aes(x = `Credit amount`, y = Duration, color = Gender)) +
geom_point() +
labs(
title = "Scatter Plot of Age and Credit Amount",
subtitle = "Relationship between age and credit amount, colored by purpose",
x = "Credit Amount",
y = "Duration",
caption = "Data Source: datasetsICR"
) +
theme_minimal() +
theme(legend.position = "right") For your final chart, interpret the findings from the chart in text. Full sentences required.
Most credits are requested between the amounts of 0 to 5,000.
Males almost double the amount of credit requests compared to female.
Most of the loans last between 0 to 20 months to be payed back.
There are a few outliers, which could be influenced by the age of the person and the purpose to why requesting the credit. Usually young people are predominant in requesting credits because they are the ones who are starting to settle down, while elderly already have a retired found.
There seems to be a correlation between the credit amount requested and the duration to payback. The shorter the amount the fastest the payment happens, and vice versa.