INTRODUCTION
I loaded the “german” dataset and examined the credit amount, age, and gender variables. Then created visualizations to explore the relationships between age and credit amount, with gender mapped to color. I produced a polished scatterplot and published an analysis to RPubs using revealjs. The exercise built skills in data wrangling, visualization, and presentation. Key steps included data loading, exploration, plotting, and publishing results. Overall, the tasks provided an end-to-end workflow from loading data to creating and sharing insights.
Looked at the first 10 and last 10 rows
Age Credit amount Gender
1 67 1169 male
2 22 5951 female
3 49 2096 male
4 45 7882 male
5 53 4870 male
6 35 9055 male
7 53 2835 male
8 35 6948 male
9 61 3059 male
10 28 5234 male
Age Credit amount Gender
991 37 3565 male
992 34 1569 male
993 23 1936 male
994 30 3959 male
995 50 2390 male
996 31 1736 female
997 40 3857 male
998 38 804 male
999 23 1845 male
1000 27 4576 male
Get summary statistics
Credit amount Age
Min. : 250 Min. :19.00
1st Qu.: 1366 1st Qu.:27.00
Median : 2320 Median :33.00
Mean : 3271 Mean :35.55
3rd Qu.: 3972 3rd Qu.:42.00
Max. :18424 Max. :75.00
Created a table for the factor
'data.frame': 1000 obs. of 9 variables:
$ Age : num 67 22 49 45 53 35 53 35 61 28 ...
$ Gender : Factor w/ 2 levels "female","male": 2 1 2 2 2 2 2 2 2 2 ...
$ Housing : chr "own" "own" "own" "free" ...
$ Saving accounts : chr NA "little" "little" "little" ...
$ Checking account: chr "little" "moderate" NA "little" ...
$ Credit amount : num 1169 5951 2096 7882 4870 ...
$ Duration : num 6 48 12 42 24 36 24 36 12 30 ...
$ Purpose : chr "radio/TV" "radio/TV" "education" "furniture/equipment" ...
$ Class Risk : num 1 2 1 1 2 1 1 1 1 2 ...
Played with a few versions of the chart following the protocols in Chapter 3.
Chart 1
Chart 2
Chart 3
Selected my final chart. For this final chart, I interpreted the findings from the chart in text.
Chart 2
Interpretation of the findings
Based on the scatter plot of credit amount vs age, there appears to be a weak positive correlation between age and credit amount overall. However, when looking at males and females separately, the relationship is different: For males, there is a moderate positive correlation between age and credit amount. Older males tend to have higher credit amounts on average. The correlation appears linear, with credit amount increasing steadily with age.
Interpretation of the findings
For females, the relationship is more complex. There is no clear linear correlation. Females have varying credit amounts across all ages. Young females both in their 20s and 30s have some of the highest credit amounts, even more than older females. There are also young females with very low amounts. The distribution appears more scattered for females across age.
So in summary, age and credit amount have a positive correlation for males but not for females in this dataset. Older males tend to have higher credit amounts but that relationship does not hold for females across age groups. The gender difference in the age-credit amount relationship is an interesting finding from this data.