# A tibble: 25 × 5
# Groups: marital [5]
marital degree N freq pct
<fct> <fct> <int> <dbl> <dbl>
1 Married Lt High School 110 0.0909 9
2 Married High School 555 0.459 46
3 Married Junior College 91 0.0752 8
4 Married Bachelor 279 0.231 23
5 Married Graduate 175 0.145 14
6 Widowed Lt High School 53 0.213 21
7 Widowed High School 128 0.514 51
8 Widowed Junior College 13 0.0522 5
9 Widowed Bachelor 28 0.112 11
10 Widowed Graduate 27 0.108 11
# ℹ 15 more rows
Code
# Create plot p <-ggplot(data = pip1,aes(x = marital, y = pct, fill = degree))p +geom_bar(stat ="identity") +labs(title ="Distribution of Marital Status by Degree",subtitle ="Married people tend to have a Graduate degree, than other people", caption ="gss_sm Dataset", x ="Marital Status",y ="Percentage") +geom_text(aes(label = pct),position =position_stack(vjust =0.5))+theme_minimal()
Interpretation
These are the insights that can be understood from the chart:
Married people tend to have higher education levels, with 23% having a Bachelor’s degree and 14% having a Graduate degree.
On the other hand, separated individuals stand out for having lower education levels. Only 10% have a Bachelor’s degree, 5% have a Graduate degree, and a significant 24% have less than a high school education.
For never-married and divorced individuals, a large portion (56%) have a high school degree. Additionally, 9% of divorced individuals and 12% of never-married individuals have an education level below high school.
Widowed individuals show a different pattern, with the lowest percentage (5%) attending junior college. The distribution is more even, with 11% having either a Bachelor’s or a Graduate degree.
In summary, the chart helps to see how education levels vary across different marital status groups. Married individuals often pursue higher education, while separated individuals tend to have lower educational levels. Never-married and divorced individuals often focus on high school education, and widowed individuals have a more balanced distribution across different education levels.
PART 2: Create stacked and dodged bar charts: Two Categorical Variables
Code
# Stacked Bar Chartp +geom_col(position ="stack") +geom_text(aes(label =sprintf("%.1f%%", pct), group = degree),position =position_stack(vjust =0.5), color ="white", size =3) +labs(x ="Marital Status", y ="Percent", fill ="Degree", title ="Stack Bar: Marital Status and Degree", caption ="gss_ssm Dataset", subtitle ="Married people tend to have a Graduate degree, than other people") +theme_minimal()
Code
# Dodged Bar Chart with labelsp +geom_col(position ="dodge2") +geom_text(aes(label =sprintf("%.1f%%", pct), group = degree),position =position_dodge(width =0.9), vjust =-0.5, color ="black", size =3) +labs(x ="Marital Status", y ="Percent", fill ="Degree",title ="Dodged Bar: Marital Status and Degree", subtitle ="Married people tend to have a Graduate degree, than other people", caption ="gss_ssm Dataset") +theme(legend.position ="top") +theme_minimal()
Code
# Dodged Bar , faceted horizontal chart with no legendsp +geom_col(position ="dodge2") +geom_text(aes(label =sprintf("%.1f%%", pct), group = degree),position =position_dodge(width =0.9), vjust =-0.5,color ="black", size =3) +labs(x =NULL, y ="Percent", fill ="Degree", title ="Dodged Bar with Changes: Marital Status and Degree", subtitle ="Married people tend to have a Graduate degree, than other people", caption ="gss_ssm Dataset") +guides(fill =FALSE) +coord_flip() +facet_grid(~ degree) +theme_minimal()
PART 3: Practice using pipes (dplyr) to summarize data: Two Continuous Variables and One Categorical
Code
cat("Categorical Variable: Marital \nContinous Variables: Age and Childs")
Categorical Variable: Marital
Continous Variables: Age and Childs
PART 4: Create a scatterplot: Two Continuous Variables and One Categorical
Code
library(ggplot2)scatter_plot <-ggplot(pip1new, aes(x = mean_age, y = mean_childs, color = marital)) +geom_point(size =5) +labs(title ="Scatterplot of Mean Age vs. Mean Children",subtitle ="Widowed People tend to be older and have more children, \n while never married people tend to be younger and have no kids at all ",x ="Mean Age",y ="Mean Children",color ="Marital Status") +theme_minimal()scatter_plot +annotate(geom ="text", x =60, y =2.9, label ="\n Widowed people tend \n to be older \n and have more children") +annotate(geom ="rect", xmin =70, xmax =75,ymin =2.5, ymax =3, fill ="red", alpha =0.2 )
PART 5: Legends and guides
Code
library(ggplot2)scatter_plot <-ggplot(pip1new, aes(x = mean_age, y = mean_childs, color = marital)) +geom_point(size =5) +labs(title ="Scatterplot of Mean Age vs. Mean Children",subtitle ="Widowed People tend to be older and have more children, \nwhile never married people tend to be younger and have no kids at all ", x ="Mean Age",y ="Mean Children",color ="Marital Status") +theme_minimal() +theme(legend.title =element_text(face ="bold", size =12, color ="black"), legend.position ="bottom", legend.box.background =element_rect(color ="lightgray"), legend.background =element_rect(fill ="lightgray"), legend.key.size =unit(1.5, "lines")) scatter_plot +annotate(geom ="text", x =60, y =2.9, label ="\n Widowed people tend \n to be older \n and have more children") +annotate(geom ="rect", xmin =70, xmax =75,ymin =2.5, ymax =3, fill ="red", alpha =0.2 )
PART 6: Data Labels
Code
library(ggplot2)library(ggrepel)scatter_plot <-ggplot(pip1new, aes(x = mean_age, y = mean_childs, color = marital)) +geom_point(size =5) +labs(title ="Scatterplot of Mean Age vs. Mean Children",subtitle ="Widowed People tend to be older and have more children, \nwhile never married people tend to be younger and have no kids at all ", x ="Mean Age",y ="Mean Children",color ="Marital Status") +theme_minimal() +theme(legend.position ="none") scatter_plot +geom_text_repel(aes(label = marital), box.padding =0.5, point.padding =0.5) +annotate(geom ="text", x =60, y =2.9, label ="\n Widowed people tend \n to be older \n and have more children") +annotate(geom ="rect", xmin =70, xmax =75,ymin =2.5, ymax =3, fill ="red", alpha =0.2 )
PART 7: Interpretation
-Widowed People: This group seems to be in the top right corner of the chart. This shows they are older on average and have more children. The results being close together here suggests a stable trend for widowed people. This can be because they may have started families young, had children throughout their marriage, or had children late in life with a younger partner.
-Never Married People: On the other hand, people who have never married are together in the bottom left corner. This suggests never-married people tend to be younger and have no children on average. This could be because this may reflect personal choices, economic concerns, lack of a long-term partner,
-Married People: The married people fall between the widowed and never married groups in age and number of children. The more spread out dots for this group shows there is more variety. This means there is diversity among married people in age and number of children.
-Separated People: These people tend to be younger than married and divorced people and on average have more children than these other two groups.
-Divorced People:These group is older than married and separated, but on average have less children than separated people.