Figure 4.5: Faceting on two categorical vars – age & nbr children by religion and polviews
Use gss_sm dataset within the socviz package
First, let’s look at the relationship between religion and polviews
Code
library(ggplot2)library(socviz)# Omit NA values for specific variablesgss_sm <- gss_sm[complete.cases(gss_sm$religion, gss_sm$polviews, gss_sm$happy, gss_sm$marital), ]p <-ggplot(gss_sm, aes(x=age, y=childs))+geom_point(alpha =0.2) +geom_smooth() +labs(title ="Number of Children by Age",subtitle ="Data from the GSS") +xlab("Respondent's Age") +ylab("Number of Children") +labs(caption ="Plot created on February 6, 2024")p
Religion and Polviews
Let’s facet by just religion: As we can see the layout of the graphs are readable and easy to understand in both just facet religion and polviews individually.
Code
p +facet_grid( ~ religion)
Or we can facet by just polviews:
Code
p +facet_grid( ~ polviews)
Facet usingfacet_grid and by both religion and polviews
Now with both variables the graphs do not look quite good. They are hard to read, small and the scale even has negative numbers, which actually do not make sense because you can not have negative children.
This is using facet_grid(religion ~ polviews)
Code
p +facet_grid(religion ~ polviews)
This is using facet_grid(~religion + polviews)
Code
p +facet_grid(~religion + polviews)
The main difference between facet_grid(religion ~ polviews) and facet_grid(~religion + polviews) is how the faceting variables are specified:
facet_grid(religion ~ polviews) creates a grid with polviews on the columns and religion on the rows. This uses the ~ formula syntax where the variable before ~ is on the rows.
facet_grid(~religion + polviews) puts both variables on the rows, so you get panels for each combination of religion and polviews. The + just combines the variables without specifying one as rows vs columns.
Facet using facet_wrap and by both religion and polviews
Now with both variables the graphs do not look quite good due to the limited amount of space that they have to appear on screen.
This is using facet_wrap(religion ~ polviews) and facet_wrap(~religion + polviews) and they do not change.
Code
p +facet_wrap(religion ~ polviews)
Code
p +facet_wrap(~religion + polviews)
Best version of Religion and Polviews
Let’s just use a cleaner aesthetic – minimal lines and shading using theme_minimal().
We set a limit from 0 to 8. With this we have positive children.
Code
p +facet_grid(religion ~ polviews) +labs(title="Figure 4.5 Faceting on two categoricals") +theme_minimal()+ylim(0,8)
Conclusions
Religion vs Political Views:
Protestants lean conservative, while those with no religion or Jewish lean liberal. Catholics are more evenly split between liberal and conservative.
The very conservative are mostly Protestant, Catholic or other religions. The extremely liberals have more people with no religion.
Age vs Political Views:
Older people tend to be more conservative. The oldest groups (70s, 80s) are predominantly conservative.
Younger people in their 20s and 30s lean more liberal.
Age vs Number of Children:
Older people tend to have more children on average. People in their 70s and 80s have the highest average number of children.
Younger people in their 20s and 30s unsurprisingly have fewer or no children so far.
Number of Children vs Religion:
Catholics and Protestants have more children on average than those with no religion.
Jewish people have an average number of children in between Catholics/Protestants and no religion.
Happy and Marital
Code
library(ggplot2)library(socviz)# Omit NA values for specific variablesgss_sm1 <- gss_sm[complete.cases(gss_sm$happy, gss_sm$marital), ]p2 <-ggplot(gss_sm1, aes(x=age, y=childs))+geom_point(alpha =0.2) +geom_smooth()p2
Let’s facet by just Happy: They both look similar to the graphs of Religion and Polviews, they are readable and easy to understand in both just facet happy and marital individually.
Code
p2 +facet_grid( ~ happy)
Or we can facet by just Marital:
Code
p2 +facet_grid( ~ marital)
Facet using facet_grid and by both happy and marital
It is the same case as the first graphs with both columns. They are hard to read, small and the scale even has negative numbers reaching to -3.
This is using facet_grid(happy ~ marital)
Code
p2 +facet_grid(happy ~ marital)
This is using facet_grid(~ happy + marital)
Code
p2 +facet_grid(~ happy + marital)
#### Facet using facet_wrap and by both happy and marital
We have the slightly different results, but in this case they are easier to read and we have the same -3 children values.
This is using facet_wrap(happy ~ marital) and facet_wrap(~ happy + marital) and they do not change the results and values.
Using facet_wrap(happy ~ marital)
Code
p +facet_wrap(happy ~ marital)
Using facet_wrap(~ happy + marital)
Code
p2 +facet_wrap(~ happy + marital)
Best version of Happy and Marital
Let’s just use a cleaner aesthetic – minimal lines and shading using theme_minimal().
We set a limit from 0 to 8. With this we have positive children.
Code
p +facet_grid(happy ~ marital) +labs(title="Figure 4.5 Faceting on two categoricals") +theme_minimal()+ylim(0,8)
Conclusions
Age vs Happiness:
Happiness levels follow a U-shaped curve with age. The youngest (20s) and oldest (70s-80s) are happiest, while middle-aged groups are less happy.
The very happy tend to be younger or older, while the not too happy are concentrated more in middle age groups.
Age vs Marital Status:
Younger people are mostly never married. Married people make up the majority in the middle age groups.
Divorced status peaks in the 40-50s. Widowed rises sharply in the 70+ groups as spouses pass away.
Children vs Happiness:
People with 0-2 kids are happiest. Happiness levels drop off with 3+ kids.
The very happy have 0-2 kids on average. People with 4+ kids tend to be less happy.
Marital Status vs Happiness:
Married people are happiest overall. Widowed and divorced groups have lower happiness.
The very happy are mostly married. The not too happy have more divorced, widowed and never married.