---
title: "Will gender and interest affect time spent on different subjects?"
output:
flexdashboard::flex_dashboard:
theme:
version: 4
bootswatch: pulse
source_code: embed
---
```{r setup, include=FALSE}
library(flexdashboard)
library(tidyverse)
library(janitor)
course_text <- read_csv("data/course-text.csv") |>
clean_names()
data_to_viz <- course_text |>
select(course_id,
time_spent_hours,
gender,
int) |>
separate(course_id, c("subject", "semester", "section")) |>
mutate(subject = recode(subject,
"AnPhA" = "Anatomy",
"BioA" = "Biology",
"FrScA" = "Forensics",
"OcnA" = "Oceanography",
"PhysA" = "Physics"))
```
## Inputs {.sidebar}
Overall, within the same subject, there is no significant difference between gender and time spent. And the relationship between the interest level and time spent is not very clear.
From "Overview of Time Spent on Different Subjects", we can see that no matter which the subject is, most of students spend 10 to 50 hours.
From "Distribution of Time Spent on Each Subject by Gender", male and female students have quite similar time spent distribution on all the subjects except biology. The time spent distribution of female students on biology is much wider than the males.
From "Relationships between Time Spent on Each Subject and Interest", there is no clear pattern of the relationships. It kind of bags the question that if the students are more interested in one subject, do they spend more time on it because they are eager to dig deeper or they spend less time on it because the subject is quite easy for them? That makes me think about the complexity of data collection and interpretation, like the authenticity of interest level, and the integrity of time spent.
## Column {data-width="600"}
### Overview of Time Spent on Different Subjects
```{r}
data_to_viz %>%
ggplot() +
geom_freqpoly(mapping = aes(x = time_spent_hours, color = subject), binwidth = 25, boundary = 0) +
labs(title = "How long do most of students spend on each subject?") +
theme_grey()
```
## Column {data-width="400"}
### Distribution of Time Spent on Each Subject by Gender
```{r}
data_to_viz %>%
ggplot() +
geom_boxplot(mapping = aes(x = gender, y = time_spent_hours, color = gender), outlier.fill = "white", outlier.stroke = 0.25) +
coord_cartesian(ylim = c(0, 100)) +
facet_grid (.~subject) +
labs(title = "Time Spent for Each Subject by Gender",
caption = "Is there a gender difference on time spent for each subject?") +
theme_bw() +
theme(legend.position = "none")
```
### Relationships between Interest and Time Spent on Each Subject
```{r}
data_to_viz %>%
ggplot() +
geom_point(mapping = aes(x = int,
y = time_spent_hours,
color = subject),
alpha = .5) +
geom_smooth(mapping = aes(x = int,
y = time_spent_hours,
weight = .5),
color = "gray",
method = loess,
se = FALSE) +
ylim(0, 100) +
xlim(1, 5) +
facet_wrap(~subject) +
labs(title = "Is there a clear realationship between interest and time spent?",
y = "Time Spent",
x = "Interest",
) +
theme_bw() +
theme(legend.position = "none",
panel.grid.minor = element_blank()) +
scale_color_brewer(palette = "Set1",
name = "Subject")
```