Young people living and working in New York experience both the convenience and demands of a fast-paced city life. An organic work-life balance and a fulfilling lifestyle are what many desire but time may not provide.
I am personally very interested in how friends around me are scheduling their family time, exercise hours and vacation plans. Therefore I made a short survey of people around me, with a particular emphasis on the diversity of samples representing different industries: finance, consulting, Law or graduate school PhD students. Indeed, people from different walks of life spend their time very differently, and New Yorkers can never be summarized in a single diagram.
Here’s a summary of the data I collected:
Name: Name of the subject
Gender: Male or Female
Age: Age of the subject
Weight: Current weight of the subject
Height: Current height of the subject
Single: 1 if single; 0 if not single
Industry: Industry that the subject works in
ExercisePerWeek: Average time of exercise per week in minutes
NumberOfSports: Number of sports that the subject enjoys (seasonal sports like skiing and surfing counts too!)
VacationDays: Number of vacation days taken in 2022
NumberOfDestinations: Number of vacation destinations that the subject has been to in 2022
PartyPerWeek: Average number of party the subject goes to per week
My data visualizations are going to revolve around the following questions:
How does VacationDays and NumberOfDestinations differ by Industry?
How does ExercisePerWeek correlate with Weight and with Age by Gender?
How does Single affect PartyPerWeek?
How does NumberOfSports affect ExercisePerWeek by Gender?
How does Single affect VacationDays and NumberOfDestinations?
To analyze these questions, I will present the data in box plots and scatter plots, as seen appropriate to each question. For all of the plots, I use different color to separate groups such as Gender, Industry and Single, and in scatter plots I used different shapes to denote data points from different groups. I also added linear regression lines to highlight the correlation between independent variable and dependent variable.
How does VacationDays and NumberOfDestinations differ by Industry?
How does ExercisePerWeek correlate with Weight and with Age by Gender?
How does Single affect PartyPerWeek?
How does NumberOfSports affect ExercisePerWeek by Gender?
How does Single affect VacationDays and NumberOfDestinations?
---
title: "ANLY 512 - Final Project: A Snapshot of the Life of Gen Z New Yorkers"
author: "Shirong Liu"
date: "`r Sys.Date()`"
output:
flexdashboard::flex_dashboard:
orientation: columns
horizontal_layout: fill
source: embed
---
```{r setup, include=FALSE}
library(flexdashboard)
library(xts)
library(ggplot2)
library(reshape)
library(reshape2)
library(readxl)
library(knitr)
library(dplyr)
library(tidyr)
library(kableExtra)
```
```{r loading data, include=FALSE}
friends <- read.csv("friends.csv")
friends <- friends %>%
mutate(
Gender = as.factor(Gender),
Industry = as.factor(Industry),
Single = as.factor(Single)
)
```
# Summary
Young people living and working in New York experience both the convenience and demands of a fast-paced city life. An organic work-life balance and a fulfilling lifestyle are what many desire but time may not provide.
I am personally very interested in how friends around me are scheduling their family time, exercise hours and vacation plans. Therefore I made a short survey of people around me, with a particular emphasis on the diversity of samples representing different industries: finance, consulting, Law or graduate school PhD students. Indeed, people from different walks of life spend their time very differently, and New Yorkers can never be summarized in a single diagram.
Here's a summary of the data I collected:
- Name: Name of the subject
- Gender: Male or Female
- Age: Age of the subject
- Weight: Current weight of the subject
- Height: Current height of the subject
- Single: 1 if single; 0 if not single
- Industry: Industry that the subject works in
- ExercisePerWeek: Average time of exercise per week in minutes
- NumberOfSports: Number of sports that the subject enjoys (seasonal sports like skiing and surfing counts too!)
- VacationDays: Number of vacation days taken in 2022
- NumberOfDestinations: Number of vacation destinations that the subject has been to in 2022
- PartyPerWeek: Average number of party the subject goes to per week
My data visualizations are going to revolve around the following questions:
1. How does VacationDays and NumberOfDestinations differ by Industry?
2. How does ExercisePerWeek correlate with Weight and with Age by Gender?
3. How does Single affect PartyPerWeek?
4. How does NumberOfSports affect ExercisePerWeek by Gender?
5. How does Single affect VacationDays and NumberOfDestinations?
To analyze these questions, I will present the data in box plots and scatter plots, as seen appropriate to each question. For all of the plots, I use different color to separate groups such as Gender, Industry and Single, and in scatter plots I used different shapes to denote data points from different groups. I also added linear regression lines to highlight the correlation between independent variable and dependent variable.
# Q1
How does VacationDays and NumberOfDestinations differ by Industry?
## Column {data-width=350}
```{r, fig.width=7, fig.height=6}
ggplot(friends, aes(x = Industry, y = VacationDays, fill = Industry)) +
theme_bw() +
theme(legend.position = 'none') +
scale_fill_brewer(palette = 'Dark2') +
geom_boxplot() +
labs(title = "Vacation Days Taken vs Industry", x = "Industry", y = "Number of Vacation Days Taken")
```
## Column {data-width=350}
```{r, fig.width=7, fig.height=6}
ggplot(friends, aes(x = Industry, y = NumberOfDestinations, fill = Industry)) +
theme_bw() +
theme(legend.position = 'none') +
scale_fill_brewer(palette = 'Dark2') +
geom_boxplot() +
labs(title = "Number of Vacation Destinations vs Industry", x = "Industry", y = "Number of Vacation Destinations")
```
# Q2
How does ExercisePerWeek correlate with Weight and with Age by Gender?
## Column {data-width=350}
```{r, fig.width=7, fig.height=6}
ggplot(friends, aes(x = ExercisePerWeek, y = Weight, color = Gender)) +
geom_point(aes(shape = Gender), size = 2) +
geom_smooth(method = 'lm', se = FALSE) +
theme_bw() +
scale_color_brewer(palette = 'Dark2') +
labs(title = "ExercisePerWeek vs Weight", x = "Average Time of Exercise per Week (min)", y = "Weight (lb)")
```
## Column {data-width=350}
```{r, fig.width=7, fig.height=6}
ggplot(friends, aes(x = ExercisePerWeek, y = Age, color = Gender)) +
geom_point(aes(shape = Gender), size = 2) +
geom_smooth(method = 'lm', se = FALSE) +
theme_bw() +
scale_color_brewer(palette = 'Dark2') +
labs(title = "ExercisePerWeek vs Weight", x = "Exercise Time per Week (min)", y = "Age")
```
# Q3
How does Single affect PartyPerWeek?
## Column {data-width=350}
```{r, fig.width=8, fig.height=6, fig.align='center'}
ggplot(friends, aes(x = Single, y = PartyPerWeek, fill = Single)) +
theme_bw() +
theme(legend.position = 'none') +
scale_fill_brewer(palette = 'Dark2') +
geom_boxplot() +
labs(title = "Single vs PartyPerWeek", x = "Single", y = "Average Number of Party per Week")
```
# Q4
How does NumberOfSports affect ExercisePerWeek by Gender?
## Column {data-width=350}
```{r, fig.width=8, fig.height=6, fig.align='center'}
ggplot(friends, aes(x = NumberOfSports, y = ExercisePerWeek, color = Gender)) +
geom_point(aes(shape = Gender), size = 2) +
geom_smooth(method = 'lm', se = FALSE) +
theme_bw() +
scale_color_brewer(palette = 'Dark2') +
labs(title = "NumberOfSports vs ExercisePerWeek", x = "Number of Sports", y = "Average Time of Exercise per Week (min)")
```
# Q5
How does Single affect VacationDays and NumberOfDestinations?
## Column {data-width=350}
```{r, fig.width=7, fig.height=6}
ggplot(friends, aes(x = Single, y = VacationDays, fill = Single)) +
theme_bw() +
theme(legend.position = 'none') +
scale_fill_brewer(palette = 'Dark2') +
geom_boxplot() +
labs(title = "Vacation Days Taken vs Single", x = "Single", y = "Number of Vacation Days Taken")
```
## Column {data-width=350}
```{r, fig.width=7, fig.height=6}
ggplot(friends, aes(x = Single, y = NumberOfDestinations, fill = Single)) +
theme_bw() +
theme(legend.position = 'none') +
scale_fill_brewer(palette = 'Dark2') +
geom_boxplot() +
labs(title = "Number of Vacation Destinations vs Single", x = "Single", y = "Number of Vacation Destinations")
```