The activities data set has the total daily totals for a
Strava user’s cycling activities. The columns are:
date: The YYYY-MM-DD date (will need to be converted to
a date type object)trips: The number of activities recorded for that
daydistance: The total distance traveled that day, in
kilometersmoving_time: The time spent moving that day, in
secondsmax_speed: The fastest recorded speed that day, in
kphelevation_gained: the total distance climbed uphill, in
metersdirt_dist: the total distance traveled on gravel or
dirt paths, in meterstibble(activities)
## # A tibble: 153 × 7
## date trips distance moving_time max_speed elevation_gain dirt_dist
## <chr> <int> <dbl> <int> <dbl> <dbl> <dbl>
## 1 2023-06-25 1 5.01 1343 6.82 36.1 0
## 2 2023-06-27 2 9.26 2141 9.75 46.3 102.
## 3 2023-06-28 1 14.9 3599 15.4 28.5 833.
## 4 2023-06-29 1 7.41 1912 6.81 20.0 187.
## 5 2023-06-30 2 15.7 3912 29.4 49.9 287
## 6 2023-07-01 1 3.52 797 9.00 24.9 0
## 7 2023-07-02 2 11.8 2739 8.05 47.5 466.
## 8 2023-07-04 1 18.0 4019 7.69 41.7 3690.
## 9 2023-07-14 1 25.0 4676 23.7 157. 892.
## 10 2023-07-15 2 19.0 3992 7.71 34.0 236.
## # ℹ 143 more rows
This homework assignment only has one question: Create the graph seen in Brightspace
Important Notes:
You’ll need to use both activities and
all_days data frames to form the data set to create the
graph
kilometers (data) have been changed to miles (graph) and 1 km = 0.6 mi
meters (data) have changed to feet (graph) and 1 meter = 3.3 feet
seconds (data) have been changed to hours (graph) and 1 hour = 3600 seconds.
While not something you should always do, the missing values
should be replaced with 0 since NA indicates no activities
for that day and no activities -> 0 time, 0 distance,
etc…
One column you’ll need to create is
average_speed = distance/moving_time. Because there are
days with no moving time, the fraction will divide by 0 and return
NaN. Any days without any activities
(trips = 0, distance = 0, and
moving_time = 0) should record an
average_speed of 0.
Create the data set needed for the graph in the code chunk below:
daily_activities <-
left_join(
x = all_days,
y = activities |> mutate(date = date(date)),
by = 'date'
) |>
mutate(
trips = if_else(is.na(trips), 0, trips),
# Replacing missing values with 0 and converting km to mi
distance = if_else(trips == 0, 0, distance * 0.6),
# Replacing missing values with 0 and converting seconds to hours
moving_time = if_else(trips == 0, 0, moving_time / 3600),
# Replacing missing values with 0 and converting kmh to mph
max_speed = if_else(trips == 0, 0, max_speed * 0.6),
# Replacing missing values with 0 and converting meters to feet
elevation_gain = if_else(trips == 0, 0, elevation_gain * 3.3),
# Replacing missing values with 0 and converting meters to feet
dirt_dist = if_else(trips == 0, 0, max_speed * 3.3)
) |>
# Keeping only the days between May and October 2024
filter(
year(date) == 2024,
month(date) %in% 4:10
) |>
mutate(
average_speed = if_else(distance > 0, (distance)/(moving_time), 0)
) |>
dplyr::select(date, distance, moving_time, average_speed) |>
pivot_longer(
cols = -date,
names_to = 'attribute',
values_to = 'value'
) |>
mutate(
attribute = case_when(
attribute == 'distance' ~ 'Distance (miles)',
attribute == 'moving_time' ~ 'Elapsed Time (hours)',
attribute == 'average_speed' ~ 'Speed (mph)'
),
attribute = as_factor(attribute)
)
tibble(daily_activities)
## # A tibble: 642 × 3
## date attribute value
## <date> <fct> <dbl>
## 1 2024-04-01 Distance (miles) 0
## 2 2024-04-01 Elapsed Time (hours) 0
## 3 2024-04-01 Speed (mph) 0
## 4 2024-04-02 Distance (miles) 0
## 5 2024-04-02 Elapsed Time (hours) 0
## 6 2024-04-02 Speed (mph) 0
## 7 2024-04-03 Distance (miles) 0
## 8 2024-04-03 Elapsed Time (hours) 0
## 9 2024-04-03 Speed (mph) 0
## 10 2024-04-04 Distance (miles) 0
## # ℹ 632 more rows
Create the graph in the code chunk below. You’ll only need to
use geom_area() to create the line and shaded area below
it. It is also the only geom you’ll need to
add.
In facet_wrap(), include
strip.position = 'left' to place the label for each graph
on the left side of the plot. Make sure to include the
theme() and scale_x_date() code in the code
chunk so it matches the results!
ggplot(
data = daily_activities,
mapping = aes(
x = date,
y = value
)
) +
geom_area(
fill = 'steelblue',
alpha = 0.5,
color = 'black'
) +
# geom_text(
# data = data.frame(
# date = as.Date(c('2024-10-06', '2024-07-19', '2024-09-28')),
# value = c(84, 54, 44),
# attribute = 'distance',
# trip = c('Lamoille Trail', 'Missisquoi Trail', 'Fall Fundo')
# ),
# mapping = aes(label = trip),
# nudge_y = 3
# ) +
facet_wrap(
facets = vars(attribute),
scales = 'free',
ncol = 1,
strip.position = 'left'
) +
labs(
x = 'Date',
y = NULL,
title = 'Daily Bike Activity for 2024'
) +
scale_y_continuous(expand = c(0, 0, 0.05, 0)) +
theme_bw() +
theme(
plot.title = element_text(hjust = 0.5, size = 16),
strip.background = element_blank(),
strip.placement = 'outside',
strip.text = element_text(size = 12)
) +
scale_x_date(
date_breaks = '1 month',
expand = c(0, 1),
date_labels = '%B',
limits = c(date('2024-04-01'), date('2024-10-31'))
)