library(tidyverse)
library(gt)
me_workforce <- read_csv("https://jsuleiman.com/datasets/maine_workforce_data.csv")Lab 2
Overview
To complete this lab, you will need:
- a quarto.pub account (free)
- GitHub Copilot account approved and enabled in posit.cloud
Go to the shared posit.cloud workspace for this class and open the lab02_assign02 project. Open the lab02.qmd file and complete the exercises. Below is an annotated guide to assist you. There is also a video in the Brightspace Todo section for this module.
We will be using a subset from the Maine Center for Workforce Information for this lab so let’s start by loading the tidyverse family of packages, gt for making pretty tables, and read in the data. We’ll be using the message: false option to suppress the output message from loading tidyverse and gt
Exercises
There are eight exercises in this lab. Grading is shown in Section 4 at the end of the document. We will be attempting to recreate this table from the data.
Exercise 1
It is always helpful to glimpse our data. We can refer to the maine.gov website for details but they will be provided here.
glimpse(me_workforce)Rows: 144
Columns: 7
$ year <dbl> 2023, 2023, 2023, 2023, 2023, 2023, 2022, 2022, 202…
$ age_group <chr> "16-24", "25-34", "35-44", "45-54", "55-64", "65+",…
$ number <dbl> NA, 164, 182, 159, 200, 328, 128, 159, 184, 156, 19…
$ in_labor_force <dbl> NA, 139, 152, 119, 136, 61, 81, 129, 151, 124, 125,…
$ employed <dbl> NA, 135, 149, 116, 134, 59, 73, 125, 148, 121, 123,…
$ unemployed <dbl> NA, 4, 3, 3, 2, 2, 8, 4, 4, 3, 2, 3, NA, 7, 7, 3, 7…
$ not_in_labor_force <dbl> NA, 25, 30, 40, 64, 267, 47, 30, 33, 32, 73, 266, N…
We can see the following columns:
yearage_groupnumber- the number of people in that age group (in thousands), all numbers below are also in thousands.in_labor_force- the number of people participating in the labor force (i.e., people employed + people unemployed and looking for work).employed- number employedunemployed- number unemployednot_in_labor_force- not actively seeking employment
Looking at the table we created, since the participation rate is defined as the number in the labor force divided by the total number of people, we know we will need the following columns: year, age_group, number, in_labor_force
Create a tibble named participation_force as subset of me_workforce that contains only the data we need. We’ll start you out with a partial statement to complete.
participation_force <- me_workforce |>
select(year,age_group, number, in_labor_force) # complete this select.Exercise 2
Use the mutate function to add a column to participation_force called participation_rate which is in_labor_force / number.
participation_force <- participation_force |>
mutate(participation_rate = in_labor_force / number) # complete this mutate.Exercise 3
Before we pivot_wider to create our table, we want to eliminate any data we don’t need for our chart. Now that we calculated participation_rate create a new tibble called participation_force_chart that contains year, age_group, and participation_rate from participation_force.
participation_force_chart <- participation_force |>
select(year, age_group, participation_rate) # complete this select.Exercise 4
Now we can use pivot_wider to make the tibble contain similar data to our chart.
Hints
In
pivot_wideruse the parameternames_from = age_groupto create a column for each age_group.In
pivot_wideruse the parametervalues_from = participation_rateto fill the age_group categories with the participation rates we calculated earlier.glimpsetheparticipation_force_chartto see where you are at. :::
# insert code here
participation_force_chart <- participation_force_chart |>
pivot_wider(
names_from = age_group,
values_from = participation_rate
)
glimpse(participation_force_chart)Rows: 24
Columns: 7
$ year <dbl> 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 20…
$ `16-24` <dbl> NA, 0.6328125, NA, NA, 0.6758621, 0.6453901, 0.6126761, 0.6229…
$ `25-34` <dbl> 0.8475610, 0.8113208, 0.8343949, 0.8496732, 0.8684211, 0.83950…
$ `35-44` <dbl> 0.8351648, 0.8206522, 0.8430233, 0.8060606, 0.8421053, 0.85314…
$ `45-54` <dbl> 0.7484277, 0.7948718, 0.8053691, 0.8322148, 0.8322981, 0.83798…
$ `55-64` <dbl> 0.6800000, 0.6313131, 0.6698565, 0.6805556, 0.6759259, 0.69953…
$ `65+` <dbl> 0.1859756, 0.1914894, 0.2000000, 0.2040134, 0.1795775, 0.17735…
Exercise 5
You might still need to filter the data to restrict it to the last five years of data. If you haven’t already done so, do it in the code chunk below.
# insert code here
participation_force_chart <- participation_force_chart |>
filter(year >= 2019)Exercise 6
We haven’t gone into depth on using gt for pretty tables, let’s take a look at what the default formatting offers. Note: this table won’t have data until you complete the prior exercises.
participation_force_chart |> gt()| year | 16-24 | 25-34 | 35-44 | 45-54 | 55-64 | 65+ |
|---|---|---|---|---|---|---|
| 2023 | NA | 0.8475610 | 0.8351648 | 0.7484277 | 0.6800000 | 0.1859756 |
| 2022 | 0.6328125 | 0.8113208 | 0.8206522 | 0.7948718 | 0.6313131 | 0.1914894 |
| 2021 | NA | 0.8343949 | 0.8430233 | 0.8053691 | 0.6698565 | 0.2000000 |
| 2020 | NA | 0.8496732 | 0.8060606 | 0.8322148 | 0.6805556 | 0.2040134 |
| 2019 | 0.6758621 | 0.8684211 | 0.8421053 | 0.8322981 | 0.6759259 | 0.1795775 |
There are three formats we need to apply.
- The
participation-ratevalues should be percentages with one decimal place. NAvalues should appear as blanks.- The table should be captioned.
Exercise 7
The caption must be created within the functiongt(caption = "insert caption name here"). The other three functions can be piped (e.g., gt() |> function_name
For percent formatting, use
fmt_percent(columns = c("16-24", ..., "list all applicable columns here"), decimals = 1)For replacing NA with blank, use
sub_missing(missing_text = "")
# complete the statements below.
participation_force_chart |>
gt(caption = "Maine Workforce Participation by Age Group") |>
fmt_percent(columns = c("16-24", "25-34", "35-44", "45-54", "55-64", "65+"),
decimals = 1) |>
sub_missing(missing_text = "")| year | 16-24 | 25-34 | 35-44 | 45-54 | 55-64 | 65+ |
|---|---|---|---|---|---|---|
| 2023 | 84.8% | 83.5% | 74.8% | 68.0% | 18.6% | |
| 2022 | 63.3% | 81.1% | 82.1% | 79.5% | 63.1% | 19.1% |
| 2021 | 83.4% | 84.3% | 80.5% | 67.0% | 20.0% | |
| 2020 | 85.0% | 80.6% | 83.2% | 68.1% | 20.4% | |
| 2019 | 67.6% | 86.8% | 84.2% | 83.2% | 67.6% | 18.0% |
Exercise 8
To submit your lab:
- Change the author name to your name in the YAML portion at the top of this document
- Render your document to html and publish it to RPubs.
- Submit the link to your Rpubs document in the Brightspace comments section for this lab.
- Click on the “Add a File” button and upload your .qmd file for this assignment to Brightspace.
Grading
| Exercise | Points |
|---|---|
| Exercise 1 | 10 |
| Exercise 2 | 10 |
| Exercise 3 | 10 |
| Exercise 4 | 10 |
| Exercise 5 | 10 |
| Exercise 6 | 10 |
| Exercise 7 | 10 |
| Exercise 8 | 30 |
| Total | 100 |