03/14/2023

Introduction

In this presentation, I will be using simple linear regression on the “economics” dataset in ggplot2 to identify trends in median unemployment time in America. The dataset is organized by: date, personal expenditures, population, personal savings rate, median duration of unemployment, and number of unemployed.

head(economics)
## # A tibble: 6 × 6
##   date         pce    pop psavert uempmed unemploy
##   <date>     <dbl>  <dbl>   <dbl>   <dbl>    <dbl>
## 1 1967-07-01  507. 198712    12.6     4.5     2944
## 2 1967-08-01  510. 198911    12.6     4.7     2945
## 3 1967-09-01  516. 199113    11.9     4.6     2958
## 4 1967-10-01  512. 199311    12.9     4.9     3143
## 5 1967-11-01  517. 199498    12.8     4.7     3066
## 6 1967-12-01  525. 199657    11.8     4.8     3018
economics
## # A tibble: 574 × 6
##    date         pce    pop psavert uempmed unemploy
##    <date>     <dbl>  <dbl>   <dbl>   <dbl>    <dbl>
##  1 1967-07-01  507. 198712    12.6     4.5     2944
##  2 1967-08-01  510. 198911    12.6     4.7     2945
##  3 1967-09-01  516. 199113    11.9     4.6     2958
##  4 1967-10-01  512. 199311    12.9     4.9     3143
##  5 1967-11-01  517. 199498    12.8     4.7     3066
##  6 1967-12-01  525. 199657    11.8     4.8     3018
##  7 1968-01-01  531. 199808    11.7     5.1     2878
##  8 1968-02-01  534. 199920    12.3     4.5     3001
##  9 1968-03-01  544. 200056    11.7     4.1     2877
## 10 1968-04-01  544  200208    12.3     4.6     2709
## # … with 564 more rows

Median unemployment by time

First, we will look at the different median durations of unemployment by time

Median unemployment with lin reg 1

Next, we will plot the previous graph but with a linear regression line for predicting future trends of median unemployment time. In this case, we use ggplot’s built-in linear regression method and package ggpubr to display the linear regression equation and \(R^2\) value.

p <- ggplot(economics, aes(x = date, y = uempmed)) +
  geom_line(color = "darkblue") + 
  stat_regline_equation(label.y = 20, aes(label = ..eq.label..)) +   
  stat_regline_equation(label.y = 18, aes(label = ..rr.label..)) +
  geom_smooth(method='lm', color = "red")
#p

Median unemployment with lin reg 2

Median unemployment with lin reg 3

With \(y = 4.3 + 0.00056x\) as the linear equation, we can calculate predictions past April 2015. Below are predictions by year using the equation: \(2016: 4.3 + 0.00056(2016) = 5.42896\ weeks\) \(2017: 4.3 + 0.00056(2017) = 5.42952\ weeks\) \(2018: 4.3 + 0.00056(2018) = 5.43008\ weeks\) \(2019: 4.3 + 0.00056(2019) = 5.43064\ weeks\) \(2020: 4.3 + 0.00056(2020) = 5.4312\ weeks\)

Median unemployment with lin reg 4

The previous results can be visualized by extending the original graph.

Median unemployment vs unemployment rate 1

As a bonus, we can look at the same median unemployment graph but including the unemployment ratio. This can be useful for analyzing the related trends of both statistics. We will use a 3d line graph using the library “plotly”

p <- plot_ly(economics, x = ~date, y = ~uempmed, 
             z = ~unemploy, type = 'scatter3d', 
             mode = 'lines', opacity = .9, 
             line = list(width = 6, color = ~uempmed,
             colorscale = 'viridis'))
#p

Median unemployment vs unemployment rate 2

Median unemployment conclusion

By looking at our simple linear regression, we can predict that the median time of unemployment will generally rise throughout the years. This means that people who are becoming unemployed will likely stay unemployed for longer in the future. This could be for many reasons. One possible reason is people who lose jobs simply have a harder time re-entering the market. Application processes might have gotten longer throughout the years with the emergence of background checks and more interview steps. Another possibility is the existence of unemployment benefits might lead people to take their time when searching for new jobs. Whatever the case, people who are losing their employment are ending up with longer periods of unemployment as time progresses which could adversely affect their financial security.