Issue Description:

The thought of borrowing money, incurring debt, and paying off loans for a post-secondary education terrifies most students. With the cost of tuition and living expenses rising and wages remaining relatively stagnate, there seems to be legitimate cause for concern. Students are very cautious to consider perusing a bachelor’s degree at a private college or university because of a higher sticker price that many immediately associate with insurmountable debt. The fear of debt and loans prevents many students and families from even considering higher education as an option, forgoing one of the biggest personal investments in life. Forbes magazine claims that student loan debt is the second largest debt category after mortgages (Friedman, 2018). Bloomberg online magazine looms “that the next generation of graduates could default on their loans at even higher rates than in the immediate wake of the financial crisis” (Griffin, 2018). With the Great Recession a recent memory for most adults, these anecdotes seem particularly daunting. What should graduating high school seniors and their families think when considering the cost of higher education and debt? Should we as a country be on alert for the looming financial crisis due to defaulting student loans? These a pressing questions that educators, student, parents, and others are concerned about. The issue is far reaching a topic of popular conversation.

Questions:

I plan to use student loan data on loan statuses, loan types, and delinquency rates to investigate the state of student loans. I want to know which loan statuses have the highest growth rates, which loan types have the largest growth, and what the growth is for the average student loan amount.

Data

https://studentaid.ed.gov/sa/sites/default/files/fsawg/datacenter/library/PortfoliobyLoanStatus.xls

library(tidyverse)
## -- Attaching packages --------------------------------------------------------------------------------------- tidyverse 1.2.1 --
## v ggplot2 3.0.0     v purrr   0.2.5
## v tibble  1.4.2     v dplyr   0.7.6
## v tidyr   0.8.1     v stringr 1.3.1
## v readr   1.1.1     v forcats 0.3.0
## -- Conflicts ------------------------------------------------------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library(dplyr)
library(readr)

LoanStatusp <- read_csv("LoanStatus.csv")
## Parsed with column specification:
## cols(
##   Year = col_integer(),
##   QRT = col_character(),
##   InSchool = col_character(),
##   InSchoolRecip = col_double(),
##   Grace = col_character(),
##   GraceRecip = col_double(),
##   Repayment = col_character(),
##   RepaymentRecip = col_double(),
##   Deferment = col_character(),
##   DefermentRecip = col_double(),
##   Forbearance = col_character(),
##   ForbearanceRecip = col_double(),
##   CumDefault = col_character(),
##   `Recipients     (in millions)` = col_double(),
##   Other = col_character(),
##   OtherRecip = col_double()
## )

Formatting the Data

My first order of business was to get all of the data in the proper format. Each dollar amount is converted to numeric and the year and quarter are converted to a chonological data type of yearqrt.

library(zoo)
## 
## Attaching package: 'zoo'
## The following objects are masked from 'package:base':
## 
##     as.Date, as.Date.numeric
#Changed all dollar variables to numeric, added new variable yearqrt to convert Year and Qrt to a Date

LoanStatus3 <- LoanStatusp %>%
  mutate(InSchool = gsub('\\$', '', InSchool),Grace = gsub('\\$', '', Grace),Repayment = gsub('\\$', '', Repayment),Deferment = gsub('\\$', '', Deferment),Forbearance = gsub('\\$', '', Forbearance),CumDefault = gsub('\\$', '', CumDefault), Other= gsub('\\$', '', Other)) %>%
  mutate(InSchool = as.numeric(InSchool),Grace = as.numeric(Grace),Repayment = as.numeric(Repayment),Deferment = as.numeric(Deferment),Forbearance = as.numeric(Forbearance),CumDefault = as.numeric(CumDefault), Other= as.numeric(Other)) %>%
  mutate(Year = as.character(Year))%>%
  mutate(yrqrt = paste(Year, QRT, sep = '/'),  yrqrt = as.yearqtr(yrqrt, format = "%Y/Q%q"))
  #mutate(yrqrt = as.Date(as.yearqtr(yrqrt, format = "%Y%Q"))

str(LoanStatus3)
## Classes 'tbl_df', 'tbl' and 'data.frame':    21 obs. of  17 variables:
##  $ Year                        : chr  "2013" "2013" "2014" "2014" ...
##  $ QRT                         : chr  "Q3" "Q4" "Q1" "Q2" ...
##  $ InSchool                    : num  134 153 147 160 136 ...
##  $ InSchoolRecip               : num  7.9 8.8 8.7 8.5 7.6 8.5 8.3 8.1 7.2 8.2 ...
##  $ Grace                       : num  40.4 47.6 27 28.9 42.8 49.8 28.7 28.6 42.2 48.5 ...
##  $ GraceRecip                  : num  1.9 2.2 1.4 1.7 1.8 2.1 1.4 1.6 1.8 1.9 ...
##  $ Repayment                   : num  237 236 272 276 300 ...
##  $ RepaymentRecip              : num  10.8 10.6 11.7 11.5 12.3 12.1 13.4 13.7 14.4 14.3 ...
##  $ Deferment                   : num  75.6 81.8 77.1 91 89.3 95.1 86.5 98.1 93.3 99.2 ...
##  $ DefermentRecip              : num  3.2 3.4 3.3 3.6 3.4 3.6 3.4 3.5 3.2 3.4 ...
##  $ Forbearance                 : num  48.3 53.7 63.6 72.1 74.9 80.7 86.7 86 87.4 89.5 ...
##  $ ForbearanceRecip            : num  1.8 1.9 2.2 2.5 2.5 2.7 2.6 2.6 2.6 2.7 ...
##  $ CumDefault                  : num  30.5 33.8 35.2 36.6 37.4 40.1 42.5 44.7 47.9 50.8 ...
##  $ Recipients     (in millions): num  2.1 2.2 2.4 2.5 2.6 2.7 2.9 3 3.2 3.3 ...
##  $ Other                       : num  3.2 2.9 4.3 4.4 4.9 5 4.7 5.1 5.2 5.4 ...
##  $ OtherRecip                  : num  0.1 0.1 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 ...
##  $ yrqrt                       : 'yearqtr' num  2013 Q3 2013 Q4 2014 Q1 2014 Q2 ...
summary(LoanStatus3)
##      Year               QRT               InSchool     InSchoolRecip  
##  Length:21          Length:21          Min.   :118.0   Min.   :6.400  
##  Class :character   Class :character   1st Qu.:133.8   1st Qu.:7.400  
##  Mode  :character   Mode  :character   Median :142.1   Median :7.700  
##                                        Mean   :141.3   Mean   :7.733  
##                                        3rd Qu.:150.1   3rd Qu.:8.200  
##                                        Max.   :160.0   Max.   :8.800  
##      Grace         GraceRecip      Repayment     RepaymentRecip 
##  Min.   :25.10   Min.   :1.200   Min.   :236.4   Min.   :10.60  
##  1st Qu.:27.00   1st Qu.:1.400   1st Qu.:301.8   1st Qu.:12.30  
##  Median :39.40   Median :1.700   Median :440.1   Median :15.00  
##  Mean   :36.45   Mean   :1.652   Mean   :426.4   Mean   :14.63  
##  3rd Qu.:42.80   3rd Qu.:1.900   3rd Qu.:535.6   3rd Qu.:16.60  
##  Max.   :50.10   Max.   :2.200   Max.   :630.2   Max.   :18.30  
##    Deferment      DefermentRecip   Forbearance     ForbearanceRecip
##  Min.   : 75.60   Min.   :3.100   Min.   : 48.30   Min.   :1.800   
##  1st Qu.: 89.30   1st Qu.:3.300   1st Qu.: 80.70   1st Qu.:2.500   
##  Median : 98.10   Median :3.400   Median : 96.20   Median :2.600   
##  Mean   : 98.27   Mean   :3.429   Mean   : 89.93   Mean   :2.543   
##  3rd Qu.:107.30   3rd Qu.:3.600   3rd Qu.:104.20   3rd Qu.:2.700   
##  Max.   :121.00   Max.   :3.700   Max.   :112.50   Max.   :2.900   
##    CumDefault    Recipients     (in millions)     Other      
##  Min.   :30.50   Min.   :2.100                Min.   :2.900  
##  1st Qu.:40.10   1st Qu.:2.700                1st Qu.:4.900  
##  Median :55.50   Median :3.600                Median :6.000  
##  Mean   :58.71   Mean   :3.533                Mean   :5.952  
##  3rd Qu.:74.90   3rd Qu.:4.300                3rd Qu.:7.200  
##  Max.   :97.00   Max.   :5.000                Max.   :8.400  
##    OtherRecip         yrqrt     
##  Min.   :0.1000   Min.   :2014  
##  1st Qu.:0.2000   1st Qu.:2015  
##  Median :0.2000   Median :2016  
##  Mean   :0.1905   Mean   :2016  
##  3rd Qu.:0.2000   3rd Qu.:2017  
##  Max.   :0.2000   Max.   :2018

How do the balances of these loan statuses change over time?

library(ggplot2)
ggplot(LoanStatus3, aes(x=yrqrt, y  = InSchool)) +
  geom_point() +
  geom_smooth()+
  labs(y = "Total in billions of dollars")+
  ggtitle("\"In School\" Loan status 2013 - 2018")
## Don't know how to automatically pick scale for object of type yearqtr. Defaulting to continuous.
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

I want to see what other loan status balances are doing over time. Mostly, I am curious about balances in Default.

LoanStatus3 %>%
  ggplot(aes(x=yrqrt, y  = CumDefault)) +
  geom_point() +
  geom_smooth()+
  labs(y = "Total in billions of dollars")+
  ggtitle("Cummulative \"Default\" Loan statuses 2013 - 2018")
## Don't know how to automatically pick scale for object of type yearqtr. Defaulting to continuous.
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

More formatting

I want to see this data a little differently so I can more easliy manipulate it with ggplot2 package.

isln = LoanStatus3%>%
  select(yrqrt,InSchool,InSchoolRecip)%>%
  mutate(loanstatus = "In School")%>%
  rename(Dollars = InSchool,Recipients = InSchoolRecip)

grln = LoanStatus3%>%
  select(yrqrt,Grace ,GraceRecip)%>%
  mutate(loanstatus = "Grace")%>%
  rename(Dollars = Grace,Recipients = GraceRecip)

rpln = LoanStatus3%>%
  select(yrqrt,Repayment ,RepaymentRecip)%>%
  mutate(loanstatus = "Repayment")%>%
  rename(Dollars = Repayment,Recipients = RepaymentRecip)

dfln = LoanStatus3%>%
  select(yrqrt,Deferment ,DefermentRecip)%>%
  mutate(loanstatus = "Deferment")%>%
  rename(Dollars = Deferment,Recipients = DefermentRecip)

fbln = LoanStatus3%>%
  select(yrqrt,Forbearance ,ForbearanceRecip)%>%
  mutate(loanstatus = "Forbearance")%>%
  rename(Dollars = Forbearance,Recipients = ForbearanceRecip)

cdln = LoanStatus3%>%
  select(yrqrt,CumDefault ,`Recipients     (in millions)`)%>%
  mutate(loanstatus = "Cummulative Default")%>%
  rename(Dollars = CumDefault,Recipients = `Recipients     (in millions)`)
orln = LoanStatus3%>%
  select(yrqrt,Other ,OtherRecip)%>%
  mutate(loanstatus = "Other")%>%
  rename(Dollars = Other,Recipients = OtherRecip)

all = rbind(isln, grln,rpln, dfln,fbln,cdln,orln)

#I wanted to check that every thing looks the way it should
summary (all)
##      yrqrt         Dollars         Recipients      loanstatus       
##  Min.   :2014   Min.   :  2.90   Min.   : 0.100   Length:147        
##  1st Qu.:2015   1st Qu.: 39.75   1st Qu.: 1.850   Class :character  
##  Median :2016   Median : 88.30   Median : 3.200   Mode  :character  
##  Mean   :2016   Mean   :122.43   Mean   : 4.816                     
##  3rd Qu.:2017   3rd Qu.:133.65   3rd Qu.: 7.300                     
##  Max.   :2018   Max.   :630.20   Max.   :18.300
str(isln)
## Classes 'tbl_df', 'tbl' and 'data.frame':    21 obs. of  4 variables:
##  $ yrqrt     : 'yearqtr' num  2013 Q3 2013 Q4 2014 Q1 2014 Q2 ...
##  $ Dollars   : num  134 153 147 160 136 ...
##  $ Recipients: num  7.9 8.8 8.7 8.5 7.6 8.5 8.3 8.1 7.2 8.2 ...
##  $ loanstatus: chr  "In School" "In School" "In School" "In School" ...
str (grln)
## Classes 'tbl_df', 'tbl' and 'data.frame':    21 obs. of  4 variables:
##  $ yrqrt     : 'yearqtr' num  2013 Q3 2013 Q4 2014 Q1 2014 Q2 ...
##  $ Dollars   : num  40.4 47.6 27 28.9 42.8 49.8 28.7 28.6 42.2 48.5 ...
##  $ Recipients: num  1.9 2.2 1.4 1.7 1.8 2.1 1.4 1.6 1.8 1.9 ...
##  $ loanstatus: chr  "Grace" "Grace" "Grace" "Grace" ...

Comparing Loan status growth

How are the loans statuses changing over time compared to each other?

all$loanstatus <- factor(all$loanstatus, c("In School", "Grace", "Repayment", "Deferment", "Forbearance", "Cummulative Default", "Other"))

all%>%
  ggplot(aes(x = yrqrt,y = Dollars, color = factor (loanstatus)))+
  geom_point()+
  labs( y = "Dollars in billions")+
  facet_wrap(~loanstatus)+
  theme(axis.text.x = element_text(angle=45)) +
  theme(strip.text.y = element_text(angle = 0))+
   scale_x_yearqtr(limits = c(min(all$yrqrt), max(all$yrqrt)),format = "%YQ%q")+
  ggtitle("Outstaing principal and interest balances for Federal Direct Loans 2013 - 2018")

Comparing growth in number of recipients

Loan amounts in man statuses are going up but what is happening with the number of recipients?

all%>%
  ggplot(aes(x = yrqrt,y = Recipients, color = factor (loanstatus)))+
  geom_point()+
  labs( y = "Recipients in millions")+
  facet_wrap(~loanstatus)+
  theme(axis.text.x = element_text(angle=45)) +
  theme(strip.text.y = element_text(angle = 0)) +
  scale_x_yearqtr(limits = c(min(all$yrqrt), max(all$yrqrt)),format = "%YQ%q")+

  ggtitle("Loan status of Recipients of Federal Direct Loans 2013 - 2018")

Distribution

What is happening to the over all balance for these loans over time?

all%>%
ggplot(aes(x = factor (yrqrt),y=Dollars)) + geom_boxplot() + coord_flip()+
  labs(y = "Dollars in billions")+
  labs(x = "Year and Quarter") +
  ggtitle("Distribution of Federal Direct Loan balances 2013 - 2018")

What is happening to the distributiong of the recipients over time?

all%>%
ggplot(aes(x = factor (yrqrt),y=Recipients)) + geom_boxplot() + coord_flip()+
  labs(y = "Recipients in millions")+
  labs(x = "Year and Quarter") +
  ggtitle("Distribution of recipients of Federal Direct Loans 2013 - 2018")

The box plots allow us to see the change in distribution over time for recipients and loan amounts. Lets see what this could mean for an individual. I add Average loan amount per recipient.

all1 = all %>%
  mutate(avgLoan = (Dollars/Recipients)*1000)
str (all1)
## Classes 'tbl_df', 'tbl' and 'data.frame':    147 obs. of  5 variables:
##  $ yrqrt     : 'yearqtr' num  2013 Q3 2013 Q4 2014 Q1 2014 Q2 ...
##  $ Dollars   : num  134 153 147 160 136 ...
##  $ Recipients: num  7.9 8.8 8.7 8.5 7.6 8.5 8.3 8.1 7.2 8.2 ...
##  $ loanstatus: Factor w/ 7 levels "In School","Grace",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ avgLoan   : num  16937 17375 16874 18824 17908 ...

How is the distribution of the average loan balance changing over time?

all1%>%
ggplot(aes(x = factor (yrqrt),y=avgLoan)) + geom_boxplot() + coord_flip()+
  labs(y="Average loan amount per recipient")+
  labs(x = "Year and Quarter") +
  ggtitle("Average Loan amount per Recipient of Federal Direct Loans 2013 - 2018")

We can see that the midian average loan amount per recipient is increaseing with time which makes sense with the steady rise of tuition. I want to see the distribution of the balance for each loan status.

all%>%
ggplot(aes(x=loanstatus,y= Dollars)) + geom_boxplot()+
   labs(y ="Balance in billions of dollars")+
  labs(x = "Year") +
  ggtitle("Balances Federal Direct Loan Status 2013 - 2018")

We can take a look at the over all distribution of the recipients in each loan status.

all%>%
ggplot(aes(x=loanstatus,y= Recipients)) + geom_boxplot()+
   labs(y ="Recipients in millions")+
  labs(x = "Year") +
  ggtitle("Recipients for various Federal Direct Loan Status 2013 - 2018")

Finally, we can see the distribution of the average loan balance for each loan status.

all1%>%
ggplot(aes(x=loanstatus,y= avgLoan)) + geom_boxplot()+
   labs(y ="Average Loan Amounts per recipient")+
  labs(x = "Year") +
  ggtitle("Distribution of average balance for various Direct Loan Status 2013 - 2018")

Interestingly, the largest average loan amounds are in Forbearance and Other which is a non-default bankruptcy status. It follows that people with large loan amounts would be facing some sort of financial hardship. Although the third largest median number of Recipient are those with a loan status in default, their median loan amount is quit low. It would seem the people who are default are defaulting on small loan amounts.

Question

What does the growth of each of the loan status balances look like over time?

all1$loanstatus <- factor(all$loanstatus, c("In School", "Grace", "Repayment", "Deferment", "Forbearance", "Cummulative Default", "Other"))

all1%>%
ggplot(aes(x= yrqrt, y = Dollars,color = loanstatus, group=loanstatus)) + geom_point() + geom_smooth()+
  labs (y = "Dollars in millions")+
  labs(x = "Year") +
  scale_x_yearqtr(limits = c(min(all$yrqrt), max(all$yrqrt)),format = "%YQ%q")+
  ggtitle("Outstanding balance for Federal Direct Loan Statuses 2013 - 2018")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

What does the growth in number of recipients in each of the loan status look like over time?

all1%>%
ggplot(aes(x= yrqrt, y = Recipients,color = loanstatus, group=loanstatus)) + geom_point() + geom_smooth()+
  labs(y ="Recipients in millions")+
  labs(x = "Year") +
  scale_x_yearqtr(limits = c(min(all$yrqrt), max(all$yrqrt)),format = "%YQ%q")+
  ggtitle("Recipients in various Federal Direct Loan Status 2013 - 2018")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

What does the growth of the average balance for each of the loan status balances look like over time?

all1%>%
ggplot(aes(x= yrqrt, y = avgLoan,color = loanstatus, group=loanstatus)) + geom_point() + geom_smooth()+
   labs(y ="Avager loan amount per recipient")+
  labs(x = "Year") +
  scale_x_yearqtr(limits = c(min(all$yrqrt), max(all$yrqrt)),format = "%YQ%q")+
  ggtitle("Average loan amounts Federal Direct Loan Statuses 2013 - 2018")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'

One of the articles I read claimed that 1 million people default on their loans each year. Lets take a look. I add the change in balance as a variable and plot it over time.

trychange = all1 %>%
  group_by(loanstatus) %>% 
  mutate(recipient_change = ((Recipients-lag(Recipients))*100000))
head(trychange)

Is the rate at which recipients enter the Default loan status increasing over time?

trychange %>%
  filter(loanstatus == "Cummulative Default")%>%
  ggplot(aes(x= yrqrt, y = recipient_change)) + geom_point() + geom_smooth()+
   labs(y ="Recipients")+
  labs(x = "Year") +
  scale_x_yearqtr(limits = c(min(all$yrqrt), max(all$yrqrt)),format = "%YQ%q")+
  ggtitle("Change in Recipients for Cummulative Default Federal Direct Loan Status 2013 - 2018")
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
## Warning: Removed 1 rows containing non-finite values (stat_smooth).
## Warning: Removed 1 rows containing missing values (geom_point).

It apprears it is staying stagniate over time except a jump in the first quarter of 2016.

Comparing loan amonts by state

Data

https://studentaid.ed.gov/sa/sites/default/files/fsawg/datacenter/library/DLPortfolio-by-Location.xls

LoanLocation <- read_csv("DLPortfolio-by-Location.csv")
## Parsed with column specification:
## cols(
##   Location = col_character(),
##   `Balance (in billions)` = col_character(),
##   `Borrowers (in thousands)` = col_double()
## )
head(LoanLocation)
loanloc <- LoanLocation %>%
  rename(Borrowers = 'Borrowers (in thousands)', Balance = 'Balance (in billions)',NAME = Location)%>%
  mutate(Balance = gsub('\\$', '', Balance))%>%
  mutate(Balance = as.numeric(Balance))%>%
  mutate(avgbalance = (Balance*1000000000)/(Borrowers*1000))

Lets look at the loan balance per state.

theme_dotplot <- theme_bw(14) +
    theme(axis.text.y = element_text(size = rel(.60)),
        axis.ticks.y = element_blank(),
        axis.title.x = element_text(size = rel(.75)),
        panel.grid.major.x = element_blank(),
        panel.grid.major.y = element_line(size = 0.5),
        panel.grid.minor.x = element_blank())

loanloc%>%
ggplot(aes(x = Balance, y = reorder(NAME, Balance))) + 
  geom_point()+
  theme_dotplot+
  geom_point(color = "blue") +
  geom_vline(xintercept = 28.0, color = "red") +
  labs(x = "Loan Balance in billions", y = "State")+
  ggtitle("Direct Loan balance by state")

This is likely just a representation of the population of the state, will I would like to see what the number of recipients looks like for each state.

loanloc%>%
ggplot(aes(x = Borrowers, y = reorder(NAME, Borrowers))) + 
  geom_point()+
  theme_dotplot+
  geom_point(color = "blue") +
  geom_vline(xintercept = 800.0, color = "red") +
  labs(x = "Borrowers (in thousands)", y = "State")+
  ggtitle("Direct Loan borrowers by state")

Now I want to consider the average loan balance for borrowers in each state.

loanloc%>%
ggplot(aes(x = avgbalance, y = reorder(NAME, avgbalance))) + 
  geom_point()+
  theme_dotplot+
  labs(x = "Average balance per borrower", y = "State")+
  ggtitle("Average borrower balance by state")

library(tidycensus)
library(tidyverse)
options(tigris_use_cache = TRUE)
#census_api_key("6162a19d578eac94a0f5bfe0c6dfeb3cd5dc1e90", install = TRUE)

acslocation <- get_acs(geography = "state", variables = "B19013_001", year = 2016, geometry = TRUE)
## Getting data from the 2012-2016 5-year ACS
head(acslocation)
both = loanloc%>%
  select(NAME, Balance, Borrowers,avgbalance)%>%
  full_join(loanloc)%>%
  full_join(acslocation)
## Joining, by = c("NAME", "Balance", "Borrowers", "avgbalance")
## Joining, by = "NAME"
glimpse(both)
## Observations: 52
## Variables: 9
## $ NAME       <chr> "Alabama", "Alaska", "Arizona", "Arkansas", "Califo...
## $ Balance    <dbl> 16.9, 1.8, 22.4, 9.1, 108.4, 21.4, 12.8, 3.3, 5.2, ...
## $ Borrowers  <dbl> 525.9, 59.6, 719.9, 317.1, 3321.7, 663.7, 417.2, 10...
## $ avgbalance <dbl> 32135.39, 30201.34, 31115.43, 28697.57, 32633.89, 3...
## $ GEOID      <chr> "01", "02", "04", "05", "06", "08", "09", "10", "11...
## $ variable   <chr> "B19013_001", "B19013_001", "B19013_001", "B19013_0...
## $ estimate   <dbl> 44758, 74444, 51340, 42336, 63783, 62520, 71755, 61...
## $ moe        <dbl> 314, 809, 231, 234, 188, 287, 473, 723, 1164, 200, ...
## $ geometry   <MULTIPOLYGON [°]> MULTIPOLYGON (((-88.05338 3..., MULTI...
head(both)

I want to take a look at the average balance for each state in a map.

library(viridis)
## Loading required package: viridisLite
both%>%
  ggplot(aes(fill = avgbalance, color = avgbalance)) +
  geom_sf() +
   coord_sf(crs = 26911) + 
  scale_fill_viridis(option = "magma") + 
  scale_color_viridis(option = "magma")

Now I want to see how the average balance might compare with the median income in that state.

both%>%
  ggplot(aes(fill = estimate, color = estimate)) +
  geom_sf() +
   coord_sf(crs = 26911) + 
  scale_fill_viridis(option = "magma") + 
  scale_color_viridis(option = "magma")

library(devtools)
devtools::install_github("rstudio/rsconnect", ref = "bugfix/multi-status-header")
## Skipping install of 'rsconnect' from a github remote, the SHA1 (c5fdee00) has not changed since last install.
##   Use `force = TRUE` to force installation

`