Data : US States Production - A panel of 48 observations from 1970 to 1986

References, source, and metadata http://vincentarelbundock.github.io/Rdatasets/doc/plm/Produc.html

The following R code and graphs is an exploration into relationships between public and private spending by state and unemployment rates between the years of 1970 and 1986. This exploration concludes with an analysis of a subset of states with the smallest and greatest range of private versus public spending during this time and an attempt at finding a relationship between the data.

1. We load the data and create a private versus public spending ratio for each year for each state.

df<-read.csv("http://vincentarelbundock.github.io/Rdatasets/csv/plm/Produc.csv", header= TRUE, sep=",")
options(warn=-1)
df$priv_over_pub <- with(df, pc / gsp)
df_sub<-subset(df, select=c(state, year, unemp,priv_over_pub))

4. In order to reduce the data set, we decide to calculate the ranges between the lowest and highest ratio values per state for each year.

min_vals <-aggregate(priv_over_pub ~ state, df_sub, function(x) min(x))
max_vals <-aggregate(priv_over_pub ~ state, df_sub, function(x) max(x))
names(min_vals)[1:2]<-c("state","min")
names(max_vals)[1:2]<-c("state","max")
maxmin=merge(min_vals,max_vals,by='state')
maxmin$range <- (maxmin$max-maxmin$min)
maxmin[order(maxmin$range),]
##             state       min       max     range
## 47      WISCONSIN 0.8408928 0.9446676 0.1037748
## 18       MARYLAND 0.6775637 0.7957770 0.1182133
## 44       VIRGINIA 0.7045575 0.8358151 0.1312577
## 28     NEW_JERSEY 0.6665610 0.7979046 0.1313436
## 19  MASSACHUSETTS 0.5885171 0.7257024 0.1371852
## 31 NORTH_CAROLINA 0.8600003 0.9991763 0.1391760
## 22    MISSISSIPPI 1.1753641 1.3199585 0.1445944
## 6     CONNECTICUT 0.6147361 0.7601229 0.1453868
## 4      CALIFORNIA 0.6501949 0.7994933 0.1492984
## 42           UTAH 0.9722568 1.1240468 0.1517900
## 9         GEORGIA 0.8863980 1.0386686 0.1522707
## 40       TENNESSE 1.0015547 1.1592838 0.1577291
## 23       MISSOURI 0.8731555 1.0320287 0.1588732
## 30       NEW_YORK 0.5628860 0.7220925 0.1592065
## 36   PENNSYLVANIA 0.8135829 0.9759466 0.1623637
## 33           OHIO 0.8608755 1.0238819 0.1630064
## 37   RHODE_ISLAND 0.5534691 0.7208513 0.1673822
## 29     NEW_MEXICO 1.3293738 1.5013790 0.1720052
## 41          TEXAS 1.2222095 1.3958253 0.1736158
## 45     WASHINGTON 0.9124592 1.0869291 0.1744699
## 10          IDAHO 1.1662234 1.3439705 0.1777470
## 5        COLORADO 0.8448481 1.0388615 0.1940134
## 38 SOUTH_CAROLINA 1.1326606 1.3303239 0.1976633
## 11       ILLINOIS 0.7878404 0.9857199 0.1978795
## 3        ARKANSAS 1.1502799 1.3642801 0.2140002
## 8         FLORIDA 0.7308522 0.9586563 0.2278041
## 12        INDIANA 1.0549468 1.2860535 0.2311067
## 1         ALABAMA 1.1990431 1.4387445 0.2397015
## 15       KENTUCKY 0.9124670 1.1652944 0.2528274
## 21      MINNESOTA 0.8885572 1.1454508 0.2568937
## 17          MAINE 0.9002072 1.1605149 0.2603077
## 35         OREGON 0.9380360 1.1988304 0.2607944
## 14         KANSAS 1.2269931 1.4891513 0.2621582
## 24        MONTANA 1.6908443 1.9567083 0.2658640
## 43        VERMONT 0.8403514 1.1081802 0.2678288
## 25       NEBRASKA 1.1629348 1.4314748 0.2685400
## 13           IOWA 1.0688918 1.3560258 0.2871340
## 20       MICHIGAN 0.7667430 1.0670583 0.3003153
## 7        DELAWARE 0.8010332 1.1058352 0.3048020
## 2         ARIZONA 0.9053479 1.2228323 0.3174845
## 39   SOUTH_DAKOTA 1.2905422 1.6113060 0.3207637
## 34       OKLAHOMA 1.1509080 1.4805259 0.3296179
## 27  NEW_HAMPSHIRE 0.6471302 0.9810807 0.3339505
## 46  WEST_VIRGINIA 1.3458595 1.7257098 0.3798503
## 26         NEVADA 1.1625266 1.5481512 0.3856246
## 16      LOUISIANA 1.4756784 1.9650905 0.4894121
## 32   NORTH_DAKOTA 1.5610037 2.2324990 0.6714953
## 48        WYOMING 1.7308232 2.4940672 0.7632439

5. We can then subset the two dataframes into the 5 greatest, and 5 lowest ranges per state per ratio.

maxmin2 <- subset(maxmin, range > 0.35 | range < 0.138,select=c(state, range))
require(plyr)
## Loading required package: plyr
combinedData <- join(maxmin2, df, by='state', type='left', match='all')
minstate <- subset(combinedData, range < 0.138,select=c(state, year, range, unemp, priv_over_pub))
maxstate <- subset(combinedData, range > 0.35 ,select=c(state, year, range, unemp, priv_over_pub))

6. A plot of our lowest or most “consistent” ranges of subset [Wisconsin, Maryland, Virginia, New Jersey, Massachusetts] shows a range in unemployment.

7. A plot of our lowest or most “inconsistent” ranges of subset [West Virginia, Nevada, Louisiana, North Dakota, Wyoming] shows a range in unemployment and trend that differs from the earlier graph.