Regression of our 36 variables

famE<-read.csv("/Users/hannahpeterson/Documents/R stuff/FamilyEdited1.csv")
table(famE$FSLAST)
## 
##     1     2     3 
##  1723  4530 39290
m1=lm(famE$FSLAST~.,data=famE)
summary(m1)
## 
## Call:
## lm(formula = famE$FSLAST ~ ., data = famE)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.65047 -0.00521  0.00202  0.01342  1.67849 
## 
## Coefficients: (1 not defined because of singularities)
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.676e-01  6.229e-02   7.507 6.23e-14 ***
## CURWRKN     -5.506e-03  2.553e-03  -2.157 0.031031 *  
## TELCELN             NA         NA      NA       NA    
## WRKCELN      3.623e-04  2.326e-04   1.557 0.119438    
## FLNGINTV     2.858e-04  2.864e-03   0.100 0.920507    
## FM_SIZE     -4.728e-03  2.662e-03  -1.776 0.075728 .  
## FM_KIDS      1.930e-03  3.069e-03   0.629 0.529547    
## FM_ELDR      1.237e-02  4.347e-03   2.845 0.004450 ** 
## FM_TYPE      3.021e-03  9.155e-03   0.330 0.741404    
## FM_STRCP     8.278e-03  5.733e-03   1.444 0.148755    
## FM_STRP     -8.485e-03  5.737e-03  -1.479 0.139181    
## FM_EDUC1     2.796e-04  2.189e-04   1.277 0.201627    
## FLAADLYN     1.760e-02  2.545e-02   0.692 0.489217    
## FLAADLCT    -3.675e-03  2.528e-02  -0.145 0.884436    
## FLIADLYN    -3.886e-02  2.160e-02  -1.799 0.072011 .  
## FLIADLCT    -7.477e-03  2.069e-02  -0.361 0.717836    
## FWKLIMYN     9.267e-03  1.013e-02   0.915 0.360174    
## FWKLIMCT     2.802e-03  9.573e-03   0.293 0.769744    
## FANYLYN      2.409e-02  8.091e-03   2.977 0.002912 ** 
## FANYLCT      2.063e-03  6.368e-03   0.324 0.745976    
## FHSTATPR    -2.700e-02  7.279e-03  -3.710 0.000208 ***
## FSRUNOUT     5.511e-01  3.696e-03 149.092  < 2e-16 ***
## FSBALANC     2.789e-01  4.147e-03  67.265  < 2e-16 ***
## FHICOVYN    -2.121e-03  2.918e-03  -0.727 0.467299    
## FMEDBILL     9.638e-03  2.665e-03   3.616 0.000299 ***
## FDGLWCT1     7.231e-03  3.382e-03   2.138 0.032547 *  
## FWRKLWCT     3.253e-03  2.372e-03   1.371 0.170317    
## FSALYN      -7.613e-03  2.821e-03  -2.699 0.006965 ** 
## FSALCT      -8.539e-04  2.787e-03  -0.306 0.759328    
## FSSRRYN     -8.459e-03  3.894e-03  -2.172 0.029846 *  
## FSSRRCT     -1.406e-02  5.062e-03  -2.778 0.005480 ** 
## FTANFYN      4.893e-03  4.199e-03   1.165 0.243974    
## FCHSPYN      1.631e-03  3.561e-03   0.458 0.646900    
## INCGRP4     -4.968e-05  7.284e-05  -0.682 0.495242    
## RAT_CAT4    -5.087e-06  9.275e-05  -0.055 0.956258    
## FSNAP        6.664e-03  2.768e-03   2.407 0.016080 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2239 on 30104 degrees of freedom
##   (15404 observations deleted due to missingness)
## Multiple R-squared:  0.7092, Adjusted R-squared:  0.7089 
## F-statistic:  2159 on 34 and 30104 DF,  p-value: < 2.2e-16

After looking at all of the data, we narrowed all the variables to a select 36 based on what we felt best fit our main 3 reasons for food insecurity: economic,social resources and functional limitations (disabilities). For our dependent variable we decided to look at FSLAST. We wanted to evaluate a variable that we felt really represented food insecurity. This variable examines if families ran out of money to buy food before the end of the week. We found that our variables explain about 70% of the model.

Correlation of FSLast and FSRUNOUT

m2=lm(famE$FSLAST~famE$FSRUNOUT,data=famE)
summary(m2)
## 
## Call:
## lm(formula = famE$FSLAST ~ famE$FSRUNOUT, data = famE)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -1.98435  0.01565  0.01565  0.01565  1.53282 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   0.708598   0.006711   105.6   <2e-16 ***
## famE$FSRUNOUT 0.758586   0.002366   320.7   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2599 on 45541 degrees of freedom
## Multiple R-squared:  0.6931, Adjusted R-squared:  0.693 
## F-statistic: 1.028e+05 on 1 and 45541 DF,  p-value: < 2.2e-16

After further analyzing the data we realized that the variables FSLAST and FSRUNOUT are pretty much the same thing. FSLAST looks at the tangible data that families did in fact runout of food do to money shortages whereas FSRUNOUT is the mentality or fear of running out of food before the week is up. In this regression we can see that just these variables are highly correlated. From there we decided to look at the data just using FSLAST and see how it correlated with our variables.

Regression without FSRUNOUT

m3=lm(famE$FSLAST~.-famE$FSRUNOUT,data=famE)
summary(m3)
## 
## Call:
## lm(formula = famE$FSLAST ~ . - famE$FSRUNOUT, data = famE)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -2.65047 -0.00521  0.00202  0.01342  1.67849 
## 
## Coefficients: (1 not defined because of singularities)
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  4.676e-01  6.229e-02   7.507 6.23e-14 ***
## CURWRKN     -5.506e-03  2.553e-03  -2.157 0.031031 *  
## TELCELN             NA         NA      NA       NA    
## WRKCELN      3.623e-04  2.326e-04   1.557 0.119438    
## FLNGINTV     2.858e-04  2.864e-03   0.100 0.920507    
## FM_SIZE     -4.728e-03  2.662e-03  -1.776 0.075728 .  
## FM_KIDS      1.930e-03  3.069e-03   0.629 0.529547    
## FM_ELDR      1.237e-02  4.347e-03   2.845 0.004450 ** 
## FM_TYPE      3.021e-03  9.155e-03   0.330 0.741404    
## FM_STRCP     8.278e-03  5.733e-03   1.444 0.148755    
## FM_STRP     -8.485e-03  5.737e-03  -1.479 0.139181    
## FM_EDUC1     2.796e-04  2.189e-04   1.277 0.201627    
## FLAADLYN     1.760e-02  2.545e-02   0.692 0.489217    
## FLAADLCT    -3.675e-03  2.528e-02  -0.145 0.884436    
## FLIADLYN    -3.886e-02  2.160e-02  -1.799 0.072011 .  
## FLIADLCT    -7.477e-03  2.069e-02  -0.361 0.717836    
## FWKLIMYN     9.267e-03  1.013e-02   0.915 0.360174    
## FWKLIMCT     2.802e-03  9.573e-03   0.293 0.769744    
## FANYLYN      2.409e-02  8.091e-03   2.977 0.002912 ** 
## FANYLCT      2.063e-03  6.368e-03   0.324 0.745976    
## FHSTATPR    -2.700e-02  7.279e-03  -3.710 0.000208 ***
## FSRUNOUT     5.511e-01  3.696e-03 149.092  < 2e-16 ***
## FSBALANC     2.789e-01  4.147e-03  67.265  < 2e-16 ***
## FHICOVYN    -2.121e-03  2.918e-03  -0.727 0.467299    
## FMEDBILL     9.638e-03  2.665e-03   3.616 0.000299 ***
## FDGLWCT1     7.231e-03  3.382e-03   2.138 0.032547 *  
## FWRKLWCT     3.253e-03  2.372e-03   1.371 0.170317    
## FSALYN      -7.613e-03  2.821e-03  -2.699 0.006965 ** 
## FSALCT      -8.539e-04  2.787e-03  -0.306 0.759328    
## FSSRRYN     -8.459e-03  3.894e-03  -2.172 0.029846 *  
## FSSRRCT     -1.406e-02  5.062e-03  -2.778 0.005480 ** 
## FTANFYN      4.893e-03  4.199e-03   1.165 0.243974    
## FCHSPYN      1.631e-03  3.561e-03   0.458 0.646900    
## INCGRP4     -4.968e-05  7.284e-05  -0.682 0.495242    
## RAT_CAT4    -5.087e-06  9.275e-05  -0.055 0.956258    
## FSNAP        6.664e-03  2.768e-03   2.407 0.016080 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2239 on 30104 degrees of freedom
##   (15404 observations deleted due to missingness)
## Multiple R-squared:  0.7092, Adjusted R-squared:  0.7089 
## F-statistic:  2159 on 34 and 30104 DF,  p-value: < 2.2e-16

In this regression, our R squared dropped significantly. It went from about a 70 to 50. By simply taking the variable FSRUNOUT out of the equation, it really exemplifies the significant relationship that is has with our dependent variable.

Bar Chart

library(lattice)
fslast=table(famE$FSLAST)
fslast
## 
##     1     2     3 
##  1723  4530 39290
barchart(fslast,ylab="The food did not last and there wasn't enough money to buy more.")

This bar chart simply shows a breakdown of the responses for the depnedent variable. 3=Never true 2=Sometimes true 1=Often true