## id year divorce unemployed femlab marriage birth military
## 1 1 1920 8.0 5.2 22.70 92.0 117.9 3.2247
## 2 2 1921 7.2 11.7 22.79 83.0 119.8 3.5614
## 3 3 1922 6.6 6.7 22.88 79.7 111.2 2.4553
## 4 4 1923 7.1 2.4 22.97 85.2 110.5 2.2065
## 5 5 1924 7.2 5.0 23.06 80.3 110.9 2.2889
## 6 6 1925 7.2 3.2 23.15 79.2 106.6 2.1735
## [1] 77 8
## [1] "id" "year" "divorce" "unemployed" "femlab"
## [6] "marriage" "birth" "military"
Year: the year from 1920-1996
Divorce: divorce per 1000 women aged 15 or more (response variable)
Unemployed: unemployment rate (explanatory variable)
Femlab: percent female participation in labor force aged 16+ (explanatory variable)
Marriage: marriages per 1000 unmarried women aged 16+ (explanatory variable)
Birth: births per 1000 women aged 15-44 (explanatory variable)
Military: military personnel per 1000 population (explanatory variable)
##
## Call:
## lm(formula = divorce ~ . - id, data = divusa)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.9087 -0.9212 -0.0935 0.7447 3.4689
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 380.14761 99.20371 3.832 0.000274 ***
## year -0.20312 0.05333 -3.809 0.000297 ***
## unemployed -0.04933 0.05378 -0.917 0.362171
## femlab 0.80793 0.11487 7.033 1.09e-09 ***
## marriage 0.14977 0.02382 6.287 2.42e-08 ***
## birth -0.11695 0.01470 -7.957 2.19e-11 ***
## military -0.04276 0.01372 -3.117 0.002652 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.513 on 70 degrees of freedom
## Multiple R-squared: 0.9344, Adjusted R-squared: 0.9288
## F-statistic: 166.2 on 6 and 70 DF, p-value: < 2.2e-16
Based on the residual plot, there appears to be some pattern as the residuals are completely random/scattered. All variables except ‘unemployed’ are expected to be significant under this model.
##
## Durbin-Watson test
##
## data: divorce ~ . - id
## DW = 0.37429, p-value < 2.2e-16
## alternative hypothesis: true autocorrelation is greater than 0
##
## Call:
## lm(formula = divorce ~ . - id, data = divusa_auto)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.07659 -0.38330 -0.02652 0.30530 3.06192
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -54.060448 61.056815 -0.885 0.3791
## year 0.030428 0.032833 0.927 0.3573
## unemployed -0.053001 0.027008 -1.962 0.0538 .
## femlab -0.007163 0.083131 -0.086 0.9316
## marriage 0.026072 0.014850 1.756 0.0836 .
## birth -0.048301 0.008798 -5.490 6.45e-07 ***
## military 0.014675 0.008103 1.811 0.0746 .
## LAST_YEAR_DIV 0.808416 0.055674 14.521 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7579 on 68 degrees of freedom
## (1 observation deleted due to missingness)
## Multiple R-squared: 0.9838, Adjusted R-squared: 0.9822
## F-statistic: 590.7 on 7 and 68 DF, p-value: < 2.2e-16
This second residual plot features smaller residuals and a weaker pattern especially in the lower half of the dataset. Now only the ‘birth’ variable is significant.
## Generalized least squares fit by maximum likelihood
## Model: divorce ~ . - id
## Data: divusa
## AIC BIC logLik
## 175.7777 196.872 -78.88886
##
## Correlation Structure: AR(1)
## Formula: ~year
## Parameter estimate(s):
## Phi
## 0.9682691
##
## Coefficients:
## Value Std.Error t-value p-value
## (Intercept) -453.4103 184.44766 -2.458206 0.0164
## year 0.2339 0.09671 2.418909 0.0182
## unemployed 0.0955 0.04479 2.132264 0.0365
## femlab -0.0167 0.16636 -0.100121 0.9205
## marriage 0.1573 0.02238 7.025867 0.0000
## birth -0.0312 0.02274 -1.373247 0.1741
## military 0.0444 0.01777 2.497031 0.0149
##
## Correlation:
## (Intr) year unmply femlab marrig birth
## year -1.000
## unemployed 0.102 -0.114
## femlab 0.830 -0.843 0.226
## marriage 0.119 -0.134 0.612 0.279
## birth -0.362 0.352 -0.001 -0.256 -0.134
## military -0.637 0.636 0.266 -0.657 0.326 0.312
##
## Standardized residuals:
## Min Q1 Med Q3 Max
## -1.9802473 -1.3946446 -0.4085795 0.5871811 1.9777468
##
## Residual standard error: 2.649147
## Degrees of freedom: 77 total; 70 residual
## Approximate 95% confidence intervals
##
## Correlation structure:
## lower est. upper
## Phi 0.8692724 0.9682691 0.9925951
## attr(,"label")
## [1] "Correlation structure:"
##
## Residual standard error:
## lower est. upper
## 1.291057 2.649147 5.435842
The gls function detects significant autocorrelation as indicated by Phi = 0.968. The interval does not come close to overlaping with a value of zero. Also, the significant variables have also changed relative to a linear model ignoring autocorrelation. All but the two variables ‘femlab’ and ‘birth’ are significant, with ‘marriage’ being the most significant.
Correlation in the errors may be due to the overall larger trends that dwarf the impact of the explanatory variables. In the initial model, there were patterns in the residuals in the latter half of the 20th century, owing to the significant changes starting in the late 1960’s. The dataset does mention marriage rate, but since these are annual statistics it does not capture the time between marriage and divorce.