The first thing I did was remind myself of the structure of the swiss data since we previously used it, but I couldn’t remember all the variables:

str(swiss)
## 'data.frame':    47 obs. of  6 variables:
##  $ Fertility       : num  80.2 83.1 92.5 85.8 76.9 76.1 83.8 92.4 82.4 82.9 ...
##  $ Agriculture     : num  17 45.1 39.7 36.5 43.5 35.3 70.2 67.8 53.3 45.2 ...
##  $ Examination     : int  15 6 5 12 17 9 16 14 12 16 ...
##  $ Education       : int  12 9 5 7 15 7 7 8 7 13 ...
##  $ Catholic        : num  9.96 84.84 93.4 33.77 5.16 ...
##  $ Infant.Mortality: num  22.2 22.2 20.2 20.3 20.6 26.6 23.6 24.9 21 24.4 ...

Then I looked at the correlation between all the variables to get a sense for which two variables would be best to analyze:

cor(swiss)
##                   Fertility Agriculture Examination   Education   Catholic
## Fertility         1.0000000  0.35307918  -0.6458827 -0.66378886  0.4636847
## Agriculture       0.3530792  1.00000000  -0.6865422 -0.63952252  0.4010951
## Examination      -0.6458827 -0.68654221   1.0000000  0.69841530 -0.5727418
## Education        -0.6637889 -0.63952252   0.6984153  1.00000000 -0.1538589
## Catholic          0.4636847  0.40109505  -0.5727418 -0.15385892  1.0000000
## Infant.Mortality  0.4165560 -0.06085861  -0.1140216 -0.09932185  0.1754959
##                  Infant.Mortality
## Fertility              0.41655603
## Agriculture           -0.06085861
## Examination           -0.11402160
## Education             -0.09932185
## Catholic               0.17549591
## Infant.Mortality       1.00000000

The summary of the data set tells us that Switzerland, in 1888, was entering a period known as the demographic transition; i.e., its fertility was beginning to fall from the high level typical of underdeveloped countries. Therefore, it seems best to focus on the strongest correlations between the Fertility measure and one other variable. The strongest correlations are between Fertility and Examination and Fertility and Education. My next step was to see if either appeared to be linear.

plot(swiss$Fertility~swiss$Examination)
abline(lm(swiss$Fertility~swiss$Examination))

plot(swiss$Fertility~swiss$Education)
abline(lm(swiss$Fertility~swiss$Education))

The Fertility and Examination plot appears to be much more linear than the Fertility and Education plot, so I will use Fertility and Examination in my regression analysis.

Discussion7<-(lm(swiss$Fertility~swiss$Examination))
summary(Discussion7)
## 
## Call:
## lm(formula = swiss$Fertility ~ swiss$Examination)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -25.9375  -6.0044  -0.3393   7.9239  19.7399 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)        86.8185     3.2576  26.651  < 2e-16 ***
## swiss$Examination  -1.0113     0.1782  -5.675 9.45e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 9.642 on 45 degrees of freedom
## Multiple R-squared:  0.4172, Adjusted R-squared:  0.4042 
## F-statistic: 32.21 on 1 and 45 DF,  p-value: 9.45e-07

The p-value is less than 0.05 so the relationship between the two variables is statistically significant. However, the R-squared is only 0.4172 which means that only ~42% of the variability in Fertility is explained by Examination.