2025-03-24

Swiss - Data Description

The dataset “swiss” is the standardized fertility measure and socioeconomic indicators for 47 French-speaking provinces of Switzerland. The time frame for this data is an average from 1887 to 1889. The data frame has 6 variables of which are percent and measured from 1 to 100. It has 47 observations, one for every province, and all variables but Fertility give proportions of the population.

Each Variable of the swiss data frame and its description

  • [1] Fertility ‘common standardized fertility measure’.
  • [2] Agriculture % of males involved in agriculture as occupation.
  • [3] Examination % draftees receiving highest mark on army examination.
  • [4] Education % education beyond primary school for draftees.
  • [5] Catholic % ‘catholic’ (as opposed to ‘protestant’).
  • [6] Infant % Mortality live births who live less than 1 year.

Examintation Score vs Education with a reggression line

We can see there is a correlation between the Examination and Education of the draftees. Most of the data is grouped close to the line with only one major outlier.

Examintation Score vs Fertility with a reggression line.

We can see there is a correlation between the Examination and Fertility of the draftees. Most of the data is grouped somewhat close to the line and has a downward trajectory.

Plotting the relationship of examinations(x), education(y) and fertility(z)

This 3-D plot has four different variables. With the x being the examination, the y being education, the z being fertility, and the color of the point being how catholic they are with more red points being more catholic and blue points being less catholic.

Pie Chart showing the levels of catholic in the provinces.

This Pie graph shows the percentage of peoples provinces that were not very catholic, somewhat catholic, and very catholic. What you can see in the data though is that most of the is most provinces were either not catholic of very catholic with a small percentage being inbetween provinces having people split.

Statistical Analysis

Below is the code for a five number summary of all 6 categories. This gives us insight to the people of the provinces. The data can help assist the graph in displaying and communicating the range and general grouping of the data for each respective category. The five number summary will be on the next ioslide.

library(knitr)
summary_table <- summary(swiss)
kable(summary_table, caption = 
        "Five Number Summary Statistics of Swiss Dataset"
      )

Five Number Summary

Five Number Summary Statistics of Swiss Dataset
Fertility Agriculture Examination Education Catholic Infant.Mortality
Min. :35.00 Min. : 1.20 Min. : 3.00 Min. : 1.00 Min. : 2.150 Min. :10.80
1st Qu.:64.70 1st Qu.:35.90 1st Qu.:12.00 1st Qu.: 6.00 1st Qu.: 5.195 1st Qu.:18.15
Median :70.40 Median :54.10 Median :16.00 Median : 8.00 Median : 15.140 Median :20.00
Mean :70.14 Mean :50.66 Mean :16.49 Mean :10.98 Mean : 41.144 Mean :19.94
3rd Qu.:78.45 3rd Qu.:67.65 3rd Qu.:22.00 3rd Qu.:12.00 3rd Qu.: 93.125 3rd Qu.:21.70
Max. :92.50 Max. :89.70 Max. :37.00 Max. :53.00 Max. :100.000 Max. :26.60

Concluding thoughts

To conclude the slides, graphs, and data we can leave with a few observations. When comparing Examination to Education we were able to see a mild positive linear correlation. Another observation happened when looking at Examination and Fertility we can see another mild correlation with this one with a negative slope. After those to Graphs we can see a third graph where four variables are in play. Examination, Education, and Fertility being the axis and the fourth variable of catholic dictates the color of the points in the 3D graph. In this graph we can see a small trend moving upward and a trend of the points shifting from blue to red as you move up in Fertility or the Z-axis. The next graph illustrates the proportion of the provinces that have a certain percent of people that are catholic. Lastly we have a five number summary showing the ranges of each category and how they might be grouped based on the based of the first and third quartile as well as the mean.