Topic 2: Descriptive Statistics


These are the solutions for Computer Lab 3.


1 Descriptive statistics and plots for a single variable

1.1

No solution required.

1.2

1.3

  1. The average happiness score for countries in 2019 was approximately 54.671. This is much lower than Australia’s average happiness score of 72.2.
  2. The median happiness score in 2019 was 55.1. This means that 50% of countries had a happiness score below 55.1 in 2019, while 50% had a happiness score above 55.1.
  3. The minimum and maximum average happiness scores in 2019 were 25.7 and 78.1, respectively. The \(25\%\), \(50\%\) and \(75\%\) quantiles were 47.0, 55.1, and 62.3 respectively. Note that the \(50\%\) quantile is equivalent to the median value. Based on these results, we can see that Australia falls in the top quartile, i.e. Australia’s average happiness score in 2019 is greater than that of (at least) \(75\%\) of the countries surveyed. We can also see that the scores are bunched up around the median value.
  4. We can calculate the IQR by hand by using \(Q3 - Q1 = 62.3 - 47 = 15.3\). Alternatively, we can read the IQR value of 15.3 from the table in jamovi. Just as we noticed in (3) above, the IQR is rather narrow, with \(50\%\) of the values bunched up around the median value.
  5. For this question, it is unlikely you will find the exact percentile for Australia (let us know if you think you have!). However, by trial and error, we find that, for example, the \(90^{th}\) and \(95^{th}\) percentiles are 70.9 and 72.95 respectively. Remembering that Australia’s 2019 score was 72.2, we can see that this falls somewhere between the \(90^{th}\) and \(95^{th}\) percentile. This may be a somewhat surprising result, to find that the average happiness level in Australia in 2019 was higher than in over \(90\%\) of the world’s countries.

1.4

1.5

Histograms

Boxplots

1.6

The GDP per capita data is clearly positively skewed, with some large positive outliers. In comparison, the happiness score data looks almost symmetrical. Based on these observations, we conclude the following.

For happiness: Since the distribution of the data appears to be (roughly) symmetric, it makes more sense to use the mean and standard deviation as appropriate measures of location and spread respectively, rather than the median and IQR.

For income: Since the distribution of the data appears to be highly skewed, it makes sense to use the median and IQR as appropriate measures of location and spread respectively.

2 Descriptive statistics and plots to assess the relationship between two variables

2.1

The covariance value between average income and average happiness for 2019 is 157188. While this number is not too informative, it does tell us that the relationship between income and happiness score is positive (which is not too surprising!).

2.2

The correlation coefficient between average income and average happiness for 2019 is roughly 0.739. We could describe this correlation as being a moderate to strong positive correlation, which intuitively makes sense - as people’s income increases, their average happiness level should increase.

2.3

One thing that may be surprising to note, is that at quite low values for average income per person, the average happiness scores vary greatly, with some values even being above the mean and median happiness scores.

It is also worth noting that an increase in income appears to offer diminishing returns with respect to happiness once a certain income level is reached - see e.g. how some of the points in the top right of the graph are below those to the left.

3 Extension

Check with your lab demonstrator if you would like to discuss your results for this question.


That’s everything for now! If there were any parts you were unsure about, take a look back over the relevant sections of the Topic 2 material.


References


These notes have been prepared by Amanda Shaker and Rupert Kuveke. The copyright for the material in these notes resides with the authors named above, with the Department of Mathematical and Physical Sciences and with La Trobe University. Copyright in this work is vested in La Trobe University including all La Trobe University branding and naming. Unless otherwise stated, material within this work is licensed under a Creative Commons Attribution-Non Commercial-Non Derivatives License BY-NC-ND.