Introduction

In this worksheet, we are going to give you an opportunity to show your understanding of basic data wrangling and visulaisation using the World Values Survey data. Your aim is to wrangle and graph data for “life satrisfaction” accross the countries in the WVS. Ultimately, we are interested to know: Which countries have the highest and lowest life satisfaction?

Please write your answers and code in an R notebook and submit it as either a .pdf or .rmd (notebook) file in the Moodle submission point.

Task 1

Use your coding skills to load the necessary packages:“tidyverse”, “mosaic”, “ggplot2”, “moments”, “PB130”, and “sjlabelled”

Task 2

Now write the code that loads the dataframe into R Studio. As always, the WVS data can be accessed here: https://github.com/thomcurran/PB130/raw/master/MT3/Workshop/WV6_Data_R_v20180912.rds. Download the file and save it to a logical place where it can be eaisly located. I would recommend that you save it in your PBS folder under a subfolder called Worksheets. Now load the file by clicking on the dataframe name in the files window where you saved the WVS data, change the name to WVS, and click OK. The code will appear in the console. Just copy and paste that code in your R chunk on the worksheet. Don’t forget to convert the country variable (V2A) to it’s character form For example:

WVS <- readRDS("C:/Documents/PB130/MT4/Workshop/WV6_Data_R_v20180912.rds")

WVS$V2A <- as_character(WVS$V2A)

Task 3

Select out of the WVS dataframe the necessary variables. In this case it is the country variable and the variable for life satisfaction (“Satisfaction with your life”; V23). See the WVS codebook for the specific names of these vairables here: https://github.com/thomcurran/PB130/blob/master/MT3/Workshop/F00007761-WV6_Codebook_v20180912.pdf

Task 4

Filter the selected variables for values of relevance (i.e., only those values that include responses to the life satisfaction scale). See the WVS codebook for the specific values of the variables.

Task 5

Find the median, mean, standard deviation, skewness, and kurtosis of life satisfaction. Then, comment on the distributional properties of life satisfaction from this information.

Task 6

Graph the distribution of life satsifaction in the whole dataset using a histogram. Add to this histogram a desity curve and a vertical line to depict the mean value. Don’t forget to adjust the bin width if necessary. Then, after graphing the distribution, provide a commentary on the distributional properties of life satisfaction.

Task 7

Graph the distribution of life satisfaction for one country of your choosing using a box-plot. Add to the box plot the data points and a red dot to depict the mean. After graphing the distributon, provide a commentary on the distributional properties of life satisfaction.

Task 8

Going back to the all country WVS dataframe, make a plot that ranks countries on mean life satisfaction from low to high and then use the plot to answer the following questions:

Which country has the highest life satisfaction?

Which country has the lowest life satisfaction?

When doing this, attempt to make the plot as visually appealing as possible using customizations of your choosing.