Take the WGI data set we discussed in class and do the tasks below.
Using dplyr
:
Calculate how many free, partly free and not free countries there are in a data set.
Calculate the average value of Control of Corruption for free, partly free and not free countries.
Sort rows in a data set by Political Stability in an ascending order and report a) 10 most stable countries; b) 10 least stable countries.
Now you are suggested to evaluate the association between Regulatory Quality and Rule of Law. First, create a simple scatter plot for these variables. Add substantial labels to axes. Add a title. Interpret the graph: state the direction of the association (positive/negative) between variables and the strength of the association.
Add colors to this plot that correspond to free, partly free and non-free countries. Use the code we discussed in class. Judging by this plot, decide, for what type of countries the association between Regulatory Quality and Rule of Law is stronger.
Run the following line of code after the plot()
function and look at the results.
# type names of the variables instead of x and y
# cex is the text size (by default it is 1)
text(x, y, labels=wgi$cnt_code, cex= 0.7)
Load a data set from Homework 4 on volunteers using this link.
Create a contingency table for sex
and volunteer
. Can you decide using this table whether the participation in volunteering depends on the people’s gender?
Test whether sex
and volunteer
are associated using a chi-squared test. Make conclusions.
Create a mosaic plot to visualize a contingency table. Does it look sensible and understandable?