Variables:

http://www.parlgov.org/documentation/codebook/#party:

  • left/right — Castles/Mair 1983 (left/right), Huber/Inglehart 1995 (left/right), Benoit/Laver 2006 – (left/right), CHES 2010 (lrgen 1999 and 2002 and 2006)

  • state/market — Benoit/Laver 2006 (taxes/spending), CHES 2010 (lrecon 1999 and 2002 and 2006)

  • liberty/authority — Benoit/Laver 2006 (social), CHES 2010 (galtan 1999 and 2002 and 2006)

  • EU anti/pro — Ray 1999 (pos96), Benoit/Laver 2006 (euauthority or eulargerstronger or eujoining), CHES 2010 (position 1999 and 2002 and 2006)

**

This data set describes political parties in the governments all over the European Union and Associated democracies. You are working with a cleaned, pre-processed dataset.

Solve the problems below by answering the listed questions. You can solve the problems in any order using any correct way to do it.

Write you answer as an Rmd script to an html file. Knit the solution with your comments in it and submit the HTML.

problem solved

  • Problem 1. (3 points) - fully done, but I’m not sure
  • Problem 2. (2 points) - It seems to be done, but I’m not sure (again)
  • Problem 3. (1 point) - fully done
  • Problem 4. (4 point) - fully done

Problem 1. (3 points)

All the parties are classified into families by their position in an economic (state/market) and a cultural (liberty/authority) left/right dimension. The classification leads to eight party family categories: Communist/Socialist, Green/Ecologist, Social democracy, Liberal, Christian democracy, Agrarian, Conservative, Right-wing.

Compare whether the party families (variable ‘party_family_short’) in these data differ on the left-right scale (variable ‘left_right’).

library(readr)
df <- read_csv("sem6_parlgov.csv")
summary(df)
##        X1         country_name_short country_name       party_name_short  
##  Min.   :   1.0   Length:1034        Length:1034        Length:1034       
##  1st Qu.: 259.2   Class :character   Class :character   Class :character  
##  Median : 517.5   Mode  :character   Mode  :character   Mode  :character  
##  Mean   : 517.5                                                           
##  3rd Qu.: 775.8                                                           
##  Max.   :1034.0                                                           
##  party_name_english family_name_short  family_name          left_right   
##  Length:1034        Length:1034        Length:1034        Min.   :0.000  
##  Class :character   Class :character   Class :character   1st Qu.:3.300  
##  Mode  :character   Mode  :character   Mode  :character   Median :6.000  
##                                                           Mean   :5.359  
##                                                           3rd Qu.:7.400  
##                                                           Max.   :9.825  
##   state_market    liberty_authority  eu_anti_pro       country_id   
##  Min.   :0.2143   Min.   :0.3338    Min.   : 0.000   Min.   : 1.00  
##  1st Qu.:3.5000   1st Qu.:3.5000    1st Qu.: 3.300   1st Qu.:23.00  
##  Median :5.7000   Median :4.5056    Median : 7.900   Median :41.00  
##  Mean   :4.9005   Mean   :5.2013    Mean   : 6.414   Mean   :40.41  
##  3rd Qu.:6.4000   3rd Qu.:7.0000    3rd Qu.: 8.300   3rd Qu.:60.00  
##  Max.   :9.4737   Max.   :9.7895    Max.   :10.000   Max.   :75.00  
##     party_id        family_id     EU_memb2000       
##  Min.   :   2.0   Min.   : 2.00   Length:1034       
##  1st Qu.: 657.2   1st Qu.: 6.00   Class :character  
##  Median :1425.0   Median :14.00   Mode  :character  
##  Mean   :1463.4   Mean   :16.23                     
##  3rd Qu.:2312.5   3rd Qu.:26.00                     
##  Max.   :2804.0   Max.   :40.00

1. Run a formal overall test to compare group means. Use a parametric test for this.

oneway.test(left_right ~ family_name_short, data = df, var.equal = T)
## 
##  One-way analysis of means
## 
## data:  left_right and family_name_short
## F = 1472, num df = 7, denom df = 1026, p-value < 2.2e-16
aov <- aov(df$left_right ~ df$family_name_short)
summary(aov)
##                        Df Sum Sq Mean Sq F value Pr(>F)    
## df$family_name_short    7   5641   805.9    1472 <2e-16 ***
## Residuals            1026    562     0.5                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(aov)

Tukey <- TukeyHSD(aov)
plot(Tukey, las = 2)

2. Are there any pairs of party families that do not differ on the left-right dimension (p = 0.05)? If yes, name them.

No, all the families have different means on the left-right dimension (as there is no intercantions of 0 line on the graph).

3. What are the maximal and minimal mean values for a party family on the left-right scale? Name the party families, report the means and standard deviations (round to two digits after the point).

  • Liberal-Christian democracy shows minimal difference of means. Social democracy-Ecologist and Liberal-Agrarian also show close to minimal differences.
  • Rightwing-Social democracy and Rightwing-Ecologist show maximal difference of means

Problem 2. (2 points)

There are several scales that differentiate between political parties in the data set: the left/right scale, the state/market scale, the liberty/authority, and the EU anti/pro scale. Do these scales measure similar or unrelated features?

1. Evaluate whether all the four scales are close to normal distribution. Use the values of skew and kurtosis, and make sure the distributions are bell-shaped to be considered normal. Report your decisions on all four scales.

library(ggplot2)
g1 <- ggplot(df, aes(x = left_right)) + 
  geom_density() +
  theme_bw()

g2 <- ggplot(df, aes(x = state_market)) + 
  geom_density() +
  labs(y="") +
  theme_bw()

g3 <- ggplot(df, aes(x = liberty_authority)) + 
  geom_density() +
  theme_bw()

g4 <- ggplot(df, aes(x = eu_anti_pro)) + 
  geom_density() +
  labs(y="") +
  theme_bw()

library(gbm)
grid.arrange(g1, g2, g3, g4, ncol=2, nrow = 2)

Variables are not normally distributed, so we should use spearman’s correlation.

2. Use the proper method to calculate a correlation matrix between the four scales. Report the statistically significant correlations (p = 0.05). Name the direction and magnitude of relationships.

library(dplyr)
cor_data <- df %>% select("left_right", "state_market", "liberty_authority", "eu_anti_pro") %>% na.omit()
cor(cor_data, method = "spearman")
##                    left_right state_market liberty_authority eu_anti_pro
## left_right         1.00000000    0.7108134         0.8382975 -0.04270487
## state_market       0.71081336    1.0000000         0.4458473  0.42967947
## liberty_authority  0.83829748    0.4458473         1.0000000 -0.20591361
## eu_anti_pro       -0.04270487    0.4296795        -0.2059136  1.00000000
#df %>% select(left_right, state_market, liberty_authority, eu_anti_pro) %>% cor.test(method = "spearman")
#cor.test(cor_data, method = "spearman")
#library(corrplot)
#corrplot(cor_data, method="number")

#library(sjPlot)
#tab_corr(cor_data)

library(ggcorrplot)
cor(cor_data, method = "spearman") %>%
  round(2) %>%
  ggcorrplot(hc.order = TRUE, type = "upper", ggtheme = ggplot2::theme_bw, colors =c("darkturquoise", "white", "#E46726"))

We can see:

  • strong positive correlation of left/right with liberty/authority (0.84) and with state/market (0.71) groups and
  • very slight negative correlation between liberty/authority and EU anti/pro groups (-0.2).

Problem 3. (1 point)

There is a variable indicating whether the country was a member of the European Union in 2000 or not (‘EU_memb2000’).

6. Were the party families (variable ‘family_name’) equally represented in the EU members and non-members? Report a formal test to answer that question. If yes, report which party family was less likely to occur in the non-EU countries? If not, name the party family which was represented in the two groups in the least balanced way.

data_chi <- df %>% select("family_name", "EU_memb2000") %>% na.omit()
data_chi$EU_memb2000 <- as.factor(data_chi$EU_memb2000)
data_chi$family_name <- as.factor(data_chi$family_name)

chisq.test(table(data_chi$EU_memb2000, data_chi$family_name))
## 
##  Pearson's Chi-squared test
## 
## data:  table(data_chi$EU_memb2000, data_chi$family_name)
## X-squared = 17.267, df = 7, p-value = 0.01575
residuals(chisq.test(table(data_chi$EU_memb2000, data_chi$family_name)))
##      
##          Agrarian Christian democracy Communist/Socialist Conservative
##   No   0.76861787         -1.21931170         -2.37851961   0.40563696
##   Yes -0.66414172          1.05357395          2.05521385  -0.35049982
##      
##       Green/Ecologist     Liberal  Right-wing Social democracy
##   No      -0.04005708  0.71551578  0.71049194       0.98408811
##   Yes      0.03461223 -0.61825765 -0.61391668      -0.85032367

The party family which was represented in the two groups in the least balanced way is Communist/Socialist (it has higher residuals)

Problem 4. (4 points)

You need to predict the party’s position on the state-market scale (variable ‘state_market’) where 0 means “the state should regulate the economy” and 10 means “the state should be minimal in the economy” (from Benoit/Laver 2006 and CHES 2010). Use the party’s left-right position and the stance towards the EU for this (variable ‘eu_anti_pro’, 0 means ‘totally against’ and 10 means ‘totally pro-EU’).

library(dplyr)
df1 <- df %>% select("state_market", "eu_anti_pro", "left_right")
df1_nona <- df1 %>% na.omit()

7. How much variance (in percent) in state_market can be explained with the party’s left-right position and the stance towards the EU in the model? (round to 0 digits after the point)

model1 <- lm(state_market ~ eu_anti_pro + left_right, data = df1)
summary(model1)
## 
## Call:
## lm(formula = state_market ~ eu_anti_pro + left_right, data = df1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.6273 -0.4513 -0.2293  0.3891  4.2486 
## 
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)    
## (Intercept) -0.397890   0.087677  -4.538 6.34e-06 ***
## eu_anti_pro  0.263986   0.009851  26.797  < 2e-16 ***
## left_right   0.672724   0.010472  64.238  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8247 on 1031 degrees of freedom
## Multiple R-squared:  0.8262, Adjusted R-squared:  0.8258 
## F-statistic:  2450 on 2 and 1031 DF,  p-value: < 2.2e-16

8. Is the relationship between the left-right party programme dependent on its pro- or anti-EU position? Compare this model to the previous one. Is it significantly better?

model2 <- lm(state_market ~ left_right * eu_anti_pro, data = df1)
summary(model2)
## 
## Call:
## lm(formula = state_market ~ left_right * eu_anti_pro, data = df1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -4.5341 -0.3636 -0.0993  0.3198  4.3642 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            0.224848   0.138769   1.620    0.105    
## left_right             0.569351   0.020773  27.408  < 2e-16 ***
## eu_anti_pro            0.129336   0.025413   5.089 4.27e-07 ***
## left_right:eu_anti_pro 0.023057   0.004022   5.733 1.30e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8122 on 1030 degrees of freedom
## Multiple R-squared:  0.8316, Adjusted R-squared:  0.8311 
## F-statistic:  1695 on 3 and 1030 DF,  p-value: < 2.2e-16
anova(model1, model2)

9. What are the standardized regression coefficients of the left-right and the eu_anti_pro predictors in the largest model? Interpret the bigger of the two coefficients.

library(lm.beta)
lm.beta(model2)
## 
## Call:
## lm(formula = state_market ~ left_right * eu_anti_pro, data = df1)
## 
## Standardized Coefficients::
##            (Intercept)             left_right            eu_anti_pro 
##              0.0000000              0.7060123              0.1704892 
## left_right:eu_anti_pro 
##              0.2324470

The bigger standardazed coefficients show ‘left_right’ predctor - it gives the stronger effect on the outcome, which each point of ‘left_right’ variable the outcome changes by 0.71.

10. Draw the interaction plot for the largest model.

  • What is the predicted score of a party on the ‘state-market’ scale if the party is strictly anti-EU and totally on the political left?

It is close to zero (red line, its left point)

  • Which leftist parties, pro- or anti-EU, have a more relaxed view about the state regulation of economy?

First, we need to look at the left part of the graph, and compare level of red and blue lines. The blue line is higher, meaning that supporters of EU are less strict of state regulation, they are more relaxed, than anti_EU.

  • Looking at the graph, describe a party which would promote the largest market regulation in economy.

We have to look at the highest point of the graph, which is on the right side, the blue line - right supporters of EU.

library(sjPlot)
#plot_model(model2)
plot_model(model2, type = "int")