Homework 1

Evaluating Modernization Theory

Author

Mateus Voltolini

Published

February 13, 2026

Overview

For this assignment, you are going to evaluate modernization theory as laid out in Seymour Martin Lipset’s classic article entitled “Some Social Requisites of Democracy: Economic Development and Political Legitimacy.” How classic is this article? According to Google Scholar, this piece has been cited more than 11.5 thousand times!

We are going to use data from V-Dem and modern data viz tools to explore Lipset’s hypothesis that economic modernization is highly correlated with democracy. We have already done this to some extent by looking at the relationship between wealth and the polyarchy score. But we are going to broaden things out by looking at other measures of modernization and democracy contained in the V-Dem dataset.

Before starting on this assignment, you will want to have a look at the V-Dem codebook. Look through the sections titled “V-Dem Democracy Indices” (section 2 of the codebook) and “Background Factors (E).” There are five democracy indicators, one of which is the polyarchy index. There are a number of background factors, many of which pertain to economic modernization. We are going to be looking at the relationship between these two sets of variables. In the code book, you will also find a list of country names and codes, which will be useful for downloading and filtering the data.

Now have a look at “Some Social Requisites of Democracy” and in particular pay attention to the indicators in Table II and the discussion surrounding them. Think of each indicator (e.g. urbanization, education, etc.) as a sub-hypothesis of his theory. Which of these sub-hypotheses about modernization do you think is most compelling? Which would you like to test?

You have the option of doing this assignment in Posit Cloud or downloading the project folder and working locally. Either way you must submit your zipped project folder with the rendered HTML file included (see submission instructions at the end of the document) to Blackboard.

Step 1: Gather your data (20 pts)

Insert a code chunk below this paragraph and label it. Use the vdemdata package to download data for your analysis. Since we already looked at the polyarchy score and wealth in class, you need to use a different measure of democracy and a different background factor for your analysis. Use a select() verb to include country, year, region (e_regionpol_6C), at least one of the other four measures of democracy, and one background factor that is not per capita GDP. Store your data in an object called dem_data. Pipe in a mutate() verb and use case_match() to label the regions. Review module 3.1 if you are confused on how to do this.

library(vdemdata)
library(dplyr)


Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

dem_data <- vdem |>
  select(
    country = country_name,
    vdem_country_id = country_id,
    year,
    delibdem = v2x_delibdem,
    peaveduc = e_peaveduc,
    region = e_regionpol_6C
  ) |>
  mutate(
    region = case_match(region,
                     1 ~ "Eastern Europe", 
                     2 ~ "Latin America",  
                     3 ~ "Middle East",   
                     4 ~ "Africa", 
                     5 ~ "The West", 
                     6 ~ "Asia")
)

Step 2: Make a line chart showing country trends (20 pts)

a) Insert a code chunk below this paragraph and label it. Filter your dem_data to include three or four countries at various levels of economic development and create a line chart of your democracy indicator. See the World Bank country classifications by income level to make your selections. Save the data as a new data frame called dem_data_line.

library(ggplot2)

dem_data_line <- dem_data |>
  filter(
    country %in% c(
      "United States of America",
      "Brazil",
      "India",
      "Ethiopia"
    ),
    year >= 1950
  )

glimpse(dem_data_line)

Rows: 300
Columns: 6
$ country         <chr> "Brazil", "Brazil", "Brazil", "Brazil", "Brazil", "Bra…
$ vdem_country_id <dbl> 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19, 19…
$ year            <dbl> 1950, 1951, 1952, 1953, 1954, 1955, 1956, 1957, 1958, …
$ delibdem        <dbl> 0.277, 0.274, 0.274, 0.275, 0.273, 0.279, 0.281, 0.282…
$ peaveduc        <dbl> 2.354, 2.427, 2.499, 2.572, 2.644, 2.717, 2.790, 2.862…
$ region          <chr> "Latin America", "Latin America", "Latin America", "La…

Note: From here on out I will expect you to know to add a code chunk and label it. So I won’t keep repeating that portion of the instructions.

b) Make a line chart using ggplot2. Be sure to specify x =, y = and color = in your aes() call and use geom_line() to create the chart. Add a colorblind-friendly color map using viridis. Now add appropriate axis labels, a title and a caption.

ggplot(dem_data_line, aes(x = year, y = delibdem, color = country)) +
  geom_line(linewidth = 1) +
  labs(
    x = "Year",
    y = "Deliberative Democracy",
    title = "Deliberative Democracy Across Countries",
    caption = "Source: V-Dem Dataset"
  ) +
    scale_color_viridis_d(option = "viridis") +
  theme_minimal()

c) In a few sentences, interpret your chart. Have the more developed countries achieved a higher level of democracy? Put your answer right below this line in markdown text.

Answer: The line chart shows that more economically developed countries tend to reach higher and more stable levels of deliberative democracy. The United States consistently scores above the other cases, while Brazil’s deliberative democracy has a continuous increase after democratization, reaching closer to U.S. democracy scores. By contrast, India and Ethiopia exhibit lower levels of democracy, with India facing a considerable drop since the early 2010’s. Ethiopia had an improvement in recent years that reflect their economy’s considerable improvement, but it’s still far from what Western democracies tend to score. The results are consistent with Lipset’s hypothesis, while also indicating that democratic performance can fluctuate even in relatively developed countries.

Step 3: Make a bar chart comparing regional levels (20 pts)

a) Going back to your original dem_data data frame, filter the data for a single year and then group by region and summarize your democracy indicator by mean. Save the new data in an object called bar_chart_data.

bar_chart_data <- dem_data |>
  filter(
    year == 2017
  ) |>
  group_by(region) |>
  summarize(
    delibdem = mean(delibdem, na.rm = TRUE)) |>
  mutate(
    region = reorder(region, delibdem)
  ) |>
  arrange(desc(delibdem))
bar_chart_data

# A tibble: 6 × 2
  region         delibdem
  <fct>             <dbl>
1 The West          0.785
2 Latin America     0.484
3 Eastern Europe    0.384
4 Asia              0.340
5 Africa            0.317
6 Middle East       0.190

b) Use ggplot() and geom_col() to create a bar chart showing levels of democracy across the regions with bar_chart_data. Make sure to add appropriate axis labels, a title and a caption. Change the fill color and add a theme to spruce it up a bit.

library(ggplot2)

ggplot(bar_chart_data, aes(x = region, y = delibdem)) +
  geom_col(fill = "cadetblue") +
  labs(
    x = "Region",
    y = "Mean Deliberative Democracy Index",
    title = "Average Levels of Deliberative Democracy by Region (2017)",
    caption = "Source: V-Dem Dataset"
  ) +
  theme_update()

c) Interpret your bar chart. Do you see evidence that more developed regions have higher levels of democracy?

Answer: The chart shows the regional gaps in average deliberative democracy , with 2017 as the specific year of analysis. The West is highest followed by Latin America. Eastern Europe is the median, with Asia and Africa are a little below. The Middle East scored the lowest by a significant margin in 2017. This pattern is consistent with Lipset’s framework that regions associated with higher levels of economic development, such as The West, should exhibit higher democracy scores, while less modernized regions should score lower on average.

Step 4: Make a scatter plot to show (20 pts)

a) Start with the dem_data data frame again, Now take an average of multiple years using group_by() and summarize() to analyze. If you choose a recent period, make sure that the data are available. Some of the background variables in V-Dem are not entirely up to date. You can check the availability of the data by looking at the V-Dem codebook or using glimpse() or View() to look at your data. Save your your data in a new object called dem_data_scatter.

dem_data_scatter <- dem_data |>
  filter(
    year >= 1950
  ) |>
  group_by(country, region) |>
  summarize(
    delibdem  = mean(delibdem,  na.rm = TRUE),
    peaveduc  = mean(peaveduc,  na.rm = TRUE)
  ) |>
  arrange(peaveduc)

`summarise()` has grouped output by 'country'. You can override using the
`.groups` argument.

b) Now build a scatter plot with ggplot2. Put your modernization-related variable (background variable) on the x-axis and your measure of democracy on the y-axis and color the points by region. Add a trend line with geom_smooth(). Add appropriate labels and a viridis color map. Change the theme to theme_minimal.

library(ggplot2)

ggplot(dem_data_scatter, aes(x = peaveduc, y = delibdem)) +
  geom_point(aes(color = region)) +
    scale_x_log10()+
  geom_smooth(method = "lm", linewidth = 1) + 
  labs(
    x= "Average Years of Education",
    y = "Democracy Score",
    title = "Average Years of Education Among Ages 15 and Older (1950-Present)", 
    caption = "Source: V-Dem Institute", 
    color = "Region",
    ) + 
  scale_color_viridis_d(option = "viridis", end = .7)+
  theme_minimal()

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 51 rows containing non-finite outside the scale range
(`stat_smooth()`).

Warning: Removed 51 rows containing missing values or values outside the scale range
(`geom_point()`).

Step 5: Conclusion (20 pts)

Render your document and write a brief conclusion to your analysis. What did you find? Did you find support for Lipset’s theory? What are the limitations of your analysis?

Conclusion: This analysis provides support for Lipset’s modernization theory. Countries and regions with higher levels of economic modernization (in this case, measured by average years of education) display higher levels of deliberative democracy. The country-level trends show that more developed countries tend to achieve higher and more stable democracy scores, while less developed ones exhibit lower and even inconsistent scores. Regional averages reinforce this pattern, with The West scoring highest and less developed regions, particularly Africa and the Middle East, scoring substantially lower. The scatter plot further confirms a positive association between education and democracy across countries.

We can say the analysis has some limitations. The results cannot establish causality. Averaging values over long periods may obscure important temporal dynamics, such as democratic transitions or breakdowns. In addition, level of education captures only one dimension of modernization, while other factors emphasized by Lipset (such as income, urbanization, and economic equality) are not directly tested. Overall, the findings are consistent with Lipset’s theory but suggest that modernization alone does not fully explain democratic variation.

Step 6: Bonus questions (one point each)

a) Facet wrap your scatter plot by region.

library(ggplot2)

ggplot(dem_data_scatter, aes(x = peaveduc, y = delibdem)) +
  geom_point(aes(color = region)) +
  facet_wrap(~region) +
    scale_x_log10()+
  geom_smooth(method = "lm", linewidth = 1) + 
  labs(
    x= "Average Years of Education",
    y = "Democracy Score",
    title = "Average Years of Education Among Ages 15 and Older (1950-Present)", 
    caption = "Source: V-Dem Institute", 
    color = "Region",
    ) + 
  scale_color_viridis_d(option = "viridis", end = .7)+
  theme_minimal()

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 51 rows containing non-finite outside the scale range
(`stat_smooth()`).

Warning: Removed 51 rows containing missing values or values outside the scale range
(`geom_point()`).

b) Remove the facet_wrap() call and display the relationship for one region and use geom_text() to label your points.

library(ggplot2)
ggplot(subset(dem_data_scatter, region == "Latin America"),
  aes(x = peaveduc, y = delibdem)) +
  geom_point(color = "cadetblue", size = 1)+
  geom_text(
    aes(label = country),
    vjust = -.7,
    check_overlap = TRUE
  ) +
  geom_smooth(method = "lm", se = FALSE, linewidth = 1) +
  scale_x_log10() +
  labs(
    x = "Average Years of Education",
    y = "Democracy Score",
    title = "Education and Democracy in Latin America (1950–Present)",
    caption = "Source: V-Dem Institute",
  ) +
  theme_light()

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 1 row containing non-finite outside the scale range
(`stat_smooth()`).

Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_point()`).

Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_text()`).

c) Remove the text labels and make your scatter plot interactive using ggplotly(). Make sure that your tooltip includes the information that you want to display to the user.

library(ggplot2)
library(plotly)


Attaching package: 'plotly'

The following object is masked from 'package:ggplot2':

    last_plot

The following object is masked from 'package:stats':

    filter

The following object is masked from 'package:graphics':

    layout

p <- ggplot(dem_data_scatter, aes(x = peaveduc, y = delibdem, 
    text = paste("Country:", country,
      "<br>Region:", region,
      "<br>Education:", round(peaveduc, 2),
      "<br>Democracy Score:", round(delibdem), 1))) +
  geom_point(aes(color = region)) +
    scale_x_log10()+
  geom_smooth(method = "lm", linewidth = 1) +
   labs(x = NULL, y = NULL, title = NULL, caption = NULL, color = "Region") +
  scale_color_viridis_d(option = "viridis", end = .7)+
  theme_minimal()
interactive_plot <- ggplotly(p, tooltip = "text")

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 51 rows containing non-finite outside the scale range
(`stat_smooth()`).

interactive_plot

d) Add annotation to your plot using hline() or vline() to highlight a significant point in your data, like the mean or some other significant value.

library(ggplot2)

x_median <- median(dem_data_scatter$peaveduc, na.rm = TRUE)
ggplot(dem_data_scatter, aes(x = peaveduc, y = delibdem)) +
geom_point(aes(color = region)) +
scale_x_log10()+
geom_smooth(method = "lm", linewidth = 1) +
  geom_vline(xintercept = x_median, linetype = "dashed", linewidth = 1, color = "steelblue") +
labs(
x= "Average Years of Education",
y = "Democracy Score",
title = "Average Years of Education Among Ages 15 and Older (1950-Present)",
caption = "Source: V-Dem Institute",
color = "Region",
) +
scale_color_viridis_d(option = "viridis", end = .7)+
theme_minimal()

`geom_smooth()` using formula = 'y ~ x'

Warning: Removed 51 rows containing non-finite outside the scale range
(`stat_smooth()`).

Warning: Removed 51 rows containing missing values or values outside the scale range
(`geom_point()`).

e) Upload your rendered HTML file to Quarto Pub and share the link with the class on Discord.

Submission Instructions

Head over to Blackboard and go to the Homework 1 assignment. Click “Create Sumbission” and write a brief statement saying that you have submitted the assignment and that all of the work is your own.

From there you have upload a compressed (zipped) version of your project folder including the rendered HTML file. To compress your project folder, right-click the project folder and choose Compress (Mac) or Send to → Compressed (zipped) folder (Windows). Then, upload the resulting .zip file.

If you like, you can also upload your rendered HTML as a webpage to Quarto Pub and share the link below your submission statement.