Module 5: Inferential Statistics
Lab 5: Testing the Fukushima Effect on Japanese Elections
Background and Instructions
Background: On March 11, 2011, a magnitude 9.1 earthquake struck just off the coast of the Tohoku region of Japan. This quake triggered a tsunami with waves as high as 40 meters (132 feet), leading to 19,000 deaths, especially in Fukushima, Miyagi, and Iwate prefectures. It triggered an explosion at Fukushima Daiichi Nuclear Power Plant, irradiating nearby communities. The government evacuated 9 municipalties and barred entrance to this exclusion zone for the next 30 years. Citizens rallied to protest nuclear power, demand compensation for victims, hold executives responsible, and demand transparency from the state. Anecdotal accounts suggest that the disaster caused voters to turnout out in higher rates, leading to a change in party in 2012. National elections are subject to many factors, so a better test would be to examine local elections.
The Lab: You have been commissioned to study the effect of the 2011 disaster on municipal voter turnout in prefectural elections. You are aware that many things might affect voter turnout, so you zoomed into neighboring sets of municipalities as similar as possible in terms of population, gender, age, income, unemployment, governance capacity, and support for the popular Liberal Democratic Party. The only difference between your municipalities is that one set was directly affected by an aspect of the disaster (treatment group), while the other set was not (control group).
You will test the effect of three key difference between towns, by zooming into the 2011 prefectural elections.
A. The Fukushima Effect: Did voters in cities in Fukushima Prefecture turnout more than peers just outside of Fukushima Prefecture? The treatment here is “Fukushima”, while the control is “Outside Fukushima”.
B. The Exclusion Zone Effect: Did voters from cities in the Exclusion Zone turnout more than peers just outside the Exclusion Zone? (Exclusion Zone residents were relocated to empty high schools in Tokyo Suburbs, and still voted as a community.) The treatment here is “Exclusion Zone” while the control is “Outside Exclusion Zone”.
C. The Tsunami Effect: Did voters in cities struck by the tsunami turnout more than peers not directly affected by the tsunami? The treatment here is “Hit” while the control is “Not Hit”.
Mapping our Matching Experiments
Look at the map above.
Notice these 3 highlighted regions, where some cities are our treatment group while other are our control group. All other cities are “Other” and must be removed from your analysis.
Notice in the bar plots below. These show that average traits are fairly balanced across the treatment and control groups. These traits probably did not cause differences in voter turnout. So, we can safely test disaster’s effect.
Together as a group, please complete this lab and respond to the questions below.
Task 1: Load and filter your data.
Please load the following packages and dataset using the link below.
# Load your packages
library(tidyverse) # for dplyr, ggplot2, and pivoting
library(viridis) # for colors
library(infer) # for stats
# Import your data
cities <- read_csv("https://bit.ly/japanese_muni_elections") %>%
filter(year == 2011)Take a look at your data, using glimpse(). You will need the following variables: year, pref, muni, turnout_rate, by_border, by_exclusion_zone, and by_tsunami. I strongly suggest that you select just these variables.
year: year of electionpref: prefecture that municipality is inmuni: municipality where election took placeturnout_rate: percentage of eligible voters in the city who turned out to vote, represented as a decimal (0 to 1).by_border: is that city on the Fukushima side of the border (“Fukushima”), on the other side (“Outside Fukushima”), or some other municipality (“Other”)?by_exclusion_zone: is that city in the Fukushima exclusion zone (“Exclusion Zone”), just outside the Fukushima exclusion zone (“Outside exclusion zone”), or some other municipality (“Other”)?by_tsunami: was that city struck by the tsunami (“Hit”), not hit but just next door (“Not Hit”), or some other municipality (“Other”)?
Task 2: Disaster Effects in 2011
In Tasks 2.1-2.3, please test the effects of the disaster on voter turnout in 2011 elections.
You will need an inferential statistical test that works for a numeric outcome an a binary explanatory variable. (There’s only one test that fits this definition.)
You will need to zoom into just cities with the treatment and control group, excluding other cities.
Please follow the directions below.
Task 2.1: The Fukushima Effect
What percentage of residents turned out to vote in the average city on the Fukushima side of the border? How much/less was that than the average city on the other side of the border? Using the infer package, please calculate the appropriate inferential statistic to measure how extreme this difference is.
Task 2.2: The Exclusion Zone Effect
What percentage of residents turned out to vote from the average city located in the exclusion zone? How much/less was that than the average city just outside the exclusion zone? Please calculate the appropriate inferential statistic to measure how extreme this difference is.
Task 2.3: The Tsunami Effect
What percentage of residents turned out to vote from the average city hit by the tsunami? How much/less was that than the average city not hit but just next door? Please calculate the appropriate inferential statistic to measure how extreme this difference is.
Task 3: Disaster Effects in 2015
Next, please repeat your analysis from Task 2, but this time examine these communities in 2015!
Which treatment had the greatest effect in 2011? How about in 2015? Why do you think that is?
Task 4: Visualize!
Please visualize 2 of your most interesting effects. Please use different visualization styles in each. At least 1 visual should use entire distributions (eg. jitter, violin, histogram, density, boxplots), not just the average (bar, line, point). Please add colors, labels, and themes to your graph.
Write a short summary describing what they show.
Congrats! You’re done! You just did three matching experiments, over two time periods, on real world data!