library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.4 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(readxl)
EAVS<-read.csv("data.csv")
eavs2<-EAVS %>%select(A1a,A1b,A1c)
eavs2$A1a<-as.numeric(eavs2$A1a)
## Warning: NAs introduced by coercion
eavs2$A1b<-as.numeric(eavs2$A1b)
## Warning: NAs introduced by coercion
eavs2$A1c<-as.numeric(eavs2$A1c)
## Warning: NAs introduced by coercion
eavs3<-eavs2%>%na.omit(.)
hist(eavs3$A1a)
hist(eavs3$A1b)
hist(eavs3$A1c)
eavs3<-eavs3%>%mutate(sqrt_A1a=sqrt(A1a))
eavs3<-eavs3%>%mutate(sqrt_A1b=sqrt(A1b))
eavs3<-eavs3%>%mutate(sqrt_A1c=sqrt(A1c))
hist(eavs3$sqrt_A1a)
hist(eavs3$sqrt_A1b)
hist(eavs3$sqrt_A1c)
My modified dataset, based on 2022_EAVS_for_Public_Release_V1.1.xlsx,
focuses on voter registration data with six variables. These include the
total number of inactive voters (A1a), the number of active voters
(A1b), and the number of inactive voters (A1c). Additionally, I have
computed the square root of each of these three variables which are the
other three variables in effort to better assess distribution patterns.
The data is skewed across all variables, as shown on the histograms.
However, the visual analysis shows that the number of active voters is
significantly higher than the number of inactive voters.