Project1

Author

Zachary Rodavich

This data set focuses on polls collected from people regarding their opinions on whether or not President Donald Trump should be impeached and removed from office during his first term from 2017-2021. Multiple news stations, newspapers and other sources conducted polls during 2018 and 2019 to get opinions from those who keep track of ongoing political events. Variables used include how many people said they wanted to see Trump impeached and removed from office, how many people wanted congress to start impeachment proceedings, and the massive difference in opinion between supporters of Republicans, Democrats and Independents. The dataset was created by authors Aaron Bycoffe, Ella Koeze, and Nathaniel Rakich, and was sourced from the DATA 110 Datasets Google Drive folder.

Step 1: Adding the required libraries

library (tidyverse)

── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.1.6
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.4     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

library (dplyr)
library (ggfortify)
#Adding Tidyverse, dplyr, and ggfortify for this dataset

Step 2: Setting up the Working Directory

getwd()

[1] "/Users/zacharyrodavich/Downloads"

setwd("/Users/zacharyrodavich/Downloads")
#Getting and Setting the Working Directories

Step 3: Importing the CSV file into R, and defining Impeachment

impeachment <- readr::read_csv("impeachmentpolls.csv")

Rows: 416 Columns: 24
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (10): Start, End, Pollster, Sponsor, Pop, Text, Category, Include?, URL,...
dbl (13): SampleSize, Yes, No, Unsure, Rep Sample, Rep Yes, Rep No, Dem Samp...
lgl  (1): tracking

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

#Importing the CSV file for the dataset into R and definining "impeachment"

Showing the dataset using the “head” command

head (impeachment)

# A tibble: 6 × 24
  Start     End       Pollster  Sponsor SampleSize Pop   tracking Text  Category
  <chr>     <chr>     <chr>     <chr>        <dbl> <chr> <lgl>    <chr> <chr>   
1 6/28/2019 7/1/2019  ABC News… <NA>          1008 a     NA       Base… begin_p…
2 4/22/2019 4/25/2019 ABC News… <NA>          1001 a     NA       Base… begin_p…
3 1/21/2019 1/24/2019 ABC News… <NA>          1001 a     NA       Base… begin_p…
4 8/26/2018 8/29/2018 ABC News… <NA>          1003 a     NA       Base… begin_p…
5 6/8/2019  6/12/2019 Civiqs    <NA>          1559 rv    NA       Do y… begin_i…
6 5/28/2019 5/31/2019 CNN/SSRS  <NA>          1006 a     NA       Base… impeach…
# ℹ 15 more variables: `Include?` <chr>, Yes <dbl>, No <dbl>, Unsure <dbl>,
#   `Rep Sample` <dbl>, `Rep Yes` <dbl>, `Rep No` <dbl>, `Dem Sample` <dbl>,
#   `Dem Yes` <dbl>, `Dem No` <dbl>, `Ind Sample` <dbl>, `Ind Yes` <dbl>,
#   `Ind No` <dbl>, URL <chr>, Notes <chr>

#Showing the dataset

Step 4: Setting up variables for those who want Trump Impeached or for Congress to begin Impeachment Proceedings

impeachment1 <- impeachment |>
  filter(Category == "impeach")
#Filtering by those who wanted Trump impeached.

impeachment2 <- impeachment |>
  filter (Category == "begin_proceedings")
#Creating a second filter for those who wanted Congress to begin Impeachment Proceedings

Step 5: Filtering via the “Yes” columns for Democrats, Republicans and Independents, and defining Impeachment and Proceedings

impeachment$Category[impeachment$Category == "impeach_and_remove"]<- "Impeachment"
#Using the "$" symbol to filter via a specific column

impeachment$Category[impeachment$Category == "begin_proceedings"]<- "Begin_Proceedings"

impeachment1$`Rep Yes`[impeachment1$`Rep Yes` > 1]<- "Impeach_Republicans"

impeachment1$`Dem Yes`[impeachment1$`Dem Yes`>1]<-"Impeach_Democrats"

impeachment1$`Ind Yes`[impeachment1$`Ind Yes` > 1]<- "Impeach Independents"

impeachment2$`Rep Yes`[impeachment2$`Rep Yes` > 1]<- "Proceedings_Republicans"

impeachment2$`Dem Yes`[impeachment2$`Dem Yes` > 1]<- "Proceedings_Democrats"

impeachment2$`Ind Yes`[impeachment2$`Ind Yes`]<- "Proceedings_Indpendents"

Step 6: Cleaning the data sets, setting the percentage values to numeric, and creating the first histogram, for those wanting Impeachment

impeachment_clean <- impeachment1 |>
  mutate(Rep_Numeric = parse_number(as.character(`Rep Yes`)),
        Dem_Numeric = parse_number(as.character(`Dem Yes`)),
        Ind_Numeric = parse_number(as.character(`Ind Yes`)))

Warning: There were 3 warnings in `mutate()`.
The first warning was:
ℹ In argument: `Rep_Numeric = parse_number(as.character(`Rep Yes`))`.
Caused by warning:
! 131 parsing failures.
row col expected              actual
  1  -- a number Impeach_Republicans
  2  -- a number Impeach_Republicans
  3  -- a number Impeach_Republicans
  4  -- a number Impeach_Republicans
  5  -- a number Impeach_Republicans
... ... ........ ...................
See problems(...) for more details.
ℹ Run `dplyr::last_dplyr_warnings()` to see the 2 remaining warnings.

p1 <- ggplot(impeachment_clean) +
  geom_histogram(aes(x = Rep_Numeric, fill = "Republican"), position = "identity", alpha = 1, binwidth = 2) +
  geom_histogram(aes(x = Dem_Numeric, fill = "Democrat"), position = "identity", alpha = 1, binwidth = 2) +
  geom_histogram(aes(x = Ind_Numeric, fill = "Independent"), , position = "identity", alpha = 1, binwidth = 2) +
  labs(x = "Percent of Participants in Favor of Impeaching Trump", y = "Number of Polls Conducted", title = "Poll Respondants in favor of Impeaching President Trump", caption = " Authors Aaron Bycoffe, Ella Koeze, Nathaniel Rakich") + 
  scale_fill_manual (name = "Party",
  values = c("Republican" = "maroon","Democrat" = "navy", "Independent" = "grey")) +
  theme_minimal()
p1

Warning: Removed 139 rows containing non-finite outside the scale range
(`stat_bin()`).

Warning: Removed 139 rows containing non-finite outside the scale range (`stat_bin()`).
Removed 139 rows containing non-finite outside the scale range (`stat_bin()`).

Warning: No shared levels found between `names(values)` of the manual scale and the
data's fill values.

#Loading the Histogram for those wanting Trump Impeached

Step 7: Repeating the same step for the second histogram, for those who want Congress to being impeachment proceedings

impeachment_clean <- impeachment2 |>
  mutate(Rep_Numeric = parse_number(as.character(`Rep Yes`)),
        Dem_Numeric = parse_number(as.character(`Dem Yes`)),
        Ind_Numeric = parse_number(as.character(`Ind Yes`)))

Warning: There were 3 warnings in `mutate()`.
The first warning was:
ℹ In argument: `Rep_Numeric = parse_number(as.character(`Rep Yes`))`.
Caused by warning:
! 61 parsing failures.
row col expected                  actual
  1  -- a number Proceedings_Republicans
  2  -- a number Proceedings_Republicans
  3  -- a number Proceedings_Republicans
  4  -- a number Proceedings_Republicans
  5  -- a number Proceedings_Republicans
... ... ........ .......................
See problems(...) for more details.
ℹ Run `dplyr::last_dplyr_warnings()` to see the 2 remaining warnings.

p2 <- ggplot(impeachment_clean) +
  geom_histogram(aes(x = Rep_Numeric, fill = "Republican"), position = "identity", alpha = 1, binwidth = 2) +
  geom_histogram(aes(x = Dem_Numeric, fill = "Democrat"), position = "identity", alpha = 1, binwidth = 2) +
  geom_histogram(aes(x = Ind_Numeric, fill = "Independent"), , position = "identity", alpha = 1, binwidth = 2) +
  labs(x = "Percent of Participants demanding Congress begin proceedings", y = "Number of Polls Conducted", title = "Poll Respondants demanding Congress being Impeachment Proceedings", caption = " Authors Aaron Bycoffe, Ella Koeze, Nathaniel Rakich") + 
  scale_fill_manual (name = "Party",
      values = c("Republican" = "maroon","Democrat" = "navy", "Independent" = "grey")) +
  theme_minimal()
p2

Warning: Removed 66 rows containing non-finite outside the scale range
(`stat_bin()`).

Warning: Removed 66 rows containing non-finite outside the scale range
(`stat_bin()`).

Warning: Removed 40 rows containing non-finite outside the scale range
(`stat_bin()`).

#Creating a second histogram for those wanting Congress to start proceedings

Looking at the visualizations created here, it is evident that most Democrats either want Trump removed from office, or want Congress to being formal impeachment proceedings, given that Trump did some questionable things whilst he was president, whereas most Republicans want their president to remain in office, regardless of any misconduct, since any impeachment or hearings would damage the reputation of the Republican Party, and would give the Democrats an easy advantage in the next midterms or presidential elections. The Independents are partially divided, because whilst most of the Independents are leaning more in Trump’s favor, a small portion of them want him removed or for Congress to being proceedings, indicating an unfavorable split with the Independents. For cleaning up the data, the global environment was swept prior to running any code, and the dataset is activley cleaned up as part of converting the data to numeric values, using a “_clean” line with the impeachment polls dataset. I wish I was able to include a bar chart or a box plot, because that would have been more interesting and fun, plus I would have also done some data for those who wanted Congress to begin an inquiry into Trump’s questionable actions. This data set clearly illustrates the deep division in American Politics that has developed over the past several decades, and will likely continue into the near future.

AI Use Attribution Statement

Field	Value
Title	DATA 110 Impeachment Polls Project
Creator	Zachary Rodavich
Context	DATA 110
Document Type	Student assignment
AI Permission	`AI-NO`
AI Categories	`Coding`

AI Tools Used

ChatGPT 5.3 — 2026-03-26 — Debugging
ChatGPT 5.3 — 2026-03-29 — Debugging
Google Gemini 3 — 2026-03-29 — Debugging+Explaining Labels

AI Prompt

Explain this particular error in question, including if there are things not defined correctly, and suggest possible fixes to resolve all errors, Explain how to add labels to a ggplot graph.

Human Role

I edited the code in my project with suggestions provided by the listed A.I. programs, in order to fix any problems that I found in my code.

Notes

All code is written by me and ME ONLY, using suggestions from the aforementioned A.I. programs to resolve errors. I have carefully read and fully understood the A.I. use policies for this course. A.I. programs were used only for the tasks stated above, primarily to find, explain, and correct errors in my code.

Generated with AI Attribution Generator