Introductory Information

Define the Business Task

In this case study we will be analyzing local pollutant levels to locate where in our community the biggest environmental issues are. From there, we will suggest positive environmental changes.

Describe the Data Sources Used

Link HERE

This dataset is from the Department of Ecology in the State of Washington and it shows which companies are responsible for polluting in our community. This data also has information that is not related to negative impacts on the environment The data set shows what the toxin or pollutant incident was and where the incident occurred. This is helpful to locate the most vulnerable areas in our community. The dataset can be found by clicking on the “All Data” section of the above link

Link HERE

This Kaggle dataset is specific to the Washington DOL and lists electric vehicles by city, county, and zip code. This will be helpful in determining the amount of hybrid/electric vehicles that are registered in our community and surrounding cities versus the total population for those corresponding cities.

Link HERE This information is from the United State Census Bureau and was used to find the population of Poulsbo, WA. Other city populations will also be looked up from this website.

Documentation of Cleaning & Manipulation of Data

Load up Tidyverse

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.0     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

 

Upload Data to R

Pollution dataset, pollution pivot table for 98370 zip code, and Washington DOL dataset

 

View(Pollution_WA)

> Pollution_WA <- read_csv("Pollution - WA.csv")
Rows: 242253 Columns: 29                                                                                                                
── Column specification ─────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (21): OriginalName, Line1Address, Line2Address, CityName, StateCode, ZIPCode, TribalLand, Reg...
dbl  (7): FacilitySiteId, GISLatitudeNumber, GISLongitudeNumber, WRIANumber, LegislativeDistrictN...
lgl  (1): ProgramFacilityName

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

 

View(Pollution_Pivot_Table)

Screenshot included in ‘Visualizations’
> Pollution_Pivot_Table <- read_csv("Pollution Pivot Table.csv")
Rows: 8 Columns: 10                                                                                                                     
── Column specification ─────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (1): Row Labels
dbl (9): AIRQUAL, HAZWASTE, SEA, SOLIDWASTE, SPILLS, TOXICS, WATQUAL, WATRES, Grand Total

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
> View(Pollution_Pivot_Table)

 

View(US_Car_Data_WA_Electric)

> US_Car_Data_WA_Electric <- read_csv("US Car Data - WA - Electric.csv")
Rows: 134474 Columns: 17                                                                                                                
── Column specification ─────────────────────────────────────────────────────────────────────────────
Delimiter: ","
chr (8): County, City, State, Make, Model, Electric Vehicle Type, CAFV, Electric Utility
dbl (9): Postal Code, Model Year, Electric Range, Base MSRP, Legislative District, DOL Vehicle ID...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
> View(US_Car_Data_WA_Electric)

 

Start Cleaning

There is some data in the Pollution_WA dataset that has a date of 01/01/1753. This is obviously input incorrectly and will be removed from the dataset for the purposes of this case study.

 

Columns were removed from the original dataset that did not serve our purposes such as:

  • Facility Site ID - The company name is listed in the data so there is no need for the site ID at this time.
  • Line 1 Address - We are focusing our attention on the general areas or zipcodes, so the specificity of the address of the facility is not needed.
  • WRIA Number - This section related to the Water Resource Inventory Area and was not needed for our study.
  • Horizontal Accuracy Code - This section’s definition could not be located so it was left out of the analysis.
  • Interaction Status Code - This section’s definition could not be located so it was left out of the analysis.
  • Alternate Name - This section does not have an entry for each facility so it was left out of the analysis.
  • ECY Contact phone number and email - These entries were not to be used in the analysis based on relevance so they were removed.

 

A Search & replace was performed on the City Names to correct them all to be UPPERCASE. When first anaylzing and visualizing the data, there were two cities within the same zip code, “POULSBO” and “Poulsbo”, for example.I wanted there to be consistency so I made sure that each city within my analysis had the same case.

 

Pivot Tables were created to analyze the 98370 zip code of Poulsbo as well as the surrounding cities for the number and different types of environmental instances that occurred or had a Start Date over the years of 2014-2024. The data set had points that were dated as far back as 1753 and then 1900. The most current data is what I was most interested in and I imagine that it is among the more credible information.

Summary of Analysis

 

Through the analysis it can be found that HAZWASTE is 41.12% of total instances in the dataset and TOXICS is 40.10%. From that information we can also see that out of Bainbridge Island, Bremerton, Hansville, Kingston, Poulsbo, and Silverdale that Bremerton makes up close to 50% of total events.

 

HAZWASTE has several sub-classifications or Interaction types but the main ones found in the data set are for TRI and HWG.

TRI is “Facilities in specific industries that manufacture, process or use more than the threshold amount of one or more of 600 listed toxic chemicals.”
HWG is “Facilities that generate any quantity of a dangerous waste.”

 

TOXICS sub-classifications or Interaction types that occur most in the local data figures are for SCS and INDPNDNT.

SCS is when “A site is being cleaned up under state regulations” due to toxic instances.
INDPNDNT is “Any remedial action without department oversight or approval and not under an order or decree.”

 

Since Bremerton has been found to have the most HAZWASTE and TOXICS instances in our general area I was interested if this could be caused by the larger population of the city, as well as the large Naval Base that is in the city. However, there are 1028 HAZWASTE data points for Bremerton and only 65 of those came from the US Navy. There were also 918 TOXICS data points for Bremerton and only 41 of those came from the US Navy.
It would appear that further analysis would need to be done in order to locate the company(ies) that are the most responsible for the negative environmental instances in the data set.

 

It was a pleasant surprise for me that the data had a lot of points that were not negative environmental instances. Many of the codes or Interaction Types correspond to clean-up efforts or oversight to ensure that toxins are disposed of properly and with regulation. If more time permitted, it would be interesting to find out what the break down is of the data for positive versus negative environmental data points and how those positive rulings and regulations are benefitting the areas surrounding them. Many Interaction Types are meant for cleaning up and monitoring waterways so a study on how these regulations have made a positive impact would be interesting. As we will see below, many instances in our area are very close to the water in the Puget Sound which has a litany of species living beneath its waters as well as around them.

 

Several SIC and/or NAICS descriptions for the data points in our set are for Gasoline related businesses as well as construction.These sections cause the most issues when looking at our area and a few surrounding towns, specifically. Further analysis would need to be done to find out what regulations are currently in place and how these companies are falling short of those regulations.

 

Since Gasoline related businesses and construction - which uses a large amount of gasoline to run its machinery - are some of the largest contributors to the negative environmental data points, I wanted to see how many people in our community have switched to a hybird or electric vehicle.
The Department of Licensing data set from the State of Washington was helpful with this. There were only 530 electric or hybrid vehicles registered in Poulsbo Washington. According to the US Census, Poulsbo has around 12,000 people living in it or 12,039 which is only about 4.40% of the population that own electric or hyrid vehicles. Bremerton, which had the most negative pollution points, only has 917 registered vehicles out of their 45,415 licensed vehicles which is only 2.02%.

 

Visualizations & Key Findings

 

98370 Pivot Table
98370 Pivot Table

 

This Pivot Table shows that Poulsbo has the highest occurrence of events, however, the other cities within the zip code that are showing have their own zip codes.
This showed that filtering by City Name instead of Zip Code would be a more reliable route. The above Pivot Table shows that the town of Bremerton has hardly any incidents of pollutants.
However, if you view the Pivot Table below organized by City Name in the columns and by Pollutant Category in the rows, we get a clearer picture of the events happening in each surrounding city.

 

Pollutants By City 2014-2024
Pollutants By City 2014-2024

 

The below Pie Chart shows the percentage of each pollution type for the following cities: Bainbridge Island, Bremerton, Hansville, Kingston, Poulsbo, and Silverdale. This will give us a clearer picture on which pollutants are the biggest issue in our community.

 

Percentage of Pollutants 2014-2024
Percentage of Pollutants 2014-2024

 

Here is an analysis of instances as a percent of the total. This Pivot Table and the Pie Chart above highlight that the top two areas to be concerned with are HAZWASTE and TOXICS

Percent of Total 2014-2024
Percent of Total 2014-2024

 

Originally, it was believed that Poulsbo had the majority of instances in our surrounding area, but now we can see that Bremerton has the most instances from the dataset. This is highlighted more in the below Grouped Column Chart:

Pollutants by Local Cities 2014-2024
Pollutants by Local Cities 2014-2024

 

The below visualization shows a geographical map that points out the number of HAZWASTE and TOXICS from 2014-2024 and each circle represents the number of events that took place at that specific geographical location.

 

Geographic HAZWASTE & TOXINS 2014-2024 A link to the above interactive Tableau Visualization can be found at this link HERE.

 

Top Recommendations

 

#1 Look into existing regulations and laws surrounding HAZWASTE and TOXICS with the Department of Ecology in Washington. Knowing what clean up projects are already in the works and which ones are not yet authorized will help us to determine where the most help is needed. Based on the data, it would appear that our waterways and the species that live in them and the wildlife that live around them are the most at risk.
This means we will want to focus concentrated efforts on keeping beaches, water run-offs, and the areas that feed the larger Puget Sound and the ocean as healthy as possible.The original data set also has congressional and legislative information, so it would be wise to use that information along with the pollution findings to rally our local representatives towards positive change.

 

#2 Many environmental instances in our area are due to Gasoline-related business and construction. We can look into what regulations are already in place and campaign for stricter regulations and/or more public transparency about when a negative ecological situation occurs in our community.
The everyday person can also help to offset these negative effects on our environment by going green with our own vehicles. Our small town of Poulsbo has just over 12,000 people that call it home and only just over 4% of those individuals are driving a hybrid or electric vehicle. When we go to Bremerton, which was the highest negative contributor to the environment in our area, they are even lower at just over 2% of the population.
A push can be made toward greener transportation such as electric vehicles, or walking, or riding a bike. Since many in this area commute for work, I believe a bigger push should be made toward electric vehicles. This could include campaigns to highlight the positive effects of hybrid vehicles or some kind of reimbursement or credit for going green.A push toward TESLA vehicles could be effective since they have a stronghold in this hybrid/electric vehicle area currently, as seen below:
Top 3 Electric Brands

 

#3 Organizing local clean-up events is a great idea to help ensure that those in the community can see first-hand what their efforts are doing. It is one thing to read about an amount of trash that was picked up or how different an area looks after a team has come in and cleaned up. It is something else entirely to be part of the crew that cleans up an area. This is especially true of long-term projects that will have larger effects as more time goes on and more effort is put in. Getting those in the community involved ensures that we have local advocates keeping a close eye on our natural areas.