library(readxl)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
texas_env <- read.csv("Texas_Commission_on_Environmental_Quality_-_Supplemental_Environmental_Projects_20240925.csv")
# Just displaying the first few rows of the data
head(texas_env)
## Program Case.No.
## 1 AIR QUALITY 49255
## 2 AIR QUALITY 49681
## 3 INDUSTRIAL AND HAZARDOUS WASTE 48085
## 4 WATER QUALITY 47475
## 5 PETROLEUM STORAGE TANKS 49644
## 6 WATER QUALITY 48316
## Customer.Name Order.Date
## 1 TOTALENERGIES PETROCHEMICALS & REFINING USA, INC. Sep 15, 2015
## 2 THE PREMCOR REFINING GROUP INC. Sep 15, 2015
## 3 J. M. HOLM & CO., INC. Sep 15, 2015
## 4 CITY OF HEARNE Sep 15, 2015
## 5 DUPRE LOGISTICS L L C Sep 20, 2015
## 6 NORTHWEST HARRIS COUNTY MUNICIPAL UTILITY DISTRICT 21 Sep 20, 2015
## Penalty.Assessed Penalty.Deferred Payable.Amount SEP.Costs.Total
## 1 55000 0 27500 27500
## 2 35438 7087 14176 14175
## 3 35000 7000 14000 14000
## 4 40500 8100 0 32400
## 5 9453 1890 3782 3781
## 6 68250 13650 0 54600
## SEP.Offset.Total Type.1
## 1 27500 POLLUTION PREVENTION
## 2 14175 POLLUTION PREVENTION
## 3 14000 POLLUTION PREVENTION
## 4 32400 POLLUTION PREVENTION
## 5 3781 POLLUTION PREVENTION
## 6 54600 POLLUTION PREVENTION
## SEP.Project.1
## 1 PERFORMING PARTY SHALL OPERATE, MAINTAIN, AND POTENTIALLY EXPAND THE EXISTING SOUTHEAST TEXAS REGIONAL AIR MONITORING NETWORK WHICH INCLUDES NINE MONITORING STATIONS CURRENTLY AT THE FOLLOWING LOCATIONS: 1) BEAUMONT CAM#2; 2) COVE SCHOOL CAM #C695; 3) MAURICEVILLE CAM#642; 4) PORT ARTHUR (MOTIVA) INDUSTRIAL SITE CAM #C628; 5) PORT ARTHUR MEMORIAL HIGH SCHOOL CAMPUS CAM #C689; 6) PORT NECHES CAM #136; 7) SABINE PASS CAM #C640; 8) SOUTHEAST TEXAS REGIONAL AIRPORT CAM #C643; 9) WEST ORANGE CAM #C9.
## 2 PERFORMING PARTY SHALL OPERATE, MAINTAIN, AND POTENTIALLY EXPAND THE EXISTING SOUTHEAST TEXAS REGIONAL AIR MONITORING NETWORK WHICH INCLUDES NINE MONITORING STATIONS CURRENTLY AT THE FOLLOWING LOCATIONS: 1) BEAUMONT CAM#2; 2) COVE SCHOOL CAM #C695; 3) MAURICEVILLE CAM#642; 4) PORT ARTHUR (MOTIVA) INDUSTRIAL SITE CAM #C628; 5) PORT ARTHUR MEMORIAL HIGH SCHOOL CAMPUS CAM #C689; 6) PORT NECHES CAM #136; 7) SABINE PASS CAM #C640; 8) SOUTHEAST TEXAS REGIONAL AIRPORT CAM #C643; 9) WEST ORANGE CAM #C9.
## 3 CONTRIBUTIONS WILL BE USED TO PAY FOR LABOR AND MATERIALS COSTS ASSOCIATED WITH IMPLEMENTING THE MARSH MANIA PROJECT, WHICH INCLUDE RESTORING SHORELINE ELEVATIONS, GROWING PLANTS FOR SHORELINE RESTORATION, AND PLANTING NEW HABITAT. HISTORIC SUBSIDENCE AND EROSION HAVE RESULTED IN SHORELINE ELEVATIONS WHICH ARE TOO LOW TO SUPPORT VEGETATION.
## 4 PROJECT SHALL COORDINATE COLLECTION EVENTS FOR LOCAL RESIDENTS TO BRING IN HOUSEHOLD HAZARDOUS WASTE SUCH AS PAINT, THINNERS, PESTICIDES, OIL AND GAS, CORROSIVE CLEANERS, AND FERTILIZERS FOR PROPER DISPOSAL. WHERE AVAILABLE, THE PROJECTS MAY ALSO OFFER ELECTRONICS COLLECTION AND RECYCLING. SEP FUNDS SHALL BE USED FOR COLLECTION, RECYCLING AND DISPOSAL.
## 5 CONTRIBUTIONS WILL BE USED TO PAY FOR LABOR AND MATERIALS COSTS ASSOCIATED WITH IMPLEMENTING THE MARSH MANIA PROJECT, WHICH INCLUDE RESTORING SHORELINE ELEVATIONS, GROWING PLNATS FOR SHORELINE RESTORATION, AND PLANTING NEW HABITAT. HISTORIC SUBSIDENCE AND EROSION HAVE RESULTED IN SHORELINE ELEVATIONS WHICH ARE TOO LOW TO SUPPORT VEGETATION.
## 6 BAYOU LAND CONSERVANCY - LAKE HOUSTON WATERSHED - WESTERN WATERSHED PROTECTION PROJECT: BLC HAS IDENTIFIED APPROXIMATELY 600 ACRES ALONG THE WEST FORK OF THE SAN JACINTO RIVER, SPRING CREEK, CYPRESS CREEK, AND LAKE CREEK FOR ACQUISITION OF PERPETUAL CONSERVATION EASEMENTS IN ACCORDANCE WITH SUBCHAPTER A, CHAPTER 183, TEXAS NATURAL RESOURCES CODE.
## SEP.Cost.1 SEP.Offset.1 Type.2
## 1 27500 27500
## 2 14175 14175
## 3 7000 7000 POLLUTION PREVENTION
## 4 10800 10800 POLLUTION PREVENTION
## 5 1891 1891 POLLUTION PREVENTION
## 6 18200 18200 POLLUTION PREVENTION
## SEP.Project.2
## 1
## 2
## 3 CONTRIBUTIONS WILL BE USED IN ACCORDANCE WITH THE SUPPLEMENTAL ENVIRONMENTAL PROJECT AGREEMENT BETWEEN THE ARMAND BAYOU NATURE CENTER AND THE TEXAS COMMISSION ON ENVIRONMENTAL QUALITY. THE TALLGRASS PRAIRIE WAS ONCE A COMMON ECOSYSTEM IN TEXAS AND THE UNITED STATES. TODAY MORE THAN 99% OF THIS HABITAT HAS BEEN LOST AND THE REMAINDER IS HIGHLY FRAGMENTED AND SEVERELY THREATENED BY EXOTIC SPECIES AND DEVELOPMENT. PRESCRIBED BURNING IS ONE STEWARDSHIP TOOL USED TO MAINTAIN A TALLGRASS PRAIRIE ECOSYSTEM.
## 4 PROJECT WILL REPAIR OR REPLACE FAILING WATER SYSTMES OR ON-SITE WASTEWATER SYSTEMS FOR LOW-INCOME HOMEOWNERS. SEP FUNDS WILL BE USED TO PAY FOR THE LABOR AND MATERIAL COSTS RELATED TO REPAIRING OR REPLACING THE FAILING SYSTEMS. THE RECIPIENTS WILL NOT BE CHARGED FOR THE COST OF REPLACING OR REPAIRING THE FAILING SYSTEM.
## 5 THIRD-PARTY ADMINISTRATOR MANAGES A SYSTEM OF ISLAND SANCTUARIES ALONG THE TEXAS COAST. THESE ISLANDS CONSTITUTE MORE THAN 4,000 ACRES THAT ARE HOME TO TWENTY-FIVE SPECIES OF COLONIAL WATERBIRDS, SEVERAL OF WHICH ARE CONSIDERED ENDANGERED OR THREATENED. MANY OF THE SPECIES OF WATERBIRDS NEST ONLY ON ISLANDS OWNED OR LEASED BY THIRD-PARTY ADMINISTRATOR.
## 6 CONTRIBUTIONS WILL BE USED IN ACCORDANCE WITH THE SUPPLEMENTAL ENVIRONMENTAL PROJECT AGREEMENT BETWEEN THE ARMAND BAYOU NATURE CENTER AND THE TEXAS COMMISSION ON ENVIRONMENTAL QUALITY. THE TALLGRASS PRAIRIE WAS ONCE A COMMON ECOSYSTEM IN TEXAS AND THE UNITED STATES. TODAY MORE THAN 99% OF THIS HABITAT HAS BEEN LOST AND THE REMAINDER IS HIGHLY FRAGMENTED AND SEVERELY THREATENED BY EXOTIC SPECIES AND DEVELOPMENT. PRESCRIBED BURNING IS ONE STEWARDSHIP TOOL USED TO MAINTAIN A TALLGRASS PRAIRIE ECOSYSTEM.
## SEP.Cost.2 SEP.Offset.2 Type.3
## 1 NA NA
## 2 NA NA
## 3 7000 7000
## 4 10800 10800 POLLUTION PREVENTION
## 5 1890 1890
## 6 18200 18200 POLLUTION PREVENTION
## SEP.Project.3
## 1
## 2
## 3
## 4 RC&D WILL COORDINATE WITH LOCAL CITY AND COUNTY GOVERNMENT OFFICIALS TO CLEAN-UP SITES WHERE TIRES HAVE BEEN DISPOSED OF ILLEGALLY. CONTRIBUTIONS WILL BE USED TO CLEAN UP ILLEGAL TIRE SITES. ELIGIBLE SITES WILL BE LIMITED TO AREAS WHERE A RESPONSIBLE PARTY CAN NOT BE IDENTIFIED AND WHERE THERE IS NO PREEXISTING OBLIGATION TO CLEAN UP THE SITE BY THE OWNER OR GOVERNMENT. SEP MONIES WILL BE USED FOR THE DIRECT COST OF COLLECTION AND DISPOSAL OF DEBRIS AND TIRES.
## 5
## 6 CONTRIBUTIONS WILL BE USED TO PAY FOR LABOR AND MATERIALS COSTS ASSOCIATED WITH IMPLEMENTING THE MARSH MANIA PROJECT, WHICH INCLUDE RESTORING SHORELINE ELEVATIONS, GROWING PLNATS FOR SHORELINE RESTORATION, AND PLANTING NEW HABITAT. HISTORIC SUBSIDENCE AND EROSION HAVE RESULTED IN SHORELINE ELEVATIONS WHICH ARE TOO LOW TO SUPPORT VEGETATION.
## SEP.Cost.3 SEP.Offset.3 Type.4 SEP.Project.4 SEP.Cost.4 SEP.Offset.4 Type.5
## 1 NA NA NA NA NA NA NA
## 2 NA NA NA NA NA NA NA
## 3 NA NA NA NA NA NA NA
## 4 10800 10800 NA NA NA NA NA
## 5 NA NA NA NA NA NA NA
## 6 18200 18200 NA NA NA NA NA
## SEP.Project.5 SEP.Cost.5 SEP.Offset.5
## 1 NA NA NA
## 2 NA NA NA
## 3 NA NA NA
## 4 NA NA NA
## 5 NA NA NA
## 6 NA NA NA
str(texas_env)
## 'data.frame': 1283 obs. of 29 variables:
## $ Program : chr "AIR QUALITY" "AIR QUALITY" "INDUSTRIAL AND HAZARDOUS WASTE" "WATER QUALITY" ...
## $ Case.No. : int 49255 49681 48085 47475 49644 48316 50162 49756 50129 48836 ...
## $ Customer.Name : chr "TOTALENERGIES PETROCHEMICALS & REFINING USA, INC." "THE PREMCOR REFINING GROUP INC." "J. M. HOLM & CO., INC." "CITY OF HEARNE" ...
## $ Order.Date : chr "Sep 15, 2015" "Sep 15, 2015" "Sep 15, 2015" "Sep 15, 2015" ...
## $ Penalty.Assessed: int 55000 35438 35000 40500 9453 68250 1125 4373 5000 72905 ...
## $ Penalty.Deferred: int 0 7087 7000 8100 1890 13650 225 874 1000 14581 ...
## $ Payable.Amount : int 27500 14176 14000 0 3782 0 450 0 2000 29162 ...
## $ SEP.Costs.Total : int 27500 14175 14000 32400 3781 54600 450 3499 2000 29162 ...
## $ SEP.Offset.Total: int 27500 14175 14000 32400 3781 54600 450 3499 2000 29162 ...
## $ Type.1 : chr "POLLUTION PREVENTION" "POLLUTION PREVENTION" "POLLUTION PREVENTION" "POLLUTION PREVENTION" ...
## $ SEP.Project.1 : chr "PERFORMING PARTY SHALL OPERATE, MAINTAIN, AND POTENTIALLY EXPAND THE EXISTING SOUTHEAST TEXAS REGIONAL AIR MONI"| __truncated__ "PERFORMING PARTY SHALL OPERATE, MAINTAIN, AND POTENTIALLY EXPAND THE EXISTING SOUTHEAST TEXAS REGIONAL AIR MONI"| __truncated__ "CONTRIBUTIONS WILL BE USED TO PAY FOR LABOR AND MATERIALS COSTS ASSOCIATED WITH IMPLEMENTING THE MARSH MANIA PR"| __truncated__ "PROJECT SHALL COORDINATE COLLECTION EVENTS FOR LOCAL RESIDENTS TO BRING IN HOUSEHOLD HAZARDOUS WASTE SUCH AS PA"| __truncated__ ...
## $ SEP.Cost.1 : int 27500 14175 7000 10800 1891 18200 450 3499 2000 29162 ...
## $ SEP.Offset.1 : int 27500 14175 7000 10800 1891 18200 450 3499 2000 29162 ...
## $ Type.2 : chr "" "" "POLLUTION PREVENTION" "POLLUTION PREVENTION" ...
## $ SEP.Project.2 : chr "" "" "CONTRIBUTIONS WILL BE USED IN ACCORDANCE WITH THE SUPPLEMENTAL ENVIRONMENTAL PROJECT AGREEMENT BETWEEN THE ARMA"| __truncated__ "PROJECT WILL REPAIR OR REPLACE FAILING WATER SYSTMES OR ON-SITE WASTEWATER SYSTEMS FOR LOW-INCOME HOMEOWNERS. "| __truncated__ ...
## $ SEP.Cost.2 : int NA NA 7000 10800 1890 18200 NA NA NA NA ...
## $ SEP.Offset.2 : int NA NA 7000 10800 1890 18200 NA NA NA NA ...
## $ Type.3 : chr "" "" "" "POLLUTION PREVENTION" ...
## $ SEP.Project.3 : chr "" "" "" "RC&D WILL COORDINATE WITH LOCAL CITY AND COUNTY GOVERNMENT OFFICIALS TO CLEAN-UP SITES WHERE TIRES HAVE BEEN DI"| __truncated__ ...
## $ SEP.Cost.3 : int NA NA NA 10800 NA 18200 NA NA NA NA ...
## $ SEP.Offset.3 : int NA NA NA 10800 NA 18200 NA NA NA NA ...
## $ Type.4 : logi NA NA NA NA NA NA ...
## $ SEP.Project.4 : logi NA NA NA NA NA NA ...
## $ SEP.Cost.4 : logi NA NA NA NA NA NA ...
## $ SEP.Offset.4 : logi NA NA NA NA NA NA ...
## $ Type.5 : logi NA NA NA NA NA NA ...
## $ SEP.Project.5 : logi NA NA NA NA NA NA ...
## $ SEP.Cost.5 : logi NA NA NA NA NA NA ...
## $ SEP.Offset.5 : logi NA NA NA NA NA NA ...
Dependent Variable: Payable.Amount (amount that remains payable after considering SEPs or deferred penalties).
Independent Variables: Penalty.Assessed, SEP.Costs.Total, and Penalty.Deferred.
# Linear model with Payable.Amount as the dependent variable
model <- lm(Payable.Amount ~ Penalty.Assessed + SEP.Costs.Total + Penalty.Deferred, data = texas_env)
summary(model)
##
## Call:
## lm(formula = Payable.Amount ~ Penalty.Assessed + SEP.Costs.Total +
## Penalty.Deferred, data = texas_env)
##
## Residuals:
## Min 1Q Median 3Q Max
## -6075.8 -9.1 -7.9 -4.5 6742.1
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.8000517 9.8869601 0.991 0.322
## Penalty.Assessed 1.0002789 0.0004447 2249.183 <2e-16 ***
## SEP.Costs.Total -1.0005624 0.0008273 -1209.437 <2e-16 ***
## Penalty.Deferred -1.0007731 0.0010284 -973.096 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 302.9 on 1279 degrees of freedom
## Multiple R-squared: 1, Adjusted R-squared: 1
## F-statistic: 8.82e+06 on 3 and 1279 DF, p-value: < 2.2e-16
My Interpretation of the Results:
R-squared (1): An R-squared value of 1 suggests that the model perfectly explains all of the variance in the dependent variable, Payable.Amount. This is very unusual in real-world datasets and might indicate overfitting.
p-value (< 2.2e-16): The extremely small p-value suggests that the model overall is statistically significant, meaning there’s a very low probability that these results occurred by chance. This makes the relationship between the dependent and independent variables highly statistically significant.
# Linearity assumption check
plot(model, which=1)
The visible pattern would suggest that the linear model may not be the
best fit, as it violates the linearity assumption.