library(readxl)
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.3     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
texas_env <- read.csv("Texas_Commission_on_Environmental_Quality_-_Supplemental_Environmental_Projects_20240925.csv")
# Just displaying the first few rows of the data
head(texas_env)
##                          Program Case.No.
## 1                    AIR QUALITY    49255
## 2                    AIR QUALITY    49681
## 3 INDUSTRIAL AND HAZARDOUS WASTE    48085
## 4                  WATER QUALITY    47475
## 5        PETROLEUM STORAGE TANKS    49644
## 6                  WATER QUALITY    48316
##                                           Customer.Name   Order.Date
## 1     TOTALENERGIES PETROCHEMICALS & REFINING USA, INC. Sep 15, 2015
## 2                       THE PREMCOR REFINING GROUP INC. Sep 15, 2015
## 3                                J. M. HOLM & CO., INC. Sep 15, 2015
## 4                                        CITY OF HEARNE Sep 15, 2015
## 5                                 DUPRE LOGISTICS L L C Sep 20, 2015
## 6 NORTHWEST HARRIS COUNTY MUNICIPAL UTILITY DISTRICT 21 Sep 20, 2015
##   Penalty.Assessed Penalty.Deferred Payable.Amount SEP.Costs.Total
## 1            55000                0          27500           27500
## 2            35438             7087          14176           14175
## 3            35000             7000          14000           14000
## 4            40500             8100              0           32400
## 5             9453             1890           3782            3781
## 6            68250            13650              0           54600
##   SEP.Offset.Total               Type.1
## 1            27500 POLLUTION PREVENTION
## 2            14175 POLLUTION PREVENTION
## 3            14000 POLLUTION PREVENTION
## 4            32400 POLLUTION PREVENTION
## 5             3781 POLLUTION PREVENTION
## 6            54600 POLLUTION PREVENTION
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          SEP.Project.1
## 1 PERFORMING PARTY SHALL OPERATE, MAINTAIN, AND POTENTIALLY EXPAND THE EXISTING SOUTHEAST TEXAS REGIONAL AIR MONITORING NETWORK WHICH INCLUDES NINE MONITORING STATIONS CURRENTLY AT THE FOLLOWING LOCATIONS: 1) BEAUMONT CAM#2; 2) COVE SCHOOL CAM #C695; 3) MAURICEVILLE CAM#642; 4) PORT ARTHUR (MOTIVA) INDUSTRIAL SITE CAM #C628; 5) PORT ARTHUR MEMORIAL HIGH SCHOOL CAMPUS CAM #C689; 6) PORT NECHES CAM #136; 7) SABINE PASS CAM #C640; 8) SOUTHEAST TEXAS REGIONAL AIRPORT CAM #C643; 9) WEST ORANGE CAM #C9.
## 2 PERFORMING PARTY SHALL OPERATE, MAINTAIN, AND POTENTIALLY EXPAND THE EXISTING SOUTHEAST TEXAS REGIONAL AIR MONITORING NETWORK WHICH INCLUDES NINE MONITORING STATIONS CURRENTLY AT THE FOLLOWING LOCATIONS: 1) BEAUMONT CAM#2; 2) COVE SCHOOL CAM #C695; 3) MAURICEVILLE CAM#642; 4) PORT ARTHUR (MOTIVA) INDUSTRIAL SITE CAM #C628; 5) PORT ARTHUR MEMORIAL HIGH SCHOOL CAMPUS CAM #C689; 6) PORT NECHES CAM #136; 7) SABINE PASS CAM #C640; 8) SOUTHEAST TEXAS REGIONAL AIRPORT CAM #C643; 9) WEST ORANGE CAM #C9.
## 3                                                                                                                                                              CONTRIBUTIONS WILL BE USED TO PAY FOR LABOR AND MATERIALS COSTS ASSOCIATED WITH IMPLEMENTING THE MARSH MANIA PROJECT, WHICH INCLUDE RESTORING SHORELINE ELEVATIONS, GROWING PLANTS FOR SHORELINE RESTORATION, AND PLANTING NEW HABITAT.  HISTORIC SUBSIDENCE AND EROSION HAVE RESULTED IN SHORELINE ELEVATIONS WHICH ARE TOO LOW TO SUPPORT VEGETATION.
## 4                                                                                                                                                 PROJECT SHALL COORDINATE COLLECTION EVENTS FOR LOCAL RESIDENTS TO BRING IN HOUSEHOLD HAZARDOUS WASTE SUCH AS PAINT, THINNERS, PESTICIDES, OIL AND GAS, CORROSIVE CLEANERS, AND FERTILIZERS FOR PROPER DISPOSAL.  WHERE AVAILABLE, THE PROJECTS MAY ALSO OFFER ELECTRONICS COLLECTION AND RECYCLING.  SEP FUNDS SHALL BE USED FOR COLLECTION, RECYCLING AND DISPOSAL.
## 5                                                                                                                                                              CONTRIBUTIONS WILL BE USED TO PAY FOR LABOR AND MATERIALS COSTS ASSOCIATED WITH IMPLEMENTING THE MARSH MANIA PROJECT, WHICH INCLUDE RESTORING SHORELINE ELEVATIONS, GROWING PLNATS FOR SHORELINE RESTORATION, AND PLANTING NEW HABITAT.  HISTORIC SUBSIDENCE AND EROSION HAVE RESULTED IN SHORELINE ELEVATIONS WHICH ARE TOO LOW TO SUPPORT VEGETATION.
## 6                                                                                                                                                         BAYOU LAND CONSERVANCY - LAKE HOUSTON WATERSHED - WESTERN WATERSHED PROTECTION PROJECT: BLC HAS IDENTIFIED APPROXIMATELY 600 ACRES ALONG THE WEST FORK OF THE SAN JACINTO RIVER, SPRING CREEK, CYPRESS CREEK, AND LAKE CREEK FOR ACQUISITION OF PERPETUAL CONSERVATION EASEMENTS IN ACCORDANCE WITH SUBCHAPTER A, CHAPTER 183, TEXAS NATURAL RESOURCES CODE.
##   SEP.Cost.1 SEP.Offset.1               Type.2
## 1      27500        27500                     
## 2      14175        14175                     
## 3       7000         7000 POLLUTION PREVENTION
## 4      10800        10800 POLLUTION PREVENTION
## 5       1891         1891 POLLUTION PREVENTION
## 6      18200        18200 POLLUTION PREVENTION
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                SEP.Project.2
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
## 3 CONTRIBUTIONS WILL BE USED IN ACCORDANCE WITH THE SUPPLEMENTAL ENVIRONMENTAL PROJECT AGREEMENT BETWEEN THE ARMAND BAYOU NATURE CENTER AND THE TEXAS COMMISSION ON ENVIRONMENTAL QUALITY. THE TALLGRASS PRAIRIE WAS ONCE A COMMON ECOSYSTEM IN TEXAS AND THE UNITED STATES.  TODAY MORE THAN 99% OF THIS HABITAT HAS BEEN LOST AND THE REMAINDER IS HIGHLY FRAGMENTED AND SEVERELY THREATENED BY EXOTIC SPECIES AND DEVELOPMENT. PRESCRIBED BURNING IS ONE STEWARDSHIP TOOL USED TO MAINTAIN A TALLGRASS PRAIRIE ECOSYSTEM.
## 4                                                                                                                                                                                        PROJECT WILL REPAIR OR REPLACE FAILING WATER SYSTMES OR ON-SITE WASTEWATER SYSTEMS FOR LOW-INCOME HOMEOWNERS.  SEP FUNDS WILL BE USED TO PAY FOR THE LABOR AND MATERIAL COSTS RELATED TO REPAIRING OR REPLACING THE FAILING SYSTEMS.  THE RECIPIENTS WILL NOT BE CHARGED FOR THE COST OF REPLACING OR REPAIRING THE FAILING SYSTEM.
## 5                                                                                                                                                         THIRD-PARTY ADMINISTRATOR MANAGES A SYSTEM OF ISLAND SANCTUARIES ALONG THE TEXAS COAST. THESE ISLANDS CONSTITUTE MORE THAN 4,000 ACRES THAT ARE HOME TO TWENTY-FIVE SPECIES OF COLONIAL WATERBIRDS, SEVERAL OF WHICH ARE CONSIDERED ENDANGERED OR THREATENED. MANY OF THE SPECIES OF WATERBIRDS NEST ONLY ON ISLANDS OWNED OR LEASED BY THIRD-PARTY ADMINISTRATOR.
## 6 CONTRIBUTIONS WILL BE USED IN ACCORDANCE WITH THE SUPPLEMENTAL ENVIRONMENTAL PROJECT AGREEMENT BETWEEN THE ARMAND BAYOU NATURE CENTER AND THE TEXAS COMMISSION ON ENVIRONMENTAL QUALITY. THE TALLGRASS PRAIRIE WAS ONCE A COMMON ECOSYSTEM IN TEXAS AND THE UNITED STATES.  TODAY MORE THAN 99% OF THIS HABITAT HAS BEEN LOST AND THE REMAINDER IS HIGHLY FRAGMENTED AND SEVERELY THREATENED BY EXOTIC SPECIES AND DEVELOPMENT. PRESCRIBED BURNING IS ONE STEWARDSHIP TOOL USED TO MAINTAIN A TALLGRASS PRAIRIE ECOSYSTEM.
##   SEP.Cost.2 SEP.Offset.2               Type.3
## 1         NA           NA                     
## 2         NA           NA                     
## 3       7000         7000                     
## 4      10800        10800 POLLUTION PREVENTION
## 5       1890         1890                     
## 6      18200        18200 POLLUTION PREVENTION
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                         SEP.Project.3
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## 3                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## 4 RC&D WILL COORDINATE WITH LOCAL CITY AND COUNTY GOVERNMENT OFFICIALS TO CLEAN-UP SITES WHERE TIRES HAVE BEEN DISPOSED OF ILLEGALLY.  CONTRIBUTIONS WILL BE USED TO CLEAN UP ILLEGAL TIRE SITES.  ELIGIBLE SITES WILL BE LIMITED TO AREAS WHERE A RESPONSIBLE PARTY CAN NOT BE IDENTIFIED AND WHERE THERE IS NO PREEXISTING OBLIGATION TO CLEAN UP THE SITE BY THE OWNER OR GOVERNMENT.  SEP MONIES WILL BE USED FOR THE DIRECT COST OF COLLECTION AND DISPOSAL OF DEBRIS AND TIRES.
## 5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
## 6                                                                                                                             CONTRIBUTIONS WILL BE USED TO PAY FOR LABOR AND MATERIALS COSTS ASSOCIATED WITH IMPLEMENTING THE MARSH MANIA PROJECT, WHICH INCLUDE RESTORING SHORELINE ELEVATIONS, GROWING PLNATS FOR SHORELINE RESTORATION, AND PLANTING NEW HABITAT.  HISTORIC SUBSIDENCE AND EROSION HAVE RESULTED IN SHORELINE ELEVATIONS WHICH ARE TOO LOW TO SUPPORT VEGETATION.
##   SEP.Cost.3 SEP.Offset.3 Type.4 SEP.Project.4 SEP.Cost.4 SEP.Offset.4 Type.5
## 1         NA           NA     NA            NA         NA           NA     NA
## 2         NA           NA     NA            NA         NA           NA     NA
## 3         NA           NA     NA            NA         NA           NA     NA
## 4      10800        10800     NA            NA         NA           NA     NA
## 5         NA           NA     NA            NA         NA           NA     NA
## 6      18200        18200     NA            NA         NA           NA     NA
##   SEP.Project.5 SEP.Cost.5 SEP.Offset.5
## 1            NA         NA           NA
## 2            NA         NA           NA
## 3            NA         NA           NA
## 4            NA         NA           NA
## 5            NA         NA           NA
## 6            NA         NA           NA
str(texas_env)
## 'data.frame':    1283 obs. of  29 variables:
##  $ Program         : chr  "AIR QUALITY" "AIR QUALITY" "INDUSTRIAL AND HAZARDOUS WASTE" "WATER QUALITY" ...
##  $ Case.No.        : int  49255 49681 48085 47475 49644 48316 50162 49756 50129 48836 ...
##  $ Customer.Name   : chr  "TOTALENERGIES PETROCHEMICALS & REFINING USA, INC." "THE PREMCOR REFINING GROUP INC." "J. M. HOLM & CO., INC." "CITY OF HEARNE" ...
##  $ Order.Date      : chr  "Sep 15, 2015" "Sep 15, 2015" "Sep 15, 2015" "Sep 15, 2015" ...
##  $ Penalty.Assessed: int  55000 35438 35000 40500 9453 68250 1125 4373 5000 72905 ...
##  $ Penalty.Deferred: int  0 7087 7000 8100 1890 13650 225 874 1000 14581 ...
##  $ Payable.Amount  : int  27500 14176 14000 0 3782 0 450 0 2000 29162 ...
##  $ SEP.Costs.Total : int  27500 14175 14000 32400 3781 54600 450 3499 2000 29162 ...
##  $ SEP.Offset.Total: int  27500 14175 14000 32400 3781 54600 450 3499 2000 29162 ...
##  $ Type.1          : chr  "POLLUTION PREVENTION" "POLLUTION PREVENTION" "POLLUTION PREVENTION" "POLLUTION PREVENTION" ...
##  $ SEP.Project.1   : chr  "PERFORMING PARTY SHALL OPERATE, MAINTAIN, AND POTENTIALLY EXPAND THE EXISTING SOUTHEAST TEXAS REGIONAL AIR MONI"| __truncated__ "PERFORMING PARTY SHALL OPERATE, MAINTAIN, AND POTENTIALLY EXPAND THE EXISTING SOUTHEAST TEXAS REGIONAL AIR MONI"| __truncated__ "CONTRIBUTIONS WILL BE USED TO PAY FOR LABOR AND MATERIALS COSTS ASSOCIATED WITH IMPLEMENTING THE MARSH MANIA PR"| __truncated__ "PROJECT SHALL COORDINATE COLLECTION EVENTS FOR LOCAL RESIDENTS TO BRING IN HOUSEHOLD HAZARDOUS WASTE SUCH AS PA"| __truncated__ ...
##  $ SEP.Cost.1      : int  27500 14175 7000 10800 1891 18200 450 3499 2000 29162 ...
##  $ SEP.Offset.1    : int  27500 14175 7000 10800 1891 18200 450 3499 2000 29162 ...
##  $ Type.2          : chr  "" "" "POLLUTION PREVENTION" "POLLUTION PREVENTION" ...
##  $ SEP.Project.2   : chr  "" "" "CONTRIBUTIONS WILL BE USED IN ACCORDANCE WITH THE SUPPLEMENTAL ENVIRONMENTAL PROJECT AGREEMENT BETWEEN THE ARMA"| __truncated__ "PROJECT WILL REPAIR OR REPLACE FAILING WATER SYSTMES OR ON-SITE WASTEWATER SYSTEMS FOR LOW-INCOME HOMEOWNERS.  "| __truncated__ ...
##  $ SEP.Cost.2      : int  NA NA 7000 10800 1890 18200 NA NA NA NA ...
##  $ SEP.Offset.2    : int  NA NA 7000 10800 1890 18200 NA NA NA NA ...
##  $ Type.3          : chr  "" "" "" "POLLUTION PREVENTION" ...
##  $ SEP.Project.3   : chr  "" "" "" "RC&D WILL COORDINATE WITH LOCAL CITY AND COUNTY GOVERNMENT OFFICIALS TO CLEAN-UP SITES WHERE TIRES HAVE BEEN DI"| __truncated__ ...
##  $ SEP.Cost.3      : int  NA NA NA 10800 NA 18200 NA NA NA NA ...
##  $ SEP.Offset.3    : int  NA NA NA 10800 NA 18200 NA NA NA NA ...
##  $ Type.4          : logi  NA NA NA NA NA NA ...
##  $ SEP.Project.4   : logi  NA NA NA NA NA NA ...
##  $ SEP.Cost.4      : logi  NA NA NA NA NA NA ...
##  $ SEP.Offset.4    : logi  NA NA NA NA NA NA ...
##  $ Type.5          : logi  NA NA NA NA NA NA ...
##  $ SEP.Project.5   : logi  NA NA NA NA NA NA ...
##  $ SEP.Cost.5      : logi  NA NA NA NA NA NA ...
##  $ SEP.Offset.5    : logi  NA NA NA NA NA NA ...

Dependent Variable: Payable.Amount (amount that remains payable after considering SEPs or deferred penalties).

Independent Variables: Penalty.Assessed, SEP.Costs.Total, and Penalty.Deferred.

# Linear model with Payable.Amount as the dependent variable
model <- lm(Payable.Amount ~ Penalty.Assessed + SEP.Costs.Total + Penalty.Deferred, data = texas_env)
summary(model)
## 
## Call:
## lm(formula = Payable.Amount ~ Penalty.Assessed + SEP.Costs.Total + 
##     Penalty.Deferred, data = texas_env)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -6075.8    -9.1    -7.9    -4.5  6742.1 
## 
## Coefficients:
##                    Estimate Std. Error   t value Pr(>|t|)    
## (Intercept)       9.8000517  9.8869601     0.991    0.322    
## Penalty.Assessed  1.0002789  0.0004447  2249.183   <2e-16 ***
## SEP.Costs.Total  -1.0005624  0.0008273 -1209.437   <2e-16 ***
## Penalty.Deferred -1.0007731  0.0010284  -973.096   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 302.9 on 1279 degrees of freedom
## Multiple R-squared:      1,  Adjusted R-squared:      1 
## F-statistic: 8.82e+06 on 3 and 1279 DF,  p-value: < 2.2e-16

My Interpretation of the Results:

R-squared (1): An R-squared value of 1 suggests that the model perfectly explains all of the variance in the dependent variable, Payable.Amount. This is very unusual in real-world datasets and might indicate overfitting.

p-value (< 2.2e-16): The extremely small p-value suggests that the model overall is statistically significant, meaning there’s a very low probability that these results occurred by chance. This makes the relationship between the dependent and independent variables highly statistically significant.

# Linearity assumption check
plot(model, which=1)

The visible pattern would suggest that the linear model may not be the best fit, as it violates the linearity assumption.