## Class "Spatial" [package "sp"]
##
## Slots:
##
## Name: bbox proj4string
## Class: matrix CRS
##
## Known Subclasses:
## Class "SpatialPoints", directly
## Class "SpatialMultiPoints", directly
## Class "SpatialGrid", directly
## Class "SpatialLines", directly
## Class "SpatialPolygons", directly
## Class "SpatialPointsDataFrame", by class "SpatialPoints", distance 2
## Class "SpatialPixels", by class "SpatialPoints", distance 2
## Class "SpatialMultiPointsDataFrame", by class "SpatialMultiPoints", distance 2
## Class "SpatialGridDataFrame", by class "SpatialGrid", distance 2
## Class "SpatialLinesDataFrame", by class "SpatialLines", distance 2
## Class "SpatialPixelsDataFrame", by class "SpatialPoints", distance 3
## Class "SpatialPolygonsDataFrame", by class "SpatialPolygons", distance 2
The purpose of this study is to analyze Wildfire Trends from
1984-2021. As our global climate is connected to the natural environment
around us, it has become increasingly important to track trends in
natural hazards so that we may adjust our economic/human resources to
mitigate negative impacts caused.
Data for Analysis
1. Utah Surface Management Agency (SMA)
a. This dataset was collected by the BLM and retrieved from the
Department of Interior’s website. The layers in this dataset are polygon
features depicting the surface area of Utah and which management agency
oversees them.
c. As this dataset was gathered by and is managed by government
agencies, the information related to private land ownership is very
basic and not likely to provide detailed information about private land
ownership. However, as this study is only concerned with government
lands a generalization of private land ownership is acceptable.
2. United States Wildfires
a. This dataset was collected by the US Forest Service and retrieved
from the Department of Agriculture’s website. The layers within this
vector point shapefile depict points where wildfires have occurred since
1984.
c. A boundary shapefile map is available for this dataset; however,
the file size is very large so the vector point file will be used as
this study is only interest in general location and counts of wildfires.
For those interested in exporting this data in the form of a KML/KMZ
output, the file size and complexity is too great.
d. This dataset inventories wildfires greater than 500 acres in the
eastern U.S. and 1,000 acres in the western U.S. While the study does
show some limitation in that wildfires are confirmed using satellite
imagery and smaller fires are omitted, the dataset is quite extensive
and accurate showing recent historical wildfires. On a greater scale of
time and history, this dataset does not provide a greater long-term
depiction of wildfire behavior.
3. Land Cover
a. This dataset was collected by the SWReGAP project which is a
multi-institutional group. The layers within this raster file depict
landcover over a five state region to assess biodiversity.
c. The generalizability of this study is outstanding in expressing
numerous land cover types validated and assessed across multiple agency
groups. It is interesting to note the study is limited to 5 states in
the intermountain west and southern U.S. as inclusion of northern states
in the intermountain region would reduce bias to land cover types
dominate in southern climates.
Load and Clean Data:
1. Surface Management (SMA) Shapefile
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\joshc\Documents\USU\2023\GEOG 4870\FInal Project\GEOG4870 Final\Final Data\SMA\utah_sma.shp", layer: "utah_sma"
## with 12009 features
## It has 9 fields
Data Cleaned
Drop column 4 as it was all NA
Drop missing values within rows
Remove duplicate rows
2. Wildfire Shapefile
## OGR data source with driver: ESRI Shapefile
## Source: "C:\Users\joshc\Documents\USU\2023\GEOG 4870\FInal Project\GEOG4870 Final\Final Data\US Wildfires\S_USA.MTBS_FIRE_OCCURRENCE_PT.shp", layer: "S_USA.MTBS_FIRE_OCCURRENCE_PT"
## with 29926 features
## It has 22 fields
## Integer64 fields read as strings: MAP_ID DNBR_OFFST DNBR_STDDV
Data Cleaned
Drop columns with no data: wildfires <- wildfires[,
-c(14,17,20)]
Drop missing values within rows: wildfires <- wildfires %>%
na.omit()
Remove duplicate rows: wildfires <-
(wildfires[!duplicated(wildfires$FIRE_ID), ])
## class : SpatialPointsDataFrame
## features : 29926
## extent : -166.0911, -65.35063, 17.95656, 70.13926 (xmin, xmax, ymin, ymax)
## variables : 19
## # A tibble: 29,926 × 19
## FIRE_ID FIRE_NAME ASMNT_TYPE FIRE_TYPE NODATA_THR GREENNESS_ LOW_THRESH
## <chr> <chr> <chr> <chr> <dbl> <dbl> <dbl>
## 1 AZ3359910999… MUD SPRI… Initial Wildfire -970 -150 100
## 2 TX3129409771… OWL CREEK Initial Wildfire -970 -150 110
## 3 FL2686508190… UNNAMED Initial (… Unknown 9999 9999 400
## 4 CO4001210858… SPRING C… Extended Wildfire -970 -150 80
## 5 AZ3586611185… ANDERSON Extended Wildfire -970 -150 20
## 6 OR4334712281… WILLIAMS… Extended Wildfire -970 -150 110
## 7 FL2685008188… UNNAMED Initial (… Unknown 9999 9999 400
## 8 FL2763008118… UNNAMED Initial (… Unknown 9999 9999 350
## 9 UT3779310991… COOPER S… Extended Wildfire -970 -150 120
## 10 WA4649412121… DISCOVERY Extended Wildfire -970 -150 100
## # ℹ 29,916 more rows
## # ℹ 12 more variables: MODERATE_T <dbl>, HIGH_THRES <dbl>, IG_DATE <chr>,
## # LATITUDE <dbl>, LONGITUDE <dbl>, ACRES <dbl>, MAP_ID <chr>, MAP_PROG <chr>,
## # DNBR_OFFST <chr>, DNBR_STDDV <chr>, PRE_ID <chr>, POST_ID <chr>
3. Land Cover
Data Preparation
Research questions and necessary data.
1. Which areas in the U.S. are at most risk of wildfires?
a. us.cities data
b. Fire point locations within the US
2. Has there been an increase in large wildfires since 1984?
a. Wildfire Shapefile
b. Key variables within the dataset are the years that individual
wildfires occur and how large they are measured in the “acres”.
4. Brainstorming
1. Frequency of large wildfires:
a. Filter wildfires that are well above the average size and
visualize any change in the frequency of large fires overtime.
2. General wildfire trends.
a. Separate the entire dataset into multiple periods of time than
compare wildfire change on line charts throughout each time period.
DATA EXPLORATION
5. Summarize key variables of interest.
Summary Statistics
|
Variable
|
N
|
Mean
|
Std. Dev.
|
Min
|
Pctl. 25
|
Pctl. 50
|
Pctl. 75
|
Max
|
|
ACRES
|
29926
|
6918
|
25127
|
500
|
1168
|
1930
|
4213
|
1068802
|
|
year
|
29926
|
2006
|
10
|
1984
|
1999
|
2008
|
2014
|
2021
|
|
GREENNESS_
|
29926
|
2984
|
4691
|
-650
|
-150
|
-150
|
9999
|
9999
|
|
LOW_THRESH
|
29926
|
14
|
1148
|
-9999
|
40
|
75
|
164
|
9999
|
|
MODERATE_T
|
29926
|
23
|
6612
|
-9999
|
-120
|
277
|
400
|
9999
|
|
HIGH_THRES
|
29926
|
1048
|
8058
|
-9999
|
-9999
|
550
|
9999
|
9999
|
Summary Statistics
|
Variable
|
N
|
Mean
|
Std. Dev.
|
Min
|
Pctl. 25
|
Pctl. 50
|
Pctl. 75
|
Max
|
|
ACRES
|
846
|
8078
|
26515
|
502
|
1184
|
2050
|
4606
|
333571
|
|
year
|
846
|
2008
|
8.6
|
1990
|
2000
|
2010
|
2015
|
2020
|
|
GREENNESS_
|
846
|
2464
|
4442
|
-320
|
-150
|
-150
|
9999
|
9999
|
|
LOW_THRESH
|
846
|
64
|
936
|
-9999
|
46
|
75
|
150
|
740
|
|
MODERATE_T
|
846
|
848
|
6121
|
-9999
|
150
|
292
|
449
|
9999
|
|
HIGH_THRES
|
846
|
1622
|
7727
|
-9999
|
-87
|
580
|
9999
|
9999
|
Summary Statistics
|
Variable
|
N
|
Mean
|
Std. Dev.
|
Min
|
Pctl. 25
|
Pctl. 50
|
Pctl. 75
|
Max
|
|
ACRES
|
29926
|
6918
|
25127
|
500
|
1168
|
1930
|
4213
|
1068802
|
|
year
|
29926
|
2006
|
10
|
1984
|
1999
|
2008
|
2014
|
2021
|
Summary Statistics
|
Variable
|
N
|
Mean
|
Std. Dev.
|
Min
|
Pctl. 25
|
Pctl. 50
|
Pctl. 75
|
Max
|
|
ACRES
|
10056
|
6211
|
19009
|
500
|
1193
|
1994
|
4465
|
563527
|
|
year
|
10056
|
1994
|
5.6
|
1984
|
1989
|
1994
|
1999
|
2002
|
|
GREENNESS_
|
10056
|
3581
|
4895
|
-500
|
-150
|
-150
|
9999
|
9999
|
|
LOW_THRESH
|
10056
|
55
|
1062
|
-9999
|
50
|
100
|
220
|
1100
|
|
MODERATE_T
|
10056
|
-1285
|
6060
|
-9999
|
-9999
|
223
|
335
|
9999
|
|
HIGH_THRES
|
10056
|
-696
|
7518
|
-9999
|
-9999
|
435
|
710
|
9999
|
Summary Statistics
|
Variable
|
N
|
Mean
|
Std. Dev.
|
Min
|
Pctl. 25
|
Pctl. 50
|
Pctl. 75
|
Max
|
|
ACRES
|
19870
|
7276
|
27706
|
500
|
1153
|
1899
|
4083
|
1068802
|
|
year
|
19870
|
2012
|
5
|
2003
|
2008
|
2011
|
2016
|
2021
|
|
GREENNESS_
|
19870
|
2682
|
4555
|
-650
|
-150
|
-150
|
9999
|
9999
|
|
LOW_THRESH
|
19870
|
-6
|
1188
|
-9999
|
40
|
65
|
150
|
9999
|
|
MODERATE_T
|
19870
|
685
|
6779
|
-9999
|
89
|
300
|
9999
|
9999
|
|
HIGH_THRES
|
19870
|
1931
|
8177
|
-9999
|
-9999
|
600
|
9999
|
9999
|
Summary Findings:
1. Data collection from 1990-2020 expresses wildfires ranging
between 500 to 333,571 acres in size. The median fire size of 2,050
acres more accurately depicts the average size of wildfires than the
mean of 8,078 acres which is likely skewed by outlying fires much large
in scope than the majority of wildfires.
2. While the minimum and maximum year values were determined by the
study period, the positive skew of the data towards later date periods
is an indicator of an increasing frequency of wildfires towards the 2nd
half of the study period.
6. Create a boxplot to identify outliers.


Outlier Findings:
As clearly depicted by the boxplots, there are numerous outliers of
large wildfires that skew the data each year. The majority of outlier
data will be kept as large scale wildfires are important to overall
trends for Wildfires. However, summary statistics were also be analysed
with the outliers removed.
When comparing the first half of the study period vs. the 2nd, the
total incidents of wildfires increased from 10,056 (1984-2002) to 19,870
wildfires (2003-2021)
7. Compute and visualize statistics for acres burned and year.
Summary statistics for the two most important varables in measuring
wildfire size over time: Acres and Years. The study period was separated
into two halves and 5 year increments to represent wildfire history from
the 1st half of the study period and the 2nd half of the study
period.
Summary Statistics
|
Variable
|
N
|
Mean
|
Std. Dev.
|
Min
|
Pctl. 25
|
Pctl. 50
|
Pctl. 75
|
Max
|
|
ACRES
|
846
|
8078
|
26515
|
502
|
1184
|
2050
|
4606
|
333571
|
|
year
|
846
|
2008
|
8.6
|
1990
|
2000
|
2010
|
2015
|
2020
|
Summary Statistics
|
Variable
|
N
|
Mean
|
Std. Dev.
|
Min
|
Pctl. 25
|
Pctl. 50
|
Pctl. 75
|
Max
|
|
ACRES
|
10056
|
6211
|
19009
|
500
|
1193
|
1994
|
4465
|
563527
|
|
year
|
10056
|
1994
|
5.6
|
1984
|
1989
|
1994
|
1999
|
2002
|
Summary Statistics
|
Variable
|
N
|
Mean
|
Std. Dev.
|
Min
|
Pctl. 25
|
Pctl. 50
|
Pctl. 75
|
Max
|
|
ACRES
|
19870
|
7276
|
27706
|
500
|
1153
|
1899
|
4083
|
1068802
|
|
year
|
19870
|
2012
|
5
|
2003
|
2008
|
2011
|
2016
|
2021
|
8. Compute and visualize correlation.
## [1] 0.02212699
## [1] 0.01879452
Correlation between size and time seperated into two time periods;
1984-2002 and 2003-2021.
## [1] 0.02876973
## [1] 0.002651568
The correlation analysis showed little correlation between the size
of wildfires and timeframe burns occur. Only a slight positive
correlation was found over the years.
9. Create an exploratory and multivariate visualization.


The line chart measuring the average size of wildfires from 1984
indicates a gradually increasing trend with a spike in acres burned from
2002-2005.

The line chart comparing prescribed burns vs wildfires indicates
prescribed burnes have consistently burned at approximately 1,000 acres
since 1984 while wildfires burned at approximately 15,000 acres on
average in 1990 increasing to an approximate average of 20,000 acres in
2021.
The first half of the study period from 1984-2002 experienced an
average increase in wildfire size from 5,700 to 8,800 acres .
Approximately a 35% increase. The 2nd half of the study period
(2003-2021) decreased from 9,200 acres on average to approximately 7,500
in 2021. A near 18% decrease in average size. While there seems to be a
general increase in size overtime, a spike in large wildfires between
2003-2005 with a sudden decrease in the following years shows
unpredictability in wildfire trends.
Spatial Integration and Visualization
Data Preperation
1. Check Projections
## [1] TRUE
2 Crop wildfires and cities to Utah. Merge two datasets
together.
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
## [1] "TX" "OH" "CA" "GA" "NY" "OR" "NM" "LA" "VA" "PA" "FL" "IA" "AK" "IN" "MA"
## [16] "MI" "MD" "MN" "WI" "IL" "CO" "NC" "NJ" "AL" "WA" "ME" "AZ" "TN" "NE" "MT"
## [31] "MS" "ND" "MO" "ID" "UT" "KY" "CT" "OK" "NV" "WY" "SC" "WV" "NH" "AR" "RI"
## [46] "DE" "HI" "KS" "VT" "SD" "DC"
## [1] "UT"
## [1] "character"
## [1] "character"
3: Filter and create new dataset pertaining to wildfires over
150,000 acres.
4. Large Wildfire Distribution ( > 150,000 : United States)

5. Spatial Visualization of Wildfires Greater Than 150,000
Acres.


The line and point plots depict an increasing trend in the frequency
of large wildfires over 150,000 acres. While the initial data collected
in the 1980s showed more incidents than the 1990s, the point plot shows
much greater frequency of large wildfires after 2005.
6. Create a spatial visualization of your data using tmap. Describe
the visualization as well as any interesting patterns or trends

## Error in wk_handle.wk_wkb(wkb, s2_geography_writer(oriented = oriented, :
## Loop 0 is not valid: Edge 344 crosses edge 346
The tmap shows the majority of large wildfires occur in the western
part of the United States. The majority of fires greater than 800,000
acres occured in California and Oregon with a few of the largest
wildfires recorded near the Oklahoma-Kansas border.