Introduction

A forest fire is something unplanned that affects a natural area such as forests, grasslands, or prairies. Wildfires in the United States and throughout the world are caused by human activity or some natural phenomenon such as lightning. According to the World Health Organization, climate change has greatly affected the occurrence of more forest fires by creating extremely dry conditions and the appearance of strong winds. These forest fires are highly negative for the ecosystem of human life as well as for flora and fauna, since when they occur there are losses of life, property, crops, animals and a great deterioration of air quality.

Forest fires can have a negative impact on mortality and morbidity, everything will depend on how big, speed and approach it is. Smoke from forest fires is a mixture of air pollutants. Inhaling this poor air quality affects people’s speech, hearing, and even motor skills. Therefore, vision is affected with irritation, nose, throat and lung. There are also effects in the lung area, such as a decrease in this, this includes coughing and wheezing, as well as cardiovascular diseases such as heart failure.

These forest fires affect everyone in the population including babies, young children, pregnant women, and the elderly who are susceptible to health problems related to forest fires, transportation, transportation, communications, energy, gas, and water services due to these fires. On the other hand, the National Park Service reports that 85% of fires in the United States are caused by humans. But the vast majority of these wildfires are unintentional as are unattended campfires, burning debris, misused and malfunctioning machinery, and discarded lit cigarettes. This is why the data that will be shown below will explain the cause of forest fires in two regions specifically.

Oregon’s forest are divided into three types: (1) Wet Forests, (2) Southwest Oregon, and (3) Dry Forests. Wildfires are natural occurrences in these forests; in the dry forests periodic burns contribute to the health of the forest ecosystem. Climate change may be a reason for the occurrences of wildfires and making wildfire seasons last longer.

Inspite the common occurrences being natural causes (lightning, dry bush, etc.), human causes are one of the main reasons for wildfires in Oregon. Activities such as campfires, fireworks, or burning piles, are causes for wildfires.

Although small wildfires can contribute to the Forest overall health, wildfires also have negatives effects. These effects include: cost to extinguish them (this may include the lost of water, the fuel used by the firetrucks, and others), damage to buildings, affect negatively air and water quality, and others.

Dry forests

In the dry ponderosa pine forests of central and eastern Oregon, fire historically burned through any given area every two to 25 years. But the fires generally were not intense. Understory plants were burned off, but large trees usually survived.

Wet forests

In the wet Douglas-fir forests on the west side of the Cascades and in the Coast Range, fire in any given stand is much less frequent, once every 200 to several hundred years. The historic record shows numerous instances of large, intense fires that killed most of the forest.

Southwestern Oregon forests

Interior southwestern Oregon forests experience some of the dryness of east-side forests, but with productivity more like west-side forests. They have intermediate fire behavior, and historically burned with mixed severity every 25 to 50 years.

Data cleaning

## Rows: 23,490
## Columns: 38
## $ Serial                 <int> 58256, 59312, 61657, 63735, 68019, 68067, 68224…
## $ FireCategory           <chr> "STAT", "STAT", "STAT", "STAT", "STAT", "STAT",…
## $ FireYear               <int> 2000, 2000, 2001, 2002, 2003, 2003, 2003, 2005,…
## $ Area                   <chr> "EOA", "EOA", "SOA", "NOA", "NOA", "EOA", "EOA"…
## $ DistrictName           <chr> "Central Oregon", "Northeast Oregon", "Southwes…
## $ UnitName               <chr> "John Day", "La Grande", "Grants Pass", "Philom…
## $ FullFireNumber         <chr> "00-952011-01", "00-971024-01", "01-712133-02",…
## $ FireName               <chr> "Slick Ear #2", "Woodley", "QUEENS BRANCH", "WR…
## $ Size_class             <chr> "B", "C", "A", "A", "A", "A", "A", "A", "A", "B…
## $ EstTotalAcres          <dbl> 0.75, 80.00, 0.10, 0.01, 0.01, 0.01, 0.00, 0.01…
## $ Protected_Acres        <dbl> 0.75, 80.00, 0.10, 0.01, 0.01, 0.01, 0.00, 0.01…
## $ HumanOrLightning       <chr> "Lightning", "Lightning", "Human", "Human", "Li…
## $ CauseBy                <chr> "Lightning", "Lightning", "Motorist", "Motorist…
## $ GeneralCause           <chr> "Lightning", "Lightning", "Smoking", "Recreatio…
## $ SpecificCause          <chr> "Lightning", "Lightning", "Other - Smoker Relat…
## $ Cause_Comments         <chr> "", "", "", "", "", "", "", "", "", "", "", "Du…
## $ Lat_DD                 <dbl> 44.91519, 45.08509, 42.53671, 44.58709, 44.7402…
## $ Long_DD                <dbl> -119.2886, -118.3344, -123.2121, -123.4278, -12…
## $ LatLongDD              <chr> "POINT (-119.28863 44.91519)", "POINT (-118.334…
## $ FO_LandOwnType         <chr> "BLM", "Other Private", "BLM", "State", "Indust…
## $ Twn                    <chr> "07S", "05S", "35S", "11S", "09S", "02N", "24S"…
## $ Rng                    <chr> "29E", "36E", "04W", "06W", "07W", "43E", "06E"…
## $ Sec                    <int> 31, 32, 7, 28, 36, 23, 1, 17, 9, 21, 17, 28, 12…
## $ Subdiv                 <chr> "NESW", "NESW", "SESE", "SENW", "SWSW", "NWSE",…
## $ LandmarkLocation       <chr> "11 MI SE Ritter LO", "Woodley C.G", "7 N ROGUE…
## $ County                 <chr> "Grant", "Union", "Jackson", "Benton", "Polk", …
## $ RegUseZone             <chr> "EC2", "NE3", "SW3", "W01", "WO1", "NE2", "WC2"…
## $ RegUseRestriction      <chr> "Reg Use Closure", "Reg Use Closure", "Reg Use …
## $ Industrial_Restriction <chr> "Does Not Apply - Eastern OR", "Does Not Apply …
## $ Ign_DateTime           <chr> "07/18/2000 07:00:00 PM", "08/24/2000 05:30:00 …
## $ ReportDateTime         <chr> "07/19/2000 01:20:00 PM", "08/24/2000 01:07:00 …
## $ Discover_DateTime      <chr> "07/19/2000 01:15:00 PM", "08/24/2000 01:07:00 …
## $ Control_DateTime       <chr> "07/20/2000 12:50:00 AM", "09/01/2000 09:30:00 …
## $ CreationDate           <chr> "07/20/2000 09:13:00 AM", "08/29/2000 03:59:00 …
## $ ModifiedDate           <chr> "11/14/2000 09:16:00 AM", "12/21/2000 04:22:00 …
## $ DistrictCode           <int> 95, 97, 71, 55, 55, 97, 99, 52, 98, 52, 98, 71,…
## $ UnitCode               <int> 952, 971, 712, 551, 552, 974, 991, 521, 981, 52…
## $ DistFireNumber         <chr> "011", "024", "133", "001", "013", "016", "228"…

Summary Statistics

First, let us look at the amount of wildfires by year.

## # A tibble: 23 × 2
## # Groups:   Year [23]
##    Year  `Total Fires`
##    <fct>         <int>
##  1 2000            920
##  2 2001           1289
##  3 2002           1174
##  4 2003           1173
##  5 2004            920
##  6 2005            827
##  7 2006           1340
##  8 2007           1238
##  9 2008           1088
## 10 2009            990
## 11 2010            693
## 12 2011            697
## 13 2012            687
## 14 2013           1180
## 15 2014           1115
## 16 2015           1071
## 17 2016            829
## 18 2017           1059
## 19 2018           1105
## 20 2019           1011
## 21 2020            976
## 22 2021           1127
## 23 2022            859

Here we can see that the years with lower amount of fires where 2010 - 2012. Now let us see what is the main caused of these wildfires:

Humans are the caused of most wildfires in Oregon. This should alert the fire department in Oregon and they need take measures to combat this problem.

Stratified sampling

Methodology: We will partition our set into tow heterogeneous groups outside, but homogeneous within. The groups will be Human and Lightning. Then, we take a random sample from this partitions. Its objective is to improve the precision of the sample by reducing sampling error.

We will create two stratus: human caused wildfires and lightning caused wildfires.

## Rows: 17,124
## Columns: 6
## $ FireYear         <fct> 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000,…
## $ EstTotalAcres    <dbl> 0.00, 0.75, 0.25, 0.01, 0.01, 1.00, 100.00, 0.10, 18.…
## $ Protected_Acres  <dbl> 0.00, 0.75, 0.25, 0.01, 0.01, 1.00, 25.00, 0.10, 18.0…
## $ HumanOrLightning <fct> Human, Human, Human, Human, Human, Human, Human, Huma…
## $ Lat_DD           <dbl> 43.57131, 44.50671, 45.60894, 44.49021, 42.32296, 45.…
## $ Long_DD          <dbl> -121.6013, -123.3406, -117.6312, -120.1054, -123.3151…
## Rows: 6,244
## Columns: 6
## $ FireYear         <fct> 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000, 2000,…
## $ EstTotalAcres    <dbl> 0.75, 80.00, 0.01, 0.01, 0.01, 0.10, 0.01, 0.10, 0.01…
## $ Protected_Acres  <dbl> 0.75, 80.00, 0.01, 0.01, 0.01, 0.10, 0.01, 0.10, 0.01…
## $ HumanOrLightning <fct> Lightning, Lightning, Lightning, Lightning, Lightning…
## $ Lat_DD           <dbl> 44.91519, 45.08509, 44.94196, 43.47754, 44.60979, 45.…
## $ Long_DD          <dbl> -119.2886, -118.3344, -119.9228, -121.6512, -122.4130…

The size of our Human caused data set is 17,143 observations and for the lightning data set is 6,249; the human stratum contain more than 10,000 additional observations than the lightning stratum.

Summary Statistics

Let us take the sum of the total acres burned by human and the acres burned by lightning:

##   Estimated Acres Burned by Humans Estimated Acres Burned by Lightning
## 1                          1182515                             4572161

The sum of the total acres burned by humans is 1,182,515 and by lightning is 4,572,161. We can see that, even though humans cause more “wildfires,” lightning caused wildfires are more devastating as show by the total sum of acres burned. We created a graph to see the total acres burned by each cause to see whether or not our hypothesis is true. What are the reasons for these? Maybe lightning strikes happen in dense forest and, thus, causing a more powerful wildfire.

Here we can see, that there is a lot of variance (acres burned) between the years. This will have an effect when we do stratified sampling with the optimal fixation method. Notice, that lightning caused wildfires burn more acres than human caused fires. Why is that? Is it because they occurred in remote places? We will create a map to see where most of them happen.

Also, one thing to notice is that 2006 was the year with most wildfires caused by humans (953), yet the acres burned was so little, that it is the second lowest between the years.

Biscuit fire (2002)

Map of location and spread of wildfires colored by their causes

Human caused wildfires are more dominant in the West, while lightning caused wildfires are more spread out. Oregon fire department should have stations close to the areas that are more prone to fires. Let us assume that Oregon has the resources to create a total of 8 fire stations. We will find the optimal number of firestations to create and we will compare them with our chosen number of fire stations, 8.

We will suggest the following locations:

These locations should minimize time of arrival and, hence, the time to control the fire. The circle is the coverage area of each station.

Parameters and estimation of our parameters

Now we will create a table showing the parameters of our population, this table will contain the total, mean, variance, and proportion. Later, we will compare this table with the table containing the parameters and estimators for our stratums.

## [1] "Data Parameters"
##   Total_Acres_Burned Mean_of_Acres_Burned Average_of_Acres_Burned_by_Year
## 1            5754676             246.2631                        250203.3
##   Proportion_Human_Caused
## 1                0.732797
## [1] "Population parameters by Stratum"
##   Total_Acres_Burned_Humans Total_Acres_Burned_Lightning Means_Humans
## 1                   1182515                      4572161     69.05599
##   Means_Lightning Variance_Humans Variance_Lightning CV_Humans CV_Lightning
## 1        732.2487         6082732          148030624  35.71478     16.61564
## [1] "Population Parameters"
##   Total_Parameter Mean_Parameter
## 1         5754676       246.2631

We will take a sample from each stratum and get their parameters. We will take 2000 samples from each stratum.

## [1] "Estimators and the standard errors of the estimators"
##   Estimated_Total Estimated_Mean SE_Total  SE_Mean
## 1         5282235       226.0456  1658130 168.7057
## [1] "Estimated standard errors of the estimators"
##   Estimated_Total_SE Estimated_Mean_SE
## 1            1584068          67.78789

We can see that our estimators are almost similar to the population parameters by stratum.

Now we will look at the design effect:

## [1] "The design effect is: "
## [1] 0.551064

Afijaciones

In this section we will take a look at the three methods for fixation and compare them:

## [1] "Sizes by fixation method: "
##   Human_Caused_Proportional_Fix_Size Lightning_Caused_Proportional_Fix_Size
## 1                           1465.594                                534.406
##     Optimal Size for Human Optimal size for lightning 
##                   202.5559                  1797.4441

As stated before, we can see that the optimal size for our lightning caused set is much greater than the human caused.

Let us now find the estimation of our parameters using the proportional fixation sizes:

## [1] "Proportional fixation parameters:"
##   Estimated_Total_Proportional Estimated_Mean_Proportional
## 1                      4434616                     189.773
##   Estimated_Total_Variance_Proportional Estimated_Mean_Variance_Proportional
## 1                          4.305053e+12                             7883.796
## [1] "Optimal Fixation Parameters:"
##   Estimated Total Estimated Mean Estimated Total SE Estimated Mean SE
## 1         5777121       247.2236            2050847          87.76307

Conclusion

Stratified sampling is method that partitions our population into – internally – homogeneous groups. From these groups we take a simple random sampling of size n. We can choose the n we desire or we can applied algorithms to choose the best n. Here, we selected a certain n = \(2,000\) and we also utilize to fixation methods to help us select a proper n: proportional and optimal fix method. When comparing these two fixation methods, we can see that the optimal method (which is based on the variance) gave us better estimates of the population parameters than the proportional method. To finish, our estimates were close (similar) to the population parameters: total and mean.

References

  1. Fire. (n.d.). OregonForests. https://oregonforests.org/fire

  2. Short, K. C. (2017). Spatial wildfire occurrence data for the United States, 1992-2015 [FPA_FOD_20170508] (4th Edition) [Data set]. In Forest Service Research Data Archive. Forest Service Research Data Archive.

  3. Wikipedia contributors. (2023). Cluster sampling. Wikipedia. https://en.wikipedia.org/wiki/Cluster_sampling

  4. Wikipedia contributors. (2023a). Stratified sampling. Wikipedia. https://en.wikipedia.org/wiki/Stratified_sampling

  5. Wildfire causes and evaluations (U.s. national Park service). (2022). Nps.gov. from https://www.nps.gov/articles/wildfire-causes-and-evaluation.htm