We will be studying a forest dataset that was collected by the United States Forest Service. The dataset includes four wilderness areas located in the Roosevelt National Forest of northern Colorado, where each observation is a 30m x 30m patch. Our goal is to predict the forest cover type (the predominant kind of tree cover) of each patch using strictly cartographic variables (instead of remotely sensed data).
Map of Roosevelt National Forest Wilderness Areas
The variables in this data set are:
| Variable Name | Description | Description of Values |
|---|---|---|
| Elevation | Elevation in meters | |
| Aspect | Aspect in degrees azimuth | |
| Slope | Slope in degrees | |
| Horizontal Distance To Hydrology | Horz Dist to nearest surface water features | |
| Vertical Distance To Hydrology | Vert Dist to nearest surface water features | |
| Horizontal Distance To Roadways | Horz Dist to nearest roadway | |
| Hillshade 9am | Hillshade index at 9am, summer solstice | (0 to 255 index) |
| Hillshade Noon | Hillshade index at noon, summer soltice | (0 to 255 index) |
| Hillshade 3pm | Hillshade index at 3pm, summer solstice | (0 to 255 index) |
| Horizontal Distance To Fire Points | Horz Dist to nearest wildfire ignition points | |
| Wilderness Area | Wilderness area designation | (4 binary columns, 0 = absence or 1 = presence) |
| Soil Type | Soil Type designation | (40 binary columns, 0 = absence or 1 = presence) |
| Cover Type | Forest Cover Type designation | (7 types, integers 1 to 7) |
| Code | Name |
|---|---|
| 1 | Rawah Wilderness Area |
| 2 | Neota Wilderness Area |
| 3 | Comanche Peak Wilderness Area |
| 4 | Cache la Poudre Wilderness Area |
| Code | Name | Code | Name |
|---|---|---|---|
| 1 | Cathedral family - Rock outcrop complex, extremely stony. | 21 | Typic Cryaquolls - Leighcan family, till substratum complex. |
| 2 | Vanet - Ratake families complex, very stony. | 22 | Leighcan family, till substratum, extremely bouldery. |
| 3 | Haploborolis - Rock outcrop complex, rubbly. | 23 | Leighcan family, till substratum - Typic Cryaquolls complex. |
| 4 | Ratake family - Rock outcrop complex, rubbly. | 24 | Leighcan family, extremely stony. |
| 5 | Vanet family - Rock outcrop complex complex, rubbly. | 25 | Leighcan family, warm, extremely stony. |
| 6 | Vanet - Wetmore families - Rock outcrop complex, stony. | 26 | Granile - Catamount families complex, very stony. |
| 7 | Gothic family. | 27 | Leighcan family, warm - Rock outcrop complex, extremely stony. |
| 8 | Supervisor - Limber families complex. | 28 | Leighcan family - Rock outcrop complex, extremely stony. |
| 9 | Troutville family, very stony. | 29 | Como - Legault families complex, extremely stony. |
| 10 | Bullwark - Catamount families - Rock outcrop complex, rubbly. | 30 | Como family - Rock land - Legault family complex, extremely stony. |
| 11 | Bullwark - Catamount families - Rock land complex, rubbly. | 31 | Leighcan - Catamount families complex, extremely stony. |
| 12 | Legault family - Rock land complex, stony. | 32 | Catamount family - Rock outcrop - Leighcan family complex, extremely stony. |
| 13 | Catamount family - Rock land - Bullwark family complex, rubbly. | 33 | Leighcan - Catamount families - Rock outcrop complex, extremely stony. |
| 14 | Pachic Argiborolis - Aquolis complex. | 34 | Cryorthents - Rock land complex, extremely stony. |
| 15 | unspecified in the USFS Soil and ELU Survey. | 35 | Cryumbrepts - Rock outcrop - Cryaquepts complex. |
| 16 | Cryaquolis - Cryoborolis complex. | 36 | Bross family - Rock land - Cryumbrepts complex, extremely stony. |
| 17 | Gateview family - Cryaquolis complex. | 37 | Rock outcrop - Cryumbrepts - Cryorthents complex, extremely stony. |
| 18 | Rogert family, very stony. | 38 | Leighcan - Moran families - Cryaquolls complex, extremely stony. |
| 19 | Typic Cryaquolis - Borohemists complex. | 39 | Moran family - Cryorthents - Leighcan family complex, extremely stony. |
| 20 | Typic Cryaquepts - Typic Cryaquolls complex. | 40 | Moran family - Cryorthents - Rock land complex, extremely stony. |
| Code | Name |
|---|---|
| 1 | Spruce/Fir |
| 2 | Lodgepole Pine |
| 3 | Ponderosa Pine |
| 4 | Cottonwood/Willow |
| 5 | Aspen |
| 6 | Douglas-fir |
| 7 | Krummholz |
Trees bring incredible value to life on Earth. They have several important functions, including:
Trees produce the oxygen we breathe. They convert carbon dioxide to oxygen via photosynthesis. Trees eat the greenhouse gases that cause climate change by storing the carbon dioxide, which helps slow the gas’s buildup in our environment. Trees provide homes to wildlife. Trees help control the climate by providing us with shade to protect us from the hot sun, providing a screen to protect us from harsh wind, and shielding us from rain. Trees absorb and store water, preventing the transport of chemicals into streems and flooding. We need to help trees help us. Accurately predicting forest cover types is important for more efficient and effective tree management so that trees can continue to do a good job of nourishing the planet and supporting our lives. Accurate prediction can help with tasks including:
Conservation measures Urban planning as Americans expand and continue to build Conservation of plant diversity Monitoring forest health and forest management
Sample size varies between wilderness areas and is lowest for Neota and highest for Commanche Peak.
Cover types vary by wilderness area with some cover types not present in certain areas. The number of cover types vary between 3 and 6 per wilderness area.
We also explored the correlations between all numeric features in the training set.
The box plots show distribution of cover types most correlated with elevation, with horizontal distance to roadways, horizontal distance to hydrology, and horizontal distance to fire points also showing some possible correlation with cover type.
We looked into the 4 features mentioned above showing possible correlation with cover type.
htmltools::includeHTML("file:///Users/JGJA/Desktop/W207/final_project/final_group_project/elevation.html")
Link to dataset: https://www.kaggle.com/c/forest-cover-type-prediction
https://www.analyticsvidhya.com/blog/2021/08/how-to-perform-exploratory-data-analysis-a-guide-for-beginners/ http://www.sangres.com/colorado/wilderness/neota.htm#.YlDwAVjMLG8 https://builtin.com/data-science/random-forest-algorithm