DATA110Project1

How is land preserved in Maryland?

There are two major ways that land from in our state is protected from reckless development. One way is to have public land, meaning the land is under the control of state government, local governments, or the federal government. Another is to negotiate easements. Easements are agreements about land that bind all future landowners to certain conditions, and some property rights are given up such as rights to develop the land. In the following dataset, we are examining the the different regions of Maryland, and what percentage of their protected lands are either protected through easements or through public lands. The Maryland Department of Natural Resources collects and regularly updates this data and makes it available on OpenData. There is a goal for Maryland to preserve 30% of the state’s total acreage.

The three ways that something is considered public land is if it’s under Federal, State, or Local (such as towns and cities) protection. The nine types of agreements and easements recorded in this dataset include 1. Maryland Program Open Space Stateside Conservation Easements 2. Maryland Rural Legacy Program Easements 3. Easements Held by Private Conservation Organizations 4. MET Maryland Environmental Trust Easements 5. Maryland Agricultural Land Preservation Foundation Easements 6. County Protected Cluster/Subdivison Remainders 7. County Purchase of Development Rights/Transfer of Development Rights 8. Forest Legacy/ISTEA/CREP/FRPP/ACEP 9. Next Gen Farmland Acquisition Program (MARBIDCO)

#Install the helpful packages that will make cleaning up and visualizing the data easier

install.packages("tidyverse")
Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.6'
(as 'lib' is unspecified)
install.packages("ggplot2")
Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.6'
(as 'lib' is unspecified)
install.packages("readr")
Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.6'
(as 'lib' is unspecified)
install.packages("dslabs")
Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.6'
(as 'lib' is unspecified)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(ggplot2)
library(readr)
library(dslabs)
#Set working directory and locate the folder that has the Total Acres Preserved from Maryland Department of Natural Resources. The csv should be made into an object with a more brief name to be used throughout the sheet.
setwd("/cloud/project")
PreservedLand <- read_csv("AcresPreservedMaryland.csv")
Rows: 25 Columns: 19
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr  (4): Maryland County, Jurisdiction Code, Maryland Region, Best Availabl...
num (15): County PDR/TDR, Cluster/subdv Remainder, MALPF, MET, Private Cons....

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
tibble(PreservedLand)
# A tibble: 25 × 19
   `Maryland County` `Jurisdiction Code` `Maryland Region`  
   <chr>             <chr>               <chr>              
 1 Allegany          ALLE                Western Maryland   
 2 Anne Arundel      ANNE                Baltimore          
 3 Baltimore City    BACI                Baltimore          
 4 Baltimore County  BACO                Baltimore          
 5 Calvert           CALV                Southern Maryland  
 6 Caroline          CARO                Upper Eastern Shore
 7 Carroll           CARR                Baltimore          
 8 Cecil             CECI                Upper Eastern Shore
 9 Charles           CHAR                Southern Maryland  
10 Dorchester        DORC                Lower Eastern Shore
# ℹ 15 more rows
# ℹ 16 more variables: `Best Available Data as of` <chr>,
#   `County PDR/TDR` <dbl>, `Cluster/subdv Remainder` <dbl>, MALPF <dbl>,
#   MET <dbl>, `Private Cons. Orgs.` <dbl>, `Rural Legacy` <dbl>,
#   `POS Stateside Conservation Easements` <dbl>,
#   `Next Gen. Farmland Acquisition Program (MARBIDCO)` <dbl>,
#   `ISTEA/ Forest Legacy/  CREP/ FRPP-ACEP` <dbl>, …
#Clean the data. First I want to rename the columns to something a little more brief.I also want to change the " " to "_" to make it easier to type in R
datacleaning <- as_tibble(PreservedLand)
colnames(datacleaning)
 [1] "Maryland County"                                        
 [2] "Jurisdiction Code"                                      
 [3] "Maryland Region"                                        
 [4] "Best Available Data as of"                              
 [5] "County PDR/TDR"                                         
 [6] "Cluster/subdv Remainder"                                
 [7] "MALPF"                                                  
 [8] "MET"                                                    
 [9] "Private Cons. Orgs."                                    
[10] "Rural Legacy"                                           
[11] "POS Stateside Conservation Easements"                   
[12] "Next Gen. Farmland Acquisition Program (MARBIDCO)"      
[13] "ISTEA/ Forest Legacy/  CREP/ FRPP-ACEP"                 
[14] "TOTAL Under Easement"                                   
[15] "LPPRP Local Parks and Recreation (incl. Local-Side POS)"
[16] "DNR State Land Inventory"                               
[17] "LPPRP Federal Park and Conservation Lands"              
[18] "TOTAL Publicly Owned"                                   
[19] "GRAND TOTAL Preserved"                                  
datacleaning <- datacleaning %>% rename_with(~ gsub(" ","_", .x), contains(" "))

#There is a row in this graph that in the Maryland County column says "TOTAL", and in the following categorical columns like Maryland Region and Jurisdiction code has N/A, and the total amount of acres in each column. I'm going to separate this into a different data table, and use in a visualization later.

PreservedLandTotal <- datacleaning %>% filter(Maryland_County == "TOTAL")
PreservedLandNoTotal <- datacleaning %>% filter_out(Maryland_County == "TOTAL")
#After removing the extra row, we're going to remove the TOTAL columns from the datasets as well.

PreservedLandNoTotal <- PreservedLandNoTotal %>% subset(select =-c(TOTAL_Publicly_Owned,TOTAL_Under_Easement,GRAND_TOTAL_Preserved))
#Finally, I don't care about the Code, or how recently the dataset was updated, so I am removing those columns too, using the subset function, and selecting out the columns I want to remove.
PreservedLandNoTotal <- PreservedLandNoTotal %>% subset(select =-c(Jurisdiction_Code,Best_Available_Data_as_of,Maryland_County))
#In order to consolidate my data, I want to get all the rows with the Maryland regions and condense them using the group_by function, targeting the Regions, and summarizing across all the Preservation Types.
MergedRegions <- PreservedLandNoTotal %>% group_by(Maryland_Region) %>%
  summarize(across(c(`County_PDR/TDR`, 
                     `Cluster/subdv_Remainder`,
                     MALPF,
                     MET,
                     Private_Cons._Orgs.,
                     Rural_Legacy,
                     POS_Stateside_Conservation_Easements,
                     `Next_Gen._Farmland_Acquisition_Program_(MARBIDCO)`,
                     `ISTEA/_Forest_Legacy/__CREP/_FRPP-ACEP`,
                     `LPPRP_Local_Parks_and_Recreation_(incl._Local-Side_POS)`,
                     LPPRP_Federal_Park_and_Conservation_Lands,
                     DNR_State_Land_Inventory),
                   sum))
head(MergedRegions)
# A tibble: 6 × 13
  Maryland_Region     `County_PDR/TDR` `Cluster/subdv_Remainder`   MALPF    MET
  <chr>                          <dbl>                     <dbl>   <dbl>  <dbl>
1 Baltimore                     95916                     16626. 103013. 27701.
2 Lower Eastern Shore             782                      1407   45668. 29910.
3 Southern Maryland             36963.                     2316   37225. 13700 
4 Suburban Washington          103978.                     7000   36163. 11484.
5 Upper Eastern Shore           15569.                    15491  135567. 51731 
6 Western Maryland               2187                       214   27293.  8400.
# ℹ 8 more variables: Private_Cons._Orgs. <dbl>, Rural_Legacy <dbl>,
#   POS_Stateside_Conservation_Easements <dbl>,
#   `Next_Gen._Farmland_Acquisition_Program_(MARBIDCO)` <dbl>,
#   `ISTEA/_Forest_Legacy/__CREP/_FRPP-ACEP` <dbl>,
#   `LPPRP_Local_Parks_and_Recreation_(incl._Local-Side_POS)` <dbl>,
#   LPPRP_Federal_Park_and_Conservation_Lands <dbl>,
#   DNR_State_Land_Inventory <dbl>
colnames(MergedRegions)
 [1] "Maryland_Region"                                        
 [2] "County_PDR/TDR"                                         
 [3] "Cluster/subdv_Remainder"                                
 [4] "MALPF"                                                  
 [5] "MET"                                                    
 [6] "Private_Cons._Orgs."                                    
 [7] "Rural_Legacy"                                           
 [8] "POS_Stateside_Conservation_Easements"                   
 [9] "Next_Gen._Farmland_Acquisition_Program_(MARBIDCO)"      
[10] "ISTEA/_Forest_Legacy/__CREP/_FRPP-ACEP"                 
[11] "LPPRP_Local_Parks_and_Recreation_(incl._Local-Side_POS)"
[12] "LPPRP_Federal_Park_and_Conservation_Lands"              
[13] "DNR_State_Land_Inventory"                               
#I'm still finding it hard to type out the column names so I'm going to rename them to more brief names.
MergedRegions <- rename(MergedRegions,
       Region ="Maryland_Region",                                        
       PDR_TDR = "County_PDR/TDR",                                        
       Cluster = "Cluster/subdv_Remainder",                                
       PrivateOrg = "Private_Cons._Orgs." ,                                   
      RuralLegacy = "Rural_Legacy",                                           
       POS = "POS_Stateside_Conservation_Easements",                   
      MARBIDCO = "Next_Gen._Farmland_Acquisition_Program_(MARBIDCO)",     
       ISTEA_Forest_CREP_FRPP_ACEP = "ISTEA/_Forest_Legacy/__CREP/_FRPP-ACEP",                 
       Local = "LPPRP_Local_Parks_and_Recreation_(incl._Local-Side_POS)",
       Federal = "LPPRP_Federal_Park_and_Conservation_Lands",              
       State = "DNR_State_Land_Inventory"                               
)
print(MergedRegions)
# A tibble: 6 × 13
  Region    PDR_TDR Cluster  MALPF    MET PrivateOrg RuralLegacy    POS MARBIDCO
  <chr>       <dbl>   <dbl>  <dbl>  <dbl>      <dbl>       <dbl>  <dbl>    <dbl>
1 Baltimore  95916   16626. 1.03e5 27701.      9497.      24923.  1752.       0 
2 Lower Ea…    782    1407  4.57e4 29910.     21110       35080. 28640.     266.
3 Southern…  36963.   2316  3.72e4 13700       3327.      17044.  3943.     203.
4 Suburban… 103978.   7000  3.62e4 11484.      2985       16050.  4901.     518.
5 Upper Ea…  15569.  15491  1.36e5 51731       9818.      23278.  6863.     845.
6 Western …   2187     214  2.73e4  8400.      6134.      13930.  2591.    1015.
# ℹ 4 more variables: ISTEA_Forest_CREP_FRPP_ACEP <dbl>, Local <dbl>,
#   Federal <dbl>, State <dbl>
#My goal is to make a stacked bar chart visualization, and for that I need to further wrangle my data. I'm going to use pivot functions so I'm installing tidyr.
install.packages("tidyr")
Installing package into '/cloud/lib/x86_64-pc-linux-gnu-library/4.6'
(as 'lib' is unspecified)
library(tidyr)
#I'm going to create a new column called "Preservation Type" so I can get 12 columns into one. This column will contain all the different ways, through easements, land agreements, and government property, that land is conserved in Maryland. Then I will put the values into "Acres", since this is all measured in Acreage.
MergedRegionsFinal <- MergedRegions %>% pivot_longer(cols = PDR_TDR:State,names_to = "PreservationType", values_to = "Acres")
#Finally, I'm going to do a visualization. My goal is to do a percentage stacked bar chart. I want to shore my viewer how each region of Maryland preserves it's land, and what percentage of the region's perserved land is protected under different tactics.In order to differentiate between the types of preservation, I used a brewer palette to bring in some lively colors.

MarylandRegions1 <- ggplot(MergedRegionsFinal, aes(fill=PreservationType, y=Acres, x=Region)) + 
  geom_bar(position="fill", stat="identity") + 
  scale_fill_brewer(palette="Set3") +
  coord_flip()+
  labs(x = "Region",
       y = "Percent of Total Acres Preserved per Region",
       title = "How Do the Different Regions in Maryland Preserve Land?",
       subtitle = "Percentage of Land Preservation Types Per Region",
       caption = "Source: Maryland Department of Natural Resources")
MarylandRegions1

#For my next visualization, I want the total acreage, across the different regions, that are currently preserved under different types. I'm going to do another stacked bar chart, but this time not equalized for height.In order to do this, I change the position from "fill" to "stack in the geom_bar function.

MarylandRegions2 <- ggplot(MergedRegionsFinal, aes(fill=PreservationType, y=Acres, x=Region)) + 
  geom_bar(position="stack", stat="identity") + 
  scale_fill_brewer(palette="Set3") +
  coord_flip()+
  labs(x = "Region",
       y = "Acres Preserved per Region",
       title = "How Do the Different Regions in Maryland Preserve Land?",
       subtitle = "Total Acreage of Land Preservation Types Per Region",
       caption = "Source: Maryland Department of Natural Resources")

MarylandRegions2

  1. How you cleaned the dataset up (be detailed and specific, using proper terminology where appropriate). 

    I cleaned my dataset by using different functions to isolate the data I cared about. For example, I used filter to remove a row that would have made my final visualization more confusing, and subset to isolate and remove the columns that weren’t necessary (the totals) because I would be showing the total acreage by looking at the individual categories in a bar graph, rather than flattening the data.

  2. What the visualization represents, any interesting patterns or surprises that arise within the visualization. 

The visualization represents the way that the 6 major regions of Maryland protect their lands. I was very interested to see that Baltimore region actually has the most acreage protected, because I would think that the Western Maryland region (where the Applachian Mountains are) would have more protected. I was also impressed to see that second to the state, MALPF (Maryland Agricultural Land Preservation Foundation) is the second highest protector of lands across all of the regions, by number of total acres.

  1. Anything that you might have shown that you could not get to work or that you wished you could have included.

I wish that the Department of Natural Resources also included a total acreage of the region, regardless of how the land is protected. Perhaps I would be less impressed with Baltimore’s 400,000 acres preserved compared to the Western Maryland if I knew that the regions total sizes were wildly different. I also am curious as to how long some of these lands have been preserved, and how many are in the process of being restored.