Oil and natural gas play significant roles in meeting global energy needs and promoting sustainable development. However, effective environmental monitoring of oil and natural gas production in the oil sand regions of Alberta, Canada, addresses not only scientific or management needs to assess impacts on the receiving environment but also societal needs to ensuring public trust in resource management (Dube et al., 2022).
This research aims to analyze three datasets—Alberta Oil and Natural Gas Production by Municipality, and the Well_Count datasets—to gain a comprehensive understanding of oil and natural gas production indices across Alberta’s production fields, with respect to the wellbores (well-count). Additionally, it seeks to create an interactive maps for users in order to understand oil and gas production index across Alberta.
Data Source: The dataset used for this analysis were obtained from the Alberta Government’s website under the Department of Economic Development and Trade. The links to the data repositories are; Oil Production by Municipality URL: https://open.alberta.ca/opendata/oil-production-by-municipality, Natural Gas Production by Municipality URL: Natural gas production by municipality - Open Government (alberta.ca), Well count by Municipality URL: Well count by municipality - Open Government (alberta.ca).
The municipalities coordinates of the data might not be 100-percent precise. The original data gotten from the Alberta Government website does not contain the coordinates information of the hydrocarbon producing municipalities and the Well counts. Therefore, additional research was conducted to manually compile the coordinates of the various locations together.
Several approaches were adopted to analyze the datasets; using the “R” basic functions to clean and sort the data by removing missing values, duplicates, etc., and also arrange the data by Period (Year), and group by the CSD. As earlier stated,the original data didn’t contain the coordinates information, but i think using a map to visualize the data across Municipality would be great. Hence, I was able to generate and compile additional coordinate data containing the geographic locations of the oil and natural gas producing fields in Alberta.
Visualization of the data was made on map using the leaflet function, and locations of each CSD gives the production value and year once clicked. Hence, this will facilitate an interactive communication and collaboration with the audience to understand oil and gas production index across Alberta.
#install.packages("tidyverse")
#install.packages("leaflet")
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.3.3
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.0 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(leaflet)
## Warning: package 'leaflet' was built under R version 4.3.3
library(ggplot2)
library(dplyr)
Oil_production_by_Municipality <- read_csv("Oil_production_by_Municipality.csv")
## Rows: 1618 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): CSD, IndicatorSummaryDescription, UnitOfMeasure
## dbl (3): CSDUID, Period, OriginalValue
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Natural_gas_production_by_municipality <- read_csv("Natural_gas_production_by_Municipality.csv")
## Rows: 1647 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): CSD, IndicatorSummaryDescription, UnitOfMeasure
## dbl (3): CSDUID, Period, OriginalValue
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Well_count_by_municipality <- read_csv("Well_count_by_Municipality.csv")
## Rows: 1459 Columns: 6
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): CSD, IndicatorSummaryDescription
## dbl (3): CSDUID, Period, OriginalValue
## lgl (1): UnitOfMeasure
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Oil_Coordinates <- read_csv("Oil_Coordinates.csv") #Coordinates data for oil production by Municipality
## Rows: 1618 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): CSD
## dbl (2): Latitude, Longitude
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Gas_Coordinates <- read_csv("Gas_Coordinates.csv") #Coordinates data for natural gas production by Municipality
## Rows: 1647 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): CSD
## dbl (2): Latitude, Longitude
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Well_Coordinates <- read_csv("Well_Coordinates.csv") #Coordinates data for the Well count by Municipality
## Rows: 1459 Columns: 3
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): CSD
## dbl (2): Latitude, Longitude
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
glimpse(Oil_production_by_Municipality) #A quick summary of the Oil production by Municipality data
## Rows: 1,618
## Columns: 6
## $ CSDUID <dbl> 4805026, 4805026, 4805026, 4805026, 480502…
## $ CSD <chr> "Drumheller", "Drumheller", "Drumheller", …
## $ Period <dbl> 2003, 2004, 2005, 2006, 2007, 2008, 2009, …
## $ IndicatorSummaryDescription <chr> "Oil Production", "Oil Production", "Oil P…
## $ UnitOfMeasure <chr> "m3", "m3", "m3", "m3", "m3", "m3", "m3", …
## $ OriginalValue <dbl> 3138.0, 3291.5, 5311.0, 5141.0, 5477.0, 49…
colnames(Oil_production_by_Municipality) #To display the columns of the Oil production by Municipality data
## [1] "CSDUID" "CSD"
## [3] "Period" "IndicatorSummaryDescription"
## [5] "UnitOfMeasure" "OriginalValue"
The Oil_production_by_municipal_district data contains 6 columns and 1618 rows, with column names; CSD, IndicatorSummaryDescription, UnitOfMeasure, CSDUID, Period, OriginalValue.
glimpse(Natural_gas_production_by_municipality) #A quick summary of the Natural gas production by Municipality data
## Rows: 1,647
## Columns: 6
## $ CSDUID <dbl> 4805026, 4805026, 4805026, 4805026, 480502…
## $ CSD <chr> "Drumheller", "Drumheller", "Drumheller", …
## $ Period <dbl> 2003, 2004, 2005, 2006, 2007, 2008, 2009, …
## $ IndicatorSummaryDescription <chr> "Natural Gas Production", "Natural Gas Pro…
## $ UnitOfMeasure <chr> "m3", "m3", "m3", "m3", "m3", "m3", "m3", …
## $ OriginalValue <dbl> 104493.2, 105486.4, 130930.0, 128564.0, 12…
colnames(Natural_gas_production_by_municipality) #To display the columns of the Natural gas production by Municipality data data
## [1] "CSDUID" "CSD"
## [3] "Period" "IndicatorSummaryDescription"
## [5] "UnitOfMeasure" "OriginalValue"
The Natural_gas_production_by_municipality data contains 6 columns and 1647 rows, with column names; CSD, IndicatorSummaryDescription, UnitOfMeasure, CSDUID, Period, OriginalValue.
glimpse(Well_count_by_municipality) #A quick summary of the Well Count by Municipality data
## Rows: 1,459
## Columns: 6
## $ CSDUID <dbl> 4805026, 4805026, 4805026, 4805026, 480502…
## $ CSD <chr> "Drumheller", "Drumheller", "Drumheller", …
## $ Period <dbl> 2003, 2004, 2005, 2006, 2007, 2008, 2009, …
## $ IndicatorSummaryDescription <chr> "Well Count", "Well Count", "Well Count", …
## $ UnitOfMeasure <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
## $ OriginalValue <dbl> 10, 35, 21, 17, 11, 11, 2, 6, 2, 1, 0, 0, …
colnames(Well_count_by_municipality) #To display the columns of the Well Count by Municipality data
## [1] "CSDUID" "CSD"
## [3] "Period" "IndicatorSummaryDescription"
## [5] "UnitOfMeasure" "OriginalValue"
The Well_count_by_municipality data contains 6 columns and 1459 rows, with column names; CSD, IndicatorSummaryDescription, UnitOfMeasure, CSDUID, Period, OriginalValue.
Checking for missing variables(NA), duplicates, leading, trailing and repeated spaces in the data
sapply(Oil_production_by_Municipality, function(x) sum(is.na(x))) # columns has no NA values
## CSDUID CSD
## 0 0
## Period IndicatorSummaryDescription
## 0 0
## UnitOfMeasure OriginalValue
## 0 0
sapply(Natural_gas_production_by_municipality, function(x) sum(is.na(x))) # columns has no NA values
## CSDUID CSD
## 0 0
## Period IndicatorSummaryDescription
## 0 0
## UnitOfMeasure OriginalValue
## 0 0
sapply(Well_count_by_municipality, function(x) sum(is.na(x))) # UnitOfMeasure column has 1459 NA values
## CSDUID CSD
## 0 0
## Period IndicatorSummaryDescription
## 0 0
## UnitOfMeasure OriginalValue
## 1459 0
Combining the Municipality data with the generated coordinate data for each Municipality.
selected_oil_production_columns <- Oil_production_by_Municipality %>%
select(CSD, CSDUID, Period, OriginalValue)
selected_coordinate_columns1 <- Oil_Coordinates %>%
select(Latitude, Longitude)
combined_oil_data <- bind_cols(selected_oil_production_columns, selected_coordinate_columns1) #combining the Oil_production_by_Municipality data with the coordinates data
selected_gas_production_columns <- Natural_gas_production_by_municipality %>%
select(CSD, CSDUID, Period, OriginalValue)
selected_coordinate_columns2 <- Gas_Coordinates %>%
select(Latitude, Longitude)
combined_gas_data <- bind_cols(selected_gas_production_columns, selected_coordinate_columns2) #combining the Natural_gas_production_by_municipality data with the coordinates data
selected_production_well_columns <- Well_count_by_municipality %>%
select(CSD, CSDUID, Period, OriginalValue)
selected_coordinate_columns3 <- Well_Coordinates %>%
select(Latitude, Longitude)
combined_well_data <- bind_cols(selected_production_well_columns, selected_coordinate_columns3) #combining the Well Count data with the coordinates data
Removing the missing variables of each data by using na.omit function
Oil_production_cleaned <- na.omit(combined_oil_data)
Natural_gas_production_cleaned <- na.omit(combined_gas_data)
Well_count_cleaned <- na.omit(combined_well_data)
For this part, I want to sort the data according to production, and create a new column that convert the oil production values from the “cubic meter to oil barrels (bbl)”, and also convert the natural gas production value to “barrel of oil equivalents (boe)”.
Oil_production_region <- Oil_production_cleaned %>%
select(CSD, CSDUID, Period, OriginalValue, Latitude, Longitude) %>%
mutate(OriginalValue_bbl = OriginalValue*6.28) %>% #conversion from cubic meter to barrels
arrange(Period)
view(Oil_production_region)
Natural_gas_production_region <- Natural_gas_production_cleaned %>%
select(CSD, CSDUID, Period, OriginalValue, Latitude, Longitude) %>%
mutate(OriginalValue_boe = OriginalValue*6.28) %>% #conversion from cubic meter to barrels of oil equivalents
arrange(Period)
view(Natural_gas_production_region)
Well_count_by_region <- Well_count_cleaned %>%
select(CSD, CSDUID, Period, OriginalValue, Latitude, Longitude) %>%
arrange(Period)
Max_Production_Oil <- Oil_production_region %>%
select(CSD, Period,OriginalValue_bbl) %>%
filter(OriginalValue_bbl == max(OriginalValue_bbl))
view(Max_Production_Oil)
Max_Production_Gas <- Natural_gas_production_region %>%
select(CSD, Period,OriginalValue_boe) %>%
filter(OriginalValue_boe == max(OriginalValue_boe))
view(Max_Production_Gas)
Max_Production_Well <- Well_count_by_region %>%
select(CSD, Period,OriginalValue) %>%
filter(OriginalValue == max(OriginalValue))
view(Max_Production_Well)
The three datasets are visualized on maps using the leaflet functions respectively. The size of each CSD circle is indicated on the map in the order of their original production values. However, Rmarkdown is not knitting the Maps produced by the “leaflet function” to PDF or Word documents, so I think this is the reason why the maps seemed to be hidden. Whereas, the maps can be viewed from the html i knitted it to.
leaflet(data = Oil_production_region) %>% #visualizing the locations of oil production region on the map
addTiles() %>%
addCircles(lng = ~Longitude, lat = ~Latitude, color = "green", radius = ~OriginalValue_bbl / 7000, popup = ~paste("OriginalValue (bbl):", OriginalValue_bbl, CSD))
leaflet(data = Natural_gas_production_region) %>% #visualizing the locations of the natural gas production region on the map
addTiles() %>%
addCircles(lng = ~Longitude, lat = ~Latitude, color = "red", radius = ~OriginalValue_boe / 7000, popup = ~paste("OriginalValue (boe):", OriginalValue_boe, CSD))
leaflet(data = Well_count_by_region) %>% #visualizing the locations of well count on the map
addTiles() %>%
addCircles(lng = ~Longitude, lat = ~Latitude, color = "black", radius = ~OriginalValue / 7000, popup = ~paste("OriginalValue:", OriginalValue, CSD))
Analysis of the three data as indicated on the dashboard, provide information on the quest to understand various oil and natural gas producing fields across Alberta Municipalities, with respect to the number of the well-bores.
However, the highest oil producing field is “Wood Buffalo” in the year 2023, which had its highest oil production in the year 2023, with the production value of approximately 1.5 billion (bbl).
The highest natural gas producting field is “Greenview No 16” with highest production in the year 2023, and production value of approximately 248 million (boe).
Also, production field with the highest number of well-count is “Wood Buffalo” with 3172 Wellbores drilled in the year 2008.
Conclusively, we can deduce that the Wood Buffalo Municipality generates more money to the government of Alberta than the rest of the oil production field relative to its higher number of wellbores. Also with the Greenview No 16, as the highest producer of natural gas.
I express my gratitude to the Department of Economic Development and Trade under the Government of Alberta, for making these data publicly accessible on their website.
Dubé, M. G., Dunlop, J. M., Davidson, C., Beausoleil, D. L., Hazewinkel, R. R., & Wyatt, F. (2022). History, overview, and governance of environmental monitoring in the oil sands region of Alberta, Canada. Integrated Environmental Assessment and Management, 18(2), 319-332.