Polar Bear Habitat Dynamics in a Changing Arctic Environment

Author

Manon Gressard, Zoe Dardare, Ugo Rombach, Victorien Rodondi

Published

December 22, 2023

1 Executive summary

In the face of escalating climate change, the Arctic region is undergoing significant environmental upheaval, directly impacting the habitat of the polar bear, a species emblematic of this area. This project aims to explore the nuanced interaction between the habitat utilization of polar bears and the evolving conditions in the Arctic. Utilizing GPS data, we aim to map the current distribution of polar bears, establishing a baseline for understanding their habitat preferences and the ongoing research efforts dedicated to their study. Our analysis proceeds to examine the complex interplay between the dietary habits of polar bears and the amount of time they spend in open water. Preliminary results from statistical models suggest a significant impact of the melting season on the bears’ diet, indicating dietary shifts as sea ice recedes and they adapt to more aquatic hunting environments. Further, we delve into the dynamic relationship between sea ice coverage and polar bear movements over time. This aspect of our research aims to highlight the bears’ adaptation strategies and habitat preferences in response to sea ice fluctuations. Expanding our scope, we assess the broader effects of climate change on Arctic ecosystems, focusing on temperature fluctuations and ice melt patterns in key polar bear habitats, such as those in Alaska. This comprehensive examination seeks to elucidate the consequences of climate change on the availability and suitability of habitats for polar bears, providing a holistic view of the challenges they face.

2 Introduction

2.1 Overview and motivation

This project, set against the dramatic backdrop of a changing Arctic environment, focuses on understanding the impact of climate change on the habitat of the polar bear. Our study is designed to unravel the complex relationship between polar bear habitat utilization and the evolving conditions of the Arctic, integrating comprehensive data analysis with field observations for a complete understanding of these changes.

Project Goals

Our main goal is a detailed scientific analysis of data related to polar bear habitats. We plan to use advanced GPS data to map the current geographical distribution and movement patterns of polar bears, incorporating a variety of ecological and geographical parameters for an in-depth understanding of their habitat preferences. We also aim to assess the adaptation of polar bears to the changing Arctic landscape, particularly the reduction of sea ice, and to explore the broader implications of climate change on the Arctic ecosystem by focusing on key factors such as temperature variations and sea ice melting patterns.

Motivation

The motivation for this project stems from a deep concern for the integrity of the Arctic ecosystem and the survival of its key species, the polar bear. This is further fueled by the unique experience of a team member who spent time in the remote Arctic region of Churchill, Manitoba, volunteering at the Churchill Northern Study Centre and working closely with leading polar bear researchers. This first-hand experience provided invaluable insights into the urgent need for data-driven research to understand the impact of climate change on polar bear populations and their behavioral adaptations to the loss of sea ice.

Incorporating these field observations with our data analysis, we aim to create a comprehensive scientific narrative. We will examine evidence of behavioral shifts in polar bears, such as changes in foraging strategies and increased land use, and cross-reference these observations with GPS tracking data. By combining scientific rigor with real-world observations, our project seeks to provide a nuanced understanding of the challenges faced by polar bears due to climate change and contribute to the global dialogue on polar bear conservation and the development of effective strategies and policies for preserving this vital Arctic species and its ecosystem.

2.3 Research questions

At the outset, our ambitions were grand. Not only did we plan to analyze demographics and forecast the future of polar bear populations using R software and extensive data, but we also aspired to model the entire Arctic ecosystem’s impact on the global climate. However, we quickly encountered a significant challenge: The non-robustness of the available data. We soon realized the necessity to scale down and refocus. This realization prompted a refocusing of our research questions. We shifted our attention to more specific, feasible areas, concentrating on the analysis of ice conditions and GPS data. Our revised questions now include:

  1. How do changes in sea ice coverage correlate with polar bear movement patterns as revealed by GPS tracking?

  2. Can we identify any significant changes in their migratory routes or habitat preferences over the past decade in the context of ongoing climate change?

  3. Is there a link between the diet of polar bears and the number of days they spend in open water?

3 Data

3.1 Data Source

Our project utilizes a variety of data sources, each critical for different aspects of our research.

  • Living Planet Index (LPI): Chosen for its rich demographic data, it enables us to weave a narrative about the life cycle and inter-species dynamics of polar bears and ringed seals. Despite some data irregularities, its depth offers a crucial ecological backdrop. Our primary data helps us to understand bear demography and the interrelationship between ‘Ursus maritimus’ (polar bears) and ‘Pusa hispida’ (ringed seals). Despite the difficulties associated with the consistency of the data, this data set is essential for the continuation of our research.

Data Description

Code
tibble(Variables = c("ID", "Binomial", "Replicate", "Citation", "Class", "Family", "Genus", "Species", "Authority", "Common_name", "Location", "Country","Region", "Latitude", "Longitude", "Specific_location", "X1950-X2020"),
       Meaning= c ("Unique identifier for each observation.", "The binomial name (genus and species) of the organism, in this case, Ursus maritimus (polar bear) ", "Indicates if the data is from a repeated observation", "Reference to the source from which the data is derived", "The biological class to which the organism belongs", "The family in the biological classification","The genus of the organism", "The species of the organism","The authority that named the species", "Common name (polar bear)", "Specific location of the study or observation", " The country where the observation was made","Geographical or political region","Latitude coordinate" , "Longitude coordinate" , "More detailed information on the location", "Counts for each year from 1950 to 2020" ))%>%
  kable()%>%
          kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Variables Meaning
ID Unique identifier for each observation.
Binomial The binomial name (genus and species) of the organism, in this case, Ursus maritimus (polar bear)
Replicate Indicates if the data is from a repeated observation
Citation Reference to the source from which the data is derived
Class The biological class to which the organism belongs
Family The family in the biological classification
Genus The genus of the organism
Species The species of the organism
Authority The authority that named the species
Common_name Common name (polar bear)
Location Specific location of the study or observation
Country The country where the observation was made
Region Geographical or political region
Latitude Latitude coordinate
Longitude Longitude coordinate
Specific_location More detailed information on the location
X1950-X2020 Counts for each year from 1950 to 2020
  • US Geological Survey (USGS) Dataset*: Its detailed locational information on polar bears serves as the backbone of our spatial analysis, allowing us to trace the subtle nuances in their movement patterns amidst changing environmental conditions.

Data Description

Code
tibble(Variables= c("BearID_rsf", "DateTimeUTC_rsf", "latitude_rsf", "longitude_rsf", "season", "Many variables finally not used" ),
       Meaning = c("Unique ID per observation", " Date and time of the observation in Coordinated Universal Time (UTC)", "Latitude coordinate of the bear's location at the time of the observation", "Longitude coordinate of the bear's location at the time of the observation", "Time of year when the observation was made", "..."))%>%
  kable()%>%
          kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Variables Meaning
BearID_rsf Unique ID per observation
DateTimeUTC_rsf Date and time of the observation in Coordinated Universal Time (UTC)
latitude_rsf Latitude coordinate of the bear's location at the time of the observation
longitude_rsf Longitude coordinate of the bear's location at the time of the observation
season Time of year when the observation was made
Many variables finally not used ...
  • PELLY ISLAND Weather Station Data: This data set is our window into the climatic heartbeat of the Arctic. Over three decades of temperature data provide us with a canvas to paint the picture of climatic shifts impacting the Arctic ecosystem.

Data Description

Code
tibble(Variables= c("Date", "TAVG..Degrees.Fahrenheit.", "TMAX..Degrees.Fahrenheit.", "TMIN..Degrees.Fahrenheit.", "Many others variables finally not used" ),
       Meaning = c("Date", "Average temperature in degrees Fahrenheit", "Maximum temperature in degrees Fahrenheit", "Minimum temeperature in degrees Fahrenheit", "..."))%>%
  kable()%>%
          kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Variables Meaning
Date Date
TAVG..Degrees.Fahrenheit. Average temperature in degrees Fahrenheit
TMAX..Degrees.Fahrenheit. Maximum temperature in degrees Fahrenheit
TMIN..Degrees.Fahrenheit. Minimum temeperature in degrees Fahrenheit
Many others variables finally not used ...
  • MASIE-NH Sea Ice Coverage Data: With its graphical ice extent representation, this data set helps us correlate the physical changes in the Arctic landscape with the behavioral adaptations of polar bears.

Data Description

Code
tibble(Variables = c("yyyyddd", "Beaufort_Sea" , "Many others variables finally not used" ),
       Meaning= c ("Date of the observation", "Daily 4km sea ice component in the region of Beaufort Sea", "..."))%>%
  kable()%>%
          kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Variables Meaning
yyyyddd Date of the observation
Beaufort_Sea Daily 4km sea ice component in the region of Beaufort Sea
Many others variables finally not used ...
  • Aerial Master Data: It offers a temporal journey through polar bear habitats in Alaska from 1979 to 2011 and from 2015 to 2017, essential for understanding the historical shifts in their territorial patterns.

Data Description

Code
tibble(Variables = c("ID", "GMT_Minus8_DateTime", "ArcLat", "Arclong", "Many others variables finally not used"), 
       Meaning = c("Unique ID per observation", "Date and time of the observation", "Latitude", "Longitude", "...")) %>%
    kable("html") %>%  
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
Variables Meaning
ID Unique ID per observation
GMT_Minus8_DateTime Date and time of the observation
ArcLat Latitude
Arclong Longitude
Many others variables finally not used ...
  • USGS Diet Data: This data set adds a crucial dimension by linking dietary patterns of polar bears to environmental changes, offering insights into the survival strategies of these majestic creatures in a rapidly changing world.

Data Description

Code
tibble(Variables = c("BearID","Year","Capture.date",
                     "Ageclass", "Sex", "OW_50pcrt","Meltseason","Bearded_seal",
                     "Ringed_seal","Beluga_whale","Bowhead_whale","Seabird_nestling"), 
       Meaning = c("Unique observation per bear","Year of the capture",
                   "Date of the capture", "The stage of maturity of the polar bear",
                   "The sex of the polar bear", "Number of days covered in open water",
                   "Number of days of the meltseason in the corresponding year",
                   "Percentage of bearded seal eaten in the corresponding period",
                   "Percentage of ringed seal eaten in the corresponding period",
                   "Percentage of beluga whale eaten in the corresponding period",
                   "Percentage of bowhead sealed eaten in the corresponding period",
                   "Percentage of seabird nestling eaten in the corresponding period"))%>%
  kable()%>%
          kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive")) 
Variables Meaning
BearID Unique observation per bear
Year Year of the capture
Capture.date Date of the capture
Ageclass The stage of maturity of the polar bear
Sex The sex of the polar bear
OW_50pcrt Number of days covered in open water
Meltseason Number of days of the meltseason in the corresponding year
Bearded_seal Percentage of bearded seal eaten in the corresponding period
Ringed_seal Percentage of ringed seal eaten in the corresponding period
Beluga_whale Percentage of beluga whale eaten in the corresponding period
Bowhead_whale Percentage of bowhead sealed eaten in the corresponding period
Seabird_nestling Percentage of seabird nestling eaten in the corresponding period

3.2 Data Scraping and Cleanup

Our data collection faced challenges, notably in accessing polar bear data due to its high research cost in this field, researchers often face restrictions on sharing data due to funding sources or proprietary concerns. Despite these obstacles, we managed to source diverse datasets, which then required meticulous cleaning. This process involved standardizing data formats, correcting inconsistencies, and merging datasets while preserving data integrity. Our focus was on maintaining accuracy, crucial for the reliability and validity of our analysis. This careful preparation ensured that our findings were based on the best available data, despite the difficulties in obtaining it.

3.2.1 Living Planet Database

We start by defining the year columns and convert them to numeric, then we calculate the annual sum for each species. We then calculate a 5-year cumulative sum for each species, and prepare the data for plotting. We also calculate the moving average for the plot data. Then we convert all columns with names starting with ‘X’ followed by 4 digits to numeric, and replace non-numeric values in these columns with NA. We calculate the sum of values in columns starting with ‘X’, select only relevant columns for mapping, and plot the global distribution of polar bears.

Code
year_columns <- paste0("X", 1970:2020)
ursus_data <- apply(data1[data1$Binomial == "Ursus maritimus", year_columns], 2, as.numeric)
pusa_data <- apply(data2[data2$Binomial == "Pusa hispida", year_columns], 2, as.numeric)

ursus_annual_sum <- colSums(ursus_data, na.rm = TRUE)
pusa_annual_sum <- colSums(pusa_data, na.rm = TRUE)

# Function to calculate the cumulative sum over 5-year periods
calculate_5yr_cumulative_sum <- function(annual_sum) {
  cum_sum_5yr <- numeric(length(annual_sum))
  for (i in 1:length(annual_sum)) {
    start_index <- max(1, i - 4)
    cum_sum_5yr[i] <- sum(annual_sum[start_index:i])
  }
  return(cum_sum_5yr)
}

ursus_cum_sum_5yr <- calculate_5yr_cumulative_sum(ursus_annual_sum)
pusa_cum_sum_5yr <- calculate_5yr_cumulative_sum(pusa_annual_sum)

years <- 1970:2020
ursus_plot_data <- data.frame(Year = years, Cumulative_Sum_5yr = ursus_cum_sum_5yr)
pusa_plot_data <- data.frame(Year = years, Cumulative_Sum_5yr = pusa_cum_sum_5yr)

ursus_plot_data$Moving_Avg <- rollmean(ursus_plot_data$Cumulative_Sum_5yr, 10, fill = NA, align = "right")
pusa_plot_data$Moving_Avg <- rollmean(pusa_plot_data$Cumulative_Sum_5yr, 10, fill = NA, align = "right")

polar_bears<- data1
x_cols <- grep("^X\\d{4}$", names(polar_bears))
polar_bears[, x_cols] <- lapply(polar_bears[, x_cols], function(x) {
  suppressWarnings(as.numeric(as.character(x)))
})
polar_bears[, x_cols][!sapply(polar_bears[, x_cols], is.numeric)] <- NA
polar_bears$Totalpolarbears <- rowSums(polar_bears[, x_cols], na.rm = TRUE)

polar_bears[, x_cols][!sapply(polar_bears[, x_cols], is.numeric)] <- NA

polar_bears$Totalpolarbears <- rowSums(polar_bears[, x_cols], na.rm = TRUE)

polar_bears_map <- polar_bears[, c("Latitude", "Longitude", "Totalpolarbears")]

min_lat <- 60
max_lat <- 90
min_lon <- -180
max_lon <- 180

polar_bears_filtered <- filter(polar_bears, Latitude >= min_lat & Latitude <= max_lat & 
                                 Longitude >= min_lon & Longitude <= max_lon)

polar_bears_filtered2 <- polar_bears_filtered %>%
  mutate(MarkerColor = ifelse(Totalpolarbears >= 10000, "blue", "red"))

scale_factor <- 0.005

3.2.2 Aerial Master Database

Here’s how we’re going to proceed for our Aerial Master (AM) table. We start to calculate the number of missing values for each column in the AM data set, then we identify the columns with more than 200,000 missing values, and decide to remove them from the data set. For better clarity, we rename the columns in the data set. Then we calculate the proportion of missing values for each column in the AM data set, and create a data frame with column names and proportions of missing values, by sorting the data frame by the proportion of missing values in descending order.

Code
na_count <- colSums(is.na(AM))

na_cols <- which(na_count > 200000)

AM <- AM[, -na_cols]
AM <- as_tibble(AM)

AM <- AM %>% rename(date_time_hour = GMT_Minus8_DateTime)
AM <- AM %>% rename(Longitude = ArcLong)
AM <- AM %>% rename(Latitude = ArcLat)

part_nulles <- colMeans(is.na(AM))

df <- data.frame(colonne = names(AM), part_nulles = part_nulles)

df <- df %>%
  mutate(colonne = factor(colonne, levels = names(AM)[order(part_nulles)]))

We create a bar chart to visualize the proportion of missing values for each column:

Code
gg1 <- ggplot(df, aes(x = colonne, y = part_nulles, text = paste("Column:", colonne, "<br>:Share of zero values:", scales::percent(round(part_nulles, 2))))) +
  geom_bar(stat = "identity", fill = "steelblue", width = 0.8)  +
  labs(title = "As we can see, many columns have quite a lot of missing data ",
       x = "Columns",
       y = "Share of zero values per column") +
  scale_y_continuous(labels = percent_format()) +
  coord_flip() +
  geom_text(aes(label = scales::percent(round(part_nulles, 2))), hjust = -0.1, vjust = 0.5, size = 1.8) +
  theme(axis.text.y = element_text(size = 4))
ggplotly(gg1, tooltip = "text")

Due to the significant amount of missing data, only location data is retained. Other columns are binary variables representing parameters of little interest or with too many missing values. We then calculate the proportion of missing values for each column in the final data set, and we remove rows with missing values from it. Then we separate the “Datetime” column into individual components (month, day, year, hour, minute, second, period), and select relevant columns for further analysis.

Code
AM_fin <- select(AM, ID, date_time_hour, Latitude, Longitude)

part_nulles2 <- colMeans(is.na(AM_fin))

AM_fin <- drop_na(AM_fin)

names(AM_fin)[names(AM_fin) == "date_time_hour"] <- "Datetime"

AM_fin <- AM_fin |> separate(Datetime, into = 
                               c("month","day","year", "hour", "minute", "second", "period"))

AM_fin <- select(AM_fin, ID, day, month, year, Latitude, Longitude)

AM_fin <- drop_na(AM_fin)

AM_fin$day <- as.integer(AM_fin$day)
AM_fin$month <- as.integer(AM_fin$month)
AM_fin$year <- as.integer(AM_fin$year)

We create a bar chart to visualize the number of observations per year:

Code
fig_79_11 <- plot_ly(x = AM_fin$year, histfunc = 'sum', type = "histogram")
fig_79_11 <- fig_79_11 %>% layout(
    yaxis = list(type = 'linear'), 
    title = "Many observations in 2009-2011",
    bargap = 0.2  
)

fig_79_11

Let’s do the same operations for the second data set “Aerial Master 2015-2017”. We repeat the same operations than on the first one to clean it up.

Code
na_count2 <- colSums(is.na(AM2))

na_cols2 <- which(na_count2 > 200000)

AM2 <- AM2 %>% rename(date_time_hour = GMT_Minus8_DateTime)
AM2 <- AM2 %>% rename(Longitude = ArcLong)
AM2 <- AM2 %>% rename(Latitude = ArcLat)

AM2 <- AM2[, -na_cols2]
AM2 <- as_tibble(AM2)

part_nulles2 <- colMeans(is.na(AM2))

AM_fin2 <- select(AM2, ID, date_time_hour, Latitude, Longitude)

part_nulles2 <- colMeans(is.na(AM_fin2))

AM_fin2 <- drop_na(AM_fin2)

names(AM_fin2)[names(AM_fin2) == "date_time_hour"] <- "Datetime"

AM_fin2 <- AM_fin2 |> separate(Datetime, into = 
                                 c("month","day","year", "hour", "minute", "second", "period"))

AM_fin2 <- select(AM_fin2, ID, day, month, year, Latitude, Longitude)

AM_fin2 <- drop_na(AM_fin2)

AM_fin2$day <- as.integer(AM_fin2$day)
AM_fin2$month <- as.integer(AM_fin2$month)
AM_fin2$year <- as.integer(AM_fin2$year)

Again, we create a bar chart to visualize the number of observations per year.

Code
AM_fin2_ <- AM_fin2 %>%
  group_by(year) %>%
  summarise(count = n()) %>%
  ungroup()

AM_fin2_$tooltip <- paste("Year:", AM_fin2_$year, "<br>Number of Observations:", AM_fin2_$count)

p <- ggplot(data = AM_fin2_, aes(x = year, y = count)) + 
  geom_bar(stat = "identity", aes(text = tooltip), fill = "steelblue") +  
  labs(title = "Number of observations per year", x = "Year", y = "") +
  theme_minimal()

ggplotly(p, tooltip = "text")

We combine the filtered data from the two final data sets into a single data frame.

Code
df_merged <- rbind(AM_fin, AM_fin2)
df_merged <- df_merged[df_merged$Latitude != 0 & df_merged$Longitude != 0,]

icon <- makeIcon(iconUrl = "Report/photo/polarBear.png",
                 iconWidth = 35, iconHeight = 35)

For the third data set about the diets of polar bears in Beaufort, there is fast no data cleaning.

Code
na_count_pb <- colSums(is.na(pb))
kable(na_count_pb) %>% 
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))
x
BearID 0
Year 0
Capture date 0
Ageclass 0
Sex 0
OW_50prct 0
OW_15prct 0
Meltseason 0
Bearded_seal 0
Ringed_seal 0
Beluga_whale 0
Bowhead_whale 0
Seabird_nestling 0

Since there are no missing values, the data set is considered “perfectly clean”. We only rename specific columns for better clarity.

Code
pb <- pb %>% rename(days_in_cov_water = OW_50prct)
pb <- pb %>% rename(Perc_bearded_eaten = Bearded_seal)
pb <- pb %>% rename(Perc_ringed_eaten = Ringed_seal)
pb <- pb %>% rename(Perc_beluga_eaten = Beluga_whale)
pb <- pb %>% rename(Perc_bowhead_eaten = Bowhead_whale)
pb <- pb %>% rename(Perc_seabird_eaten = Seabird_nestling)

3.2.3 Polar Bear Distribution Database

We began by extracting the essential information, such as each bear’s identifier, date and geographical coordinates. This data was then converted into standardised formats to ensure the consistency and accuracy of the analysis. A crucial aspect of our processing was data cleaning, where we eliminated incomplete records to ensure the reliability of our study. After organising the data by bear identifier and date, we calculated the distance traveled between successive bear points using Haversine’s formula. As a result, the first GPS broadcast for each bear does not have a value in the distance column calculated in this way. This method provides us with valuable insights into the movement patterns of bears. This approach created a coherent and detailed data set, essential for our behavioural analysis of bear movements in their natural habitat.

Code
ours_nettoyé <- ours_de_base %>%
  select(bearid = BearID_rsf, date = DateTimeUTC_rsf, latitude = latitude_rsf, longitude = longitude_rsf)

ours_nettoyé$date <- as.POSIXct(ours_nettoyé$date, format = "%m/%d/%Y %H:%M", tz = "UTC")

ours_clean <- ours_nettoyé %>%
  filter(!is.na(bearid) & !is.na(date) & !is.na(latitude) & !is.na(longitude))

ours_clean$date <- as.Date(ours_clean$date)

ours_clean <- ours_clean %>%
  arrange(bearid, date)

ours_clean <- ours_clean %>%
  group_by(bearid) %>%
  mutate(
    Longitude_prev = lag(longitude),
    Latitude_prev = lag(latitude)
  ) %>%
  rowwise() %>%
  mutate(
    Distance = ifelse(
      is.na(Longitude_prev) | is.na(Latitude_prev), 
      NA, 
      distHaversine(
        c(longitude, latitude), 
        c(Longitude_prev, Latitude_prev)
      )
    )
  ) %>%
  ungroup()
Code
nb_obs <- read_csv(here::here("Report/données/ours_clean.csv"))
# Convertir la date en format Date et extraire l'année
nb_obs$date <- as.Date(nb_obs$date)
nb_obs$year <- format(nb_obs$date, "%Y")

# Calculer le nombre d'observations par année
observations_par_an <- nb_obs %>%
  group_by(year) %>%
  summarise(NumberPointGPS = n())

# Créer un graphique
graph_nb_obs <- ggplot(observations_par_an, aes(x = year, y = NumberPointGPS, fill = NumberPointGPS)) +
  geom_bar(stat = "identity") +
  theme_minimal() +
  scale_fill_gradient(low = "blue", high = "red") +
  labs(title = "Number of polar bear GPS relvets by year",
       x = "Year",
       y = "Number of GPS points") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) # Rotation des étiquettes

# Convertir en graphique interactif
graph_nb_obs_interactif <- ggplotly(graph_nb_obs, tooltip = c("x", "y"))

# Afficher le graphique
graph_nb_obs_interactif

3.2.4 Temperatures Data Base

In order to clean this data set, we converted the temperatures from Fahrenheit to Celsius, an international standard that makes them easier to understand and compare. The next step was to merge the maximum and minimum temperature data based on date to get a consistent overview. Finally, we calculated the daily average temperature, which provides a more balanced and meaningful perspective of weather conditions. To simplify and clarify our results, we chose to keep only the mean temperature data, eliminating the maximum and minimum temperature columns. This process provided an accurate and easily interpretable overview of climate trends on Pelly Island.

Code
convert_to_celsius <- function(fahrenheit) {
  (fahrenheit - 32) * 5 / 9
}

max_clean <- max_temp %>%
  select(Date, TMAX = `TMAX..Degrees.Fahrenheit.`) %>%
  mutate(Date = as.Date(Date), TMAX = convert_to_celsius(TMAX))

min_clean <- min_temp %>%
  select(Date, TMIN = `TMIN..Degrees.Fahrenheit.`) %>%
  mutate(Date = as.Date(Date), TMIN = convert_to_celsius(TMIN))

df_merged2 <- inner_join(max_clean, min_clean, by = "Date")

temp_beaufort <- df_merged2 %>%
  mutate(Average_Temperature = (TMAX + TMIN) / 2)

temp_beaufort <- temp_beaufort %>%
  select(-TMAX, -TMIN)

3.2.5 Sea Ice Cover Data Base

To refine our analysis of the Beaufort Sea, we first filtered the data, discarding information relating to other maritime regions. Next, we converted the date format from ‘yyyyyddd’ to a standard format, facilitating temporal analyses. These data preparations are crucial to ensuring the accuracy and relevance of our future analyses.

Code
glace_beaufort$
glace_beaufort <- glace_beaufort %>%
  select(-Chukchi_Sea,                    -East_Siberian_Sea  ,             -Laptev_Sea      ,               
          -Kara_Sea    ,                    -Barents_Sea      ,               -Greenland_Sea  ,                
          -Baffin_Bay_Gulf_of_St._Lawrence ,-Canadian_Archipelago ,           -Hudson_Bay   ,                  
          -Central_Arctic       ,           -Bering_Sea  ,                   -Baltic_Sea ,                    
          -Sea_of_Okhotsk   ,               -Yellow_Sea  ,                   -Cook_Inlet, -Northern_Hemisphere)

glace_beaufort$yyyyddd <- as.character(glace_beaufort$yyyyddd)

# Convertir la colonne yyyyddd en dates
glace_beaufort$yyyyddd <- as.Date(as.numeric(substr(glace_beaufort$yyyyddd, 5, 7)) - 1, 
                               origin = as.Date(paste0(substr(glace_beaufort$yyyyddd, 1, 4), "-01-01")))

glace_beaufort$yyyyddd <- ymd(glace_beaufort$yyyyddd)

3.2.6 Combined Data Sets

From this data set, we assigned the nearest mean temperature date to each polar bear GPS point. In order to do this, we scanned each polar bear observation, calculated the difference in days between that observation and the dates of the temperature readings, and selected the temperature corresponding to the closest date. This process made it possible to precisely link the climatic conditions to the specific positions of the bears.

Code
ours_clean$Closest_Avg_Temperature <- numeric(nrow(ours_clean))

for (i in 1:nrow(ours_clean)) {
    # Calculate the difference in days
  differences <- abs(difftime(temp_beaufort$Date, ours_clean$date[i], units = "days"))
  
    # Find the minimum difference index
  closest_idx <- which.min(differences)
  
    # Assign the nearest average temperature
  ours_clean$Closest_Avg_Temperature[i] <- temp_beaufort$Average_Temperature[closest_idx]
}


ours_clean <- arrange(ours_clean, bearid)

ours_clean <- ours_clean %>%
  arrange(bearid, date)

bear_periods <- ours_clean %>%
  group_by(bearid) %>%
  summarise(start_date = min(date), end_date = max(date))

count_active_bears <- function(date, periods) {
  sum(periods$start_date <= date & periods$end_date >= date)
}

unique_dates <- unique(ours_clean$date)
count_bears_per_date <- sapply(unique_dates, function(date) count_active_bears(date, bear_periods))

count_bears_df <- data.frame(date = unique_dates, Count_Bears = count_bears_per_date)

ours_clean <- merge(ours_clean, count_bears_df, by = "date")

3.2.7 Data Set Regression for Travel Forecasts

  • We use a methodical approach to develop and evaluate predictive models of latitude and longitude. We began by dividing our selected data set into two distinct sets: one for training and one for testing, ensuring reproducibility through a fixed seed parameter. This division ensured that models would be trained on a subset of the data and tested on a separate set, not used during training, for an unbiased assessment of their performance.

  • We then trained two separate models using the random forest algorithm, one to predict latitude and the other for longitude, excluding longitude and latitude respectively from the predictor variables to avoid redundancy. After training, we used the models to predict latitude and longitude values on the test set.

  • The performance of the models was evaluated by calculating the root mean square error (RMSE) for the latitude and longitude predictions. The results showed an RMSE of 1.2227 for latitude and 13.4141 for longitude. By quantifying the mean difference between predicted and actual values, these RMSE measurements revealed that the latitude model was relatively accurate, while the longitude model had a greater margin of error.

  • We extend our study to include predictions over the next five years. Using the latest date in our data set as a starting point, we have generated a sequence of future dates. To these dates, we used a function to associate the relevant mean temperature and ice extent data for the Beaufort Sea, taking care to fill in any gaps with the median of the available temperatures, to maintain data consistency.

  • We then enriched the future data by assigning bear identifiers (BearIDs) to each GPS point, simulating the tracking of 20 different bears. This step allowed us to contextualize our predictions within a more realistic framework, reflecting the potential diversity of bear movements in the future.

  • Using the random forest models, we had previously trained, we predicted the geographical positions (latitude and longitude) for each combination of BearID and future date. These predictions, based on anticipated environmental conditions, were compiled into a new dataset, providing a prospective view of possible polar bear movements in the face of future climate and environmental change. This work represents a significant advance in our understanding of polar bear dynamics and their potential adaptation to changes in their habitat.

Code
data$date <- as.Date(data$date)
data$year <- year(data$date)
data$month <- month(data$date)
data$day <- day(data$date)

data_selected <- data %>%
  select(latitude, longitude, year, month, day, Closest_Avg_Temperature, Beaufort_Sea, date)

data_selected <- na.omit(data_selected)


# Division des données en ensemble d'entraînement et de test
set.seed(123) # Pour la reproductibilité
indices <- sample(1:nrow(data_selected), size = 0.8 * nrow(data_selected))
train_data <- data_selected[indices, ]
test_data <- data_selected[-indices, ]

# Entraînement du modèle pour la latitude
model_latitude <- randomForest(latitude ~ ., data = train_data[, -2]) 

# Entraînement du modèle pour la longitude
model_longitude <- randomForest(longitude ~ ., data = train_data[, -1]) 

predictions_latitude <- predict(model_latitude, test_data)
predictions_longitude <- predict(model_longitude, test_data)

rmse_latitude <- sqrt(mean((predictions_latitude - test_data$latitude)^2))
rmse_longitude <- sqrt(mean((predictions_longitude - test_data$longitude)^2))

print(paste("RMSE Latitude:", rmse_latitude))
[1] "RMSE Latitude: 1.2227392145361"
Code
print(paste("RMSE Longitude:", rmse_longitude))
[1] "RMSE Longitude: 13.4141324335172"
Code
last_date <- max(data_selected$date)
future_dates <- seq(from = last_date + 1, by = "day", length.out = 5 * 365)

future_data <- data.frame(
  date = future_dates,
  Closest_Avg_Temperature = temp_beaufort$Average_Temperature[match(future_dates, temp_beaufort$date)],
  Beaufort_Sea = glace_beaufort$Beaufort_Sea[match(future_dates, glace_beaufort$date)],
  year = year(future_dates),
  month = month(future_dates),
  day = day(future_dates)
)

nombre_ours <- 20
bear_id_start <- 301
total_days <- length(future_dates)

future_data$BearID <- rep(bear_id_start:(bear_id_start + nombre_ours - 1), 
                          times = ceiling(total_days / nombre_ours))[1:total_days]


predicted_latitude <- predict(model_latitude, future_data)
predicted_longitude <- predict(model_longitude, future_data)

predictions <- data.frame(BearID = future_data$BearID, date = future_dates, latitude = predicted_latitude, longitude = predicted_longitude)

predictions <- na.omit(predictions)

4 Exploratory Data Analysis

4.1 Locations of research centers

Focus on Alaska

In light of recent data, there is growing concern over the declining population of polar bears. This trend warrants a closer examination of the origins of our data. To gain a clearer understanding of these patterns, we will delve into the geographic source of this information. Such an approach is crucial, as it allows us to pinpoint specific regions contributing most significantly to the data set.

On the following map, we identify the locations from which the majority of our polar bear sightings are reported. Given the essential nature of these data for understanding the state of polar bear populations, identifying the main source regions is a step that helped us greatly in subsequently guaranteeing the robustness of our results.

Code
leaflet_map <- leaflet(polar_bears_filtered2) %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  addCircleMarkers(
    lng = ~Longitude, lat = ~Latitude, 
    popup = ~paste("Country: ", Country, "<br>",
                   "Total Observations: ", Totalpolarbears),
    color = ~MarkerColor, 
    radius = ~Totalpolarbears * scale_factor, 
    clusterOptions = markerClusterOptions(),
    fillOpacity = 0.5
  ) %>%
  addMiniMap() %>% 
  setView(lng = mean(c(min_lon, max_lon)), lat = mean(c(min_lat, max_lat)), zoom = 4) %>%
  addProviderTiles(providers$Esri.WorldImagery) %>% 
  addScaleBar() %>% 
  addLayersControl(
    baseGroups = c("OpenStreetMap", "Satellite"),
    options = layersControlOptions(collapsed = FALSE)
  )

leaflet_map

In our data science analysis of Arctic wildlife populations, we discovered significant disparities in data collection, a key point highlighted through our visualizations. From this map, we identify the distribution of research centers and their observation counts, revealing a concentration of research efforts in North America, raising questions about the global validity of our analyses. In order to address our research questions, we decided to focus on Alaska, as it is the region that provides the best amount of data.

4.2 5-Year cumulative count trend for ringed seals and polar bears

We used cumulative averages in our graphs to mitigate annual variability, revealing long-term trends in bear and seal populations over 50 years. These trends indicate an early decline in bear populations compared to seals, suggesting that their populations are not directly correlated and that external factors like human impact and environmental changes play a significant role.

Code
p1 <- ggplot(pusa_plot_data, aes(x = Year, y = Cumulative_Sum_5yr)) +
  geom_line(color = "green") +
  geom_line(aes(y = Moving_Avg), color = "red", linetype = "dashed") +
  ggtitle("5-Year Cumulative Count Trend for Ringed Seals (Pusa hispida)") +
  xlab("Years") +
  ylab("Cumulative Count Trend")

p2 <- ggplot(ursus_plot_data, aes(x = Year, y = Cumulative_Sum_5yr)) +
  geom_line(color = "blue") +
  geom_line(aes(y = Moving_Avg), color = "red", linetype = "dashed") +
  ggtitle("5-Year Cumulative Count Trend for Polar Bears (Ursus maritimus)") +
  xlab("Years") +
  ylab("Cumulative Count Trend")

grid.arrange(p1, p2, ncol = 2)

These two graphs show cumulative trends over 5 years for two distinct species: the ringed seal (Pusa hispida) and the polar bear (Ursus maritimus), respectively. The first trend, represented by a green line, shows the cumulative sum over 5 years of seal counts. This shows how the total number of seals and bears observed has evolved over a rolling 5-year period. The second trend, a dotted red line, represents a moving average. This average smoothes out annual variations to give a better idea of the general trend over the long term. It is important to note that our data set no longer contains data from 2010 onwards, which explains the decline.

In this initial phase, we have gathered the essential data to analyze the movement of polar bears and the impacts of climate change.

4.3 Polar bear location

Here you can see data on the location of polar bears in 1979. The interest is not in the number of data, as we have seen from the graph of the number of sightings that these data are inconsistent and tend to increase as the years go by. The interest lies rather in the general location of all these data. So here we see that polar bears tended to live in northern Alaska in 1979.

Year 1979:

Code
# Cards for analysis
# Filtrer les données pour l'année 1979

df_1979 <- df_merged %>% filter(year == 1979)

# Créer une carte
map_1979 <- leaflet()

# Ajouter une basemap OpenStreetMap
map_1979 <- map_1979 %>% addTiles(
  urlTemplate = "https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png",
  attribution = "Map data &copy; <a href='https://www.openstreetmap.org/'>OpenStreetMap</a> contributors, <a href='https://creativecommons.org/licenses/by-sa/2.0/'>CC-BY-SA</a>"
)
# Ajouter les points de données
map_1979 <- leaflet(quakes) %>% addTiles() %>%
  addMarkers(data = df_1979,lat = df_1979$Latitude, lng = df_1979$Longitude,
             clusterOptions = markerClusterOptions(),
             icon = icon)

# Afficher la carte
map_1979

The locations of polar bears in 1990 are shown here. Compared with 1979, we can observe a westward shift in the average location of the bears. However, they still remain in northern Alaska.

Year 1990:

Code
# Filtrer les données pour l'année 1990
df_1990 <- df_merged %>% filter(year == 1990)

# Créer une carte
map_1990 <- leaflet()

# Ajouter une basemap OpenStreetMap
map_1990 <- map_1990 %>% addTiles(
  urlTemplate = "https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png",
  attribution = "Map data &copy; <a href='https://www.openstreetmap.org/'>OpenStreetMap</a> contributors, <a href='https://creativecommons.org/licenses/by-sa/2.0/'>CC-BY-SA</a>"
)
# Ajouter les points de données
map_1990 <- leaflet(quakes) %>% addTiles() %>%
  addMarkers(data = df_1990,lat = df_1990$Latitude, lng = df_1990$Longitude,
             clusterOptions = markerClusterOptions(),
             icon = icon)

# Afficher la carte
map_1990

Data for 2015 are shown here. Compared with 1990, there has been a clear westward shift in location data. Some bears remain in the north, but it is interesting to note that some bears also live in the far west of Alaska, but no longer in the north.

Year 2015:

Code
# Filtrer les données pour l'année 2015
df_2015 <- df_merged %>% filter(year == 2015)

# Créer une carte
map_2015 <- leaflet()

# Ajouter une basemap OpenStreetMap
map_2015 <- map_2015 %>% addTiles(
  urlTemplate = "https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png",
  attribution = "Map data &copy; <a href='https://www.openstreetmap.org/'>OpenStreetMap</a> contributors, <a href='https://creativecommons.org/licenses/by-sa/2.0/'>CC-BY-SA</a>"
)
# Ajouter les points de données
map_2015 <- leaflet(quakes) %>% addTiles() %>%
  addMarkers(data = df_2015,lat = df_2015$Latitude, lng = df_2015$Longitude,
             clusterOptions = markerClusterOptions(),
             icon = icon)

# Afficher la carte
map_2015

4.4 Average weekly distance traveled per year

To represent the movements of polar bears within our data set, we needed to generate a trend for the average distance these mammals travel each day. We calculated the total distance traveled by each bear per day and per year. Next, we determined the daily average of this distance for each bear per year. Finally, we calculated the overall average of these daily averages for all the bears, year by year, in order to identify a general trend in the distances traveled by polar bears over time. These results are shown in the following interactive graph, and you can select a specific year by double-clicking on it. As explained in the graph (graph number on the number of GPS points per year) we have a wide disparity in our data depending on the year. Despite this, a trend in the daily distance traveled by polar bears can be seen in the graph. There has been a downward trend in this distance over the last 15 years.

Code
plot_distance <- ours_clean

# Convertir la colonne date en type date et extraire l'année
plot_distance$date <- as.Date(plot_distance$date)
plot_distance$year <- year(plot_distance$date)

# Calculer la distance totale parcourue par jour pour chaque ours dans chaque année
distance_journaliere_par_annee <- plot_distance %>%
  group_by(bearid, date, year) %>%
  summarize(distance_totale = sum(Distance, na.rm = TRUE))

# Calculer la moyenne quotidienne pour chaque ours par année
moyenne_quotidienne_par_ours_par_annee <- distance_journaliere_par_annee %>%
  group_by(bearid, year) %>%
  summarize(
    jours_uniques = n(),
    distance_annuelle = sum(distance_totale, na.rm = TRUE),
    moyenne_quotidienne = distance_annuelle / jours_uniques
  )

# Calculer la moyenne globale quotidienne pour chaque année sur tous les ours
moyenne_globale_quotidienne_par_annee <- moyenne_quotidienne_par_ours_par_annee %>%
  group_by(year) %>%
  summarize(moyenne_globale = mean(moyenne_quotidienne, na.rm = TRUE))

# Créer un histogramme avec ggplot
graph_distance <- ggplot(moyenne_globale_quotidienne_par_annee, aes(x = year, y = moyenne_globale, fill = as.factor(year), text = moyenne_globale)) +
  geom_bar(stat = "identity") +
  scale_fill_viridis_d() +
  theme_minimal() +
  labs(fill = "Year", 
       x = "Year", 
       y = "Average weekly distance in metres", 
       title = "Average weekly distance traveled per year")

# Convertir le graphique ggplot en un graphique plotly interactif
# 'text' spécifie les informations à afficher dans l'info-bulle
p_plotly_distance_ours <- ggplotly(graph_distance, tooltip = "text")

# Afficher le graphique
p_plotly_distance_ours

4.5 GPS tracks of polar bear

Code
create_map_for_bear <- function(bearid, data) {
  filtered_data <- data[data$bearid == bearid, ]
  
  m <- leaflet(filtered_data) %>%
    addTiles() %>%
    addPolylines(~longitude, ~latitude, group = ~bearid, weight=2, color="#3388ff", opacity=0.5) %>% # Trace plus fine et en transparence
    addCircles(lng = ~longitude, lat = ~latitude, weight = 1, radius = 100, # Points GPS plus gros
               popup = ~paste("Year:", format(as.Date(date), "%Y"), "<br/>",
                              "Distance traveled between last point:", round(Distance, 2), "m"),
               fillOpacity = 0.8) %>%
    addScaleBar(options = list(position = "bottomleft", maxWidth = 100, metric = TRUE, imperial = FALSE))
  
  saveWidget(m, file=paste0("bear_", bearid, ".html"), selfcontained = TRUE)
}

unique_bears <- unique(ours_data$bearid)
for(bear in unique_bears) {
  create_map_for_bear(bear, ours_data)
}

Select a Bear to display its Track, click on a GPS point to display its characteristics

4.6 Temeprature evolution

We selected temperature readings from various meteorological stations in the Beaufort Sea as our primary indicator. Additionally, we have investigated the trends in ice cover and its temporal fluctuations. The process of processing, merging, and analyzing disparate data sets presents a significant challenge. Currently, our statistical analysis primarily considers temperature changes as the key factor in explaining the migration patterns of polar bears.

Code
couleurs_saisons <- c("Spring" = "#89A45E", "Summer" = "#7FC6BC", "Autumn" = "#D17A22", "Winter" = "#5A7D9A")

# Déterminer les années uniques dans les données
unique_years <- unique(year(ymd(temp_beaufort_graph$date)))

# Fonction pour générer un graphique par saison et année
generate_plot <- function(season, year) {
  data_filtered <- temp_beaufort_graph %>%
    filter(season == season, year(ymd(date)) == year) %>%
    mutate(Average_Temperature = round(Average_Temperature * 2) / 2)

  couleur_ligne <- couleurs_saisons[season]

  ggplot(data_filtered, aes(x = date, y = Average_Temperature, group = season)) +
    geom_line(color = couleur_ligne, alpha = 0.7) +
    geom_point(color = couleur_ligne, size = 1, alpha = 0.7) +
    theme_minimal() +
    labs(title = paste("Temperature in the Beaufort Sea: ", season, " ", year),
         x = "Date",
         y = "Temperature (°C)")
}

# Générer et sauvegarder les graphiques en tant que fichiers HTML pour chaque combinaison saison-année
for (season in names(couleurs_saisons)) {
  for (year in unique_years) {
    plotly_plot <- ggplotly(generate_plot(season, year))
    html_file <- paste0("plot_", season, "_", year, ".html")
    htmlwidgets::saveWidget(plotly_plot, file = html_file)
  }
}

4.7 Average sea ice cover

To represent the average ice cover in the Beaufort Sea, we have also chosen to divide it up by season. In the following graph, you can double-click on a season to isolate it from the others. As with temperature, there is a strong seasonal effect on the surface area (in km2) of ice in the Beaufort Sea. In addition, there is a tendency for this to decrease over the years.

Code
# Fonction pour déterminer la saison
#getSeason <- function(date) {
#  month <- as.numeric(format(date, "%m"))
#  if (month %in% c(3, 4, 5)) {
#    return("Spring")
#  } else if (month %in% c(6, 7, 8)) {
#    return("Summer")
#  } else if (month %in% c(9, 10, 11)) {
#    return("Fall")
# } else {
#    return("Winter")
#  }
#}


# Ajouter la colonne saison
#glace_beaufort_graph <- glace_beaufort %>%
#  mutate(season = sapply(date, getSeason))


#couleurs_saisons <- c("Spring" = "#89A45E",  
#                      "Summer" = "#7FC6BC",        
#                      "Fall" = "#D17A22",    
#                      "Winter" = "#5A7D9A")

glace_beaufort_graph <- read.csv(here::here("Report/données/glace_beaufort_graph.csv"))
# Création du graphique ggplot
graph_beaufort <- ggplot(glace_beaufort_graph, aes(x = date, y = Beaufort_Sea, color = season)) +
  geom_line() +
  scale_color_manual(values = couleurs_saisons) +
  theme_minimal() +
  labs(title = "Extent of Beaufort Sea ice by year and season",
       x = "Year",
       y = "Extent of ice (km²)")

# Conversion en graphique interactif avec plotly
graph_beaufort_interactif <- ggplotly(graph_beaufort)

# Afficher le graphique
graph_beaufort_interactif

4.8 Number of polar bear observations

In addition to temperature and ice cover in the Beaufort Sea, we want to add another variable to our data set. It seems important to us to reflect the number of active polar bears (i.e. transmitting GPS data) present in our data set on a date T. This is why we have created a “Bear Counter” function, which associates the number of active polar bears with each GPS point. This allows us to adjust our data set in anticipation of our analysis. The following graph shows this function and thus the evolution of the number of active polar bears in the study over the years. Here again we can see that our data is fairly heterogeneous from year to year. This reinforces our idea of increasing our understanding of the data set to develop a rigorous statistical analysis.

Code
plot_nb_ours <- ours_clean

plot_nb_ours$date <- as.Date(plot_nb_ours$date)
plot_nb_ours$year <- year(plot_nb_ours$date)

# Calculer la moyenne du nombre d'ours actifs par année et arrondir à l'entier supérieur
moyenne_ours_par_annee <- plot_nb_ours %>%
  group_by(year) %>%
  summarize(moyenne_ours = ceiling(mean(Count_Bears, na.rm = TRUE)))

# Créer un graphique avec ggplot
graph_ours_v1 <- ggplot(moyenne_ours_par_annee, aes(x = year, y = moyenne_ours, group = 1)) +
  geom_line() +
  scale_y_continuous(limits = c(0, 50)) + # Fixer les limites de l'axe Y si nécessaire
  theme_minimal() +
  labs(x = "Year", y = "Number of bears studied per year", title = "Number of bear observations")


# Animer le graphique
nb_ours_anime <- graph_ours_v1 + transition_reveal(year)

# Enregistrer l'animation
anim_save("/Users/zoedardare/Documents/Master Management/Data science in BA/Quarto polar bears_files/Data Ugo/animation_moyenne_ours.gif", animation = nb_ours_anime)

# Afficher l'animation
animate(nb_ours_anime)

We have been able to separate the temperatures into 4 seasons over the years. We then notice a very strong seasonality through the average temperatures. What’s more, we don’t specifically see an increase or decrease in these temperatures over time.

4.9 Relationship between the days in covered water and the diets of polar bears

We create a scatterplot to visualize the relationship between the days in covered water and the ringed eaten by the polar bears.

Code
p_basic <- ggplot(pb, aes(x = days_in_cov_water, y = Perc_ringed_eaten)) +
  geom_point() +
  ggtitle("No clear relation between days in cover water and ringed seal consumed") +
  xlab("Number of days in water in melt season") +
  ylab("Ringed seal consumed in the during the previous year")
  theme(plot.title = element_text(size = 12))  # Ajustez la taille ici
List of 1
 $ plot.title:List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : num 12
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi FALSE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 - attr(*, "class")= chr [1:2] "theme" "gg"
 - attr(*, "complete")= logi FALSE
 - attr(*, "validate")= logi TRUE
Code
p_basics <- ggplotly(p_basic)
p_basics

The following graph shows the percentage of animals eaten.

Code
attach(pb)
data <- data.frame(
  labels = c("Ringed seals", "Bearded seals", "Seabirds", "Bowhead whales", "Belugas"),
  values = c(mean(Perc_ringed_eaten), mean(Perc_bearded_eaten),
             mean(Perc_seabird_eaten), mean(Perc_bowhead_eaten),
             mean(Perc_beluga_eaten))
)
ordered_data <- data %>% arrange(desc(values))
# Create the bar chart
ordered_data$values_rounded <- round(ordered_data$values, 2)

ordered_data$tooltip <- paste("Specy:", ordered_data$labels, 
                              "<br>Proportion:", ordered_data$values_rounded,"%")

gg <- ggplot(ordered_data, aes(x = labels, y = values, text = tooltip)) +
  geom_bar(stat = "identity", width = 0.75, fill = c("lightblue", "darkblue", "lightyellow", "skyblue", "blue")) +
  labs(title = "Ringed seals is the favourite food for polar bears", x = "Species eaten", y = "Proportion (%)")
ggplotly(gg, tooltip = "text")

4.10 Polar bear heat map

We can then plot the movements of each polar bear in our Beaufort Sea study on a map. What’s more, each GPS point is characterised by its year and the distance the polar bear has traveled since its last GPS fix. As the GPS points are not homogeneous depending on the BearID and the year, the track between the points is not the real trajectory of the polar bear selected. It presents the general idea of the journey made. For greater accuracy, it would be necessary to increase the number of GPS readings over time.

Code
predictions_complete$year <- format(as.Date(predictions_complete$date), "%Y")

create_heatmap_for_year <- function(year, data) {
  filtered_data <- data[data$year == year, ]
  
  m <- leaflet(filtered_data) %>%
    addTiles() %>%
    addHeatmap(lng = ~longitude, lat = ~latitude, intensity = ~1, blur = 20, max = 0.05, radius = 15) %>%
    addScaleBar(options = list(position = "bottomleft", maxWidth = 100, metric = TRUE, imperial = FALSE))
  
  saveWidget(m, file=paste0("heatmap_", year, ".html"), selfcontained = TRUE)
}

for(year in 2015:2020) {
  create_heatmap_for_year(year, predictions_complete)
}

Select a Year to display the Heatmap

5 Analysis

5.1 A decreasing polar bear population

We can see that the location of bears changes over time. As previously stated, an analysis based on the number of data points is of little relevance, as the bears were not specified in the data, and we can imagine an improvement in the methods and means put in place to obtain these measurements over time. What we can analyse, however, is the gradual shift towards the west in the measurements taken. We can imagine that as a result of global warming, bears are tending to seek out colder regions, and that while northern Alaska fulfilled this condition in the 80s and 90s, they now have to move to the even colder west of the country to find conditions that are conducive to their survival.

We can observe that the seals’ numbers oscillate yet show an overarching increase, which hints at their adaptability to environmental shifts—a resilience that might be bolstered by conservation measures. However, data irregularities, such as inconsistent collection frequencies and varying methodologies, could be obscuring the true narrative of these trends. These factors remind us to interpret the data cautiously.

Anyhow polar bears present a different story. The data, primarily from the American continent, indicate a concerning decrease in their population. The retreating Arctic ice is a clear signal of the climate crisis, stripping away the habitat crucial for the bears’ hunting and breeding. The data’s geographical limitations warrant a cautious approach to generalizing these findings across the glob.

From our study, we can observe that the decline in the number of polar bears is markedly more significant than the one of their primary prey, the seals. This pattern suggests a hypothesis: faced with an invisible enemy and the only predator of polar bears – climate change – these bears are in a dire situation. The warming climate leads to the ice in the bay forming later and melting earlier each year, which may abbreviate the seal hunting season to such an extent that polar bears may not have sufficient time to accumulate the fat reserves needed to survive their summer fast. The breeding cycle of the females could be disrupted, threatening the continuity of the species.

5.3 How do changes in sea ice coverage correlate with polar bear movement patterns as revealed by GPS tracking?

In this section, we set up a new panel of twenty polar bears, independently of those already considered in previous research activities. Due to practical constraints and limitations inherent in our skills, we opted for this method. It is important to note that this approach could influence the accuracy of our model, particularly regarding the continuity of movement of current bears.

Polar bears are mammals whose behaviour is highly complex, making their movements difficult to model. These challenges are exacerbated by constantly changing environmental and geographical phenomena, making it particularly difficult to estimate their movements accurately. Despite these obstacles, we mobilised all our available resources and skills to develop a simplified model aimed at predicting polar bear movement trends for the five years following our study period. Our preliminary model indicates a significant increase in the presence of polar bears along the Beaufort Sea coast over time. However, the inherent limitations of our model and its simplistic nature make any definitive conclusions premature.

Code
# Ajouter des variables saisonnières basées sur le mois
ours_clean <- ours_clean %>%
  mutate(date = ymd(date),
         month = month(date),
         season = case_when(
           month %in% c(12, 1, 2) ~ "Winter",
           month %in% c(3, 4, 5) ~ "Spring",
           month %in% c(6, 7, 8) ~ "Summer",
           month %in% c(9, 10, 11) ~ "Autumn",
           TRUE ~ NA_character_
         ))

# Convertir la saison en variable factorielle
ours_clean$season <- as.factor(ours_clean$season)

# Régression linéaire
model <- lm(Distance ~ Closest_Avg_Temperature + Count_Bears + latitude + longitude + season, data = ours_clean)

# Résumé du modèle
sum_mod<-summary(model)


# Conversion de la sortie de summary en data frame
coef_df1 <- as.data.frame(sum_mod$coefficients)

# Utilisation de formattable pour afficher le résumé
formattable(coef_df1)
Estimate Std. Error t value Pr(>|t|)
(Intercept) -15704.358046 2148.116825 -7.3107560 2.664331e-13
Closest_Avg_Temperature -218.358044 4.032561 -54.1487294 0.000000e+00
Count_Bears 98.310057 4.685425 20.9821019 1.161376e-97
latitude 277.315868 28.592081 9.6990445 3.071029e-22
longitude 4.608247 3.239887 1.4223483 1.549264e-01
seasonSpring -4241.466867 147.268913 -28.8008297 4.173695e-182
seasonSummer 39.891773 149.265983 0.2672529 7.892746e-01
seasonWinter -3645.022000 195.345021 -18.6594057 1.196508e-77

This multiple linear regression model, developed from 263,886 observations, explores the factors influencing the distance traveled by bears. The model includes environmental and geographical variables as well as seasonal effects, but has an R2 of just 0.02, suggesting that other variables not included may also be important. Average temperature showed a significant negative impact of -218 meters per degree, suggesting that higher temperatures are associated with a decrease in distance traveled. The number of bears observed had a positive effect, with an increase of 98 meters for each additional bear observed. Geographical variables such as latitude and longitude showed significant effects, with an increase of 277 meters for each additional degree of latitude. However, the effect of longitude, although positive, was not statistically significant (p = 0.15). The seasonal effects are clearly visible, and we have arbitrarily chosen autumn as the reference season for comparing them. In terms of seasonal effects, spring shows a significant reduction of 4,241 meters compared with autumn and winter a reduction of 3,645 meters. Summer, on the other hand, shows no significant difference from autumn. Secondly, we integrate the ice cover (in km2) over the Beaufort Sea.

Code
# Convertir 'yyyyddd' en 'YYYY-MM-DD'
glace_beaufort$date <- as.Date(paste0(substr(glace_beaufort$yyyyddd, 1, 4), "-01-01")) +
  (as.numeric(substr(glace_beaufort$yyyyddd, 5, 7)) - 1)

# Fusionner les datasets sur la base de la date
dataset_regression <- left_join(ours_clean, glace_beaufort, by = "date")

# Filtrer pour ne garder que les données à partir de 2006
dataset_regression <- filter(dataset_regression, year(date) >= 2006)

# Régression linéaire sur le déplacement des ours
model <- lm(Distance ~ Closest_Avg_Temperature + Count_Bears + Beaufort_Sea, data = dataset_regression)

# Résumé du modèle
sum_mod2<-summary(model)

# Conversion de la sortie de summary en data frame
coef_df2 <- as.data.frame(sum_mod2$coefficients)

# Utilisation de formattable pour afficher le résumé
formattable(coef_df2)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 7.880316e+03 3.653602e+02 21.56862 8.008171e-103
Closest_Avg_Temperature 2.763439e+02 1.344256e+01 20.55738 1.297099e-93
Count_Bears 9.407240e+01 5.426635e+00 17.33531 3.585202e-67
Beaufort_Sea -7.270132e-04 2.697236e-04 -2.69540 7.032181e-03
Code
dataset_regression <- arrange(dataset_regression, bearid)

Here we focus on the post-2006 period of our original data set, which is not only where most of the GPS markers are located but also the start of the ice cover data set. The multiple linear regression model analysed thus provides a quantitative overview of the factors influencing the distance traveled by bears, with each variable expressed in terms of its impact on distance in meters. Based on 193,502 observations, this model incorporates several environmental and geographical predictors. Among the variables, average temperature shows a significant impact of 30 meters per degree of temperature change. This indicates that an increase in average temperature is associated with an average increase in the distance traveled by bears. The number of bears observed (Count_Bears) shows a positive relationship with distance traveled, with an increase of 62 meters for each additional bear observed. This suggests that the presence of a greater number of bears is correlated with an increase in the distance traveled. The geographical variables, latitude and longitude also showed significant effects. For example, each additional degree of latitude is associated with an increase of 368 meters in distance traveled, while each additional degree of longitude corresponds to an increase of 8 meters. Nevertheless, in our model, despite its significance, ice cover (Beaufort_Sea) does not bring any change to our dependent variable. In spring, the distance traveled fell by 2,205 meters compared with autumn, while in summer it fell by 1,804 meters. In winter, the distance traveled decreases by 523 meters compared with autumn.

In a final model, we wanted to detach the potential multicollinearity of our variables as a function of season. To do this, we transformed our season variable into an indicator variable. Here are the results of the model:

Code
data_regression1<- na.omit(dataset_regression)
# Transformation de la variable saison en variables indicatrices
data_regression1 <- mutate(data_regression1, season = as.factor(season))
data_regression1 <- cbind(data_regression1, model.matrix(~season - 1, data = data_regression1))

# Ajustement du modèle de régression avec des interactions
modele3 <- lm(Distance ~ Count_Bears + latitude + longitude + seasonSpring:Closest_Avg_Temperature + seasonSummer:Closest_Avg_Temperature + seasonWinter:Closest_Avg_Temperature + seasonSpring:Beaufort_Sea + seasonSummer:Beaufort_Sea + seasonWinter:Beaufort_Sea, data = data_regression1)

# Résumé des résultats du modèle
sum_mod3<-summary(modele3)

# Conversion de la sortie de summary en data frame
coef_df3 <- as.data.frame(sum_mod3$coefficients)

# Utilisation de formattable pour afficher le résumé
formattable(coef_df3)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.642017e+04 3.254354e+03 11.1912125 4.780964e-29
Count_Bears 1.025473e+02 5.638745e+00 18.1861939 1.004242e-73
latitude -4.027032e+02 4.411302e+01 -9.1288967 7.108091e-20
longitude -2.347668e+00 1.683814e+00 -1.3942566 1.632448e-01
Closest_Avg_Temperature:seasonWinter 3.170391e+02 1.400669e+01 22.6348351 5.291311e-113
seasonWinter:Beaufort_Sea -2.277004e-04 2.737522e-04 -0.8317755 4.055387e-01

The results remain very similar to the two previous regressions. But to verify this multicollinearity aspect, we carry out a VIF test on our variables.

Code
#vif_values <- vif(modele3)

# Créer un data frame à partir des valeurs VIF
#vif_df <- data.frame(
#  Variable = names(vif_values),
#  VIF = unname(vif_values)
#)

#kable(vif_df, caption = "Variance Inflation Factors")
#kable(vif_df, "html") %>%
#  kable_styling(bootstrap_options = c("striped", "hover")) %>%
#  add_header_above(c(" " = 1, "Variance Inflation Factor" = 1))

The multicollinearity analysis for our regression model, based on the variance inflation factor (VIF), reveals varying levels of collinearity between predictors. The variables Bear_count, latitude, longitude and Temperature_Summer have VIFs below 5, suggesting a lack of significant multicollinearity. However, strong multicollinearity is observed for Temperature_Winter and Glace_cover_Winter, with VIFs of 16.55 and 17.08 respectively. Other terms, such as Temperature_Spring, Glace_cover_spring, and Glace_cover_Summer, show moderate levels of multicollinearity. These results indicate that although some interactions in the model do not pose significant collinearity problems, others, particularly those involving the winter season, could affect the reliability and interpretation of the coefficients and require particular attention when analyzing the results.

In conclusion, although our current model reveals significant relationships between various environmental and geographical factors and the distance traveled by bears, its low R2 suggests that there is room for improvement. To increase the explanatory power of the model, it could be beneficial to integrate additional data such as the availability of food resources in the Beaufort Sea, variations in vegetation density, or even interactions with other species, which could influence the bears’ movement patterns. In addition, a more detailed analysis of temporal data, considering intra-seasonal variations or specific weather conditions, for example, could provide more nuanced insights. Finally, exploring more complex modelling methods, such as non-linear models or mixed models, could also help to capture more subtle and complex relationships between the variables studied. It is crucial to understand that this model was designed primarily to evaluate its feasibility on our scale. We now need to collect more homogeneous and diversified data to improve the reliability and sustainability of our predictions. Expanding our database, incorporating more precise environmental parameters, and collaborating with experts in climatology and wildlife biology could improve the accuracy of our future estimates. By developing a more nuanced understanding of the migratory patterns and adaptive responses of polar bears to environmental change, we will be able to offer more informed insights into how these majestic animals navigate a rapidly changing world.

6 Conclusion

The comprehensive study on the impact of climate change on Arctic ecosystems, particularly focusing on polar bears, reveals significant findings and implications. Our investigation, driven by the urgent need to understand the effects of environmental shifts in the Arctic, combines extensive data analysis with predictive modeling to paint a detailed picture of the current and future state of polar bear populations and their habitats.

Key Findings

  • Declining Polar Bear Population: The data indicates a notable decline in polar bear numbers, particularly in regions like Alaska. This trend is alarming and points to the broader implications of shrinking sea ice and changing climatic conditions in the Arctic. The contrast with the ringed seals’ population trends, which show some resilience, underscores the unique challenges faced by polar bears.

  • Movement Patterns and Habitat Utilization: Our analysis of GPS data and environmental factors such as temperature and sea ice coverage reveals that polar bears’ movement patterns are intricately linked to these variables. The decreasing distances traveled by bears over the years suggest adaptations to the rapidly changing environment. Our predictive models, while indicating some level of accuracy, also highlight the complexity of these movement patterns.

  • Diet and Open Water Days: The relationship between the polar bears’ diet and the time they spend in open water is complex. Our statistical models suggest a connection, albeit not a straightforward one, between these factors. The impact of the melting season on bears’ dietary habits is evident but requires further exploration for conclusive insights.

Implications and Future Directions

The findings underscore the urgency for targeted conservation efforts aimed at preserving polar bear habitats, particularly in light of the diminishing sea ice cover in the Arctic. The insights gleaned from this study can inform policymakers and conservationists in formulating strategies that address the specific needs of polar bears, considering the broader context of climate change. Continued monitoring and research are vital to deepen our understanding of polar bears’ adaptation strategies to the changing environment. Future studies should aim to integrate more variables, such as human activities and their impact on bear habitats. Also, data on the location of bears in Alaska could be cross-referenced with changes in temperatures in different regions and the number of animals counted over time.

Concluding Remarks

In conclusion, this study highlights the profound impact of climate change on Arctic ecosystems, particularly on polar bears. The changing patterns in sea ice coverage and temperature regimes are not just altering the physical landscape but are also reshaping the life cycles and survival strategies of these iconic species. Our findings, while offering significant insights, also open the door for further research and action in the realm of wildlife conservation and climate change mitigation.