Polar Bear Habitat Dynamics in a Changing Arctic Environment

Author

Manon Gressard, Zoe Dardare, Ugo Rombach, Victorien Rodondi

Published

December 22, 2023

1 Executive summary

In the face of escalating climate change, the Arctic region is undergoing significant environmental upheaval, directly impacting the habitat of the polar bear, a species emblematic of this area. This project aims to explore the nuanced interaction between the habitat utilization of polar bears and the evolving conditions in the Arctic. Utilizing GPS data, we aim to map the current distribution of polar bears, establishing a baseline for understanding their habitat preferences and the ongoing research efforts dedicated to their study. Our analysis proceeds to examine the complex interplay between the dietary habits of polar bears and the amount of time they spend in open water. Preliminary results from statistical models suggest a significant impact of the melting season on the bears’ diet, indicating dietary shifts as sea ice recedes and they adapt to more aquatic hunting environments. Further, we delve into the dynamic relationship between sea ice coverage and polar bear movements over time. This aspect of our research aims to highlight the bears’ adaptation strategies and habitat preferences in response to sea ice fluctuations. Expanding our scope, we assess the broader effects of climate change on Arctic ecosystems, focusing on temperature fluctuations and ice melt patterns in key polar bear habitats, such as those in Alaska. This comprehensive examination seeks to elucidate the consequences of climate change on the availability and suitability of habitats for polar bears, providing a holistic view of the challenges they face.

2 Introduction

2.1 Overview and motivation

This project, set against the dramatic backdrop of a changing Arctic environment, focuses on understanding the impact of climate change on the habitat of the polar bear. Our study is designed to unravel the complex relationship between polar bear habitat utilization and the evolving conditions of the Arctic, integrating comprehensive data analysis with field observations for a complete understanding of these changes.

Project Goals

Our main goal is a detailed scientific analysis of data related to polar bear habitats. We plan to use advanced GPS data to map the current geographical distribution and movement patterns of polar bears, incorporating a variety of ecological and geographical parameters for an in-depth understanding of their habitat preferences. We also aim to assess the adaptation of polar bears to the changing Arctic landscape, particularly the reduction of sea ice, and to explore the broader implications of climate change on the Arctic ecosystem by focusing on key factors such as temperature variations and sea ice melting patterns.

Motivation

The motivation for this project stems from a deep concern for the integrity of the Arctic ecosystem and the survival of its key species, the polar bear. This is further fueled by the unique experience of a team member who spent time in the remote Arctic region of Churchill, Manitoba, volunteering at the Churchill Northern Study Centre and working closely with leading polar bear researchers. This first-hand experience provided invaluable insights into the urgent need for data-driven research to understand the impact of climate change on polar bear populations and their behavioral adaptations to the loss of sea ice.

Incorporating these field observations with our data analysis, we aim to create a comprehensive scientific narrative. We will examine evidence of behavioral shifts in polar bears, such as changes in foraging strategies and increased land use, and cross-reference these observations with GPS tracking data. By combining scientific rigor with real-world observations, our project seeks to provide a nuanced understanding of the challenges faced by polar bears due to climate change and contribute to the global dialogue on polar bear conservation and the development of effective strategies and policies for preserving this vital Arctic species and its ecosystem.

2.3 Research questions

At the outset, our ambitions were grand. Not only did we plan to analyze demographics and forecast the future of polar bear populations using R software and extensive data, but we also aspired to model the entire Arctic ecosystem’s impact on the global climate. However, we quickly encountered a significant challenge: The non-robustness of the available data. We soon realized the necessity to scale down and refocus. This realization prompted a refocusing of our research questions. We shifted our attention to more specific, feasible areas, concentrating on the analysis of ice conditions and GPS data. Our revised questions now include:

How do changes in sea ice coverage correlate with polar bear movement patterns as revealed by GPS tracking?
Can we identify any significant changes in their migratory routes or habitat preferences over the past decade in the context of ongoing climate change?
Is there a link between the diet of polar bears and the number of days they spend in open water?

3 Data

3.1 Data Source

Our project utilizes a variety of data sources, each critical for different aspects of our research.

Living Planet Index (LPI): Chosen for its rich demographic data, it enables us to weave a narrative about the life cycle and inter-species dynamics of polar bears and ringed seals. Despite some data irregularities, its depth offers a crucial ecological backdrop. Our primary data helps us to understand bear demography and the interrelationship between ‘Ursus maritimus’ (polar bears) and ‘Pusa hispida’ (ringed seals). Despite the difficulties associated with the consistency of the data, this data set is essential for the continuation of our research.

Data Description

Code

tibble(Variables = c("ID", "Binomial", "Replicate", "Citation", "Class", "Family", "Genus", "Species", "Authority", "Common_name", "Location", "Country","Region", "Latitude", "Longitude", "Specific_location", "X1950-X2020"),
       Meaning= c ("Unique identifier for each observation.", "The binomial name (genus and species) of the organism, in this case, Ursus maritimus (polar bear) ", "Indicates if the data is from a repeated observation", "Reference to the source from which the data is derived", "The biological class to which the organism belongs", "The family in the biological classification","The genus of the organism", "The species of the organism","The authority that named the species", "Common name (polar bear)", "Specific location of the study or observation", " The country where the observation was made","Geographical or political region","Latitude coordinate" , "Longitude coordinate" , "More detailed information on the location", "Counts for each year from 1950 to 2020" ))%>%
  kable()%>%
          kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Variables	Meaning
ID	Unique identifier for each observation.
Binomial	The binomial name (genus and species) of the organism, in this case, Ursus maritimus (polar bear)
Replicate	Indicates if the data is from a repeated observation
Citation	Reference to the source from which the data is derived
Class	The biological class to which the organism belongs
Family	The family in the biological classification
Genus	The genus of the organism
Species	The species of the organism
Authority	The authority that named the species
Common_name	Common name (polar bear)
Location	Specific location of the study or observation
Country	The country where the observation was made
Region	Geographical or political region
Latitude	Latitude coordinate
Longitude	Longitude coordinate
Specific_location	More detailed information on the location
X1950-X2020	Counts for each year from 1950 to 2020

US Geological Survey (USGS) Dataset*: Its detailed locational information on polar bears serves as the backbone of our spatial analysis, allowing us to trace the subtle nuances in their movement patterns amidst changing environmental conditions.

Data Description

Code

tibble(Variables= c("BearID_rsf", "DateTimeUTC_rsf", "latitude_rsf", "longitude_rsf", "season", "Many variables finally not used" ),
       Meaning = c("Unique ID per observation", " Date and time of the observation in Coordinated Universal Time (UTC)", "Latitude coordinate of the bear's location at the time of the observation", "Longitude coordinate of the bear's location at the time of the observation", "Time of year when the observation was made", "..."))%>%
  kable()%>%
          kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Variables	Meaning
BearID_rsf	Unique ID per observation
DateTimeUTC_rsf	Date and time of the observation in Coordinated Universal Time (UTC)
latitude_rsf	Latitude coordinate of the bear's location at the time of the observation
longitude_rsf	Longitude coordinate of the bear's location at the time of the observation
season	Time of year when the observation was made
Many variables finally not used	...

PELLY ISLAND Weather Station Data: This data set is our window into the climatic heartbeat of the Arctic. Over three decades of temperature data provide us with a canvas to paint the picture of climatic shifts impacting the Arctic ecosystem.

Data Description

Code

tibble(Variables= c("Date", "TAVG..Degrees.Fahrenheit.", "TMAX..Degrees.Fahrenheit.", "TMIN..Degrees.Fahrenheit.", "Many others variables finally not used" ),
       Meaning = c("Date", "Average temperature in degrees Fahrenheit", "Maximum temperature in degrees Fahrenheit", "Minimum temeperature in degrees Fahrenheit", "..."))%>%
  kable()%>%
          kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Variables	Meaning
Date	Date
TAVG..Degrees.Fahrenheit.	Average temperature in degrees Fahrenheit
TMAX..Degrees.Fahrenheit.	Maximum temperature in degrees Fahrenheit
TMIN..Degrees.Fahrenheit.	Minimum temeperature in degrees Fahrenheit
Many others variables finally not used	...

MASIE-NH Sea Ice Coverage Data: With its graphical ice extent representation, this data set helps us correlate the physical changes in the Arctic landscape with the behavioral adaptations of polar bears.

Data Description

Code

tibble(Variables = c("yyyyddd", "Beaufort_Sea" , "Many others variables finally not used" ),
       Meaning= c ("Date of the observation", "Daily 4km sea ice component in the region of Beaufort Sea", "..."))%>%
  kable()%>%
          kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Variables	Meaning
yyyyddd	Date of the observation
Beaufort_Sea	Daily 4km sea ice component in the region of Beaufort Sea
Many others variables finally not used	...

Aerial Master Data: It offers a temporal journey through polar bear habitats in Alaska from 1979 to 2011 and from 2015 to 2017, essential for understanding the historical shifts in their territorial patterns.

Data Description

Code

tibble(Variables = c("ID", "GMT_Minus8_DateTime", "ArcLat", "Arclong", "Many others variables finally not used"), 
       Meaning = c("Unique ID per observation", "Date and time of the observation", "Latitude", "Longitude", "...")) %>%
    kable("html") %>%  
    kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Variables	Meaning
ID	Unique ID per observation
GMT_Minus8_DateTime	Date and time of the observation
ArcLat	Latitude
Arclong	Longitude
Many others variables finally not used	...

USGS Diet Data: This data set adds a crucial dimension by linking dietary patterns of polar bears to environmental changes, offering insights into the survival strategies of these majestic creatures in a rapidly changing world.

Data Description

Code

tibble(Variables = c("BearID","Year","Capture.date",
                     "Ageclass", "Sex", "OW_50pcrt","Meltseason","Bearded_seal",
                     "Ringed_seal","Beluga_whale","Bowhead_whale","Seabird_nestling"), 
       Meaning = c("Unique observation per bear","Year of the capture",
                   "Date of the capture", "The stage of maturity of the polar bear",
                   "The sex of the polar bear", "Number of days covered in open water",
                   "Number of days of the meltseason in the corresponding year",
                   "Percentage of bearded seal eaten in the corresponding period",
                   "Percentage of ringed seal eaten in the corresponding period",
                   "Percentage of beluga whale eaten in the corresponding period",
                   "Percentage of bowhead sealed eaten in the corresponding period",
                   "Percentage of seabird nestling eaten in the corresponding period"))%>%
  kable()%>%
          kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

Variables	Meaning
BearID	Unique observation per bear
Year	Year of the capture
Capture.date	Date of the capture
Ageclass	The stage of maturity of the polar bear
Sex	The sex of the polar bear
OW_50pcrt	Number of days covered in open water
Meltseason	Number of days of the meltseason in the corresponding year
Bearded_seal	Percentage of bearded seal eaten in the corresponding period
Ringed_seal	Percentage of ringed seal eaten in the corresponding period
Beluga_whale	Percentage of beluga whale eaten in the corresponding period
Bowhead_whale	Percentage of bowhead sealed eaten in the corresponding period
Seabird_nestling	Percentage of seabird nestling eaten in the corresponding period

3.2 Data Scraping and Cleanup

Our data collection faced challenges, notably in accessing polar bear data due to its high research cost in this field, researchers often face restrictions on sharing data due to funding sources or proprietary concerns. Despite these obstacles, we managed to source diverse datasets, which then required meticulous cleaning. This process involved standardizing data formats, correcting inconsistencies, and merging datasets while preserving data integrity. Our focus was on maintaining accuracy, crucial for the reliability and validity of our analysis. This careful preparation ensured that our findings were based on the best available data, despite the difficulties in obtaining it.

3.2.1 Living Planet Database

We start by defining the year columns and convert them to numeric, then we calculate the annual sum for each species. We then calculate a 5-year cumulative sum for each species, and prepare the data for plotting. We also calculate the moving average for the plot data. Then we convert all columns with names starting with ‘X’ followed by 4 digits to numeric, and replace non-numeric values in these columns with NA. We calculate the sum of values in columns starting with ‘X’, select only relevant columns for mapping, and plot the global distribution of polar bears.

Code

year_columns <- paste0("X", 1970:2020)
ursus_data <- apply(data1[data1$Binomial == "Ursus maritimus", year_columns], 2, as.numeric)
pusa_data <- apply(data2[data2$Binomial == "Pusa hispida", year_columns], 2, as.numeric)

ursus_annual_sum <- colSums(ursus_data, na.rm = TRUE)
pusa_annual_sum <- colSums(pusa_data, na.rm = TRUE)

# Function to calculate the cumulative sum over 5-year periods
calculate_5yr_cumulative_sum <- function(annual_sum) {
  cum_sum_5yr <- numeric(length(annual_sum))
  for (i in 1:length(annual_sum)) {
    start_index <- max(1, i - 4)
    cum_sum_5yr[i] <- sum(annual_sum[start_index:i])
  }
  return(cum_sum_5yr)
}

ursus_cum_sum_5yr <- calculate_5yr_cumulative_sum(ursus_annual_sum)
pusa_cum_sum_5yr <- calculate_5yr_cumulative_sum(pusa_annual_sum)

years <- 1970:2020
ursus_plot_data <- data.frame(Year = years, Cumulative_Sum_5yr = ursus_cum_sum_5yr)
pusa_plot_data <- data.frame(Year = years, Cumulative_Sum_5yr = pusa_cum_sum_5yr)

ursus_plot_data$Moving_Avg <- rollmean(ursus_plot_data$Cumulative_Sum_5yr, 10, fill = NA, align = "right")
pusa_plot_data$Moving_Avg <- rollmean(pusa_plot_data$Cumulative_Sum_5yr, 10, fill = NA, align = "right")

polar_bears<- data1
x_cols <- grep("^X\\d{4}$", names(polar_bears))
polar_bears[, x_cols] <- lapply(polar_bears[, x_cols], function(x) {
  suppressWarnings(as.numeric(as.character(x)))
})
polar_bears[, x_cols][!sapply(polar_bears[, x_cols], is.numeric)] <- NA
polar_bears$Totalpolarbears <- rowSums(polar_bears[, x_cols], na.rm = TRUE)

polar_bears[, x_cols][!sapply(polar_bears[, x_cols], is.numeric)] <- NA

polar_bears$Totalpolarbears <- rowSums(polar_bears[, x_cols], na.rm = TRUE)

polar_bears_map <- polar_bears[, c("Latitude", "Longitude", "Totalpolarbears")]

min_lat <- 60
max_lat <- 90
min_lon <- -180
max_lon <- 180

polar_bears_filtered <- filter(polar_bears, Latitude >= min_lat & Latitude <= max_lat & 
                                 Longitude >= min_lon & Longitude <= max_lon)

polar_bears_filtered2 <- polar_bears_filtered %>%
  mutate(MarkerColor = ifelse(Totalpolarbears >= 10000, "blue", "red"))

scale_factor <- 0.005

3.2.2 Aerial Master Database

Here’s how we’re going to proceed for our Aerial Master (AM) table. We start to calculate the number of missing values for each column in the AM data set, then we identify the columns with more than 200,000 missing values, and decide to remove them from the data set. For better clarity, we rename the columns in the data set. Then we calculate the proportion of missing values for each column in the AM data set, and create a data frame with column names and proportions of missing values, by sorting the data frame by the proportion of missing values in descending order.

Code

na_count <- colSums(is.na(AM))

na_cols <- which(na_count > 200000)

AM <- AM[, -na_cols]
AM <- as_tibble(AM)

AM <- AM %>% rename(date_time_hour = GMT_Minus8_DateTime)
AM <- AM %>% rename(Longitude = ArcLong)
AM <- AM %>% rename(Latitude = ArcLat)

part_nulles <- colMeans(is.na(AM))

df <- data.frame(colonne = names(AM), part_nulles = part_nulles)

df <- df %>%
  mutate(colonne = factor(colonne, levels = names(AM)[order(part_nulles)]))

We create a bar chart to visualize the proportion of missing values for each column:

Code

gg1 <- ggplot(df, aes(x = colonne, y = part_nulles, text = paste("Column:", colonne, "<br>:Share of zero values:", scales::percent(round(part_nulles, 2))))) +
  geom_bar(stat = "identity", fill = "steelblue", width = 0.8)  +
  labs(title = "As we can see, many columns have quite a lot of missing data ",
       x = "Columns",
       y = "Share of zero values per column") +
  scale_y_continuous(labels = percent_format()) +
  coord_flip() +
  geom_text(aes(label = scales::percent(round(part_nulles, 2))), hjust = -0.1, vjust = 0.5, size = 1.8) +
  theme(axis.text.y = element_text(size = 4))
ggplotly(gg1, tooltip = "text")

Due to the significant amount of missing data, only location data is retained. Other columns are binary variables representing parameters of little interest or with too many missing values. We then calculate the proportion of missing values for each column in the final data set, and we remove rows with missing values from it. Then we separate the “Datetime” column into individual components (month, day, year, hour, minute, second, period), and select relevant columns for further analysis.

Code

AM_fin <- select(AM, ID, date_time_hour, Latitude, Longitude)

part_nulles2 <- colMeans(is.na(AM_fin))

AM_fin <- drop_na(AM_fin)

names(AM_fin)[names(AM_fin) == "date_time_hour"] <- "Datetime"

AM_fin <- AM_fin |> separate(Datetime, into = 
                               c("month","day","year", "hour", "minute", "second", "period"))

AM_fin <- select(AM_fin, ID, day, month, year, Latitude, Longitude)

AM_fin <- drop_na(AM_fin)

AM_fin$day <- as.integer(AM_fin$day)
AM_fin$month <- as.integer(AM_fin$month)
AM_fin$year <- as.integer(AM_fin$year)

We create a bar chart to visualize the number of observations per year:

Code

fig_79_11 <- plot_ly(x = AM_fin$year, histfunc = 'sum', type = "histogram")
fig_79_11 <- fig_79_11 %>% layout(
    yaxis = list(type = 'linear'), 
    title = "Many observations in 2009-2011",
    bargap = 0.2  
)

fig_79_11

Let’s do the same operations for the second data set “Aerial Master 2015-2017”. We repeat the same operations than on the first one to clean it up.

Code

na_count2 <- colSums(is.na(AM2))

na_cols2 <- which(na_count2 > 200000)

AM2 <- AM2 %>% rename(date_time_hour = GMT_Minus8_DateTime)
AM2 <- AM2 %>% rename(Longitude = ArcLong)
AM2 <- AM2 %>% rename(Latitude = ArcLat)

AM2 <- AM2[, -na_cols2]
AM2 <- as_tibble(AM2)

part_nulles2 <- colMeans(is.na(AM2))

AM_fin2 <- select(AM2, ID, date_time_hour, Latitude, Longitude)

part_nulles2 <- colMeans(is.na(AM_fin2))

AM_fin2 <- drop_na(AM_fin2)

names(AM_fin2)[names(AM_fin2) == "date_time_hour"] <- "Datetime"

AM_fin2 <- AM_fin2 |> separate(Datetime, into = 
                                 c("month","day","year", "hour", "minute", "second", "period"))

AM_fin2 <- select(AM_fin2, ID, day, month, year, Latitude, Longitude)

AM_fin2 <- drop_na(AM_fin2)

AM_fin2$day <- as.integer(AM_fin2$day)
AM_fin2$month <- as.integer(AM_fin2$month)
AM_fin2$year <- as.integer(AM_fin2$year)

Again, we create a bar chart to visualize the number of observations per year.

Code

AM_fin2_ <- AM_fin2 %>%
  group_by(year) %>%
  summarise(count = n()) %>%
  ungroup()

AM_fin2_$tooltip <- paste("Year:", AM_fin2_$year, "<br>Number of Observations:", AM_fin2_$count)

p <- ggplot(data = AM_fin2_, aes(x = year, y = count)) + 
  geom_bar(stat = "identity", aes(text = tooltip), fill = "steelblue") +  
  labs(title = "Number of observations per year", x = "Year", y = "") +
  theme_minimal()

ggplotly(p, tooltip = "text")

We combine the filtered data from the two final data sets into a single data frame.

Code

df_merged <- rbind(AM_fin, AM_fin2)
df_merged <- df_merged[df_merged$Latitude != 0 & df_merged$Longitude != 0,]

icon <- makeIcon(iconUrl = "Report/photo/polarBear.png",
                 iconWidth = 35, iconHeight = 35)

For the third data set about the diets of polar bears in Beaufort, there is fast no data cleaning.

Code

na_count_pb <- colSums(is.na(pb))
kable(na_count_pb) %>% 
  kable_styling(bootstrap_options = c("striped", "hover", "condensed", "responsive"))

	x
BearID	0
Year	0
Capture date	0
Ageclass	0
Sex	0
OW_50prct	0
OW_15prct	0
Meltseason	0
Bearded_seal	0
Ringed_seal	0
Beluga_whale	0
Bowhead_whale	0
Seabird_nestling	0

Since there are no missing values, the data set is considered “perfectly clean”. We only rename specific columns for better clarity.

Code

pb <- pb %>% rename(days_in_cov_water = OW_50prct)
pb <- pb %>% rename(Perc_bearded_eaten = Bearded_seal)
pb <- pb %>% rename(Perc_ringed_eaten = Ringed_seal)
pb <- pb %>% rename(Perc_beluga_eaten = Beluga_whale)
pb <- pb %>% rename(Perc_bowhead_eaten = Bowhead_whale)
pb <- pb %>% rename(Perc_seabird_eaten = Seabird_nestling)

3.2.3 Polar Bear Distribution Database

We began by extracting the essential information, such as each bear’s identifier, date and geographical coordinates. This data was then converted into standardised formats to ensure the consistency and accuracy of the analysis. A crucial aspect of our processing was data cleaning, where we eliminated incomplete records to ensure the reliability of our study. After organising the data by bear identifier and date, we calculated the distance traveled between successive bear points using Haversine’s formula. As a result, the first GPS broadcast for each bear does not have a value in the distance column calculated in this way. This method provides us with valuable insights into the movement patterns of bears. This approach created a coherent and detailed data set, essential for our behavioural analysis of bear movements in their natural habitat.

Code

ours_nettoyé <- ours_de_base %>%
  select(bearid = BearID_rsf, date = DateTimeUTC_rsf, latitude = latitude_rsf, longitude = longitude_rsf)

ours_nettoyé$date <- as.POSIXct(ours_nettoyé$date, format = "%m/%d/%Y %H:%M", tz = "UTC")

ours_clean <- ours_nettoyé %>%
  filter(!is.na(bearid) & !is.na(date) & !is.na(latitude) & !is.na(longitude))

ours_clean$date <- as.Date(ours_clean$date)

ours_clean <- ours_clean %>%
  arrange(bearid, date)

ours_clean <- ours_clean %>%
  group_by(bearid) %>%
  mutate(
    Longitude_prev = lag(longitude),
    Latitude_prev = lag(latitude)
  ) %>%
  rowwise() %>%
  mutate(
    Distance = ifelse(
      is.na(Longitude_prev) | is.na(Latitude_prev), 
      NA, 
      distHaversine(
        c(longitude, latitude), 
        c(Longitude_prev, Latitude_prev)
      )
    )
  ) %>%
  ungroup()

Code

nb_obs <- read_csv(here::here("Report/données/ours_clean.csv"))
# Convertir la date en format Date et extraire l'année
nb_obs$date <- as.Date(nb_obs$date)
nb_obs$year <- format(nb_obs$date, "%Y")

# Calculer le nombre d'observations par année
observations_par_an <- nb_obs %>%
  group_by(year) %>%
  summarise(NumberPointGPS = n())

# Créer un graphique
graph_nb_obs <- ggplot(observations_par_an, aes(x = year, y = NumberPointGPS, fill = NumberPointGPS)) +
  geom_bar(stat = "identity") +
  theme_minimal() +
  scale_fill_gradient(low = "blue", high = "red") +
  labs(title = "Number of polar bear GPS relvets by year",
       x = "Year",
       y = "Number of GPS points") +
  theme(axis.text.x = element_text(angle = 90, hjust = 1)) # Rotation des étiquettes

# Convertir en graphique interactif
graph_nb_obs_interactif <- ggplotly(graph_nb_obs, tooltip = c("x", "y"))

# Afficher le graphique
graph_nb_obs_interactif

3.2.4 Temperatures Data Base

In order to clean this data set, we converted the temperatures from Fahrenheit to Celsius, an international standard that makes them easier to understand and compare. The next step was to merge the maximum and minimum temperature data based on date to get a consistent overview. Finally, we calculated the daily average temperature, which provides a more balanced and meaningful perspective of weather conditions. To simplify and clarify our results, we chose to keep only the mean temperature data, eliminating the maximum and minimum temperature columns. This process provided an accurate and easily interpretable overview of climate trends on Pelly Island.

Code

convert_to_celsius <- function(fahrenheit) {
  (fahrenheit - 32) * 5 / 9
}

max_clean <- max_temp %>%
  select(Date, TMAX = `TMAX..Degrees.Fahrenheit.`) %>%
  mutate(Date = as.Date(Date), TMAX = convert_to_celsius(TMAX))

min_clean <- min_temp %>%
  select(Date, TMIN = `TMIN..Degrees.Fahrenheit.`) %>%
  mutate(Date = as.Date(Date), TMIN = convert_to_celsius(TMIN))

df_merged2 <- inner_join(max_clean, min_clean, by = "Date")

temp_beaufort <- df_merged2 %>%
  mutate(Average_Temperature = (TMAX + TMIN) / 2)

temp_beaufort <- temp_beaufort %>%
  select(-TMAX, -TMIN)

3.2.5 Sea Ice Cover Data Base

To refine our analysis of the Beaufort Sea, we first filtered the data, discarding information relating to other maritime regions. Next, we converted the date format from ‘yyyyyddd’ to a standard format, facilitating temporal analyses. These data preparations are crucial to ensuring the accuracy and relevance of our future analyses.

Code

glace_beaufort$
glace_beaufort <- glace_beaufort %>%
  select(-Chukchi_Sea,                    -East_Siberian_Sea  ,             -Laptev_Sea      ,               
          -Kara_Sea    ,                    -Barents_Sea      ,               -Greenland_Sea  ,                
          -Baffin_Bay_Gulf_of_St._Lawrence ,-Canadian_Archipelago ,           -Hudson_Bay   ,                  
          -Central_Arctic       ,           -Bering_Sea  ,                   -Baltic_Sea ,                    
          -Sea_of_Okhotsk   ,               -Yellow_Sea  ,                   -Cook_Inlet, -Northern_Hemisphere)

glace_beaufort$yyyyddd <- as.character(glace_beaufort$yyyyddd)

# Convertir la colonne yyyyddd en dates
glace_beaufort$yyyyddd <- as.Date(as.numeric(substr(glace_beaufort$yyyyddd, 5, 7)) - 1, 
                               origin = as.Date(paste0(substr(glace_beaufort$yyyyddd, 1, 4), "-01-01")))

glace_beaufort$yyyyddd <- ymd(glace_beaufort$yyyyddd)

3.2.6 Combined Data Sets

From this data set, we assigned the nearest mean temperature date to each polar bear GPS point. In order to do this, we scanned each polar bear observation, calculated the difference in days between that observation and the dates of the temperature readings, and selected the temperature corresponding to the closest date. This process made it possible to precisely link the climatic conditions to the specific positions of the bears.

Code

ours_clean$Closest_Avg_Temperature <- numeric(nrow(ours_clean))

for (i in 1:nrow(ours_clean)) {
    # Calculate the difference in days
  differences <- abs(difftime(temp_beaufort$Date, ours_clean$date[i], units = "days"))
  
    # Find the minimum difference index
  closest_idx <- which.min(differences)
  
    # Assign the nearest average temperature
  ours_clean$Closest_Avg_Temperature[i] <- temp_beaufort$Average_Temperature[closest_idx]
}


ours_clean <- arrange(ours_clean, bearid)

ours_clean <- ours_clean %>%
  arrange(bearid, date)

bear_periods <- ours_clean %>%
  group_by(bearid) %>%
  summarise(start_date = min(date), end_date = max(date))

count_active_bears <- function(date, periods) {
  sum(periods$start_date <= date & periods$end_date >= date)
}

unique_dates <- unique(ours_clean$date)
count_bears_per_date <- sapply(unique_dates, function(date) count_active_bears(date, bear_periods))

count_bears_df <- data.frame(date = unique_dates, Count_Bears = count_bears_per_date)

ours_clean <- merge(ours_clean, count_bears_df, by = "date")

3.2.7 Data Set Regression for Travel Forecasts

We use a methodical approach to develop and evaluate predictive models of latitude and longitude. We began by dividing our selected data set into two distinct sets: one for training and one for testing, ensuring reproducibility through a fixed seed parameter. This division ensured that models would be trained on a subset of the data and tested on a separate set, not used during training, for an unbiased assessment of their performance.
We then trained two separate models using the random forest algorithm, one to predict latitude and the other for longitude, excluding longitude and latitude respectively from the predictor variables to avoid redundancy. After training, we used the models to predict latitude and longitude values on the test set.
The performance of the models was evaluated by calculating the root mean square error (RMSE) for the latitude and longitude predictions. The results showed an RMSE of 1.2227 for latitude and 13.4141 for longitude. By quantifying the mean difference between predicted and actual values, these RMSE measurements revealed that the latitude model was relatively accurate, while the longitude model had a greater margin of error.
We extend our study to include predictions over the next five years. Using the latest date in our data set as a starting point, we have generated a sequence of future dates. To these dates, we used a function to associate the relevant mean temperature and ice extent data for the Beaufort Sea, taking care to fill in any gaps with the median of the available temperatures, to maintain data consistency.
We then enriched the future data by assigning bear identifiers (BearIDs) to each GPS point, simulating the tracking of 20 different bears. This step allowed us to contextualize our predictions within a more realistic framework, reflecting the potential diversity of bear movements in the future.
Using the random forest models, we had previously trained, we predicted the geographical positions (latitude and longitude) for each combination of BearID and future date. These predictions, based on anticipated environmental conditions, were compiled into a new dataset, providing a prospective view of possible polar bear movements in the face of future climate and environmental change. This work represents a significant advance in our understanding of polar bear dynamics and their potential adaptation to changes in their habitat.

Code

data$date <- as.Date(data$date)
data$year <- year(data$date)
data$month <- month(data$date)
data$day <- day(data$date)

data_selected <- data %>%
  select(latitude, longitude, year, month, day, Closest_Avg_Temperature, Beaufort_Sea, date)

data_selected <- na.omit(data_selected)


# Division des données en ensemble d'entraînement et de test
set.seed(123) # Pour la reproductibilité
indices <- sample(1:nrow(data_selected), size = 0.8 * nrow(data_selected))
train_data <- data_selected[indices, ]
test_data <- data_selected[-indices, ]

# Entraînement du modèle pour la latitude
model_latitude <- randomForest(latitude ~ ., data = train_data[, -2]) 

# Entraînement du modèle pour la longitude
model_longitude <- randomForest(longitude ~ ., data = train_data[, -1]) 

predictions_latitude <- predict(model_latitude, test_data)
predictions_longitude <- predict(model_longitude, test_data)

rmse_latitude <- sqrt(mean((predictions_latitude - test_data$latitude)^2))
rmse_longitude <- sqrt(mean((predictions_longitude - test_data$longitude)^2))

print(paste("RMSE Latitude:", rmse_latitude))

[1] "RMSE Latitude: 1.2227392145361"

Code

print(paste("RMSE Longitude:", rmse_longitude))

[1] "RMSE Longitude: 13.4141324335172"

Code

last_date <- max(data_selected$date)
future_dates <- seq(from = last_date + 1, by = "day", length.out = 5 * 365)

future_data <- data.frame(
  date = future_dates,
  Closest_Avg_Temperature = temp_beaufort$Average_Temperature[match(future_dates, temp_beaufort$date)],
  Beaufort_Sea = glace_beaufort$Beaufort_Sea[match(future_dates, glace_beaufort$date)],
  year = year(future_dates),
  month = month(future_dates),
  day = day(future_dates)
)

nombre_ours <- 20
bear_id_start <- 301
total_days <- length(future_dates)

future_data$BearID <- rep(bear_id_start:(bear_id_start + nombre_ours - 1), 
                          times = ceiling(total_days / nombre_ours))[1:total_days]


predicted_latitude <- predict(model_latitude, future_data)
predicted_longitude <- predict(model_longitude, future_data)

predictions <- data.frame(BearID = future_data$BearID, date = future_dates, latitude = predicted_latitude, longitude = predicted_longitude)

predictions <- na.omit(predictions)

4 Exploratory Data Analysis

4.1 Locations of research centers

Focus on Alaska

In light of recent data, there is growing concern over the declining population of polar bears. This trend warrants a closer examination of the origins of our data. To gain a clearer understanding of these patterns, we will delve into the geographic source of this information. Such an approach is crucial, as it allows us to pinpoint specific regions contributing most significantly to the data set.

On the following map, we identify the locations from which the majority of our polar bear sightings are reported. Given the essential nature of these data for understanding the state of polar bear populations, identifying the main source regions is a step that helped us greatly in subsequently guaranteeing the robustness of our results.

Code

leaflet_map <- leaflet(polar_bears_filtered2) %>%
  addTiles() %>%  # Add default OpenStreetMap map tiles
  addCircleMarkers(
    lng = ~Longitude, lat = ~Latitude, 
    popup = ~paste("Country: ", Country, "<br>",
                   "Total Observations: ", Totalpolarbears),
    color = ~MarkerColor, 
    radius = ~Totalpolarbears * scale_factor, 
    clusterOptions = markerClusterOptions(),
    fillOpacity = 0.5
  ) %>%
  addMiniMap() %>% 
  setView(lng = mean(c(min_lon, max_lon)), lat = mean(c(min_lat, max_lat)), zoom = 4) %>%
  addProviderTiles(providers$Esri.WorldImagery) %>% 
  addScaleBar() %>% 
  addLayersControl(
    baseGroups = c("OpenStreetMap", "Satellite"),
    options = layersControlOptions(collapsed = FALSE)
  )

leaflet_map

In our data science analysis of Arctic wildlife populations, we discovered significant disparities in data collection, a key point highlighted through our visualizations. From this map, we identify the distribution of research centers and their observation counts, revealing a concentration of research efforts in North America, raising questions about the global validity of our analyses. In order to address our research questions, we decided to focus on Alaska, as it is the region that provides the best amount of data.

4.2 5-Year cumulative count trend for ringed seals and polar bears

We used cumulative averages in our graphs to mitigate annual variability, revealing long-term trends in bear and seal populations over 50 years. These trends indicate an early decline in bear populations compared to seals, suggesting that their populations are not directly correlated and that external factors like human impact and environmental changes play a significant role.

Code

p1 <- ggplot(pusa_plot_data, aes(x = Year, y = Cumulative_Sum_5yr)) +
  geom_line(color = "green") +
  geom_line(aes(y = Moving_Avg), color = "red", linetype = "dashed") +
  ggtitle("5-Year Cumulative Count Trend for Ringed Seals (Pusa hispida)") +
  xlab("Years") +
  ylab("Cumulative Count Trend")

p2 <- ggplot(ursus_plot_data, aes(x = Year, y = Cumulative_Sum_5yr)) +
  geom_line(color = "blue") +
  geom_line(aes(y = Moving_Avg), color = "red", linetype = "dashed") +
  ggtitle("5-Year Cumulative Count Trend for Polar Bears (Ursus maritimus)") +
  xlab("Years") +
  ylab("Cumulative Count Trend")

grid.arrange(p1, p2, ncol = 2)

These two graphs show cumulative trends over 5 years for two distinct species: the ringed seal (Pusa hispida) and the polar bear (Ursus maritimus), respectively. The first trend, represented by a green line, shows the cumulative sum over 5 years of seal counts. This shows how the total number of seals and bears observed has evolved over a rolling 5-year period. The second trend, a dotted red line, represents a moving average. This average smoothes out annual variations to give a better idea of the general trend over the long term. It is important to note that our data set no longer contains data from 2010 onwards, which explains the decline.

In this initial phase, we have gathered the essential data to analyze the movement of polar bears and the impacts of climate change.

4.3 Polar bear location

Here you can see data on the location of polar bears in 1979. The interest is not in the number of data, as we have seen from the graph of the number of sightings that these data are inconsistent and tend to increase as the years go by. The interest lies rather in the general location of all these data. So here we see that polar bears tended to live in northern Alaska in 1979.

Year 1979:

Code

# Cards for analysis
# Filtrer les données pour l'année 1979

df_1979 <- df_merged %>% filter(year == 1979)

# Créer une carte
map_1979 <- leaflet()

# Ajouter une basemap OpenStreetMap
map_1979 <- map_1979 %>% addTiles(
  urlTemplate = "https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png",
  attribution = "Map data &copy; <a href='https://www.openstreetmap.org/'>OpenStreetMap</a> contributors, <a href='https://creativecommons.org/licenses/by-sa/2.0/'>CC-BY-SA</a>"
)
# Ajouter les points de données
map_1979 <- leaflet(quakes) %>% addTiles() %>%
  addMarkers(data = df_1979,lat = df_1979$Latitude, lng = df_1979$Longitude,
             clusterOptions = markerClusterOptions(),
             icon = icon)

# Afficher la carte
map_1979

The locations of polar bears in 1990 are shown here. Compared with 1979, we can observe a westward shift in the average location of the bears. However, they still remain in northern Alaska.

Year 1990:

Code

# Filtrer les données pour l'année 1990
df_1990 <- df_merged %>% filter(year == 1990)

# Créer une carte
map_1990 <- leaflet()

# Ajouter une basemap OpenStreetMap
map_1990 <- map_1990 %>% addTiles(
  urlTemplate = "https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png",
  attribution = "Map data &copy; <a href='https://www.openstreetmap.org/'>OpenStreetMap</a> contributors, <a href='https://creativecommons.org/licenses/by-sa/2.0/'>CC-BY-SA</a>"
)
# Ajouter les points de données
map_1990 <- leaflet(quakes) %>% addTiles() %>%
  addMarkers(data = df_1990,lat = df_1990$Latitude, lng = df_1990$Longitude,
             clusterOptions = markerClusterOptions(),
             icon = icon)

# Afficher la carte
map_1990

Data for 2015 are shown here. Compared with 1990, there has been a clear westward shift in location data. Some bears remain in the north, but it is interesting to note that some bears also live in the far west of Alaska, but no longer in the north.

Year 2015:

Code

# Filtrer les données pour l'année 2015
df_2015 <- df_merged %>% filter(year == 2015)

# Créer une carte
map_2015 <- leaflet()

# Ajouter une basemap OpenStreetMap
map_2015 <- map_2015 %>% addTiles(
  urlTemplate = "https://{s}.tile.openstreetmap.org/{z}/{x}/{y}.png",
  attribution = "Map data &copy; <a href='https://www.openstreetmap.org/'>OpenStreetMap</a> contributors, <a href='https://creativecommons.org/licenses/by-sa/2.0/'>CC-BY-SA</a>"
)
# Ajouter les points de données
map_2015 <- leaflet(quakes) %>% addTiles() %>%
  addMarkers(data = df_2015,lat = df_2015$Latitude, lng = df_2015$Longitude,
             clusterOptions = markerClusterOptions(),
             icon = icon)

# Afficher la carte
map_2015

4.4 Average weekly distance traveled per year

To represent the movements of polar bears within our data set, we needed to generate a trend for the average distance these mammals travel each day. We calculated the total distance traveled by each bear per day and per year. Next, we determined the daily average of this distance for each bear per year. Finally, we calculated the overall average of these daily averages for all the bears, year by year, in order to identify a general trend in the distances traveled by polar bears over time. These results are shown in the following interactive graph, and you can select a specific year by double-clicking on it. As explained in the graph (graph number on the number of GPS points per year) we have a wide disparity in our data depending on the year. Despite this, a trend in the daily distance traveled by polar bears can be seen in the graph. There has been a downward trend in this distance over the last 15 years.

Code

plot_distance <- ours_clean

# Convertir la colonne date en type date et extraire l'année
plot_distance$date <- as.Date(plot_distance$date)
plot_distance$year <- year(plot_distance$date)

# Calculer la distance totale parcourue par jour pour chaque ours dans chaque année
distance_journaliere_par_annee <- plot_distance %>%
  group_by(bearid, date, year) %>%
  summarize(distance_totale = sum(Distance, na.rm = TRUE))

# Calculer la moyenne quotidienne pour chaque ours par année
moyenne_quotidienne_par_ours_par_annee <- distance_journaliere_par_annee %>%
  group_by(bearid, year) %>%
  summarize(
    jours_uniques = n(),
    distance_annuelle = sum(distance_totale, na.rm = TRUE),
    moyenne_quotidienne = distance_annuelle / jours_uniques
  )

# Calculer la moyenne globale quotidienne pour chaque année sur tous les ours
moyenne_globale_quotidienne_par_annee <- moyenne_quotidienne_par_ours_par_annee %>%
  group_by(year) %>%
  summarize(moyenne_globale = mean(moyenne_quotidienne, na.rm = TRUE))

# Créer un histogramme avec ggplot
graph_distance <- ggplot(moyenne_globale_quotidienne_par_annee, aes(x = year, y = moyenne_globale, fill = as.factor(year), text = moyenne_globale)) +
  geom_bar(stat = "identity") +
  scale_fill_viridis_d() +
  theme_minimal() +
  labs(fill = "Year", 
       x = "Year", 
       y = "Average weekly distance in metres", 
       title = "Average weekly distance traveled per year")

# Convertir le graphique ggplot en un graphique plotly interactif
# 'text' spécifie les informations à afficher dans l'info-bulle
p_plotly_distance_ours <- ggplotly(graph_distance, tooltip = "text")

# Afficher le graphique
p_plotly_distance_ours

4.5 GPS tracks of polar bear

Code

create_map_for_bear <- function(bearid, data) {
  filtered_data <- data[data$bearid == bearid, ]
  
  m <- leaflet(filtered_data) %>%
    addTiles() %>%
    addPolylines(~longitude, ~latitude, group = ~bearid, weight=2, color="#3388ff", opacity=0.5) %>% # Trace plus fine et en transparence
    addCircles(lng = ~longitude, lat = ~latitude, weight = 1, radius = 100, # Points GPS plus gros
               popup = ~paste("Year:", format(as.Date(date), "%Y"), "<br/>",
                              "Distance traveled between last point:", round(Distance, 2), "m"),
               fillOpacity = 0.8) %>%
    addScaleBar(options = list(position = "bottomleft", maxWidth = 100, metric = TRUE, imperial = FALSE))
  
  saveWidget(m, file=paste0("bear_", bearid, ".html"), selfcontained = TRUE)
}

unique_bears <- unique(ours_data$bearid)
for(bear in unique_bears) {
  create_map_for_bear(bear, ours_data)
}

Select a Bear to display its Track, click on a GPS point to display its characteristics

4.6 Temeprature evolution

We selected temperature readings from various meteorological stations in the Beaufort Sea as our primary indicator. Additionally, we have investigated the trends in ice cover and its temporal fluctuations. The process of processing, merging, and analyzing disparate data sets presents a significant challenge. Currently, our statistical analysis primarily considers temperature changes as the key factor in explaining the migration patterns of polar bears.

Code

couleurs_saisons <- c("Spring" = "#89A45E", "Summer" = "#7FC6BC", "Autumn" = "#D17A22", "Winter" = "#5A7D9A")

# Déterminer les années uniques dans les données
unique_years <- unique(year(ymd(temp_beaufort_graph$date)))

# Fonction pour générer un graphique par saison et année
generate_plot <- function(season, year) {
  data_filtered <- temp_beaufort_graph %>%
    filter(season == season, year(ymd(date)) == year) %>%
    mutate(Average_Temperature = round(Average_Temperature * 2) / 2)

  couleur_ligne <- couleurs_saisons[season]

  ggplot(data_filtered, aes(x = date, y = Average_Temperature, group = season)) +
    geom_line(color = couleur_ligne, alpha = 0.7) +
    geom_point(color = couleur_ligne, size = 1, alpha = 0.7) +
    theme_minimal() +
    labs(title = paste("Temperature in the Beaufort Sea: ", season, " ", year),
         x = "Date",
         y = "Temperature (°C)")
}

# Générer et sauvegarder les graphiques en tant que fichiers HTML pour chaque combinaison saison-année
for (season in names(couleurs_saisons)) {
  for (year in unique_years) {
    plotly_plot <- ggplotly(generate_plot(season, year))
    html_file <- paste0("plot_", season, "_", year, ".html")
    htmlwidgets::saveWidget(plotly_plot, file = html_file)
  }
}

4.7 Average sea ice cover

To represent the average ice cover in the Beaufort Sea, we have also chosen to divide it up by season. In the following graph, you can double-click on a season to isolate it from the others. As with temperature, there is a strong seasonal effect on the surface area (in km2) of ice in the Beaufort Sea. In addition, there is a tendency for this to decrease over the years.

Code

# Fonction pour déterminer la saison
#getSeason <- function(date) {
#  month <- as.numeric(format(date, "%m"))
#  if (month %in% c(3, 4, 5)) {
#    return("Spring")
#  } else if (month %in% c(6, 7, 8)) {
#    return("Summer")
#  } else if (month %in% c(9, 10, 11)) {
#    return("Fall")
# } else {
#    return("Winter")
#  }
#}


# Ajouter la colonne saison
#glace_beaufort_graph <- glace_beaufort %>%
#  mutate(season = sapply(date, getSeason))


#couleurs_saisons <- c("Spring" = "#89A45E",  
#                      "Summer" = "#7FC6BC",        
#                      "Fall" = "#D17A22",    
#                      "Winter" = "#5A7D9A")

glace_beaufort_graph <- read.csv(here::here("Report/données/glace_beaufort_graph.csv"))
# Création du graphique ggplot
graph_beaufort <- ggplot(glace_beaufort_graph, aes(x = date, y = Beaufort_Sea, color = season)) +
  geom_line() +
  scale_color_manual(values = couleurs_saisons) +
  theme_minimal() +
  labs(title = "Extent of Beaufort Sea ice by year and season",
       x = "Year",
       y = "Extent of ice (km²)")

# Conversion en graphique interactif avec plotly
graph_beaufort_interactif <- ggplotly(graph_beaufort)

# Afficher le graphique
graph_beaufort_interactif

4.8 Number of polar bear observations

In addition to temperature and ice cover in the Beaufort Sea, we want to add another variable to our data set. It seems important to us to reflect the number of active polar bears (i.e. transmitting GPS data) present in our data set on a date T. This is why we have created a “Bear Counter” function, which associates the number of active polar bears with each GPS point. This allows us to adjust our data set in anticipation of our analysis. The following graph shows this function and thus the evolution of the number of active polar bears in the study over the years. Here again we can see that our data is fairly heterogeneous from year to year. This reinforces our idea of increasing our understanding of the data set to develop a rigorous statistical analysis.

Code

plot_nb_ours <- ours_clean

plot_nb_ours$date <- as.Date(plot_nb_ours$date)
plot_nb_ours$year <- year(plot_nb_ours$date)

# Calculer la moyenne du nombre d'ours actifs par année et arrondir à l'entier supérieur
moyenne_ours_par_annee <- plot_nb_ours %>%
  group_by(year) %>%
  summarize(moyenne_ours = ceiling(mean(Count_Bears, na.rm = TRUE)))

# Créer un graphique avec ggplot
graph_ours_v1 <- ggplot(moyenne_ours_par_annee, aes(x = year, y = moyenne_ours, group = 1)) +
  geom_line() +
  scale_y_continuous(limits = c(0, 50)) + # Fixer les limites de l'axe Y si nécessaire
  theme_minimal() +
  labs(x = "Year", y = "Number of bears studied per year", title = "Number of bear observations")


# Animer le graphique
nb_ours_anime <- graph_ours_v1 + transition_reveal(year)

# Enregistrer l'animation
anim_save("/Users/zoedardare/Documents/Master Management/Data science in BA/Quarto polar bears_files/Data Ugo/animation_moyenne_ours.gif", animation = nb_ours_anime)

# Afficher l'animation
animate(nb_ours_anime)

We have been able to separate the temperatures into 4 seasons over the years. We then notice a very strong seasonality through the average temperatures. What’s more, we don’t specifically see an increase or decrease in these temperatures over time.

4.9 Relationship between the days in covered water and the diets of polar bears

We create a scatterplot to visualize the relationship between the days in covered water and the ringed eaten by the polar bears.

Code

p_basic <- ggplot(pb, aes(x = days_in_cov_water, y = Perc_ringed_eaten)) +
  geom_point() +
  ggtitle("No clear relation between days in cover water and ringed seal consumed") +
  xlab("Number of days in water in melt season") +
  ylab("Ringed seal consumed in the during the previous year")
  theme(plot.title = element_text(size = 12))  # Ajustez la taille ici

List of 1
 $ plot.title:List of 11
  ..$ family       : NULL
  ..$ face         : NULL
  ..$ colour       : NULL
  ..$ size         : num 12
  ..$ hjust        : NULL
  ..$ vjust        : NULL
  ..$ angle        : NULL
  ..$ lineheight   : NULL
  ..$ margin       : NULL
  ..$ debug        : NULL
  ..$ inherit.blank: logi FALSE
  ..- attr(*, "class")= chr [1:2] "element_text" "element"
 - attr(*, "class")= chr [1:2] "theme" "gg"
 - attr(*, "complete")= logi FALSE
 - attr(*, "validate")= logi TRUE

Code

p_basics <- ggplotly(p_basic)
p_basics

The following graph shows the percentage of animals eaten.

Code

attach(pb)
data <- data.frame(
  labels = c("Ringed seals", "Bearded seals", "Seabirds", "Bowhead whales", "Belugas"),
  values = c(mean(Perc_ringed_eaten), mean(Perc_bearded_eaten),
             mean(Perc_seabird_eaten), mean(Perc_bowhead_eaten),
             mean(Perc_beluga_eaten))
)
ordered_data <- data %>% arrange(desc(values))
# Create the bar chart
ordered_data$values_rounded <- round(ordered_data$values, 2)

ordered_data$tooltip <- paste("Specy:", ordered_data$labels, 
                              "<br>Proportion:", ordered_data$values_rounded,"%")

gg <- ggplot(ordered_data, aes(x = labels, y = values, text = tooltip)) +
  geom_bar(stat = "identity", width = 0.75, fill = c("lightblue", "darkblue", "lightyellow", "skyblue", "blue")) +
  labs(title = "Ringed seals is the favourite food for polar bears", x = "Species eaten", y = "Proportion (%)")
ggplotly(gg, tooltip = "text")

4.10 Polar bear heat map

We can then plot the movements of each polar bear in our Beaufort Sea study on a map. What’s more, each GPS point is characterised by its year and the distance the polar bear has traveled since its last GPS fix. As the GPS points are not homogeneous depending on the BearID and the year, the track between the points is not the real trajectory of the polar bear selected. It presents the general idea of the journey made. For greater accuracy, it would be necessary to increase the number of GPS readings over time.

Code

predictions_complete$year <- format(as.Date(predictions_complete$date), "%Y")

create_heatmap_for_year <- function(year, data) {
  filtered_data <- data[data$year == year, ]
  
  m <- leaflet(filtered_data) %>%
    addTiles() %>%
    addHeatmap(lng = ~longitude, lat = ~latitude, intensity = ~1, blur = 20, max = 0.05, radius = 15) %>%
    addScaleBar(options = list(position = "bottomleft", maxWidth = 100, metric = TRUE, imperial = FALSE))
  
  saveWidget(m, file=paste0("heatmap_", year, ".html"), selfcontained = TRUE)
}

for(year in 2015:2020) {
  create_heatmap_for_year(year, predictions_complete)
}

Select a Year to display the Heatmap

5 Analysis

5.1 A decreasing polar bear population

We can see that the location of bears changes over time. As previously stated, an analysis based on the number of data points is of little relevance, as the bears were not specified in the data, and we can imagine an improvement in the methods and means put in place to obtain these measurements over time. What we can analyse, however, is the gradual shift towards the west in the measurements taken. We can imagine that as a result of global warming, bears are tending to seek out colder regions, and that while northern Alaska fulfilled this condition in the 80s and 90s, they now have to move to the even colder west of the country to find conditions that are conducive to their survival.

We can observe that the seals’ numbers oscillate yet show an overarching increase, which hints at their adaptability to environmental shifts—a resilience that might be bolstered by conservation measures. However, data irregularities, such as inconsistent collection frequencies and varying methodologies, could be obscuring the true narrative of these trends. These factors remind us to interpret the data cautiously.

Anyhow polar bears present a different story. The data, primarily from the American continent, indicate a concerning decrease in their population. The retreating Arctic ice is a clear signal of the climate crisis, stripping away the habitat crucial for the bears’ hunting and breeding. The data’s geographical limitations warrant a cautious approach to generalizing these findings across the glob.

From our study, we can observe that the decline in the number of polar bears is markedly more significant than the one of their primary prey, the seals. This pattern suggests a hypothesis: faced with an invisible enemy and the only predator of polar bears – climate change – these bears are in a dire situation. The warming climate leads to the ice in the bay forming later and melting earlier each year, which may abbreviate the seal hunting season to such an extent that polar bears may not have sufficient time to accumulate the fat reserves needed to survive their summer fast. The breeding cycle of the females could be disrupted, threatening the continuity of the species.

5.2 Is there a link between the diet of polar bears and the number of days they spend in open water?

We select the relevant columns for correlation analysis and create then a scatterplot matrix to visualize the relationships between all variables.

Code

attach(pb)
pb_cor <- select(pb, days_in_cov_water, OW_15prct, Meltseason, Perc_bearded_eaten, 
                 Perc_ringed_eaten, Perc_beluga_eaten, Perc_bowhead_eaten,
                 Perc_seabird_eaten)

suppressWarnings({scatterplotMatrix(pb_cor)})

We identify a potential multicollinearity issue between OW_15prct and the days in covered water due to high correlation, therefore we choose to retain the days in covered water for further analysis.

Code

pb_co <- cor(pb_cor)
my_colors <- brewer.pal(5, "Spectral")
my_colors <- colorRampPalette(my_colors)(100)
ord <- order(pb_co[1, ])
data_ord <- pb_co[ord, ord]
plotcorr(data_ord , col=my_colors[data_ord*50+50] , mar=c(0.1,0.1,0.1,0.1))

Here we see the link between the different variables in the data set. There is no clear link between the variables, except between “OW_15pcrt” and “days_in_covered_water”. However, due to the multicollinearity between these variables, OW_15prct was subsequently removed from the analysis. We can see a link, which will prove interesting later on, between days spent in open water and meltseason.

For further analysis, we select a subset of variables, and create a scatterplot matrix for the selected variables.

Code

pb_cor_multi <- pb %>%
  select(
    Days_water = days_in_cov_water,
    Meltseason,
    Bearded = Perc_bearded_eaten,
    Ringed = Perc_ringed_eaten,
    Beluga = Perc_beluga_eaten,
    Bowhead = Perc_bowhead_eaten,
    Seabird = Perc_seabird_eaten
  )

# Create a scatterplot matrix for the selected variables

p <- ggpairs(pb_cor_multi, title="No significant correlation between the variables")

# Utilisez la fonction rotate_text pour ajuster l'orientation des étiquettes
p <- p + theme(text= element_text(size=6.5), axis.text.y = element_text(size = 5),
               axis.text.y.right = element_text(size = 5),
               axis.text.x = element_text(size = 5, angle = 90, hjust = 1, vjust = 1))

p_interactive <- ggplotly(p)

p_interactive

We fitted a simple linear regression model with the days in covered water as the dependent variable and we summarized the results of the simple linear regression model. We also fitted a multiple linear regression model with the days in covered water as the dependent variable, the melt season, and the percentage of bearded eaten by the bears as independent variables.

Then we were able to calculate the variance inflation factors (VIFs) for the multiple linear regression model:

Code

vif_values <- vif(mod1)

# Créer un data frame à partir des valeurs VIF
vif_df <- data.frame(
  Variable = names(vif_values),
  VIF = unname(vif_values)
)

kable(vif_df, caption = "Variance Inflation Factors")

Variance Inflation Factors
Variable	VIF
pb_cor_multi$Meltseason	1.003628
pb_cor_multi$Bearded	1.003628

Code

kable(vif_df, "html") %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  add_header_above(c(" " = 1, "Variance Inflation Factor" = 1))

	Variance Inflation Factor
Variable	VIF
pb_cor_multi$Meltseason	1.003628
pb_cor_multi$Bearded	1.003628

No problems were detected, as the coefficients were well below 5.

We fit a multiple linear regression model with the days in covered water as the dependent variable and all other variables as independent variables.

Days_water = β0 + β1∗Meltseason + β2∗Bearded + β3∗Ringed + β4∗Beluga + β5∗Bowhead + β6∗Seabird + ε

Code

mod2 <- lm(pb_cor_multi$Days_water~pb_cor_multi$Meltseason+pb_cor_multi$Bearded
           +pb_cor_multi$Ringed+pb_cor_multi$Beluga+
             pb_cor_multi$Bowhead+pb_cor_multi$Seabird)

summary_mod2<-summary(mod2)

The R-Square is not very good. Relating to the variance inflation factor of the first model, considering all the parameters is a problem. The melt season seems to be interesting, and 1-2 other variables of food. We take the AIC model to find the best model:

Code

selected.AIC <- step(lm(pb_cor_multi$Days_water~pb_cor_multi$Meltseason+pb_cor_multi$Bearded
                        +pb_cor_multi$Ringed+pb_cor_multi$Beluga+
                          pb_cor_multi$Bowhead+pb_cor_multi$Seabird),direction="backward")

Start:  AIC=2606.08
pb_cor_multi$Days_water ~ pb_cor_multi$Meltseason + pb_cor_multi$Bearded + 
    pb_cor_multi$Ringed + pb_cor_multi$Beluga + pb_cor_multi$Bowhead + 
    pb_cor_multi$Seabird

                          Df Sum of Sq    RSS    AIC
- pb_cor_multi$Bowhead     1       6.8  85602 2604.1
- pb_cor_multi$Seabird     1       6.9  85602 2604.1
- pb_cor_multi$Ringed      1       6.9  85602 2604.1
- pb_cor_multi$Bearded     1       6.9  85602 2604.1
- pb_cor_multi$Beluga      1       6.9  85602 2604.1
<none>                                  85595 2606.1
- pb_cor_multi$Meltseason  1   17286.5 102882 2697.0

Step:  AIC=2604.12
pb_cor_multi$Days_water ~ pb_cor_multi$Meltseason + pb_cor_multi$Bearded + 
    pb_cor_multi$Ringed + pb_cor_multi$Beluga + pb_cor_multi$Seabird

                          Df Sum of Sq    RSS    AIC
<none>                                  85602 2604.1
- pb_cor_multi$Seabird     1     371.1  85973 2604.3
- pb_cor_multi$Beluga      1    1145.3  86747 2608.8
- pb_cor_multi$Ringed      1    1424.6  87027 2610.4
- pb_cor_multi$Bearded     1    2130.4  87732 2614.5
- pb_cor_multi$Meltseason  1   17338.4 102940 2695.3

Code

summary(selected.AIC)


Call:
lm(formula = pb_cor_multi$Days_water ~ pb_cor_multi$Meltseason + 
    pb_cor_multi$Bearded + pb_cor_multi$Ringed + pb_cor_multi$Beluga + 
    pb_cor_multi$Seabird)

Residuals:
    Min      1Q  Median      3Q     Max 
-21.098 -10.468   1.183   6.012  32.971 

Coefficients:
                        Estimate Std. Error t value Pr(>|t|)    
(Intercept)             71.56829    5.87533  12.181  < 2e-16 ***
pb_cor_multi$Meltseason  0.32006    0.03184  10.053  < 2e-16 ***
pb_cor_multi$Bearded     0.14169    0.04021   3.524 0.000464 ***
pb_cor_multi$Ringed      0.11061    0.03838   2.882 0.004125 ** 
pb_cor_multi$Beluga      0.15110    0.05848   2.584 0.010054 *  
pb_cor_multi$Seabird     0.08457    0.05750   1.471 0.141994    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 13.1 on 499 degrees of freedom
Multiple R-squared:  0.1858,    Adjusted R-squared:  0.1776 
F-statistic: 22.77 on 5 and 499 DF,  p-value: < 2.2e-16

The better model seems to consider the melt season and the number of bearded sealed, but not the three other variables.

Code

mod_fin <- lm(pb_cor_multi$Days_water~pb_cor_multi$Meltseason+pb_cor_multi$Bearded)

sum_modfin<- summary(mod_fin)
# Conversion de la sortie de summary en data frame
coef_dffin <- as.data.frame(sum_modfin$coefficients)

# Utilisation de formattable pour afficher le résumé
formattable(coef_dffin)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	81.79382895	4.75672094	17.195423	1.917344e-52
pb_cor_multi$Meltseason </td> <td style="text-align:right;"> 0.30852876 </td> <td style="text-align:right;"> 0.03184609 </td> <td style="text-align:right;"> 9.688122 </td> <td style="text-align:right;"> 1.835253e-20 </td> </tr> <tr> <td style="text-align:left;"> pb_cor_multi$Bearded	0.05797567	0.02539129	2.283290	2.283034e-02

The coefficients are significant, but the model is badly predicted… We try with a simple linear regression, only with the melt season.

Code

mod_simp <- lm(pb_cor_multi$Days_water~pb_cor_multi$Meltseason)

sum_modsimp<-summary(mod_simp)

# Conversion de la sortie de summary en data frame
coef_dfsimp <- as.data.frame(sum_modsimp$coefficients)

# Utilisation de formattable pour afficher le résumé
formattable(coef_dfsimp)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	82.3690274	4.76989854	17.26851	8.268680e-53
pb_cor_multi$Meltseason	0.3129008	0.03192133	9.80225	7.081637e-21

We can represent the data as a linear regression.

Code

p_regr <- ggplot(pb_cor_multi, aes(x = Meltseason, y = days_in_cov_water)) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") +
  ggtitle("We see a small positive link, but with variability")
p_reg <- ggplotly(p_regr)
p_reg

The R-Squared is fast the same, but this model is simpler. But we can also keep the final model, with a link with the food for polar bears. We can have a look on the variances, with the analysis of variances (anova):

Code

library(formattable)
df_anova<-anova(mod_fin)
formattable_table <- formattable(df_anova)
formattable_table

	Df	Sum Sq	Mean Sq	F value	Pr(>F)
pb_cor_multi$Meltseason </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 16861.3443 </td> <td style="text-align:right;"> 16861.3443 </td> <td style="text-align:right;"> 96.888963 </td> <td style="text-align:right;"> 5.072960e-21 </td> </tr> <tr> <td style="text-align:left;"> pb_cor_multi$Bearded	1	907.2771	907.2771	5.213412	2.283034e-02
Residuals	502	87361.8063	174.0275	NA	NA

Let’s have a look on the residues:

Code

qqPlot(mod_simp$residuals, distribution = "norm")

[1] 13 31

Code

mean(mod_simp$residuals)

[1] -1.479786e-15

Code

detach(pb)

They are fast equal to zero, so we cannot reject that the residues are normally distributed.

5.3 How do changes in sea ice coverage correlate with polar bear movement patterns as revealed by GPS tracking?

In this section, we set up a new panel of twenty polar bears, independently of those already considered in previous research activities. Due to practical constraints and limitations inherent in our skills, we opted for this method. It is important to note that this approach could influence the accuracy of our model, particularly regarding the continuity of movement of current bears.

Polar bears are mammals whose behaviour is highly complex, making their movements difficult to model. These challenges are exacerbated by constantly changing environmental and geographical phenomena, making it particularly difficult to estimate their movements accurately. Despite these obstacles, we mobilised all our available resources and skills to develop a simplified model aimed at predicting polar bear movement trends for the five years following our study period. Our preliminary model indicates a significant increase in the presence of polar bears along the Beaufort Sea coast over time. However, the inherent limitations of our model and its simplistic nature make any definitive conclusions premature.

Code

# Ajouter des variables saisonnières basées sur le mois
ours_clean <- ours_clean %>%
  mutate(date = ymd(date),
         month = month(date),
         season = case_when(
           month %in% c(12, 1, 2) ~ "Winter",
           month %in% c(3, 4, 5) ~ "Spring",
           month %in% c(6, 7, 8) ~ "Summer",
           month %in% c(9, 10, 11) ~ "Autumn",
           TRUE ~ NA_character_
         ))

# Convertir la saison en variable factorielle
ours_clean$season <- as.factor(ours_clean$season)

# Régression linéaire
model <- lm(Distance ~ Closest_Avg_Temperature + Count_Bears + latitude + longitude + season, data = ours_clean)

# Résumé du modèle
sum_mod<-summary(model)


# Conversion de la sortie de summary en data frame
coef_df1 <- as.data.frame(sum_mod$coefficients)

# Utilisation de formattable pour afficher le résumé
formattable(coef_df1)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	-15704.358046	2148.116825	-7.3107560	2.664331e-13
Closest_Avg_Temperature	-218.358044	4.032561	-54.1487294	0.000000e+00
Count_Bears	98.310057	4.685425	20.9821019	1.161376e-97
latitude	277.315868	28.592081	9.6990445	3.071029e-22
longitude	4.608247	3.239887	1.4223483	1.549264e-01
seasonSpring	-4241.466867	147.268913	-28.8008297	4.173695e-182
seasonSummer	39.891773	149.265983	0.2672529	7.892746e-01
seasonWinter	-3645.022000	195.345021	-18.6594057	1.196508e-77

This multiple linear regression model, developed from 263,886 observations, explores the factors influencing the distance traveled by bears. The model includes environmental and geographical variables as well as seasonal effects, but has an R2 of just 0.02, suggesting that other variables not included may also be important. Average temperature showed a significant negative impact of -218 meters per degree, suggesting that higher temperatures are associated with a decrease in distance traveled. The number of bears observed had a positive effect, with an increase of 98 meters for each additional bear observed. Geographical variables such as latitude and longitude showed significant effects, with an increase of 277 meters for each additional degree of latitude. However, the effect of longitude, although positive, was not statistically significant (p = 0.15). The seasonal effects are clearly visible, and we have arbitrarily chosen autumn as the reference season for comparing them. In terms of seasonal effects, spring shows a significant reduction of 4,241 meters compared with autumn and winter a reduction of 3,645 meters. Summer, on the other hand, shows no significant difference from autumn. Secondly, we integrate the ice cover (in km2) over the Beaufort Sea.

Code

# Convertir 'yyyyddd' en 'YYYY-MM-DD'
glace_beaufort$date <- as.Date(paste0(substr(glace_beaufort$yyyyddd, 1, 4), "-01-01")) +
  (as.numeric(substr(glace_beaufort$yyyyddd, 5, 7)) - 1)

# Fusionner les datasets sur la base de la date
dataset_regression <- left_join(ours_clean, glace_beaufort, by = "date")

# Filtrer pour ne garder que les données à partir de 2006
dataset_regression <- filter(dataset_regression, year(date) >= 2006)

# Régression linéaire sur le déplacement des ours
model <- lm(Distance ~ Closest_Avg_Temperature + Count_Bears + Beaufort_Sea, data = dataset_regression)

# Résumé du modèle
sum_mod2<-summary(model)

# Conversion de la sortie de summary en data frame
coef_df2 <- as.data.frame(sum_mod2$coefficients)

# Utilisation de formattable pour afficher le résumé
formattable(coef_df2)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	7.880316e+03	3.653602e+02	21.56862	8.008171e-103
Closest_Avg_Temperature	2.763439e+02	1.344256e+01	20.55738	1.297099e-93
Count_Bears	9.407240e+01	5.426635e+00	17.33531	3.585202e-67
Beaufort_Sea	-7.270132e-04	2.697236e-04	-2.69540	7.032181e-03

Code

dataset_regression <- arrange(dataset_regression, bearid)

Here we focus on the post-2006 period of our original data set, which is not only where most of the GPS markers are located but also the start of the ice cover data set. The multiple linear regression model analysed thus provides a quantitative overview of the factors influencing the distance traveled by bears, with each variable expressed in terms of its impact on distance in meters. Based on 193,502 observations, this model incorporates several environmental and geographical predictors. Among the variables, average temperature shows a significant impact of 30 meters per degree of temperature change. This indicates that an increase in average temperature is associated with an average increase in the distance traveled by bears. The number of bears observed (Count_Bears) shows a positive relationship with distance traveled, with an increase of 62 meters for each additional bear observed. This suggests that the presence of a greater number of bears is correlated with an increase in the distance traveled. The geographical variables, latitude and longitude also showed significant effects. For example, each additional degree of latitude is associated with an increase of 368 meters in distance traveled, while each additional degree of longitude corresponds to an increase of 8 meters. Nevertheless, in our model, despite its significance, ice cover (Beaufort_Sea) does not bring any change to our dependent variable. In spring, the distance traveled fell by 2,205 meters compared with autumn, while in summer it fell by 1,804 meters. In winter, the distance traveled decreases by 523 meters compared with autumn.

In a final model, we wanted to detach the potential multicollinearity of our variables as a function of season. To do this, we transformed our season variable into an indicator variable. Here are the results of the model:

Code

data_regression1<- na.omit(dataset_regression)
# Transformation de la variable saison en variables indicatrices
data_regression1 <- mutate(data_regression1, season = as.factor(season))
data_regression1 <- cbind(data_regression1, model.matrix(~season - 1, data = data_regression1))

# Ajustement du modèle de régression avec des interactions
modele3 <- lm(Distance ~ Count_Bears + latitude + longitude + seasonSpring:Closest_Avg_Temperature + seasonSummer:Closest_Avg_Temperature + seasonWinter:Closest_Avg_Temperature + seasonSpring:Beaufort_Sea + seasonSummer:Beaufort_Sea + seasonWinter:Beaufort_Sea, data = data_regression1)

# Résumé des résultats du modèle
sum_mod3<-summary(modele3)

# Conversion de la sortie de summary en data frame
coef_df3 <- as.data.frame(sum_mod3$coefficients)

# Utilisation de formattable pour afficher le résumé
formattable(coef_df3)

	Estimate	Std. Error	t value	Pr(>\|t\|)
(Intercept)	3.642017e+04	3.254354e+03	11.1912125	4.780964e-29
Count_Bears	1.025473e+02	5.638745e+00	18.1861939	1.004242e-73
latitude	-4.027032e+02	4.411302e+01	-9.1288967	7.108091e-20
longitude	-2.347668e+00	1.683814e+00	-1.3942566	1.632448e-01
Closest_Avg_Temperature:seasonWinter	3.170391e+02	1.400669e+01	22.6348351	5.291311e-113
seasonWinter:Beaufort_Sea	-2.277004e-04	2.737522e-04	-0.8317755	4.055387e-01

The results remain very similar to the two previous regressions. But to verify this multicollinearity aspect, we carry out a VIF test on our variables.

Code

#vif_values <- vif(modele3)

# Créer un data frame à partir des valeurs VIF
#vif_df <- data.frame(
#  Variable = names(vif_values),
#  VIF = unname(vif_values)
#)

#kable(vif_df, caption = "Variance Inflation Factors")
#kable(vif_df, "html") %>%
#  kable_styling(bootstrap_options = c("striped", "hover")) %>%
#  add_header_above(c(" " = 1, "Variance Inflation Factor" = 1))

The multicollinearity analysis for our regression model, based on the variance inflation factor (VIF), reveals varying levels of collinearity between predictors. The variables Bear_count, latitude, longitude and Temperature_Summer have VIFs below 5, suggesting a lack of significant multicollinearity. However, strong multicollinearity is observed for Temperature_Winter and Glace_cover_Winter, with VIFs of 16.55 and 17.08 respectively. Other terms, such as Temperature_Spring, Glace_cover_spring, and Glace_cover_Summer, show moderate levels of multicollinearity. These results indicate that although some interactions in the model do not pose significant collinearity problems, others, particularly those involving the winter season, could affect the reliability and interpretation of the coefficients and require particular attention when analyzing the results.

In conclusion, although our current model reveals significant relationships between various environmental and geographical factors and the distance traveled by bears, its low R2 suggests that there is room for improvement. To increase the explanatory power of the model, it could be beneficial to integrate additional data such as the availability of food resources in the Beaufort Sea, variations in vegetation density, or even interactions with other species, which could influence the bears’ movement patterns. In addition, a more detailed analysis of temporal data, considering intra-seasonal variations or specific weather conditions, for example, could provide more nuanced insights. Finally, exploring more complex modelling methods, such as non-linear models or mixed models, could also help to capture more subtle and complex relationships between the variables studied. It is crucial to understand that this model was designed primarily to evaluate its feasibility on our scale. We now need to collect more homogeneous and diversified data to improve the reliability and sustainability of our predictions. Expanding our database, incorporating more precise environmental parameters, and collaborating with experts in climatology and wildlife biology could improve the accuracy of our future estimates. By developing a more nuanced understanding of the migratory patterns and adaptive responses of polar bears to environmental change, we will be able to offer more informed insights into how these majestic animals navigate a rapidly changing world.

6 Conclusion

The comprehensive study on the impact of climate change on Arctic ecosystems, particularly focusing on polar bears, reveals significant findings and implications. Our investigation, driven by the urgent need to understand the effects of environmental shifts in the Arctic, combines extensive data analysis with predictive modeling to paint a detailed picture of the current and future state of polar bear populations and their habitats.

Key Findings

Declining Polar Bear Population: The data indicates a notable decline in polar bear numbers, particularly in regions like Alaska. This trend is alarming and points to the broader implications of shrinking sea ice and changing climatic conditions in the Arctic. The contrast with the ringed seals’ population trends, which show some resilience, underscores the unique challenges faced by polar bears.
Movement Patterns and Habitat Utilization: Our analysis of GPS data and environmental factors such as temperature and sea ice coverage reveals that polar bears’ movement patterns are intricately linked to these variables. The decreasing distances traveled by bears over the years suggest adaptations to the rapidly changing environment. Our predictive models, while indicating some level of accuracy, also highlight the complexity of these movement patterns.
Diet and Open Water Days: The relationship between the polar bears’ diet and the time they spend in open water is complex. Our statistical models suggest a connection, albeit not a straightforward one, between these factors. The impact of the melting season on bears’ dietary habits is evident but requires further exploration for conclusive insights.

Implications and Future Directions

The findings underscore the urgency for targeted conservation efforts aimed at preserving polar bear habitats, particularly in light of the diminishing sea ice cover in the Arctic. The insights gleaned from this study can inform policymakers and conservationists in formulating strategies that address the specific needs of polar bears, considering the broader context of climate change. Continued monitoring and research are vital to deepen our understanding of polar bears’ adaptation strategies to the changing environment. Future studies should aim to integrate more variables, such as human activities and their impact on bear habitats. Also, data on the location of bears in Alaska could be cross-referenced with changes in temperatures in different regions and the number of animals counted over time.

Concluding Remarks

In conclusion, this study highlights the profound impact of climate change on Arctic ecosystems, particularly on polar bears. The changing patterns in sea ice coverage and temperature regimes are not just altering the physical landscape but are also reshaping the life cycles and survival strategies of these iconic species. Our findings, while offering significant insights, also open the door for further research and action in the realm of wildlife conservation and climate change mitigation.

	Df	Sum Sq	Mean Sq	F value	Pr(>F)
pb_cor_multi\(Meltseason </td> <td style="text-align:right;"> 1 </td> <td style="text-align:right;"> 16861.3443 </td> <td style="text-align:right;"> 16861.3443 </td> <td style="text-align:right;"> 96.888963 </td> <td style="text-align:right;"> 5.072960e-21 </td> </tr> <tr> <td style="text-align:left;"> pb_cor_multi\)Bearded	1	907.2771	907.2771	5.213412	2.283034e-02
Residuals	502	87361.8063	174.0275	NA	NA

1 Executive summary

2 Introduction

2.1 Overview and motivation

2.2 Related work

2.3 Research questions

3 Data

3.1 Data Source

3.2 Data Scraping and Cleanup

3.2.1 Living Planet Database

3.2.2 Aerial Master Database

3.2.3 Polar Bear Distribution Database

3.2.4 Temperatures Data Base

3.2.5 Sea Ice Cover Data Base

3.2.6 Combined Data Sets

3.2.7 Data Set Regression for Travel Forecasts

4 Exploratory Data Analysis

4.1 Locations of research centers

4.2 5-Year cumulative count trend for ringed seals and polar bears

4.3 Polar bear location

4.4 Average weekly distance traveled per year

4.5 GPS tracks of polar bear

4.6 Temeprature evolution

4.7 Average sea ice cover

4.8 Number of polar bear observations

4.9 Relationship between the days in covered water and the diets of polar bears

4.10 Polar bear heat map

5 Analysis

5.1 A decreasing polar bear population

5.2 Is there a link between the diet of polar bears and the number of days they spend in open water?

5.3 How do changes in sea ice coverage correlate with polar bear movement patterns as revealed by GPS tracking?

6 Conclusion