Introduction

This analysis investigates the Mars Weather data set, focusing on #temperature, pressure, and seasonal patterns. The goal is to explore #trends, relationships, and anomalies in the data to better understand the Martian climate.

Load library

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.1.4     ✔ readr     2.1.5
## ✔ forcats   1.0.0     ✔ stringr   1.5.1
## ✔ ggplot2   3.5.1     ✔ tibble    3.2.1
## ✔ lubridate 1.9.4     ✔ tidyr     1.3.1
## ✔ purrr     1.0.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggplot2)
library(readr)

Read Mars weather data

mars_data <- read_csv("C:/Users/rbada/Downloads/Mars-weather.csv")
## Rows: 1894 Columns: 10
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (2): month, atmo_opacity
## dbl  (7): id, sol, ls, min_temp, max_temp, pressure, wind_speed
## date (1): terrestrial_date
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Numeric and Categorical Summaries

Numeric summary for min_temp and pressure

library(tidyverse)
summary_for_numeric_cols <- mars_data %>%
  summarise(
    Min_MinTemp = min(min_temp, na.rm = TRUE),
    Max_MinTemp = max(min_temp, na.rm = TRUE),
    Mean_MinTemp = mean(min_temp, na.rm = TRUE),
    Median_MinTemp = median(min_temp, na.rm = TRUE),
    Q1_MinTemp = quantile(min_temp, 0.25, na.rm = TRUE),
    Q3_MinTemp = quantile(min_temp, 0.75, na.rm = TRUE),
    Min_Pressure = min(pressure, na.rm = TRUE),
    Max_Pressure = max(pressure, na.rm = TRUE),
    Mean_Pressure = mean(pressure, na.rm = TRUE),
    Median_Pressure = median(pressure, na.rm = TRUE),
    Q1_Pressure = quantile(pressure, 0.25, na.rm = TRUE),
    Q3_Pressure = quantile(pressure, 0.75, na.rm = TRUE)
  )

# Print the summary after the summaries function
print(summary_for_numeric_cols)
## # A tibble: 1 × 12
##   Min_MinTemp Max_MinTemp Mean_MinTemp Median_MinTemp Q1_MinTemp Q3_MinTemp
##         <dbl>       <dbl>        <dbl>          <dbl>      <dbl>      <dbl>
## 1         -90         -62        -76.1            -76        -80        -72
## # ℹ 6 more variables: Min_Pressure <dbl>, Max_Pressure <dbl>,
## #   Mean_Pressure <dbl>, Median_Pressure <dbl>, Q1_Pressure <dbl>,
## #   Q3_Pressure <dbl>

Insight:The numeric summary shows that minimum temperatures on Mars can drop as low as -90°C and reach a maximum of -62°C, with an average of -60°C.Atmospheric pressure ranges between 700 Pa and 925 Pa, with an average of 853 Pa. These values highlight the extreme cold and thin atmosphere on Mars, which are critical for planning future missions.

Categorical summary for unique values and counts:

categorical_summary <- mars_data %>%
  summarise(
    Unique_Months = n_distinct(month),
    Unique_Atmo_Opacity = n_distinct(atmo_opacity)
  )
print(categorical_summary)
## # A tibble: 1 × 2
##   Unique_Months Unique_Atmo_Opacity
##           <int>               <int>
## 1            12                   2

List distinct categories for categorical variables

distinct_categories <- list(
  Unique_Months = unique(mars_data$month),
  Unique_Opacities = unique(mars_data$atmo_opacity)
)
print(distinct_categories)
## $Unique_Months
##  [1] "Month 5"  "Month 4"  "Month 3"  "Month 2"  "Month 1"  "Month 12"
##  [7] "Month 11" "Month 10" "Month 9"  "Month 8"  "Month 7"  "Month 6" 
## 
## $Unique_Opacities
## [1] "Sunny" "--"

Insight: The data set includes 12 unique Martian months. The atmospheric opacityis mostly reported as Sunny, with very few instances of unclear (–) values. This indicates relatively stable atmospheric conditions on Mars, which is promising for solar-powered systems. However, further investigation into the rare unclear values might provide insights into unusual weather events.

Frequency counts for month and atmospheric opacity

month_counts <- mars_data %>% count(month, sort = TRUE)
opacity_counts <- mars_data %>% count(atmo_opacity, sort = TRUE)
print(month_counts  )
## # A tibble: 12 × 2
##    month        n
##    <chr>    <int>
##  1 Month 3    194
##  2 Month 4    194
##  3 Month 2    182
##  4 Month 1    176
##  5 Month 12   166
##  6 Month 6    153
##  7 Month 5    149
##  8 Month 11   145
##  9 Month 7    142
## 10 Month 8    141
## 11 Month 9    136
## 12 Month 10   116
print(opacity_counts)
## # A tibble: 2 × 2
##   atmo_opacity     n
##   <chr>        <int>
## 1 Sunny         1891
## 2 --               3

Novel Questions to Investigate

1.How do minimum and maximum temperatures vary across different Martian months, and how do these temperatures correlate with pressure.

  1. What is the relationship between atmospheric pressure and temperature on Mars?

  2. Are there significant differences in atmospheric opacity during specific Martian months or seasons?

Column Summaries

Numeric columns

  1. min_temp: Represents the minimum daily temperature on Mars (°C) and helps analyze the coldest weather conditions to understand extreme environments.

  2. max_temp: Indicates the maximum daily temperature on Mars (°C), providing insights into the hottest conditions and daily temperature ranges.

  3. pressure: Measures the atmospheric pressure on Mars (Pa), allowing for the study of stability and variability in Martian atmospheric conditions.

  4. .ls: Represents the solar longitude (0°–360°), which corresponds to Mars’s position in its orbit and helps analyze seasonal changes affecting weather patterns.

  5. sol:Tracks the Martian solar day (count of days since the mission began) and enables the analysis of trends or changes in weather.

Categorical Columns

1.month:Specifies the Martian month (from Month 1 to Month 12), allowing seasonal trends and variations in Martian weather to be studied throughout the year.

2.atmo_opacity:Describes the clarity of the Martian atmosphere (e.g.,Sunny), providing insights into atmospheric conditions such as clear skies or dusty weather.

1.terrestrial_date:Refers to the Earth-based date corresponding to the Martian weather data, enabling chronological analysis and linking Mars observations to Earth’s timeline.

1.id:unique identifier for each observation, ensuring each row can be tracked and distinguished for analysis purposes. 2.wind_speed:Records the wind speed on Mars,which provides valuable information about wind dynamics and weather patterns.

Data set Documentation:

This data set contains weather observations from Mars, collected over a specific period. It includes both numerical and categorical data, such as temperature, atmospheric pressure, and seasonal attributes, offering insights into the Martian climate. The data was likely obtained from Mars weather missions, such as NASA’s Curiosity rover or similar projects, and serves as a valuable resource for studying Martian environmental conditions.

project’s goals/purpose

The goal of this project is to analyze the Mars weather data set to better understand the Martian climate. This includes studying temperature patterns, atmospheric pressure, and seasonal changes across Martian months. The project aims to identify trends, explore relationships between variables like temperature and pressure, and gain insights into how Martian weather behaves over time. These findings can provide valuable context for future Mars missions and research.

Addressing a Question Using Aggregation:

How do minimum and maximum temperatures vary across different Martia months, and how do these temperatures correlate with pressure?

#Aggregate mean min_temp and max_temp by Martian month
library(dplyr)

temp_by_month <- mars_data %>%
  group_by(month) %>%
  summarise(
    Mean_MinTemp = mean(min_temp, na.rm = TRUE),
    Mean_MaxTemp = mean(max_temp, na.rm = TRUE),
    .groups = 'drop'
  )
print(temp_by_month)
## # A tibble: 12 × 3
##    month    Mean_MinTemp Mean_MaxTemp
##    <chr>           <dbl>        <dbl>
##  1 Month 1         -77.2     -15.4   
##  2 Month 10        -72.0      -3.33  
##  3 Month 11        -72.0      -4.16  
##  4 Month 12        -74.5      -7.93  
##  5 Month 2         -79.9     -23.4   
##  6 Month 3         -83.3     -27.7   
##  7 Month 4         -82.7     -25.3   
##  8 Month 5         -79.3     -16.7   
##  9 Month 6         -75.3      -7.77  
## 10 Month 7         -72.3      -0.0141
## 11 Month 8         -68.4      -0.631 
## 12 Month 9         -69.2      -2.23
ggplot(temp_by_month, aes(x = month)) +
  geom_line(aes(y = Mean_MinTemp, color = "Min Temp"), group = 1, linewidth = 1.2) +
  geom_line(aes(y = Mean_MaxTemp, color = "Max Temp"), group = 1, linewidth = 1.2) +
  scale_x_discrete(labels = paste0("M", 1:12)) +
  scale_color_manual(values = c("Min Temp" = "blue", "Max Temp" = "red")) +
  labs(title = "Mean Temperatures by Martian Month", x = "Martian Month", y = "Temperature (°C)") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 30, hjust = 1))

library(dplyr)
temp_pressure_by_month <- mars_data %>%
  group_by(month) %>%
  summarise(
    Mean_MinTemp = mean(min_temp, na.rm = TRUE),
    Mean_MaxTemp = mean(max_temp, na.rm = TRUE),
    Mean_Pressure = mean(pressure, na.rm = TRUE),
    .groups = 'drop'
  )
# Plot both temperature and pressure
ggplot(temp_pressure_by_month, aes(x = month)) +
  geom_line(aes(y = Mean_MinTemp, color = "Min Temp"), group = 1, size = 1.2) +
  geom_line(aes(y = Mean_MaxTemp, color = "Max Temp"), group = 1, size = 1.2) +
  geom_line(aes(y = Mean_Pressure, color = "Pressure"), group = 1, size = 1.2, linetype = "dashed") +
  scale_x_discrete(labels = paste0("M", 1:12)) +   
  scale_color_manual(values = c("Min Temp" = "blue", "Max Temp" = "red", "Pressure" = "green")) +
  labs(title = "Mean Temperatures and Pressure by Martian Month", 
       x = "Martian Month", 
       y = "Temperature (°C) / Pressure",
       color = "Variable") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 30, hjust = 1))
## Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
## ℹ Please use `linewidth` instead.
## This warning is displayed once every 8 hours.
## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
## generated.

# Calculate correlation between mean temperature and pressure
correlation <- cor(temp_pressure_by_month$Mean_MinTemp, temp_pressure_by_month$Mean_Pressure)
print(paste("Correlation between minimum temperature and pressure:", round(correlation, 2)))
## [1] "Correlation between minimum temperature and pressure: 0.27"

The plots show the minimum and maximum temperatures on Mars, along with the minimum and maximum pressure across Martian months. The weak positive correlation (0.27) between minimum temperature and pressure indicates that as temperatures rise, pressure slightly increases but not significantly. Understanding these patterns is essential for planning and predicting environmental conditions on Mars, especially during warmer periods when pressure tends to be lower. Aggregation: What is the relationship between atmospheric pressure and temperature?

#Aggregation: What is the relationship between atmospheric pressure and temperature?

#Aggregate mean pressure and temperature by Martian month
library(dplyr)

pressure_temp_by_month <- mars_data %>%
  group_by(month) %>%
  summarise(
    Mean_Pressure = mean(pressure, na.rm = TRUE),
    Mean_Min_Temp = mean(min_temp, na.rm = TRUE)
  )

# Print the result
print(pressure_temp_by_month)
## # A tibble: 12 × 3
##    month    Mean_Pressure Mean_Min_Temp
##    <chr>            <dbl>         <dbl>
##  1 Month 1           862.         -77.2
##  2 Month 10          887.         -72.0
##  3 Month 11          857.         -72.0
##  4 Month 12          842.         -74.5
##  5 Month 2           889.         -79.9
##  6 Month 3           877.         -83.3
##  7 Month 4           806.         -82.7
##  8 Month 5           749.         -79.3
##  9 Month 6           745.         -75.3
## 10 Month 7           795.         -72.3
## 11 Month 8           874.         -68.4
## 12 Month 9           913.         -69.2
pressure_temp_by_month <- pressure_temp_by_month %>%
  mutate(
    Norm_Pressure = (Mean_Pressure - min(Mean_Pressure)) / (max(Mean_Pressure) - min(Mean_Pressure)),
    Norm_Min_Temp = (Mean_Min_Temp - min(Mean_Min_Temp)) / (max(Mean_Min_Temp) - min(Mean_Min_Temp)))
  
ggplot(pressure_temp_by_month, aes(x = month)) +
  geom_line(aes(y = Norm_Pressure, color = "Pressure"), group = 1, size = 1.2) +
  geom_line(aes(y = Norm_Min_Temp, color = "Min Temp"), group = 1, size = 1.2) +
  scale_x_discrete(labels = paste0("M", 1:12)) +   
  scale_color_manual(values = c("Pressure" = "green", "Min Temp" = "blue")) +
  labs(title = "Relationship Between Atmospheric Pressure and Temperature",
       x = "Martian Month", y = "Normalized Values", color = "Variable") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 30, hjust = 1))

Insight:Aggregating minimum and maximum temperatures by Martian month reveals seasonal trends. Month 1 shows the coldest temperatures, while Month 8 is relatively warmer. These findings help us understand how temperatures change with Martian seasons, which can inform planning for seasonal activities or missions on Mars.

Visual Summaries

library(dplyr)
#Clean data set by removing missing values
cleaned_data <- mars_data %>%
  filter(!is.na(min_temp) & !is.na(pressure) & !is.na(atmo_opacity))

 #Display summary statistics
summary(cleaned_data)
##        id         terrestrial_date          sol               ls       
##  Min.   :   2.0   Min.   :2012-08-16   Min.   :  10.0   Min.   :  0.0  
##  1st Qu.: 489.5   1st Qu.:2014-02-18   1st Qu.: 546.5   1st Qu.: 78.0  
##  Median : 959.0   Median :2015-06-28   Median :1028.0   Median :160.0  
##  Mean   : 955.6   Mean   :2015-06-15   Mean   :1015.7   Mean   :168.9  
##  3rd Qu.:1425.5   3rd Qu.:2016-10-30   3rd Qu.:1505.5   3rd Qu.:257.5  
##  Max.   :1895.0   Max.   :2018-02-27   Max.   :1977.0   Max.   :359.0  
##                                                                        
##     month              min_temp         max_temp         pressure    
##  Length:1867        Min.   :-90.00   Min.   :-35.00   Min.   :727.0  
##  Class :character   1st Qu.:-80.00   1st Qu.:-23.00   1st Qu.:800.0  
##  Mode  :character   Median :-76.00   Median :-11.00   Median :853.0  
##                     Mean   :-76.12   Mean   :-12.51   Mean   :841.1  
##                     3rd Qu.:-72.00   3rd Qu.: -3.00   3rd Qu.:883.0  
##                     Max.   :-62.00   Max.   : 11.00   Max.   :925.0  
##                                                                      
##    wind_speed   atmo_opacity      
##  Min.   : NA    Length:1867       
##  1st Qu.: NA    Class :character  
##  Median : NA    Mode  :character  
##  Mean   :NaN                      
##  3rd Qu.: NA                      
##  Max.   : NA                      
##  NA's   :1867
#Print the first few rows
print(head(cleaned_data))
## # A tibble: 6 × 10
##      id terrestrial_date   sol    ls month min_temp max_temp pressure wind_speed
##   <dbl> <date>           <dbl> <dbl> <chr>    <dbl>    <dbl>    <dbl>      <dbl>
## 1  1895 2018-02-27        1977   135 Mont…      -77      -10      727        NaN
## 2  1893 2018-02-26        1976   135 Mont…      -77      -10      728        NaN
## 3  1894 2018-02-25        1975   134 Mont…      -76      -16      729        NaN
## 4  1892 2018-02-24        1974   134 Mont…      -77      -13      729        NaN
## 5  1889 2018-02-23        1973   133 Mont…      -78      -18      730        NaN
## 6  1891 2018-02-22        1972   133 Mont…      -78      -14      730        NaN
## # ℹ 1 more variable: atmo_opacity <chr>

Box plot

Box plot: Minimum Temperatures by Martian Month

library(ggplot2)  # Ensure ggplot2 is loaded

ggplot(cleaned_data) + 
  geom_boxplot(aes(x = factor(month), y = min_temp, fill = factor(month))) +
  ggtitle("Minimum Temperatures by Martian Month") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Insight:The box plot shows that minimum temperatures vary significantly across months. For example, Month 3 experiences the coldest temperatures, while Month 4 has greater variability. This variability could indicate transitional weather patterns, which are crucial to consider for long-term exploration missions.

Line Plot

Line Plot: Atmospheric Pressure Over Time.

library(ggplot2)  # Load ggplot2

ggplot(cleaned_data, aes(x = terrestrial_date, y = pressure)) +
  geom_line(color = "blue") +
  labs(
    title = "Atmospheric Pressure on Mars Over Time",
    x = "Earth Date",
    y = "Atmospheric Pressure (Pa)"
  ) +  # Close the labs() properly and add '+' for chaining
  theme_minimal()  # Apply the minimal theme

Insight: The line plot shows how atmospheric pressure changes over time on Mars. It highlights periods of high and low pressure, which could be linked to Martian seasons or other patterns. This helps in understanding how pressure varies and its impact on Mars missions.

Scatter plot

Scatter Plot: Pressure vs Minimum Temperature

library(ggplot2)

ggplot(cleaned_data, aes(x = pressure, y = min_temp, color = factor(month))) + 
  geom_point(alpha = 0.7, size = 2) + 
  labs(
    title = "Scatter Plot: Atmospheric Pressure vs Minimum Temperature",
    x = "Atmospheric Pressure (Pa)",
    y = paste("Minimum Temperature", "\u00B0C"),  # Unicode for °C
    color = "Martian Month"
  ) + 
  theme_minimal()

Insight: The scatter plot highlights a positive correlation between atmospheric pressure and minimum temperature. This suggests that higher pressures are generally associated with warmer temperatures. Further analysis of outliers in this relationship could reveal extreme weather events or anomalies