Formative 1

Author

Weronika Staniak

Published

October 23, 2024

Data Exploration

Aim: To produce an annotated R script to explore the mosquito data set.

image

Proposed Question:

Is there a significant difference in wing length between male and female mosquitoes?

Graphical Visualization:

To visualize this data we can use either:

  • Box-plot: Useful for comparing the distribution of wing lengths.

  • Density-Plot: Provides a more clearer view of the distribution.

Tests that can be used:

  • T-test: Simple two-sample t test can be used to compare the mean wing length between males and females to test for a statistically significant difference.

    Assumptions: The data for both groups should be normally distributed, and the variances of both groups should be equal?

DATA and Codes

Loading Data Set

library(ggplot2)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
mosquito_data <- read.table("mosquitos.txt", sep = "\t", header = TRUE)

head(mosquito_data)
  ID     wing sex
1  1 37.83925   f
2  2 50.63106   f
3  3 39.25539   f
4  4 38.05383   f
5  5 25.15835   f
6  6 57.95632   f
## Stat summary for wing length by sex
mosquito_data %>%
  group_by(sex) %>%
  summarise(
    count = n(),
    mean_wing = mean(wing),
    median_wing = median(wing),
    sd_wing = sd(wing)
  )
# A tibble: 2 × 5
  sex   count mean_wing median_wing sd_wing
  <chr> <int>     <dbl>       <dbl>   <dbl>
1 f        50      47.2        46.4    9.99
2 m        50      50.4        52.0    9.19

Codes for graphical visualization

Box-plot:

# Boxplot for wing lengths by sex
ggplot(mosquito_data, aes(x = sex, y = wing, fill = sex)) +
  geom_boxplot() +
  labs(title = "Wing Length by Sex", x = "Sex", y = "Wing Length (mm)") +
  theme_minimal()

Density Plot:

# Density plot for wing lengths by sex
ggplot(mosquito_data, aes(x = wing, fill = sex)) +
  geom_density(alpha = 0.5) +
  labs(title = "Density Plot of Wing Length by Sex", x = "Wing Length (mm)") +
  theme_minimal()