RMDA Group Formative

By Eleanor Salisbury, Joss Sibbering, Katie Prange, Renad Elrayes, Becca Hageman, Rory McCloskey.

Setting up

Load packages

library(knitr)
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Load data

mosquitos <- read_tsv("mosquitos.txt")
Rows: 100 Columns: 3
── Column specification ────────────────────────────────────────────────────────
Delimiter: "\t"
chr (1): sex
dbl (2): ID, wing

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Kable presentation

kable(head(mosquitos))
ID wing sex
1 37.83925 f
2 50.63106 f
3 39.25539 f
4 38.05383 f
5 25.15835 f
6 57.95632 f

Research Question:
How does the sex of mosquitoes influence their wingspan?

Exploring average wingspan by sex

mosquitos %>% # Start with mosquito's dataset and use the pipe operator function
  group_by(sex) %>% # Group the mosquito data by the 'sex' column to perform operations separately for males and females
  summarise(wingmean=mean(wing)) %>% # Calculate the average wing length for each group and create a new column called 'wingmean'
  ungroup()
# A tibble: 2 × 2
  sex   wingmean
  <chr>    <dbl>
1 f         47.2
2 m         50.4
  • Mean data seems to show males having larger wings than female.

Exploring range of wingspan by sex

mosquitos %>% # Start with mosquito's dataset and use the pipe operator function
  group_by(sex) %>% # Group the mosquito data by the 'sex' column to perform operations separately for males and females
  summarise(wingdifference=max(wing)-min(wing)) %>%  # Calculate the difference between the maximum and minimum wing length for each sex and create a new column called 'wingdifference'
  ungroup()
# A tibble: 2 × 2
  sex   wingdifference
  <chr>          <dbl>
1 f               44.7
2 m               38.7
  • Females appear to have a bigger range than males.

Exploring data to visualise the wing size distribution

mosquitos %>% # Start with mosquito's dataset and use the pipe operator function
  ggplot(aes(x=wing, # Set the x-axis to 'wing' variable
             fill=sex))+ # Use 'sex' to fill the density curves with different colours
  geom_density(alpha=0.7)+ # Add a density plot layer with a transparency level of 0.7
  labs(x="Wingspan/mm",
       y="Density",
       color="Sex") # Change the labels on the plot to have more info and capital letters

Fig. 1: A density plot showing how the wingspan of mosquitos is distributed for males and females.

Exploring mean and dispersion of data by creating a boxplot

mosquitos %>% # Start with mosquito's dataset and use the pipe operator function
  ggplot(aes(x=sex, # Set the x-axis to 'wing' variable
             y=wing,
             color=sex))+ # Group data by 'sex' for the boxplot
  geom_boxplot()+ # Add a boxplot layer to visualise the distribution of wing sizes
    labs(y="Wingspan/mm",
       x="Sex",
       fill="Sex") # Change the labels on the plot to have more info and capital letters

Fig. 2: A box plot showing how the wingspan of mosquitos is distributed for males and females.

Summarising the data

summary(lm(wing~sex, data=mosquitos)) # Fit a linear model to examine the relationship between wing size and sex

Call:
lm(formula = wing ~ sex, data = mosquitos)

Residuals:
     Min       1Q   Median       3Q      Max 
-22.9515  -7.2651  -0.0677   6.2607  22.6409 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   47.177      1.357  34.771   <2e-16 ***
sexm           3.202      1.919   1.669   0.0984 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 9.594 on 98 degrees of freedom
Multiple R-squared:  0.02762,   Adjusted R-squared:  0.0177 
F-statistic: 2.784 on 1 and 98 DF,  p-value: 0.09839

Results

Mean t-test on data shows p-value of 0.09, therefore not statistically significant.

Conclusion

Sex does not significantly influence wingspan, mean data suggests males having bigger wings however females appear to have a bigger range of wingspan than males.