Mosquitoes

The Workbook consists of work done by Qiuyang Zhang, Eboni Lucian Patreace Senior , Chase Lilly and Jal Dipamkumar Vashi.

Mosquitoes or Culicidae consist of 41 known genera, with around 3,500 species. They act as vectors of disease pathogens like malaria, yellow fever and dengue ( Foster and Walker, 2019). It is seen that the wings of male mosquitoes can range between 2.11 mm and 2.48 mm ( Hatala et al., 2018).

Library being used

# vtable is use to generate formatted table of a given data-set.
library(vtable)
Loading required package: kableExtra
# ggplot2 is used to create plots like scatter plots, line plots and many more based on a given data set.
library(ggplot2)

# dplyr is used to transform data frame using pipe commands (%>%).  
library(dplyr)

Attaching package: 'dplyr'
The following object is masked from 'package:kableExtra':

    group_rows
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Importing data in R-Studio

#Convert the data-set in to a .csv format for R-Studio to understand. 
mosquito_df <- read.csv("~/Desktop/MRes/Research Methods/Formative/mosquitos.csv"
                        , header = TRUE)

Finding the summary

# summary () is used to summarise the data of each column and find statistical data like mean, median and many more. 
summary (mosquito_df)
       ID              wing           sex           
 Min.   :  1.00   Min.   :25.16   Length:100        
 1st Qu.: 25.75   1st Qu.:41.42   Class :character  
 Median : 50.50   Median :48.42   Mode  :character  
 Mean   : 50.50   Mean   :48.78                     
 3rd Qu.: 75.25   3rd Qu.:56.24                     
 Max.   :100.00   Max.   :69.82                     
# vtable () is used to produce a much more detailed and formatted summary. 
vtable (mosquito_df)
mosquito_df
Name Class Values
ID integer Num: 1 to 100
wing numeric Num: 25.158 to 69.818
sex character
# summary_stats is being used to divide the data frame in a categorical way and find their statistical value like mean, median and much more. 
summary_stats <- mosquito_df %>%
  group_by(sex) %>%
  summarise(
    mean_wing = mean(wing),
    sd_wing = sd(wing),
    median_wing = median(wing),  
    var_wing = var(wing),        
    min_wing = min(wing),
    max_wing = max(wing)
  )

print(summary_stats)
# A tibble: 2 × 7
  sex   mean_wing sd_wing median_wing var_wing min_wing max_wing
  <chr>     <dbl>   <dbl>       <dbl>    <dbl>    <dbl>    <dbl>
1 f          47.2    9.99        46.4     99.7     25.2     69.8
2 m          50.4    9.19        52.0     84.4     27.4     66.1

Finding the data type

# sapply () is used to identify the data type of the column in a data-set. 
sapply (mosquito_df, class)
         ID        wing         sex 
  "integer"   "numeric" "character" 

Finding the structure of the data

#str () is understand the kind of data including rows abd columns. 
str (mosquito_df)
'data.frame':   100 obs. of  3 variables:
 $ ID  : int  1 2 3 4 5 6 7 8 9 10 ...
 $ wing: num  37.8 50.6 39.3 38.1 25.2 ...
 $ sex : chr  "f" "f" "f" "f" ...

Representing the data as a bar chart

# A bar chart is use to show data points across different groups in a data frame.   
# mosquito_df %>% is used to bring the mosquito data frame from the directory. 

mosquito_df %>%
  ggplot(aes(x = sex, color = sex, fill = sex)) +
  geom_bar(alpha = 0.5) +
  labs(x = "Sex", 
       y = "Count", 
       fill = "Sex", 
       color = "Sex", 
       title = "Mosquitos counting by Sex") +
  theme_minimal() +
  theme(axis.text = element_text(size = 14),
        axis.title = element_text(size = 14))

Representing the data as a Boxplot

# Box plots are used to see the distribution of numerical values and compare them between multiple groups. 

 mosquito_df %>% 
  na.omit() %>% 
  ggplot(aes(x=sex, 
             y = wing,
             color=wing, 
             fill=wing))+
  geom_boxplot(alpha=0.7)+
    geom_boxplot(color = "blue",
           fill = "lightblue")+
    geom_jitter()+
    labs(x = "Sex", 
       y = "Wing", 
       fill = "Sex", 
       color = "Sex", 
       title = "Wing span depending on Sex")+
  theme(axis.text=element_text(size=14),
        axis.title=element_text(size=14))
Warning: The following aesthetics were dropped during statistical transformation: colour
and fill.
ℹ This can happen when ggplot fails to infer the correct grouping structure in
  the data.
ℹ Did you forget to specify a `group` aesthetic or to convert a numerical
  variable into a factor?

Representing the data as a Histogram

# Histogram is used for summarise continuous data that are measured on a regular interval scale. 

mosquito_df %>%
  ggplot(aes(x = wing, color = sex, fill = sex)) +
  geom_histogram(alpha = 0.5, bins = 30) +
  labs(x = "Wing Length", 
       y = "Count", 
       fill = "Sex", 
       color = "Sex", 
       title = "Wing Length Distribution by Sex") +
  theme_minimal() +  
  theme(axis.text = element_text(size = 14),
        axis.title = element_text(size = 14))

Representing the data as a Density plot

# Density plot is used as a measure to check the distrubution of numerical value in a data set. 

mosquito_df %>%
  ggplot(aes(x = wing, 
             color = sex,
             fill = sex)) + 
  geom_density(trim = FALSE, 
               color = "darkgrey",
               alpha = 0.5) +
  labs(x = "Wing",
       y = "Density",
       title = "Density plot of wing size measures by sex") +
  theme_minimal()

Formulation of a Question

After looking and undestanding different graphs a question was formulated. Is there a link between the gender (sex) and wing span of mosquitoes? Also, is there a correlation between the gender (sex) and the wings of adult mosquitoes? To answer this, a correlation graph was plotted.

Representing the correlation between span and sex

# Correlations are used to access the strength of a linear relationship between different variables and pairs. 

mosquito_df %>%
  ggplot(aes(x = sex, y = wing)) +  
  geom_point() +
  labs(x = "Sex", 
       y = "Wing", 
       fill = "Sex", 
       color = "Sex", 
       title = "Wing span depending on Sex") + 
   geom_point(color = "blue")+
  theme_minimal() +
  theme(axis.text = element_text(size = 14),
        axis.title = element_text(size = 14))

As seen from the graph there is no real correlation between the wing span and gender (sex). It is seen from the graph that the wing span of mosquitoes can range between 25 mm to 70 mm, regardless the gender (sex).

References

  • Foster, W.A. and Walker, E.D., 2019. mosquitoes (Culicidae). In Medical and veterinary entomology (pp. 261-325). Academic press.
  • Hatala, A.J., Harrington, L.C. and Degner, E.C., 2018. Age and body size influence sperm quantity in male Aedes albopictus (Diptera: Culicidae) mosquitoes. Journal of medical entomology, 55(4), pp.1051-1054.