NAME: Adriana Cortazar-Sosa

Load in the required package

library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr     1.2.1     ✔ readr     2.2.0
## ✔ forcats   1.0.1     ✔ stringr   1.6.0
## ✔ ggplot2   4.0.3     ✔ tibble    3.3.1
## ✔ lubridate 1.9.5     ✔ tidyr     1.3.2
## ✔ purrr     1.2.2     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Load in the file titled “LabAssignment1_Data”

## file.choose("C:\\Users\\acort\\Downloads\\LabAssignment1_Data.csv")
lab1<-read.csv("C:\\Users\\acort\\Downloads\\LabAssignment1_Data.csv", stringsAsFactors = TRUE)
lab1

For this assignment, you will be exploring pipefish charaacteristics (length and number if fin rays) across season and pipefish color.

Q1 (3pts)

Compare the length of the pipefish across the two seasons. Report the mean, standard deviation, standard error, and sample size for each season below. Then answer the question.

lab1<-read.csv("C:\\Users\\acort\\Downloads\\LabAssignment1_Data.csv", stringsAsFactors = TRUE)
lab1

#SPRING
Spring_Pipefish<-filter(lab1, Season=="spring")

#mean
mean_spring<-mean(Spring_Pipefish$Pipefish.Length)
#STDV
stdv_spring<-sd(Spring_Pipefish$Pipefish.Length)
#standard error
se_spring<-stdv_spring/sqrt(54)
#sample size
n_spring<-length(Spring_Pipefish$Pipefish.Length)
#WINTER

Winter_Pipefish<-filter(lab1, Season=="winter")
#mean
mean_winter<-mean(Winter_Pipefish$Pipefish.Length)
#STDV
stdv_winter<-sd(Winter_Pipefish$Pipefish.Length)
#standard error
# se_winter<-sd_winter/sqrt(46)
#sample size
n_winter<-length(Winter_Pipefish$Pipefish.Length)
  1. Provide the following summary statistics below (2pts)

Spring
mean = 29.19 sd = 10.78 n = 54 se = 1.47

Winter
mean = 67.00 sd = 15.39 n = 46 se = 2.27

  1. Using your knowledge on location and spread statistics, describe and compare your findings. (1pt)

Winter has a higher mean (67.00) in comparison to Spring (29.19) indicating larger values present compared to Spring. The standard deviation for winter (15.39) is also larger compared to (10.78) for Spring which shows that the variability is greater for winter with data being more spread out.

Q2 (3pts)

There is speculation that the green and brown pipefish may be genetically different. The number of fin rays could indicate this potential difference. Explore the number of fin rays across the green and brown pipefish, and report the median, mean and sample size below.




``` r
lab1<-read.csv("C:\\Users\\acort\\Downloads\\LabAssignment1_Data.csv", stringsAsFactors = TRUE)
lab1
#GREEN
finrays_green<-filter(lab1, Color=="green")
#G-median
median_greenfinrays<-median(finrays_green$Fin.Rays)
#G-mean
mean_greenfinrays<-mean(finrays_green$Fin.Rays)
#G-sample size
n_greenfinrays<-length(finrays_green$Fin.Rays)
#BROWN
finrays_brown<-filter(lab1, Color=="brown")
#B-median 
median_brownfinrays<-median(finrays_brown$Fin.Rays)
#B-mean
mean_brownfinrays<-mean(finrays_brown$Fin.Rays)
#B-sample size
n_brownfinrays<-length(finrays_brown$Fin.Rays)
  1. Provide the following summary statistics below: (2pt)

Green
median: 5 mean: 4.94 n: 64

Brown
median: 8 mean: 7.86 n: 36

  1. Compare the number of finrays of brown and green pipefish. Does there seem to be evidence of genetic differentiation? Also address whether the means and medians seem to match, or whether there seems to be a skew for either color pipefish. (1pt)

The average number of finrays for brown pipefish is higher (7.86) compared to green pipefish (4.94) which seems to be evidence of genetic variation. Despite this there is only a slight skew for either color of pipefish for both the mean and median, with the medians compared being brown (8) and green (5), so there is close to no skew

Q3 (6pts)

You’re studying a population of pipefish across spring and winter seasons. Pipefish are either observed with brown or green coloration. They live in eelgrass habitat, which is an aquatic plant that has long, green blades. Explore the dataset below.

  1. What proportion of pipefish in the study were green? (1pt)
summary(finrays_green)
##    Color       Season   Pipefish.Length    Fin.Rays    
##  brown: 0   spring:43   Min.   :13.64   Min.   :4.000  
##  green:64   winter:21   1st Qu.:24.09   1st Qu.:4.000  
##                         Median :41.82   Median :5.000  
##                         Mean   :41.24   Mean   :4.938  
##                         3rd Qu.:50.91   3rd Qu.:6.000  
##                         Max.   :89.09   Max.   :6.000
p_pipefishgreen<-64/100
  1. What proportion of pipefish were found in the spring? (1pt)
# summary(Spring_Pipefish)
p_springpipefish<-54/100
  1. What proportion of pipefish in the entire study are green and found in the spring? (1pt)
summary(finrays_green$Season)
## spring winter 
##     43     21
p_greeninspring<-43/100
  1. Is pipefish color independent of the season they are found in? Show your work. (3pts)
summary(lab1$Season)
## spring winter 
##     54     46
summary(lab1$Color)
## brown green 
##    36    64
#proportions
p_greeninspring<-43/100
p_pipefishgreen<-64/100
p_springpipefish<-54/100

p_pipefishgreen*p_springpipefish
## [1] 0.3456
  1. Given you are sampling in the spring time, what proportion of brown pipefish might you expect to catch? (2pt)
#summary(Spring_Pipefish$Color)
p_brownpipefish<-11/54

Data Visualization (2pts)

Provide a data visualization and figure caption for the number of green and brown pipefish found in the spring. (2pts)

ggplot(Spring_Pipefish, aes(x=Color, fill=Color))+
  geom_bar()+theme_classic(16)+
  xlab("Color of Pipefish Found in Spring")+
  ylab("Frequency of Pipefish")

Figure 1: Number of brown (11) and green (43) pipefish found in the spring.