282 Lab 2

Heidi Nydam

282 Lab 2

Preparing for lab:

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.4     ✔ readr     2.1.5
✔ forcats   1.0.0     ✔ stringr   1.5.1
✔ ggplot2   3.5.1     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.1
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(ggsci) 
library(patchwork) 
library(palmerpenguins)
library(ggplot2)

Lab 2

Question 1:

irisdata<-iris
head(irisdata)
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

Question 2: Make a histogram of Sepal.Length that compares distributions for all 3 species in the same graph. Note that color= changes the color of lines and fill= changes the color of the fill!’

ggplot(data=irisdata, aes(Sepal.Length))+
    geom_histogram(aes(color=Species, fill=Species))
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

#not quite sure why color= and fill= wasn’t changing anything in this case? I first tried to get it to automatically do it based on species by making color=species but this didn’t function. Just kidding I figured it out!

Question 3: Make a boxplot that shows how Sepal.Length differs by Species. Remove the gray background (there are many ways to do that– any way you want is fine).

lab2graph3 <-
ggplot(data=irisdata, aes(x=Species, y=Sepal.Length)) +
  geom_boxplot(aes(fill=Species, color=Species)) +
  theme_bw()
lab2graph3

Question 4: Make a bar graph that shows Sepal.Length by species. Is this a good graph or no? Consider the aspects of good vs bad graphs in the tutorial.

ggplot(data=irisdata, aes(x=Sepal.Length))+
  geom_bar(aes(fill=Species, color=Species))

I don’t think this is a great graph because the stacking of the species on top of each other within a sepal length makes it difficult to compare sepal length across species. It is very similar to the boxplot previously made so I am not sure that I actually made the graph you were looking for.

Question 5: Make a scatter plot that shows Sepal.Length by species. Compare this to your bar graph. Which is more useful and why?

ggplot(data=irisdata, aes(x=Species, y=Sepal.Length)) +
  geom_point()

This chart, in my opinion is far more useful than the initial bar graph because it is easier to see both the distribution of sepal lengths within species and to compare the minx, max, and perceived average across species.

Question 6: Make a line graph comparing Sepal.Length and Sepal.Width by species. What do you see? This is often the kind of graph we pair with a linear regression, so thinking about what it shows us is important.

ggplot(data=irisdata, aes(x=Sepal.Length, y=Sepal.Width, color=Species)) +
  geom_line() +
  theme_bw()

Generally, as sepal length increases, so does sepal width, but this is a very irregular relationship with plenty of variability. The increase in width is much greater (relative to length) for the sectosa species than the other species, as shown by the steeper slope of the red line.

Question 7:

ggplot(data=irisdata, aes(x=Species, y=Sepal.Length, color=Species)) +
  geom_point() +
  theme_minimal()

I changed the background color by changing the theme and then set each species to be a different color, instead of simply black.

Question 8:

ggplot(data=irisdata, aes(x=Species, y=Sepal.Length, color=Species)) +
  geom_point(size=5, shape=8) +
  theme_minimal()

I changed the size and shape of the points in the graph by changing their settings within the aes frame.

Question 9:

lab2graph9 <- 
  ggplot(data=irisdata, aes(x=Species, y=Sepal.Length, color=Species)) +
  geom_point() +
  theme_minimal()+
  labs(x = 'Species', y='Sepal Length', title='Flower Sepal Length by Species')+ #what do I put at the start of this to make the plot change the title etc? in your example you use "labs" which I thought was maybe referring to the name of your product or dataframe? 
  theme(text=element_text(size=18))
lab2graph9

Question 10: Take the graph from 6 and facet_wrap() it by species

  lab2graph10 <-
ggplot(data=irisdata, aes(x=Sepal.Length, y=Sepal.Width, color=Species)) +
  geom_line() +
  theme_bw() + 
    facet_wrap(~Species)
  lab2graph10

Question 11: Using the patchwork package, take any three of your graphs and panel them so that they all fit together on one page

#struggling to name the plots to then recall them back so that I can merge them with patchwork? Just kidding figured it out!

#I was also wondering when is better to use the %>% arrow sequence to start a new subline of code compared to the +?

#Additionally, is there a reason that every time I reopen the quarto/script again the graphs that I’d previously made disappear?

Graphs 3, 9, 10:

library(patchwork)
lab2graph3/lab2graph9/lab2graph10 

#I kind of tried to mess with bin width to make this a bit prettier but was not having much success