ONLY install packages if first time using them! FYI - you have to load packages every time you use R
install.packages("tidyverse")
##
## The downloaded binary packages are in
## /var/folders/8n/yt7q563d0kq2_9rbr216z8mc0000gn/T//Rtmp0YgZtr/downloaded_packages
library(tidyverse)
## ── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
## ✔ dplyr 1.1.4 ✔ readr 2.1.5
## ✔ forcats 1.0.0 ✔ stringr 1.5.1
## ✔ ggplot2 3.5.1 ✔ tibble 3.2.1
## ✔ lubridate 1.9.3 ✔ tidyr 1.3.1
## ✔ purrr 1.0.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
FYI:
The white boxes that have text that start with "##" are output (or output messages, warnings, errors, etc.).
The gray boxes are R code.
I don't know what the conflicts are when I load tidyverse at this point in my training, but I am not worried about them right now because that code isn't necessary for ch.1 (I am interested in ggplot2 right now)
ONLY install packages if first time using them! FYI - you have to load packages every time you use R
install.packages("palmerpenguins")
##
## The downloaded binary packages are in
## /var/folders/8n/yt7q563d0kq2_9rbr216z8mc0000gn/T//Rtmp0YgZtr/downloaded_packages
install.packages("ggthemes")
##
## The downloaded binary packages are in
## /var/folders/8n/yt7q563d0kq2_9rbr216z8mc0000gn/T//Rtmp0YgZtr/downloaded_packages
library(palmerpenguins)
library(ggthemes)
This will open a separate tab of the data. If you only want a preview of the data, type "penguins" in the console and hit enter & R will print a preview
view(penguins)
Ultimate goal with penguin data in Ch.1 : create a visual representation of the relationship between body
mass and flipper length, with consideration for the penguin species present in the data set (pg. 4).
ggplot(data = penguins)
ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g))
ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) + geom_point()
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
It looks like R automatically removed missing data that it was unable to plot. You will see this warning pop up as we layer the code, feel free to ignore it.
Alright, let's carry on!
ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) + geom_point()
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g, color = species)) + geom_point() + geom_smooth(method = "lm")
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) + geom_point(mapping = aes(color = species)) + geom_smooth(method = "lm")
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) + geom_point(mapping = aes(color = species, shape = species)) + geom_smooth(method = "lm")
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggplot( data = penguins, mapping = aes(x = flipper_length_mm, y = body_mass_g)) + geom_point(mapping = aes(color = species, shape = species)) + geom_smooth(method = "lm") + labs(title = "Body Mass and Flipper Length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Flipper Length (mm)", y = "Body Mass (g)", color = "Species", shape = "Species") + scale_color_colorblind()
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
(look at the difference in code between piping and #8)
penguins |>
ggplot(aes(x = flipper_length_mm, y = body_mass_g)) + geom_point(mapping = aes(color = species, shape = species)) + geom_smooth(method = "lm") + labs(title = "Body Mass and Flipper Length", subtitle = "Dimensions for Adelie, Chinstrap, and Gentoo Penguins", x = "Flipper Length (mm)", y = "Body Mass (g)", color = "Species", shape = "Species") + scale_color_colorblind()
## `geom_smooth()` using formula = 'y ~ x'
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_smooth()`).
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
Categorical Variable
penguins |>
ggplot(aes(x = species)) + geom_bar()
Categorical Variable: Ordered Levels
penguins |>
ggplot(aes(x = fct_infreq(species))) + geom_bar()
Numerical Variable: Histogram
penguins |>
ggplot(aes(x = body_mass_g)) + geom_histogram(binwidth = 200)
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_bin()`).
Numerical Variable: Histogram - Exploring Width
penguins |>
ggplot(aes(x = body_mass_g)) + geom_histogram(binwidth = 20)
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_bin()`).
penguins |>
ggplot(aes(x = body_mass_g)) + geom_histogram(binwidth = 2000)
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_bin()`).
Numerical Variable: Density Plot
penguins |>
ggplot(aes(x = body_mass_g)) + geom_density()
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_density()`).
Boxplot
penguins |>
ggplot(aes(x = species, y = body_mass_g)) + geom_boxplot()
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_boxplot()`).
Density Plot
penguins |>
ggplot(aes(x = body_mass_g, color = species)) + geom_density(linewidth = 0.75)
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_density()`).
Density Plot: Add Transparancy to Density Curves
penguins |>
ggplot(aes(x = body_mass_g, color = species, fill = species)) + geom_density(alpha = 0.5)
## Warning: Removed 2 rows containing non-finite outside the scale range
## (`stat_density()`).
Stacked Bar Plot
penguins |>
ggplot(aes(x = island, fill = species)) + geom_bar()
Relative Frequency Plot (%)
penguins |>
ggplot(aes(x = island, fill = species)) + geom_bar(position = "fill")
Scatterplot
penguins |>
ggplot(aes(x = flipper_length_mm, y = body_mass_g)) + geom_point()
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
Scatterplot: Add Aesthetics & Layers
penguins |>
ggplot(aes(x = flipper_length_mm, y = body_mass_g)) + geom_point(aes(color = species, shape = island))
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
Scatterplot: Facets
penguins |>
ggplot(aes(x = flipper_length_mm, y = body_mass_g)) + geom_point(aes(color = species, shape = species)) + facet_wrap(~island)
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
Saving plots as an image (.png)
penguins |>
ggplot(aes(x = flipper_length_mm, y = body_mass_g)) + geom_point(aes(color = species, shape = species)) + facet_wrap(~island)
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
ggsave(filename = "penguin-plot.png")
## Saving 7 x 5 in image
## Warning: Removed 2 rows containing missing values or values outside the scale range
## (`geom_point()`).
Use the code ggsave(filename = "ADD-TITLE.png") at the each plot's code that you want to export.The png file will save where you opened your syntax. I suggest creating a folder for projects so that everything is organized in one location.
-Make sure (parentheses) and “exclamations” are paired together
-When writing code, the “+” needs to come at the end of the line, not the start
-You can get help by running ?function_name (e.g., ?ggsave) in the console or highlight function + F1 in RStudio
-Carefully read error messages and google it if you can’t figure it out! :)
END OF CHAPTER 1.