DS Labs Homework

Author

Oliver Kronen

Load the necessary packages

library(dslabs)
Warning: package 'dslabs' was built under R version 4.5.3
library(ggplot2)
library(dplyr)

Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union
library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ lubridate 1.9.5     ✔ tibble    3.3.1
✔ purrr     1.2.1     ✔ tidyr     1.3.2
✔ readr     2.1.6     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors

Look for data sets to work with

data(package="dslabs") #calling the ds labs info
list.files(system.file("script", package = "dslabs")) #listing the files inside ds labs
 [1] "make-admissions.R"                   
 [2] "make-brca.R"                         
 [3] "make-brexit_polls.R"                 
 [4] "make-calificaciones.R"               
 [5] "make-death_prob.R"                   
 [6] "make-divorce_margarine.R"            
 [7] "make-gapminder-rdas.R"               
 [8] "make-greenhouse_gases.R"             
 [9] "make-historic_co2.R"                 
[10] "make-mice_weights.R"                 
[11] "make-mnist_127.R"                    
[12] "make-mnist_27.R"                     
[13] "make-movielens.R"                    
[14] "make-murders-rda.R"                  
[15] "make-na_example-rda.R"               
[16] "make-nyc_regents_scores.R"           
[17] "make-olive.R"                        
[18] "make-outlier_example.R"              
[19] "make-polls_2008.R"                   
[20] "make-polls_us_election_2016.R"       
[21] "make-pr_death_counts.R"              
[22] "make-reported_heights-rda.R"         
[23] "make-research_funding_rates.R"       
[24] "make-results_us_election_2012.R"     
[25] "make-stars.R"                        
[26] "make-temp_carbon.R"                  
[27] "make-tissue-gene-expression.R"       
[28] "make-trump_tweets.R"                 
[29] "make-weekly_us_contagious_diseases.R"
[30] "save-gapminder-example-csv.R"        

I picked the data set on olives because my name is Oliver.

data("olive") #call the data set. This allows me to see the variables in the data set.

I check for any NA values to see if I need to clean the data set.

sum(is.na(olive)) #check for the total amount of NA values in the data
[1] 0

There are no NA values

Now I will make the graph

graph <- ggplot(olive, aes(x = palmitic, y = palmitoleic, color = region)) + #Create a graph variable and set the aesthetics
  geom_point() + #make a scatterplot
  theme_minimal() + #change theme to minimal
  scale_color_manual(values = c("#ed0e6b", "#0eedd3", "#0eed3f")) + #manually change the colours used for the different regions
  labs(x = "Amount of Palmitic Acid", y = "Amount of Palmitoleic Acid", title = "Palmitic vs. Palmitoleic Acid in Olives Across Different Italian Regions", color = "Italian Regions") #label the different aspects of the graph
graph

Paragraph

The data set I have chosen to work with is the one on Olives. This data set details the amount of acids found in the olives of different Italian regions and areas. For the assignment, I chose to only look at two different types of acid; palmitic and palmitoleic. I also chose to work with a third variable, that being the Italian regions. I chose to make a scatter plot as I am most comfortable making those. I set the x and y axis to the different acid types and used the colour aesthetic for the region variable. This would allow the graph to properly show the amount of acids found in the different regions olives. One takeaway from the graph is the high amounts of acidity in southern Italian olives. It would appear Southern Italy has the most acidic olives while Northern italy has the least acidic olives.