HW week 8 data 110

Author

Dajana R

Load the libraries sets needed

### Load dslabs, ggplot2 , and highcharter 
library(dslabs)
Warning: package 'dslabs' was built under R version 4.3.3
library(ggplot2)
library(highcharter)
Warning: package 'highcharter' was built under R version 4.3.3
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 

Attaching package: 'highcharter'
The following object is masked from 'package:dslabs':

    stars

Load and check the data

### Check the head and tail of the data 
data(olive)
head(olive)
          region         area palmitic palmitoleic stearic oleic linoleic
1 Southern Italy North-Apulia    10.75        0.75    2.26 78.23     6.72
2 Southern Italy North-Apulia    10.88        0.73    2.24 77.09     7.81
3 Southern Italy North-Apulia     9.11        0.54    2.46 81.13     5.49
4 Southern Italy North-Apulia     9.66        0.57    2.40 79.52     6.19
5 Southern Italy North-Apulia    10.51        0.67    2.59 77.71     6.72
6 Southern Italy North-Apulia     9.11        0.49    2.68 79.24     6.78
  linolenic arachidic eicosenoic
1      0.36      0.60       0.29
2      0.31      0.61       0.29
3      0.31      0.63       0.29
4      0.50      0.78       0.35
5      0.50      0.80       0.46
6      0.51      0.70       0.44
tail(olive)
            region         area palmitic palmitoleic stearic oleic linoleic
567 Northern Italy West-Liguria     10.7         1.0     2.2  77.3      8.7
568 Northern Italy West-Liguria     12.8         1.1     2.9  74.9      7.9
569 Northern Italy West-Liguria     10.6         1.0     2.7  77.4      8.1
570 Northern Italy West-Liguria     10.1         0.9     2.1  77.2      9.7
571 Northern Italy West-Liguria      9.9         1.2     2.5  77.5      8.7
572 Northern Italy West-Liguria      9.6         0.8     2.4  79.5      7.4
    linolenic arachidic eicosenoic
567       0.1       0.1       0.02
568       0.1       0.1       0.02
569       0.1       0.1       0.03
570       0.0       0.0       0.02
571       0.1       0.1       0.02
572       0.1       0.2       0.02

###Check for missing data

### Checking for missing data 
sum(is.na(olive))
[1] 0

Create the 1st Visualization

### Creating a ggplot graph
  ggplot(olive, aes(x = palmitic, y = oleic, color = region)) +
  geom_point(alpha = 0.7) +  # Adding transparency to points
  labs(title = "Fatty Acid Composition of Olive Oils Across Italy",
       x = "Percentage of Palmitic Acid",
       y = "Percentage of Oleic Acid",
       color = "Region", 
       caption = "Source: DSLABS Data Set Olive") +
 theme_minimal()

2nd Visualization

Making a high charter graph

*** CHATGTP was used to help make the high charter and look and fix for errors

hc <- highchart() |>
  hc_chart(type = "scatter") |>
  hc_add_series(data = olive[olive$region == "Northern Italy", ], 
                type = "scatter", 
                hcaes(x = palmitic, y = oleic), 
                color = "red", 
                name = "Northern Italy") |>
  hc_add_series(data = olive[olive$region == "Sardinia", ], 
                type = "scatter", 
                hcaes(x = palmitic, y = oleic), 
                color = "green", 
                name = "Sardinia") |>
  hc_add_series(data = olive[olive$region == "Southern Italy", ], 
                type = "scatter", 
                hcaes(x = palmitic, y = oleic), 
                color = "blue", 
                name = "Southern Italy") |>
  hc_legend(layout = "horizontal", align = "center", verticalAlign = "bottom") |>
  hc_title(text = "Fatty Acid Composition of Olive Oils Across Italy")|>
  hc_xAxis(title = list(text = "Percentage of Palmitic Acid")) |>
  hc_yAxis(title = list(text = "Percentage of Oleic Acid")) |>
  hc_plotOptions(
    scatter = list(
      marker = list(
        symbol = "circle",
        radius = 5
      )
    )
  ) |>
     hc_tooltip(pointFormat = "<br>Percentage of Palmitic Acid: {point.x:.2f}%<br>Percentage of Oleic Acid: {point.y:.2f}%<br>Area: {point.area}") |>
  hc_subtitle(text = "Source: DSLABS Data Set Olive")

hc

The data set I used is called olive it is from the DSLABS library. The variables I used were region, area, palmitic, and oleic.I created the graph using both high charter and ggplot2. The graph from high charter uses the variable area while the ggplot2 does not. The variable palmitic is used as the x-axis while the variable oleic is used as the y-axis, the color of the dots represents the region of where the oil sample was from. I chose to use the variables palmitic and and oleic because they are two of the main fatty acids found in olive oils. The percentage of each fatty acid varies by region, by using these two variables and the percentage of each by region allows us to gain insight into the composition of olive oils across Italy.

** CHATGTP was used to look for errors and fix them.

** CHATGTP was used to help make the high charter graph