DSLabs Olive Viz

Author

Dormowa Sherman

Published

October 24, 2023

Loading packages.

library(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.3     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.4     ✔ tibble    3.2.1
✔ lubridate 1.9.3     ✔ tidyr     1.3.0
✔ purrr     1.0.2     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(dslabs)
library(RColorBrewer)
library(dplyr)
library(highcharter)
Registered S3 method overwritten by 'quantmod':
  method            from
  as.zoo.data.frame zoo 

Attaching package: 'highcharter'

The following object is masked from 'package:dslabs':

    stars

Browsing the list of available datasets in the dslabs package.

library(dslabs)
data()

Taking a look at the heading and the first six lines of the olive dataset.

head(olive)
          region         area palmitic palmitoleic stearic oleic linoleic
1 Southern Italy North-Apulia    10.75        0.75    2.26 78.23     6.72
2 Southern Italy North-Apulia    10.88        0.73    2.24 77.09     7.81
3 Southern Italy North-Apulia     9.11        0.54    2.46 81.13     5.49
4 Southern Italy North-Apulia     9.66        0.57    2.40 79.52     6.19
5 Southern Italy North-Apulia    10.51        0.67    2.59 77.71     6.72
6 Southern Italy North-Apulia     9.11        0.49    2.68 79.24     6.78
  linolenic arachidic eicosenoic
1      0.36      0.60       0.29
2      0.31      0.61       0.29
3      0.31      0.63       0.29
4      0.50      0.78       0.35
5      0.50      0.80       0.46
6      0.51      0.70       0.44

Seeing what regions are included.

unique(olive$region)
[1] Southern Italy Sardinia       Northern Italy
Levels: Northern Italy Sardinia Southern Italy

Creating a new dataframe with the eight types of fatty acids in Italian olives

olive1 <- olive |>
  select(3:10)

Using highcharter to make a graph of how fatty acids correlate (or do not) to one another.

hchart(cor(olive1))

Creating scatterplot of oleic and linoleic fatty acids to include graph title, x and y axes titles, custom colors, simple theme, and a centered legend.

olive |>
  hchart('scatter', hcaes(x = oleic, y = linoleic, group = region)) |>
  hc_colors(c("#9989c6", "#00cc99", "#d2691e")) |>
  hc_title(text = "Oleic and Linoleic Fatty Acids in Italian Olives", align = "center") |>
  hc_xAxis(title = list(text = "Oleic Acid (% of sample)")) |>
  hc_yAxis(title = list(text = "Linoleic Acid (% of sample)")) |>
  hc_legend(align = "center") |>
  hc_add_theme(hc_theme_smpl())

About this viz

I created this visualization using the olive dataset from the dslabs package. According to CRAN this dataset includes the “composition in percentage of eight fatty acids found in the lipid fraction of 572 Italian olive oils”. Using the highcharter package, I created a scatterplot that shows the sample percentages of linoleic and oleic acid in Italian olives by region. I chose these two fatty acids because linoleic is the primary essential fatty acid and oleic is the fatty acid predominantly found in the Western diet, which is known to be unhealthy. I grouped by region because I thought it would be interesting to see how Sardinia, considered a Blue Zone, compares to the other regions.