Taking a look at DSLabs Datasets/ Loading libraries

#install.packages("dslabs")
library("dslabs")
## Warning: package 'dslabs' was built under R version 4.1.3
data(package="dslabs")
#list.files(system.file("script", package = "dslabs"))
library(tidyverse)
## -- Attaching packages --------------------------------------- tidyverse 1.3.1 --
## v ggplot2 3.3.5     v purrr   0.3.4
## v tibble  3.1.6     v dplyr   1.0.8
## v tidyr   1.2.0     v stringr 1.4.0
## v readr   2.1.2     v forcats 0.5.1
## -- Conflicts ------------------------------------------ tidyverse_conflicts() --
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
library("RColorBrewer")
library(dplyr)
library(ggplot2)
#install.packages("highcharter")
library(highcharter)
## Warning: package 'highcharter' was built under R version 4.1.3
## Registered S3 method overwritten by 'quantmod':
##   method            from
##   as.zoo.data.frame zoo
## 
## Attaching package: 'highcharter'
## The following object is masked from 'package:dslabs':
## 
##     stars
I have chosen the Italian olive dataset. I’m Italian so I felt like I just had to!

Italian Olive Dataset

This dataset explores the composition in percentage of eight fatty acids found in the lipid fraction of 572 Italian olive oils.

Exploring the dataset…

data("olive")
head(olive)
##           region         area palmitic palmitoleic stearic oleic linoleic
## 1 Southern Italy North-Apulia    10.75        0.75    2.26 78.23     6.72
## 2 Southern Italy North-Apulia    10.88        0.73    2.24 77.09     7.81
## 3 Southern Italy North-Apulia     9.11        0.54    2.46 81.13     5.49
## 4 Southern Italy North-Apulia     9.66        0.57    2.40 79.52     6.19
## 5 Southern Italy North-Apulia    10.51        0.67    2.59 77.71     6.72
## 6 Southern Italy North-Apulia     9.11        0.49    2.68 79.24     6.78
##   linolenic arachidic eicosenoic
## 1      0.36      0.60       0.29
## 2      0.31      0.61       0.29
## 3      0.31      0.63       0.29
## 4      0.50      0.78       0.35
## 5      0.50      0.80       0.46
## 6      0.51      0.70       0.44
tail(olive)
##             region         area palmitic palmitoleic stearic oleic linoleic
## 567 Northern Italy West-Liguria     10.7         1.0     2.2  77.3      8.7
## 568 Northern Italy West-Liguria     12.8         1.1     2.9  74.9      7.9
## 569 Northern Italy West-Liguria     10.6         1.0     2.7  77.4      8.1
## 570 Northern Italy West-Liguria     10.1         0.9     2.1  77.2      9.7
## 571 Northern Italy West-Liguria      9.9         1.2     2.5  77.5      8.7
## 572 Northern Italy West-Liguria      9.6         0.8     2.4  79.5      7.4
##     linolenic arachidic eicosenoic
## 567       0.1       0.1       0.02
## 568       0.1       0.1       0.02
## 569       0.1       0.1       0.03
## 570       0.0       0.0       0.02
## 571       0.1       0.1       0.02
## 572       0.1       0.2       0.02
dim(olive)
## [1] 572  10
summary(olive)
##             region                 area        palmitic      palmitoleic    
##  Northern Italy:151   South-Apulia   :206   Min.   : 6.10   Min.   :0.1500  
##  Sardinia      : 98   Inland-Sardinia: 65   1st Qu.:10.95   1st Qu.:0.8775  
##  Southern Italy:323   Calabria       : 56   Median :12.01   Median :1.1000  
##                       Umbria         : 51   Mean   :12.32   Mean   :1.2609  
##                       East-Liguria   : 50   3rd Qu.:13.60   3rd Qu.:1.6925  
##                       West-Liguria   : 50   Max.   :17.53   Max.   :2.8000  
##                       (Other)        : 94                                   
##     stearic          oleic          linoleic        linolenic     
##  Min.   :1.520   Min.   :63.00   Min.   : 4.480   Min.   :0.0000  
##  1st Qu.:2.050   1st Qu.:70.00   1st Qu.: 7.707   1st Qu.:0.2600  
##  Median :2.230   Median :73.03   Median :10.300   Median :0.3300  
##  Mean   :2.289   Mean   :73.12   Mean   : 9.805   Mean   :0.3189  
##  3rd Qu.:2.490   3rd Qu.:76.80   3rd Qu.:11.807   3rd Qu.:0.4025  
##  Max.   :3.750   Max.   :84.10   Max.   :14.700   Max.   :0.7400  
##                                                                   
##    arachidic       eicosenoic    
##  Min.   :0.000   Min.   :0.0100  
##  1st Qu.:0.500   1st Qu.:0.0200  
##  Median :0.610   Median :0.1700  
##  Mean   :0.581   Mean   :0.1628  
##  3rd Qu.:0.700   3rd Qu.:0.2800  
##  Max.   :1.050   Max.   :0.5800  
## 
table(olive$area)
## 
##        Calabria  Coast-Sardinia    East-Liguria Inland-Sardinia    North-Apulia 
##              56              33              50              65              25 
##          Sicily    South-Apulia          Umbria    West-Liguria 
##              36             206              51              50
table(olive$region)
## 
## Northern Italy       Sardinia Southern Italy 
##            151             98            323

* This dataset has 572 observations of 10 variables.

* The areas explored in this datasets are: Calabria, Coast-Sardinia, East-Liguria, Inland-Sardinia, North-Apulia, Sicily, South-Apulia, Umbria, and West-Liguria.

* The regions explored in this datasets are: Northern Italy, Sardinia, Southern Italy.

Researching…

Since I wasn’t familiar with most of these fatty acids, I listed them and provided definitions.

1. Palmitic acid: a solid saturated fatty acid obtained from palm oil and other vegetable and animal fats.

2. Palmitoleic acid: a non-essential omega-7 monounsaturated free fatty acid.

3. Stearic acid: a solid saturated fatty acid obtained from animal or vegetable fats.

4. Oleic acid: a fatty acid that occurs naturally in various animal and vegetable fats and oils.

5. Linoleic acid: a polyunsaturated fatty acid present as a glyceride in linseed oil and other oils and essential in the human diet.

6. Linolenic acid: a polyunsaturated fatty acid (with one more double bond than linoleic acid) present as a glyceride in linseed and other oils and essential in the human diet.

7. Arachidic acid: also known as icosanoic acid, is a saturated fatty acid with a 20-carbon chain.

8. Eicosenoic acid: a monounsaturated omega-9 fatty acid found in a variety of plant oils and nuts; jojoba oil. It is one of a number of eicosenoic acids.

According to the Mayo Clinic,“studies show that eating foods rich in unsaturated fat instead of saturated fat improve blood cholesterol levels, which can decrease your risk of heart attack and stroke. One type in particular omega-3 fatty acid appears to boost heart health by improving cholesterol levels, reducing blood clotting, reducing irregular heartbeats, and slightly lowering blood pressure.”

There are two main types of unsaturated fat:

* Monounsaturated fat

* Polyunsaturated fat

I want to focus on areas from Southern Italy only.

italysouth <- olive %>%
filter(region =='Southern Italy' )

Plotting a saturated fat and a unsaturated fat contents from areas of Southern Italy using Highcharter.

p1 <- italysouth %>% 
  hchart('scatter', hcaes(x = linoleic, y = palmitic, group = area)) %>%
  hc_colors(c("#00bfff", "#ed9121", "#d70a53", "#00cc99" )) %>%
   hc_xAxis(title = list(text="linoleic acid")) %>%
   hc_subtitle(text = "Source: Olive data set") %>%
  hc_yAxis(title = list(text="palmitic acid"))%>%
  hc_title( text = "Palmitic and Linoleic acid contents in olive oils from Southern Italy") %>%
   hc_add_theme(hc_theme_smpl())
  
p1

I chose the “olive” dataset from the DSLabs dataset. This dataset explores the composition in the percentage of eight fatty acids found in the lipid fraction of 572 Italian olive oils. It focuses on three regions of Italy, Northern Italy, Sardinia, and Southern Italy. As well as areas for these regions such as Calabria, Coast-Sardinia, East-Liguria, Inland-Sardinia, North-Apulia, Sicily, South-Apulia, Umbria, and West-Liguria. First, I did some basic exploring for this dataset, such as its dimensions, summary, etc. I then did some research about these eight fatty acids since I wasn’t familiar with all of them. I then listed them and provided their definitions. I then add information from the Mayo Clinic regarding what fats are considered healthy and unhealthy. Lastly, I created a scatterplot using Highcharter where I wanted to see the contents of one saturated fat, palmitic, and one polyunsaturated fat, linoleic, in olive oils from Southern Italy. I specifically focused on Southern Italy because that is where I’m from. I filtered this specific region using dplyr.