Team member: Priyanka Veeranki Sneha Vanguri Amod Ashok Panchal Sai Sudha Upadrasta

#Loading necessary Packages

library(ggplot2)
library(ggcorrplot)
## Warning: package 'ggcorrplot' was built under R version 4.2.1
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.2.1
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ tibble  3.1.7     ✔ dplyr   1.0.9
## ✔ tidyr   1.2.0     ✔ stringr 1.4.0
## ✔ readr   2.1.2     ✔ forcats 0.5.1
## ✔ purrr   0.3.4     
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

#Reading Data Initially we need to read the data. We are provided the excel file Final_Nutrition2.xlsx

Nut.df <- read_csv("C:/Users/Public/Final_Nutrition2.csv") %>%
 as.data.frame()
## Rows: 8789 Columns: 42
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr  (5): Shrt_Desc, Long_Desc, FdGrp_Desc, GmWt_Desc1, GmWt_Desc2
## dbl (37): NDB_No, Water_g, Energ_Kcal, Protein_g, Lipid_Tot_g, Fiber_TD_g, S...
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

#Merging After loading the data, we will see if we can merge different data sets. In this case study we are provided just one data set - Final_Nutrition2.csv. Now, check structure of the data before doing EDA.

Nut.df %>% str()
## 'data.frame':    8789 obs. of  42 variables:
##  $ NDB_No         : num  1001 1002 1003 1004 1005 ...
##  $ Shrt_Desc      : chr  "BUTTER,WITH SALT" "BUTTER,WHIPPED,W/ SALT" "BUTTER OIL,ANHYDROUS" "CHEESE,BLUE" ...
##  $ Long_Desc      : chr  "Butter, salted" "Butter, whipped, with salt" "Butter oil, anhydrous" "Cheese, blue" ...
##  $ FdGrp_Desc     : chr  "Dairy and Egg Products" "Dairy and Egg Products" "Dairy and Egg Products" "Dairy and Egg Products" ...
##  $ Water_g        : num  15.87 16.72 0.24 42.41 41.11 ...
##  $ Energ_Kcal     : num  717 718 876 353 371 334 300 376 404 387 ...
##  $ Protein_g      : num  0.85 0.49 0.28 21.4 23.24 ...
##  $ Lipid_Tot_g    : num  81.1 78.3 99.5 28.7 29.7 ...
##  $ Fiber_TD_g     : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Sugar_Tot_g    : num  0.06 0.06 0 0.5 0.51 0.45 0.46 NA 0.48 NA ...
##  $ Calcium_mg     : num  24 23 4 528 674 184 388 673 710 643 ...
##  $ Iron_mg        : num  0.02 0.05 0 0.31 0.43 0.5 0.33 0.64 0.14 0.21 ...
##  $ Magnesium_mg   : num  2 1 0 23 24 20 20 22 27 21 ...
##  $ Phosphorus_mg  : num  24 24 3 387 451 188 347 490 455 464 ...
##  $ Potassium_mg   : num  24 41 5 256 136 152 187 93 76 95 ...
##  $ Sodium_mg      : num  643 583 2 1146 560 ...
##  $ Zinc_mg        : num  0.09 0.05 0.01 2.66 2.6 2.38 2.38 2.94 3.64 2.79 ...
##  $ Copper_mg      : num  0 0.01 0.001 0.04 0.024 0.019 0.021 0.024 0.03 0.042 ...
##  $ Manganese_mg   : num  0 0.001 0 0.009 0.012 0.034 0.038 0.021 0.027 0.012 ...
##  $ Selenium_g     : num  1 0 0 14.5 14.5 14.5 14.5 14.5 28.5 14.5 ...
##  $ Vit_C_mg       : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Thiamin_mg     : num  0.005 0.007 0.001 0.029 0.014 0.07 0.028 0.031 0.029 0.046 ...
##  $ Riboflavin_mg  : num  0.034 0.064 0.005 0.382 0.351 0.52 0.488 0.45 0.428 0.293 ...
##  $ Niacin_mg      : num  0.042 0.022 0.003 1.016 0.118 ...
##  $ Vit_B6_mg      : num  0.003 0.008 0.001 0.166 0.065 0.235 0.227 0.074 0.066 0.074 ...
##  $ Folate_Tot_g   : num  3 4 0 36 20 65 62 18 27 18 ...
##  $ Folic_Acid_g   : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Food_Folate_g  : num  3 4 0 36 20 65 62 18 27 18 ...
##  $ Choline_Tot__mg: num  18.8 18.8 22.3 15.4 15.4 15.4 15.4 NA 16.5 NA ...
##  $ Vit_B12_g      : num  0.17 0.07 0.01 1.22 1.26 1.65 1.3 0.27 1.1 0.83 ...
##  $ Lycopene_g     : num  0 0 0 0 0 0 0 NA 0 NA ...
##  $ Vit_E_mg       : num  2.32 1.37 2.8 0.25 0.26 0.24 0.21 NA 0.71 NA ...
##  $ Vit_D_g        : num  0 0 0 0.5 0.5 0.5 0.4 NA 0.6 NA ...
##  $ Vit_K_g        : num  7 4.6 8.6 2.4 2.5 2.3 2 NA 2.4 NA ...
##  $ FA_Sat_g       : num  51.4 45.4 61.9 18.7 18.8 ...
##  $ FA_Mono_g      : num  21.02 19.87 28.73 7.78 8.6 ...
##  $ FA_Poly_g      : num  3.043 3.331 3.694 0.8 0.784 ...
##  $ Cholestrl_mg   : num  215 225 256 75 94 100 72 93 99 103 ...
##  $ GmWt_1         : num  5 3.8 12.8 28.4 132 ...
##  $ GmWt_Desc1     : chr  "1 pat,  (1\" sq, 1/3\" high)" "1 pat,  (1\" sq, 1/3\" high)" "1 tbsp" "1 oz" ...
##  $ GmWt_2         : num  14.2 9.4 205 17 113 144 246 NA 244 NA ...
##  $ GmWt_Desc2     : chr  "1 tbsp" "1 tbsp" "1 cup" "1 cubic inch" ...

From the Nutrition dataset structure above, we can see what the dataset contains 42 variables. The database has 8,789 rows. The change we’re going to make the forecast on “Energ_Kcal” shows the calories of different foods. This is let’s look at the basic data for food data.

#now we are going to choose few columns along with the response variable

Nut.df <- Nut.df %>%
 select(Energ_Kcal, Protein_g, Lipid_Tot_g,
 Fiber_TD_g, Potassium_mg, Sodium_mg,
 Manganese_mg )

#Calculate summary statistics of the subset of data

summary(Nut.df)
##    Energ_Kcal      Protein_g      Lipid_Tot_g       Fiber_TD_g    
##  Min.   :  0.0   Min.   : 0.00   Min.   :  0.00   Min.   : 0.000  
##  1st Qu.: 91.0   1st Qu.: 2.38   1st Qu.:  0.95   1st Qu.: 0.000  
##  Median :191.0   Median : 8.02   Median :  5.13   Median : 0.700  
##  Mean   :226.3   Mean   :11.35   Mean   : 10.55   Mean   : 2.188  
##  3rd Qu.:337.0   3rd Qu.:19.88   3rd Qu.: 13.72   3rd Qu.: 2.600  
##  Max.   :902.0   Max.   :88.32   Max.   :100.00   Max.   :79.000  
##                                                   NA's   :594     
##   Potassium_mg       Sodium_mg        Manganese_mg     
##  Min.   :    0.0   Min.   :    0.0   Min.   :  0.0000  
##  1st Qu.:  127.0   1st Qu.:   41.0   1st Qu.:  0.0150  
##  Median :  229.0   Median :   88.0   Median :  0.0820  
##  Mean   :  279.4   Mean   :  312.6   Mean   :  0.6582  
##  3rd Qu.:  335.5   3rd Qu.:  405.0   3rd Qu.:  0.3260  
##  Max.   :16500.0   Max.   :38758.0   Max.   :328.0000  
##  NA's   :426       NA's   :83        NA's   :2160

Basic information gives us quick insights into the identity of the data. exchange. The types of changes for food information are very different from each other. This will be required to provide us with specific information. This is very fast.

#Checking Missing values count

apply(Nut.df, 2, FUN = function(x) sum(is.na(x)))
##   Energ_Kcal    Protein_g  Lipid_Tot_g   Fiber_TD_g Potassium_mg    Sodium_mg 
##            0            0            0          594          426           83 
## Manganese_mg 
##         2160

From the analysis above, we can see that many things are missing from it. Differences in Fiber_TD_g, Sugar_Tot_g, Calcium_mg and Iron_mg

#Removing the NA records

Nut.df <- Nut.df %>% na.omit()

#Check dimension of the dataset againg after removing the missing values.

dim(Nut.df)
## [1] 6209    7

We have to seen, after removing NA records there are 6209 rows available

#Let’s check the distribution of the response variable

ggplot(data = Nut.df,
 mapping = aes(Energ_Kcal)) +
 geom_histogram(binwidth = 50, bins = 30, fill = "coral")

 theme_minimal()
## List of 93
##  $ line                      :List of 6
##   ..$ colour       : chr "black"
##   ..$ size         : num 0.5
##   ..$ linetype     : num 1
##   ..$ lineend      : chr "butt"
##   ..$ arrow        : logi FALSE
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_line" "element"
##  $ rect                      :List of 5
##   ..$ fill         : chr "white"
##   ..$ colour       : chr "black"
##   ..$ size         : num 0.5
##   ..$ linetype     : num 1
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_rect" "element"
##  $ text                      :List of 11
##   ..$ family       : chr ""
##   ..$ face         : chr "plain"
##   ..$ colour       : chr "black"
##   ..$ size         : num 11
##   ..$ hjust        : num 0.5
##   ..$ vjust        : num 0.5
##   ..$ angle        : num 0
##   ..$ lineheight   : num 0.9
##   ..$ margin       : 'margin' num [1:4] 0points 0points 0points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : logi FALSE
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ title                     : NULL
##  $ aspect.ratio              : NULL
##  $ axis.title                : NULL
##  $ axis.title.x              :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : num 1
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 2.75points 0points 0points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.title.x.top          :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : num 0
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 0points 0points 2.75points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.title.x.bottom       : NULL
##  $ axis.title.y              :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : num 1
##   ..$ angle        : num 90
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 0points 2.75points 0points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.title.y.left         : NULL
##  $ axis.title.y.right        :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : num 0
##   ..$ angle        : num -90
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 0points 0points 0points 2.75points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.text                 :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : chr "grey30"
##   ..$ size         : 'rel' num 0.8
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.text.x               :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : num 1
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 2.2points 0points 0points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.text.x.top           :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : num 0
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 0points 0points 2.2points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.text.x.bottom        : NULL
##  $ axis.text.y               :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : num 1
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 0points 2.2points 0points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.text.y.left          : NULL
##  $ axis.text.y.right         :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : num 0
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 0points 0points 0points 2.2points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ axis.ticks                : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ axis.ticks.x              : NULL
##  $ axis.ticks.x.top          : NULL
##  $ axis.ticks.x.bottom       : NULL
##  $ axis.ticks.y              : NULL
##  $ axis.ticks.y.left         : NULL
##  $ axis.ticks.y.right        : NULL
##  $ axis.ticks.length         : 'simpleUnit' num 2.75points
##   ..- attr(*, "unit")= int 8
##  $ axis.ticks.length.x       : NULL
##  $ axis.ticks.length.x.top   : NULL
##  $ axis.ticks.length.x.bottom: NULL
##  $ axis.ticks.length.y       : NULL
##  $ axis.ticks.length.y.left  : NULL
##  $ axis.ticks.length.y.right : NULL
##  $ axis.line                 : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ axis.line.x               : NULL
##  $ axis.line.x.top           : NULL
##  $ axis.line.x.bottom        : NULL
##  $ axis.line.y               : NULL
##  $ axis.line.y.left          : NULL
##  $ axis.line.y.right         : NULL
##  $ legend.background         : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ legend.margin             : 'margin' num [1:4] 5.5points 5.5points 5.5points 5.5points
##   ..- attr(*, "unit")= int 8
##  $ legend.spacing            : 'simpleUnit' num 11points
##   ..- attr(*, "unit")= int 8
##  $ legend.spacing.x          : NULL
##  $ legend.spacing.y          : NULL
##  $ legend.key                : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ legend.key.size           : 'simpleUnit' num 1.2lines
##   ..- attr(*, "unit")= int 3
##  $ legend.key.height         : NULL
##  $ legend.key.width          : NULL
##  $ legend.text               :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 0.8
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ legend.text.align         : NULL
##  $ legend.title              :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : num 0
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ legend.title.align        : NULL
##  $ legend.position           : chr "right"
##  $ legend.direction          : NULL
##  $ legend.justification      : chr "center"
##  $ legend.box                : NULL
##  $ legend.box.just           : NULL
##  $ legend.box.margin         : 'margin' num [1:4] 0cm 0cm 0cm 0cm
##   ..- attr(*, "unit")= int 1
##  $ legend.box.background     : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ legend.box.spacing        : 'simpleUnit' num 11points
##   ..- attr(*, "unit")= int 8
##  $ panel.background          : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ panel.border              : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ panel.spacing             : 'simpleUnit' num 5.5points
##   ..- attr(*, "unit")= int 8
##  $ panel.spacing.x           : NULL
##  $ panel.spacing.y           : NULL
##  $ panel.grid                :List of 6
##   ..$ colour       : chr "grey92"
##   ..$ size         : NULL
##   ..$ linetype     : NULL
##   ..$ lineend      : NULL
##   ..$ arrow        : logi FALSE
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_line" "element"
##  $ panel.grid.major          : NULL
##  $ panel.grid.minor          :List of 6
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 0.5
##   ..$ linetype     : NULL
##   ..$ lineend      : NULL
##   ..$ arrow        : logi FALSE
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_line" "element"
##  $ panel.grid.major.x        : NULL
##  $ panel.grid.major.y        : NULL
##  $ panel.grid.minor.x        : NULL
##  $ panel.grid.minor.y        : NULL
##  $ panel.ontop               : logi FALSE
##  $ plot.background           : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ plot.title                :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 1.2
##   ..$ hjust        : num 0
##   ..$ vjust        : num 1
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 0points 0points 5.5points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ plot.title.position       : chr "panel"
##  $ plot.subtitle             :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : num 0
##   ..$ vjust        : num 1
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 0points 0points 5.5points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ plot.caption              :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 0.8
##   ..$ hjust        : num 1
##   ..$ vjust        : num 1
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 5.5points 0points 0points 0points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ plot.caption.position     : chr "panel"
##  $ plot.tag                  :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : 'rel' num 1.2
##   ..$ hjust        : num 0.5
##   ..$ vjust        : num 0.5
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ plot.tag.position         : chr "topleft"
##  $ plot.margin               : 'margin' num [1:4] 5.5points 5.5points 5.5points 5.5points
##   ..- attr(*, "unit")= int 8
##  $ strip.background          : list()
##   ..- attr(*, "class")= chr [1:2] "element_blank" "element"
##  $ strip.background.x        : NULL
##  $ strip.background.y        : NULL
##  $ strip.placement           : chr "inside"
##  $ strip.text                :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : chr "grey10"
##   ..$ size         : 'rel' num 0.8
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : NULL
##   ..$ lineheight   : NULL
##   ..$ margin       : 'margin' num [1:4] 4.4points 4.4points 4.4points 4.4points
##   .. ..- attr(*, "unit")= int 8
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ strip.text.x              : NULL
##  $ strip.text.y              :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : num -90
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  $ strip.switch.pad.grid     : 'simpleUnit' num 2.75points
##   ..- attr(*, "unit")= int 8
##  $ strip.switch.pad.wrap     : 'simpleUnit' num 2.75points
##   ..- attr(*, "unit")= int 8
##  $ strip.text.y.left         :List of 11
##   ..$ family       : NULL
##   ..$ face         : NULL
##   ..$ colour       : NULL
##   ..$ size         : NULL
##   ..$ hjust        : NULL
##   ..$ vjust        : NULL
##   ..$ angle        : num 90
##   ..$ lineheight   : NULL
##   ..$ margin       : NULL
##   ..$ debug        : NULL
##   ..$ inherit.blank: logi TRUE
##   ..- attr(*, "class")= chr [1:2] "element_text" "element"
##  - attr(*, "class")= chr [1:2] "theme" "gg"
##  - attr(*, "complete")= logi TRUE
##  - attr(*, "validate")= logi TRUE

From the graph above we can see our calorie distribution policy changes. We can see that there is a lot of food available low calorie.

#Scatter plot with linear fitted lines

gather(Nut.df[, -1], "Variable", "Value") %>%
 ggplot(mapping = aes(y=Value, fill = Variable)) +
 facet_wrap(~ Variable, scales = "free_y") +
 geom_boxplot(color = "green") +
 # geom_smooth(method = "lm", col = "coral") +
 theme_minimal() +
 theme(legend.position = "none")

From the above plots, we can see there are so many extreme points are there. It’s very difficult to get an accurate values

#Scatter plot with linear fitted lines

gather(Nut.df, "Variable", "Value", -Energ_Kcal) %>%
 ggplot(mapping = aes(Value, Energ_Kcal)) +
 facet_wrap(~ Variable, scales = "free_x") +
 geom_point(color = "green") +
 geom_smooth(method = "lm", col = "coral") +
 theme_minimal()
## `geom_smooth()` using formula 'y ~ x'

From the above plots, the independent variables are positively correlated with response variable.

#correlation matrix

cm <- round(cor(Nut.df), 2)
cm
##              Energ_Kcal Protein_g Lipid_Tot_g Fiber_TD_g Potassium_mg Sodium_mg
## Energ_Kcal         1.00      0.17        0.80       0.22         0.12      0.05
## Protein_g          0.17      1.00        0.13      -0.10         0.21     -0.01
## Lipid_Tot_g        0.80      0.13        1.00       0.01        -0.01      0.02
## Fiber_TD_g         0.22     -0.10        0.01       1.00         0.36     -0.01
## Potassium_mg       0.12      0.21       -0.01       0.36         1.00      0.00
## Sodium_mg          0.05     -0.01        0.02      -0.01         0.00      1.00
## Manganese_mg       0.05      0.03        0.00       0.12         0.15      0.00
##              Manganese_mg
## Energ_Kcal           0.05
## Protein_g            0.03
## Lipid_Tot_g          0.00
## Fiber_TD_g           0.12
## Potassium_mg         0.15
## Sodium_mg            0.00
## Manganese_mg         1.00

#Plot the resulting matrix

ggcorrplot(corr = cm,lab = T, type= "lower")

The heatmap and correlation matrix provide complete information it is not important that there is a large relationship between variables.

Some data preprocessing and recognized the reason for the absence. We removed the missing goods. Once upon a time we We had world data, we did a special EDA on nutrition data and The distribution was made for the version that shows Calories.We then performed a correlation analysis and found no significance by reason of the variables