Import Nutrition Data

Data downloaded from Kaggle.

https://www.kaggle.com/datasets/utsavdey1410/food-nutrition-dataset

Clear up the workspace and read up on the help files.

remove (list=ls())

?read.csv # open the help file

My original import commands - you have to change the file path in your working directory.

FOOD.DATA.GROUP1 <- read.csv("~/Desktop/FOOD-DATA-GROUP1.csv")# you need to identify the key argument in the command

FOOD.DATA.GROUP1 <- read.csv(file = "~/Desktop/FOOD-DATA-GROUP1.csv") # better coding practice is to specify the key argument

FOOD.DATA.GROUP2 <- read.csv(file = "~/Desktop/FOOD-DATA-GROUP1.csv", 
                             header = TRUE) # explicitly specifying the default argumnent does not change anything, but might be a good practice when you are new

FOOD.DATA.GROUP2 <- read.csv(file = "~/Desktop/FOOD-DATA-GROUP1.csv", 
                             header = FALSE)

This piece of code is better as you do not have to change the file path and can simply run the code without making any changes, as long as you maintain the original folder (do not delete the sub folder FINAL FOOD DATASET).

FOOD.DATA.GROUP1 <- read.csv("~/Desktop/FOOD-DATA-GROUP1.csv")# you need to identify the key argument in the command

FOOD.DATA.GROUP2 <- read.csv(file = "~/Desktop/FOOD-DATA-GROUP1.csv", 
                             header = FALSE)

Sub Setting Data

We will create a healthy and unhealthy panel of food, based on Nutrition.Density values.

?remove
remove(FOOD.DATA.GROUP2) # remove data group two.

FOOD.DATA.GROUP1$X <- NULL #take away X category.

FOOD.DATA.GROUP1$healthy <- FOOD.DATA.GROUP1$Nutrition.Density > 10 # overriden values less than 10 - created own value of healthy based on current value of nutrition.density to study healthy foods, created true/false categories.

df_healthy_food <- FOOD.DATA.GROUP1[FOOD.DATA.GROUP1$healthy, ] # create healthy food data set.
df_unhealthy_food <- FOOD.DATA.GROUP1[!FOOD.DATA.GROUP1$healthy, ] # created opposite/unhealthy data set.

Sub Setting Large data

You can play with the nrows argument to import only a small subset of the original data.

FOOD.DATA.GROUP2 <- read.csv(file = "~/Desktop/FOOD-DATA-GROUP1.csv", 
                             header = TRUE,
                             nrows = 10)

‘read.csv’ help files gives us some instructions on how to use the command.

?read.csv # open the help file

## using count.fields to handle unknown maximum number of fields
## when fill = TRUE
test1 <- c(1:5, "6,7", "8,9,10")
tf <- tempfile()
writeLines(test1, tf)

read.csv(tf, fill = TRUE) # 1 column
ncol <- max(count.fields(tf, sep = ","))
read.csv(tf, fill = TRUE, header = FALSE,
         col.names = paste0("V", seq_len(ncol)))
unlink(tf)

## "Inline" data set, using text=
## Notice that leading and trailing empty lines are auto-trimmed

read.table(header = TRUE, text = "
a b
1 2
3 4
")

Summary Stats

We will use the psych package. Use the stargazer package if you wish instead.

Make sure the package is installed and the package is loaded

# install.packages("psych") # installation - only once
library(psych)

Now you can use the package.

describe(df_healthy_food)

## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning Inf

## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning -Inf

##                      vars   n   mean     sd median trimmed    mad   min    max
## Unnamed..0              1 518 271.13 157.42 268.50  270.20 199.41  1.00  550.0
## food*                   2 518 259.50 149.68 259.50  259.50 192.00  1.00  518.0
## Caloric.Value           3 518 249.51 198.62 196.50  222.81 165.31  8.00 1578.0
## Fat                     4 518  11.35  12.71   7.35    9.01   7.71  0.00   87.5
## Saturated.Fats          5 518   3.83   5.23   2.10    2.78   2.52  0.00   43.5
## Monounsaturated.Fats    6 518   4.05   5.30   2.40    3.02   2.82  0.00   48.0
## Polyunsaturated.Fats    7 518   2.28   3.54   1.20    1.58   1.33  0.00   40.1
## Carbohydrates           8 518  16.77  20.56   7.20   13.54  10.67  0.00  128.3
## Sugars                  9 518   2.87   7.66   0.00    1.20   0.00  0.00   70.8
## Protein                10 518  19.46  19.05  12.85   15.90  12.08  0.00   86.9
## Dietary.Fiber          11 518   1.17   2.21   0.00    0.66   0.00  0.00   17.5
## Cholesterol            12 518  64.93  70.93  39.00   52.61  49.22  0.00  352.5
## Sodium                 13 518   0.59   0.60   0.40    0.51   0.48  0.00    6.1
## Water                  14 518 107.55  88.01  88.75   99.16  93.18  0.00  535.8
## Vitamin.A              15 518   0.08   0.17   0.04    0.05   0.06  0.00    2.1
## Vitamin.B1             16 518   0.16   0.21   0.09    0.12   0.11  0.00    1.9
## Vitamin.B11            17 518   0.07   0.09   0.05    0.05   0.05  0.00    1.3
## Vitamin.B12            18 518   0.04   0.04   0.04    0.04   0.04  0.00    0.4
## Vitamin.B2             19 518   0.21   0.30   0.10    0.17   0.15  0.00    3.8
## Vitamin.B3             20 518   3.88   6.08   1.90    2.53   2.52  0.00   57.8
## Vitamin.B5             21 518   1.15   2.50   0.50    0.71   0.59  0.00   31.4
## Vitamin.B6             22 518   0.33   0.50   0.10    0.22   0.15  0.00    4.3
## Vitamin.C              23 518   2.23   5.20   0.10    0.92   0.15  0.00   42.9
## Vitamin.D              24 518   0.18   1.45   0.00    0.01   0.00  0.00   29.3
## Vitamin.E              25 518   0.58   1.26   0.04    0.30   0.07  0.00   14.1
## Vitamin.K              26 518   0.45   7.32   0.01    0.03   0.02  0.00  166.4
## Calcium                27 518 100.79 174.24  42.80   62.88  52.78  0.00 1283.5
## Copper                 28 518   8.91  45.68   0.10    0.19   0.14  0.00  668.6
## Iron                   29 518   1.62   1.91   1.20    1.31   1.33  0.00   21.1
## Magnesium              30 518  36.56  45.98  23.50   28.44  25.95  0.00  376.2
## Manganese              31 518   3.73  12.96   0.20    0.27   0.26  0.00  111.6
## Phosphorus             32 518 232.27 245.80 152.70  189.75 167.09  0.00 1385.1
## Potassium              33 518 367.27 383.28 246.80  297.51 246.33  0.00 3198.4
## Selenium               34 518  36.91 157.62   0.05    0.05   0.04  0.00 1679.0
## Zinc                   35 518   1.74   6.71   0.90    1.09   1.04  0.00  147.3
## Nutrition.Density      36 518 153.46 189.54  93.81  114.95  83.96 10.15 1337.0
## healthy                37 518    NaN     NA     NA     NaN     NA   Inf   -Inf
##                        range  skew kurtosis    se
## Unnamed..0            549.00  0.04    -1.20  6.92
## food*                 517.00  0.00    -1.21  6.58
## Caloric.Value        1570.00  1.95     6.90  8.73
## Fat                    87.50  2.25     6.65  0.56
## Saturated.Fats         43.50  3.19    15.19  0.23
## Monounsaturated.Fats   48.00  3.17    15.43  0.23
## Polyunsaturated.Fats   40.10  5.18    42.82  0.16
## Carbohydrates         128.30  1.55     3.41  0.90
## Sugars                 70.80  5.35    34.01  0.34
## Protein                86.90  1.67     2.43  0.84
## Dietary.Fiber          17.50  3.00    12.19  0.10
## Cholesterol           352.50  1.45     1.52  3.12
## Sodium                  6.10  2.23    12.76  0.03
## Water                 535.80  1.04     1.39  3.87
## Vitamin.A               2.10  6.56    58.35  0.01
## Vitamin.B1              1.90  3.61    20.60  0.01
## Vitamin.B11             1.30  6.33    63.47  0.00
## Vitamin.B12             0.40  2.17    15.88  0.00
## Vitamin.B2              3.80  6.45    62.42  0.01
## Vitamin.B3             57.80  3.61    18.04  0.27
## Vitamin.B5             31.40  7.42    70.05  0.11
## Vitamin.B6              4.30  3.39    15.66  0.02
## Vitamin.C              42.90  4.08    20.89  0.23
## Vitamin.D              29.30 16.53   317.27  0.06
## Vitamin.E              14.10  4.92    35.26  0.06
## Vitamin.K             166.40 22.50   507.11  0.32
## Calcium              1283.50  3.66    14.95  7.66
## Copper                668.60  9.01   103.82  2.01
## Iron                   21.10  3.97    27.61  0.08
## Magnesium             376.20  3.45    17.59  2.02
## Manganese             111.60  4.61    23.85  0.57
## Phosphorus           1385.10  1.76     3.45 10.80
## Potassium            3198.40  2.17     7.19 16.84
## Selenium             1679.00  6.01    43.30  6.93
## Zinc                  147.30 19.81   424.90  0.29
## Nutrition.Density    1326.85  3.17    11.64  8.33
## healthy                 -Inf    NA       NA    NA

Importing Dataset

Hannah Robinson

2024-07-23

Import Nutrition Data

Sub Setting Data

Sub Setting Large data

Summary Stats