Data downloaded from Kaggle.
https://www.kaggle.com/datasets/utsavdey1410/food-nutrition-dataset
Clear up the workspace and read up on the help files.
remove (list=ls())
?read.csv # open the help file
My original import commands - you have to change the file path in your working directory.
FOOD.DATA.GROUP1 <- read.csv("~/Desktop/FOOD-DATA-GROUP1.csv")# you need to identify the key argument in the command
FOOD.DATA.GROUP1 <- read.csv(file = "~/Desktop/FOOD-DATA-GROUP1.csv") # better coding practice is to specify the key argument
FOOD.DATA.GROUP2 <- read.csv(file = "~/Desktop/FOOD-DATA-GROUP1.csv",
header = TRUE) # explicitly specifying the default argumnent does not change anything, but might be a good practice when you are new
FOOD.DATA.GROUP2 <- read.csv(file = "~/Desktop/FOOD-DATA-GROUP1.csv",
header = FALSE)
This piece of code is better as you do not have to change the file path and can simply run the code without making any changes, as long as you maintain the original folder (do not delete the sub folder FINAL FOOD DATASET).
FOOD.DATA.GROUP1 <- read.csv("~/Desktop/FOOD-DATA-GROUP1.csv")# you need to identify the key argument in the command
FOOD.DATA.GROUP2 <- read.csv(file = "~/Desktop/FOOD-DATA-GROUP1.csv",
header = FALSE)
We will create a healthy and unhealthy panel of food, based on
Nutrition.Density values.
?remove
remove(FOOD.DATA.GROUP2) # remove data group two.
FOOD.DATA.GROUP1$X <- NULL #take away X category.
FOOD.DATA.GROUP1$healthy <- FOOD.DATA.GROUP1$Nutrition.Density > 10 # overriden values less than 10 - created own value of healthy based on current value of nutrition.density to study healthy foods, created true/false categories.
df_healthy_food <- FOOD.DATA.GROUP1[FOOD.DATA.GROUP1$healthy, ] # create healthy food data set.
df_unhealthy_food <- FOOD.DATA.GROUP1[!FOOD.DATA.GROUP1$healthy, ] # created opposite/unhealthy data set.
You can play with the nrows argument to import only a
small subset of the original data.
FOOD.DATA.GROUP2 <- read.csv(file = "~/Desktop/FOOD-DATA-GROUP1.csv",
header = TRUE,
nrows = 10)
‘read.csv’ help files gives us some instructions on how to use the command.
?read.csv # open the help file
## using count.fields to handle unknown maximum number of fields
## when fill = TRUE
test1 <- c(1:5, "6,7", "8,9,10")
tf <- tempfile()
writeLines(test1, tf)
read.csv(tf, fill = TRUE) # 1 column
ncol <- max(count.fields(tf, sep = ","))
read.csv(tf, fill = TRUE, header = FALSE,
col.names = paste0("V", seq_len(ncol)))
unlink(tf)
## "Inline" data set, using text=
## Notice that leading and trailing empty lines are auto-trimmed
read.table(header = TRUE, text = "
a b
1 2
3 4
")
We will use the psych package. Use the stargazer package if you wish instead.
# install.packages("psych") # installation - only once
library(psych)
Now you can use the package.
describe(df_healthy_food)
## Warning in FUN(newX[, i], ...): no non-missing arguments to min; returning Inf
## Warning in FUN(newX[, i], ...): no non-missing arguments to max; returning -Inf
## vars n mean sd median trimmed mad min max
## Unnamed..0 1 518 271.13 157.42 268.50 270.20 199.41 1.00 550.0
## food* 2 518 259.50 149.68 259.50 259.50 192.00 1.00 518.0
## Caloric.Value 3 518 249.51 198.62 196.50 222.81 165.31 8.00 1578.0
## Fat 4 518 11.35 12.71 7.35 9.01 7.71 0.00 87.5
## Saturated.Fats 5 518 3.83 5.23 2.10 2.78 2.52 0.00 43.5
## Monounsaturated.Fats 6 518 4.05 5.30 2.40 3.02 2.82 0.00 48.0
## Polyunsaturated.Fats 7 518 2.28 3.54 1.20 1.58 1.33 0.00 40.1
## Carbohydrates 8 518 16.77 20.56 7.20 13.54 10.67 0.00 128.3
## Sugars 9 518 2.87 7.66 0.00 1.20 0.00 0.00 70.8
## Protein 10 518 19.46 19.05 12.85 15.90 12.08 0.00 86.9
## Dietary.Fiber 11 518 1.17 2.21 0.00 0.66 0.00 0.00 17.5
## Cholesterol 12 518 64.93 70.93 39.00 52.61 49.22 0.00 352.5
## Sodium 13 518 0.59 0.60 0.40 0.51 0.48 0.00 6.1
## Water 14 518 107.55 88.01 88.75 99.16 93.18 0.00 535.8
## Vitamin.A 15 518 0.08 0.17 0.04 0.05 0.06 0.00 2.1
## Vitamin.B1 16 518 0.16 0.21 0.09 0.12 0.11 0.00 1.9
## Vitamin.B11 17 518 0.07 0.09 0.05 0.05 0.05 0.00 1.3
## Vitamin.B12 18 518 0.04 0.04 0.04 0.04 0.04 0.00 0.4
## Vitamin.B2 19 518 0.21 0.30 0.10 0.17 0.15 0.00 3.8
## Vitamin.B3 20 518 3.88 6.08 1.90 2.53 2.52 0.00 57.8
## Vitamin.B5 21 518 1.15 2.50 0.50 0.71 0.59 0.00 31.4
## Vitamin.B6 22 518 0.33 0.50 0.10 0.22 0.15 0.00 4.3
## Vitamin.C 23 518 2.23 5.20 0.10 0.92 0.15 0.00 42.9
## Vitamin.D 24 518 0.18 1.45 0.00 0.01 0.00 0.00 29.3
## Vitamin.E 25 518 0.58 1.26 0.04 0.30 0.07 0.00 14.1
## Vitamin.K 26 518 0.45 7.32 0.01 0.03 0.02 0.00 166.4
## Calcium 27 518 100.79 174.24 42.80 62.88 52.78 0.00 1283.5
## Copper 28 518 8.91 45.68 0.10 0.19 0.14 0.00 668.6
## Iron 29 518 1.62 1.91 1.20 1.31 1.33 0.00 21.1
## Magnesium 30 518 36.56 45.98 23.50 28.44 25.95 0.00 376.2
## Manganese 31 518 3.73 12.96 0.20 0.27 0.26 0.00 111.6
## Phosphorus 32 518 232.27 245.80 152.70 189.75 167.09 0.00 1385.1
## Potassium 33 518 367.27 383.28 246.80 297.51 246.33 0.00 3198.4
## Selenium 34 518 36.91 157.62 0.05 0.05 0.04 0.00 1679.0
## Zinc 35 518 1.74 6.71 0.90 1.09 1.04 0.00 147.3
## Nutrition.Density 36 518 153.46 189.54 93.81 114.95 83.96 10.15 1337.0
## healthy 37 518 NaN NA NA NaN NA Inf -Inf
## range skew kurtosis se
## Unnamed..0 549.00 0.04 -1.20 6.92
## food* 517.00 0.00 -1.21 6.58
## Caloric.Value 1570.00 1.95 6.90 8.73
## Fat 87.50 2.25 6.65 0.56
## Saturated.Fats 43.50 3.19 15.19 0.23
## Monounsaturated.Fats 48.00 3.17 15.43 0.23
## Polyunsaturated.Fats 40.10 5.18 42.82 0.16
## Carbohydrates 128.30 1.55 3.41 0.90
## Sugars 70.80 5.35 34.01 0.34
## Protein 86.90 1.67 2.43 0.84
## Dietary.Fiber 17.50 3.00 12.19 0.10
## Cholesterol 352.50 1.45 1.52 3.12
## Sodium 6.10 2.23 12.76 0.03
## Water 535.80 1.04 1.39 3.87
## Vitamin.A 2.10 6.56 58.35 0.01
## Vitamin.B1 1.90 3.61 20.60 0.01
## Vitamin.B11 1.30 6.33 63.47 0.00
## Vitamin.B12 0.40 2.17 15.88 0.00
## Vitamin.B2 3.80 6.45 62.42 0.01
## Vitamin.B3 57.80 3.61 18.04 0.27
## Vitamin.B5 31.40 7.42 70.05 0.11
## Vitamin.B6 4.30 3.39 15.66 0.02
## Vitamin.C 42.90 4.08 20.89 0.23
## Vitamin.D 29.30 16.53 317.27 0.06
## Vitamin.E 14.10 4.92 35.26 0.06
## Vitamin.K 166.40 22.50 507.11 0.32
## Calcium 1283.50 3.66 14.95 7.66
## Copper 668.60 9.01 103.82 2.01
## Iron 21.10 3.97 27.61 0.08
## Magnesium 376.20 3.45 17.59 2.02
## Manganese 111.60 4.61 23.85 0.57
## Phosphorus 1385.10 1.76 3.45 10.80
## Potassium 3198.40 2.17 7.19 16.84
## Selenium 1679.00 6.01 43.30 6.93
## Zinc 147.30 19.81 424.90 0.29
## Nutrition.Density 1326.85 3.17 11.64 8.33
## healthy -Inf NA NA NA