setwd("C:/Users/avaan/OneDrive/Desktop")
protein <- read.csv(file = "protein.csv")
str(protein)

## 'data.frame':    170 obs. of  32 variables:
##  $ Country                     : chr  "Afghanistan" "Albania" "Algeria" "Angola" ...
##  $ Alcoholic.Beverages         : num  0 0.184 0.0323 0.6285 0.1535 ...
##  $ Animal.Products             : num  9.75 27.75 13.84 15.23 33.19 ...
##  $ Animal.fats                 : num  0.0277 0.0711 0.0054 0.0277 0.1289 ...
##  $ Aquatic.Products..Other     : num  0 0 0 0 0 0 0 0.0046 0 0 ...
##  $ Cereals...Excluding.Beer    : num  36 14.2 26.6 20.4 10.5 ...
##  $ Eggs                        : num  0.407 1.807 1.292 0.176 0.485 ...
##  $ Fish..Seafood               : num  0.0647 0.6274 0.635 5.4436 8.2146 ...
##  $ Fruits...Excluding.Wine     : num  0.582 1.276 1.162 1.275 1.259 ...
##  $ Meat                        : num  3.13 7.66 3.51 7.62 16.07 ...
##  $ Milk...Excluding.Butter     : num  5.53 16.48 8.06 1.15 7.43 ...
##  $ Offals                      : num  0.592 1.108 0.328 0.813 0.853 ...
##  $ Oilcrops                    : num  0.203 0.372 0.183 2.153 0.767 ...
##  $ Pulses                      : num  1.248 1.456 2.551 4.085 0.884 ...
##  $ Spices                      : num  0.166 0 0.178 0 0.344 ...
##  $ Starchy.Roots               : num  0.194 0.887 1.464 5.194 0.467 ...
##  $ Stimulants                  : num  0.555 0.264 0.463 0.102 0.411 ...
##  $ Sugar.Crops                 : num  0 0 0 0 0 0 0 0 0 0 ...
##  $ Sugar...Sweeteners          : num  0 0.0042 0 0.0092 0 0.0049 0.0051 0 0.0139 0 ...
##  $ Treenuts                    : num  0.1387 0.2677 0.2745 0.0092 0.0737 ...
##  $ Vegetal.Products            : num  40.2 22.3 36.2 34.8 16.8 ...
##  $ Vegetable.Oils              : num  0 0.0084 0.0269 0.0092 0.043 0 0.0205 0.0463 0.0647 0.0217 ...
##  $ Vegetables                  : num  1.137 3.246 3.127 0.813 1.602 ...
##  $ Miscellaneous               : num  0.0462 0.0544 0.1399 0.0924 0.2947 ...
##  $ Obesity                     : num  4.5 22.3 26.6 6.8 19.1 28.5 20.9 30.4 21.9 19.9 ...
##  $ Undernourished              : chr  "29.8" "6.2" "3.9" "25" ...
##  $ Confirmed                   : num  0.1421 2.9673 0.2449 0.0617 0.2939 ...
##  $ Deaths                      : num  0.00619 0.05095 0.00656 0.00146 0.00714 ...
##  $ Recovered                   : num  0.1234 1.7926 0.1676 0.0568 0.1908 ...
##  $ Active                      : num  0.01257 1.12371 0.07077 0.00342 0.09592 ...
##  $ Population                  : num  38928000 2838000 44357000 32522000 98000 ...
##  $ Unit..all.except.Population.: chr  "%" "%" "%" "%" ...

library(MASS)

Question 2

The variable that I am going to use in my analysis is called ‘Animal.Products’. This variable describes the percentage of fat intake from each country that comes from animal products.

Question 3

result <- t.test(protein$Animal.Products)
print(result$conf.int)

## [1] 20.03275 22.43156
## attr(,"conf.level")
## [1] 0.95

I used the t-test method to construct a confidence interval which was used in previous statistics courses. It gave a 95% CI [20.03, 22.43].

Question 4

n = dim(protein)[1] 
original.sample = sample(protein$Animal.Products, 170, replace = FALSE) 

bt.sample.mean.vec = NULL      
for(i in 1:1000){ith.bt.sample = sample(x = original.sample, 
                       size = 170,              
                    replace = TRUE)                      
  bt.sample.mean.vec[i] = mean(ith.bt.sample) }

CI = quantile(bt.sample.mean.vec, c(0.025, 0.975))
CI

##     2.5%    97.5% 
## 20.13919 22.45177

Question 5

hist(bt.sample.mean.vec,                     
     breaks = 20,                            # specify number of vertical bars
       xlab = "Bootstrap sample means",      # change the label of x-axis
      # add a title to the histogram
        main="Bootstrap Sampling Distribution \n of Sample Means")

Bootstrap sampling distribution of sample means

Question 6

The confidence intervals for the t-test and the bootstrap model are extremely similar. they are only off from one another by small decimals. This is likely because both bootstrap and t-tests use sample data to evaluate the mean. Both confidence intervals also account for uncertainty, the t-test relies on a normal distribution, and the bootstrap model uses re sampling.

STA 321 Assignment 1

Ava Destefano

2024-09-08

Question 2

Question 3

Question 4

Question 5

Question 6