Question 1(a)


Plant Growth Data set

Basically this data set is about the relationship between the weight and the group of Plant Growth

The current time is r time

PlantGrowth Data

Data taken from data() function in the R studio library .

data()
data("PlantGrowth")
PlantGrowth
##    weight group
## 1    4.17  ctrl
## 2    5.58  ctrl
## 3    5.18  ctrl
## 4    6.11  ctrl
## 5    4.50  ctrl
## 6    4.61  ctrl
## 7    5.17  ctrl
## 8    4.53  ctrl
## 9    5.33  ctrl
## 10   5.14  ctrl
## 11   4.81  trt1
## 12   4.17  trt1
## 13   4.41  trt1
## 14   3.59  trt1
## 15   5.87  trt1
## 16   3.83  trt1
## 17   6.03  trt1
## 18   4.89  trt1
## 19   4.32  trt1
## 20   4.69  trt1
## 21   6.31  trt2
## 22   5.12  trt2
## 23   5.54  trt2
## 24   5.50  trt2
## 25   5.37  trt2
## 26   5.29  trt2
## 27   4.92  trt2
## 28   6.15  trt2
## 29   5.80  trt2
## 30   5.26  trt2

EDA of the PlantGrowth dataset

min,mean,interquartile and varience of the weight of plants

set.seed(100)
x <-rnorm(100)
mean(PlantGrowth$weight)
## [1] 5.073
mean(PlantGrowth$weight)
## [1] 5.073
median(PlantGrowth$weight)
## [1] 5.155
sd(PlantGrowth$weight)
## [1] 0.7011918
min(PlantGrowth$weight)
## [1] 3.59
min(PlantGrowth$weight)
## [1] 3.59
IQR(PlantGrowth$weight)
## [1] 0.98
var(PlantGrowth$weight)
## [1] 0.49167

Exploratory Data analysis

Summaries of the PlantGrowth dataset

##      weight       group   
##  Min.   :3.590   ctrl:10  
##  1st Qu.:4.550   trt1:10  
##  Median :5.155   trt2:10  
##  Mean   :5.073            
##  3rd Qu.:5.530            
##  Max.   :6.310

Head and Tail of the data

Head is the first 6 dataset

Tail is the last 6 dataset

##   weight group
## 1   4.17  ctrl
## 2   5.58  ctrl
## 3   5.18  ctrl
## 4   6.11  ctrl
## 5   4.50  ctrl
## 6   4.61  ctrl
##    weight group
## 25   5.37  trt2
## 26   5.29  trt2
## 27   4.92  trt2
## 28   6.15  trt2
## 29   5.80  trt2
## 30   5.26  trt2

Data visualization

CODEBOOK

## ================================================================================
## 
##    weight
## 
## --------------------------------------------------------------------------------
## 
##    Storage mode: double
## 
##         Min:  3.590
##         Max:  6.310
##        Mean:  5.073
##    Std.Dev.:  0.689
##    Skewness: -0.153
##    Kurtosis: -0.659
## 
## ================================================================================
## 
##    group
## 
## --------------------------------------------------------------------------------
## 
##    Storage mode: integer
##    Factor with 3 levels
## 
##    Levels and labels     N Valid
##                                 
##    1 'ctrl'             10  33.3
##    2 'trt1'             10  33.3
##    3 'trt2'             10  33.3

Question 1(b)


Online data taken

Marvel

##                                     ï..Movie Year IMDB_Score RunTime.min.
## 1                     Spider-Man: Homecoming 2017        7.4          133
## 2                    Avengers: Age of Ultron 2015        7.3          141
## 3          The Falcon and the Winter Soldier 2021        7.5           50
## 4                                WandaVision 2021        8.1          350
## 5                    Spider-Man: No Way Home 2021        7.0          154
## 6                                Black Widow 2021        6.8          133
## 7                          Avengers: Endgame 2019        8.4          181
## 8                    Guardians of the Galaxy 2014        8.0          121
## 9  Shang-Chi and the Legend of the Ten Rings 2021        8.9          124
## 10                 Spider-Man: Far from Home 2019        7.5          129
## 11                            Thor: Ragnarok 2017        7.9          130
## 12                    Avengers: Infinity War 2018        8.4          149
## 13                             Black Panther 2018        7.3          134
## 14                            Captain Marvel 2019        6.9          123

Functions of dplyr for data manipulation

filter – It filters the data based on a condition


select – It is used to select columns of interest from a data set


arrange – It is used to arrange data set values on ascending or descending order


mutate – It is used to create new variables from existing variables


summarise (with group_by) – It is used to perform analysis by commonly used


operations such as min, max, mean count etc


## Warning: package 'dplyr' was built under R version 4.1.2
##                                    ï..Movie Year IMDB_Score RunTime.min.
## 1                               WandaVision 2021        8.1          350
## 2                         Avengers: Endgame 2019        8.4          181
## 3                   Guardians of the Galaxy 2014        8.0          121
## 4 Shang-Chi and the Legend of the Ten Rings 2021        8.9          124
## 5                            Thor: Ragnarok 2017        7.9          130
## 6                    Avengers: Infinity War 2018        8.4          149
##                                     ï..Movie Year IMDB_Score RunTime.min.
## 1  Shang-Chi and the Legend of the Ten Rings 2021        8.9          124
## 2                          Avengers: Endgame 2019        8.4          181
## 3                     Avengers: Infinity War 2018        8.4          149
## 4                                WandaVision 2021        8.1          350
## 5                    Guardians of the Galaxy 2014        8.0          121
## 6                             Thor: Ragnarok 2017        7.9          130
## 7          The Falcon and the Winter Soldier 2021        7.5           50
## 8                  Spider-Man: Far from Home 2019        7.5          129
## 9                     Spider-Man: Homecoming 2017        7.4          133
## 10                   Avengers: Age of Ultron 2015        7.3          141
## 11                             Black Panther 2018        7.3          134
## 12                   Spider-Man: No Way Home 2021        7.0          154
## 13                            Captain Marvel 2019        6.9          123
## 14                               Black Widow 2021        6.8          133
##                                     ï..Movie Year IMDB_Score RunTime.min.
## 1                     Spider-Man: Homecoming 2017        7.4          133
## 2                    Avengers: Age of Ultron 2015        7.3          141
## 3          The Falcon and the Winter Soldier 2021        7.5           50
## 4                                WandaVision 2021        8.1          350
## 5                    Spider-Man: No Way Home 2021        7.0          154
## 6                                Black Widow 2021        6.8          133
## 7                          Avengers: Endgame 2019        8.4          181
## 8                    Guardians of the Galaxy 2014        8.0          121
## 9  Shang-Chi and the Legend of the Ten Rings 2021        8.9          124
## 10                 Spider-Man: Far from Home 2019        7.5          129
## 11                            Thor: Ragnarok 2017        7.9          130
## 12                    Avengers: Infinity War 2018        8.4          149
## 13                             Black Panther 2018        7.3          134
## 14                            Captain Marvel 2019        6.9          123
##    runtime_h
## 1  2.2166667
## 2  2.3500000
## 3  0.8333333
## 4  5.8333333
## 5  2.5666667
## 6  2.2166667
## 7  3.0166667
## 8  2.0166667
## 9  2.0666667
## 10 2.1500000
## 11 2.1666667
## 12 2.4833333
## 13 2.2333333
## 14 2.0500000
##    Year                                  ï..Movie
## 1  2017                    Spider-Man: Homecoming
## 2  2015                   Avengers: Age of Ultron
## 3  2021         The Falcon and the Winter Soldier
## 4  2021                               WandaVision
## 5  2021                   Spider-Man: No Way Home
## 6  2021                               Black Widow
## 7  2019                         Avengers: Endgame
## 8  2014                   Guardians of the Galaxy
## 9  2021 Shang-Chi and the Legend of the Ten Rings
## 10 2019                 Spider-Man: Far from Home
## 11 2017                            Thor: Ragnarok
## 12 2018                    Avengers: Infinity War
## 13 2018                             Black Panther
## 14 2019                            Captain Marvel
##   mean(IMDB_Score)
## 1         7.671429