Practical Task 8

Tyding the data

# Loading necessary libraries
library(ggplot2)
library(tidyr)
library(readr)
library(dplyr)
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(effects)
## Warning: package 'effects' was built under R version 4.3.2
## Carregando pacotes exigidos: carData
## lattice theme set by effectsTheme()
## See ?effectsTheme for details.
# Loading the dataset
Mustard <- read_csv("c:/users/dell/downloads/Mustard.csv")
## Rows: 108 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): light, medium
## dbl (2): weight, watering
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Seeing the summary statistics

# Displaying summary statistics
summary(Mustard)
##      weight         light              watering    medium         
##  Min.   :0.060   Length:108         Min.   :1   Length:108        
##  1st Qu.:0.980   Class :character   1st Qu.:1   Class :character  
##  Median :1.780   Mode  :character   Median :3   Mode  :character  
##  Mean   :1.879                      Mean   :3                     
##  3rd Qu.:2.400                      3rd Qu.:5                     
##  Max.   :5.600                      Max.   :5                     
##  NA's   :3

EDA

Growth Medium EDA

## Warning: Removed 3 rows containing non-finite values (`stat_boxplot()`).

The boxplot shows noticeable differences in mustard yield across various growth media. In particular, soil and cotton wool exhibit higher median yields compared to newspaper and sawdust. Outliers highlight distinctive responses to particular growth media, underscoring the significant impact of medium selection on mustard yield.

Watering Frequency EDA

## Warning: Continuous x aesthetic
## ℹ did you forget `aes(group = ...)`?
## Warning: Removed 3 rows containing non-finite values (`stat_boxplot()`).

A noticeable trend of higher median yields is evident as the watering frequency increases from 1 to 5 times over 14 days, as illustrated by the boxplot. The outliers underscore variations in plant responses to different watering schedules, emphasizing the significance of frequency in influencing mustard yield.

Light Type EDA

## Warning: Removed 3 rows containing non-finite values (`stat_boxplot()`).

Different types of light show potential variations in mustard yield, as indicated by the boxplot. Compared to other factors, the overall impact of light type seems less pronounced, despite some differences. Certain mustard plants exhibit unique responses to specific light types, as suggested by outliers, but confirming the significance of these differences requires further investigation.

ANOVA TEST

##              Df Sum Sq Mean Sq F value Pr(>F)    
## medium        3  85.48  28.494   41.16 <2e-16 ***
## Residuals   101  69.93   0.692                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 3 observations deleted due to missingness

Mustard yield is significantly influenced by the factor “medium.” The F value of 41.16 and an extremely small p-value (Pr(>F) < 2e-16) suggest that the observed differences in mustard yield among the growth mediums are unlikely to be attributed to random chance.

Watering Frequency ANOVA

##              Df Sum Sq Mean Sq F value   Pr(>F)    
## watering      1  38.76   38.76   34.22 5.87e-08 ***
## Residuals   103 116.66    1.13                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 3 observations deleted due to missingness

Mustard yield is significantly affected by the factor “watering.” The F value of 34.22, along with an extremely small p-value (5.87e-08) for Pr(>F), suggests that the observed differences in mustard yield related to watering frequency are highly unlikely to be attributed to random chance.

Light Type ANOVA

##              Df Sum Sq Mean Sq F value Pr(>F)  
## light         2   7.76   3.878   2.679 0.0735 .
## Residuals   102 147.66   1.448                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 3 observations deleted due to missingness

Mustard yield is significantly affected by the factor “light.” The F value is 2.679, and the p-value (Pr(>F)) is 0.0735, indicating statistical significance, as it is below the commonly used level of 0.05. This suggests that the variations in mustard yield attributable to light type may be statistically significant.

General interpretation

Mustard yield is significantly influenced by all three factors: growth medium, watering frequency, and light type. Growth medium and watering frequency exhibit high significance, as indicated by very small p-values, while the p-value for light type, although below 0.05, is not extremely small.