Relationship between No. of leaves and No. of fruits

ggplot(GarlicMustard,aes(x=AvgNLeaves,y=AvgNFruits))+
  geom_point()+
  geom_smooth(method="lm", linetype=1)+
  facet_wrap(~Region)+
  labs(title= "Relationship between No. of fruits and No. of leaves",
       x="Average No. of leaves",
       y="Average No. of Fruits")

HTML REPORT

INTRODUCTION

Name: Aaryan and Faizan

Our dependent variable is TotalDens which is the mean number of adult and juvenile plants per square meter in the experimental plot. We believe that some factors that might affect TotalDens are bio1 and bio10 which correspond to Annual Mean Temperature and Mean Temperature of Warmest Quarter respectively. The question we aim to answer is how TotalDens relates to both bio1 and bio10 in both regions since Garlic Mustard plants reach optimal growth levels in warmer temperatures.

DATA VISUALIZATION

BOXPLOT

ggplot(GarlicMustard,aes(x=Region,y=TotalDens))+
  geom_boxplot()

INTERPRETATION

From the graph above we see that North America plot has a higher average density of plants and a larger range of outliers as compared to the Europe plot.(Code written by Faizan)

HISTOGRAMS

ggplot(NorthAmericaData,mapping=aes(x=TotalDens))+
  geom_histogram(color="black",fill="skyblue")+
  labs(title="Total Density of plants in North America",
       x="Total Density",
       y="Count")

INTERPRETATION

The histogram illustrates the total density of plants in North America. The majority of the data points are concentrated around the lower end of the density range, with a significant number of observations near zero.The distribution is heavily right-skewed, indicating that most plant populations have low densities, while only a few regions or cases have significantly higher densities, extending beyond 500. There are outliers in the data, with some areas showing very high plant density values, approaching 1000.(Code written by Aaryan)

ggplot(EuropeData,mapping=aes(x=TotalDens))+
  geom_histogram(color="black",fill="skyblue")+
  labs(title="Total Density of plants in Europe",
       x="Total Density",
       y="Count")

INTERPRETATION

This histogram presents the total density of plants in Europe.Like the North American distribution, the European plant density is right-skewed. However, the skewness appears less pronounced in Europe, with the majority of the data points still clustering at the lower density range but showing a more gradual taper. European plant densities reach up to around 200, whereas North American densities span up to around 1000.This suggests that North America has a broader range of extremely dense plant populations compared to Europe. There are some higher-density outliers in Europe, but fewer than those seen in North America. These outliers are still significant, reaching densities around 200. (Code written by Aaryan)

SCATTERPLOTS

## `geom_smooth()` using formula = 'y ~ x'

## `geom_smooth()` using formula = 'y ~ x'

INTERPRETATION

The North America scatter plot shows the relationship between the total density of North American plants and the mean temperature of the warmest quarter. There is a negative relationship between total plant density and temperature, as indicated by the downward-sloping regression line. This suggests that areas with higher temperatures in the warmest quarter tend to have lower plant densities. A few outliers have high plant densities (above 500) even at relatively low to moderate temperatures. These could represent unique ecological conditions or areas with specific vegetation characteristics. Most data points cluster between temperatures of 18C to 24C, with plant densities mainly below 200. Beyond this, the spread of data becomes more vague, but the negative trend persists. The plot suggests that as temperatures increase, plant density generally decreases, with a few exceptions at both low and high density values.

The Europe scatter plot analyzes the relationship between the mean temperature of the warmest quarter and the total density of European plants. Based on the plot, there seems to be a weak positive correlation between the mean temperature of the warmest quarter and the total plant density. This suggests that, generally, as the temperature of the warmest quarter increases, there tends to be a slight increase in plant density. However, the correlation is not very strong, as there is a considerable amount of scatter around the regression line. The data points cluster around 15C to 20C with a fairly large number of outliers. Generally, total density remains constant as temperature increases according to the plot. (Code written by Faizan)

## `geom_smooth()` using formula = 'y ~ x'

## `geom_smooth()` using formula = 'y ~ x'

INTERPRETATION

These scatter plots present a relationship between the annual mean temperature and the total density of North American and European plants. The shaded area around the regression line represents low accuracy due to lack of data points ahead , illustrating the range of potential values for plant density at a given temperature. While the plots hints at a negative correlation between temperature and plant density, it’s important to note that the relationship is not particularly strong. There is a considerable amount of scatter around the regression line, suggesting that other factors besides temperature may also influence plant density. It’s also worth considering the potential impact of outliers on the overall trend. A few data points appear to deviate significantly from the general pattern, which could be due to unique environmental conditions or other factors.

Both plots suggest a negative relationship between annual mean temperature and plant density in European plants. However, the strength and clarity of this relationship vary between the two plots. The wider range of temperatures and the less steep regression line in the European plot suggest a weaker and less defined relationship compared to the North America plot. This could be due to factors such as the inclusion of data points from regions with more extreme temperatures or variations in other environmental factors that influence plant density. (Code written by Aaryan)

CONCLUSION

North America has a broader range of plant densities compared to Europe, with values reaching as high as 1000, while Europe’s densities peak around 200. Both regions exhibit right-skewed distributions, with most areas having low plant densities. However, the skewness is more pronounced in North America, where a few regions show exceptionally high densities, while Europe has fewer and less extreme outliers. Additionally, a negative correlation between plant density and temperature is observed in North America, where higher temperatures in the warmest quarter correspond to lower plant densities. This temperature-dependent trend, combined with the broader density range, highlights key ecological differences between the two regions. These observations show the prevelance of Garlic Mustard in North America and its higher density compared to Europe. To reduce this high density we may look to decrease average temperatures in North America but this may not be possible entirely due to the current climate crisis around the world.