The data set in this project will be used to answer the question”, “What affects the size of a frog’s egg?” The data set is a collection of data from a study of various populations of frogs 2035 to 3494 m above sea level in the eastern Tibetan Plateau in 2013, from W. Chen, Z. H. Tang, X. G. Fan, Y. Wang, and D. A. Pike. The data they collected includes the altitude and latitude of frog they studied, the body length of the mother frog who laid the egg clutch in cm, the number of eggs in a clutch (clutch size), the volume of the egg clutch in mm^3, and the average diameter of an egg in mm. A couple of these variables mention an egg clutch, an egg clutch is a group of eggs laid by reptiles, amphibians, etc. which are a laid at a single nesting period.
altitude latitude clutch_size clutch_volume
Min. :2035 Min. :32.78 Min. : 158.5 Min. : 151.4
1st Qu.:3189 1st Qu.:34.30 1st Qu.: 549.5 1st Qu.: 609.6
Median :3462 Median :34.30 Median : 707.9 Median : 831.8
Mean :3276 Mean :34.35 Mean : 721.3 Mean : 882.5
3rd Qu.:3493 3rd Qu.:34.82 3rd Qu.: 851.1 3rd Qu.:1096.5
Max. :3493 Max. :34.96 Max. :1698.2 Max. :2630.3
egg_size
Min. :1.622
1st Qu.:1.950
Median :2.089
Mean :2.114
3rd Qu.:2.291
Max. :2.630
As show in the “Data Structure & Checking for NAs” chunk, body.size had 302 NA values and with the data set only having 431 observations. I chose to remove the column as it wouldn’t prove useful due to majority of the values within it being NA values. In the correlation matrix above it was shown that clutch volume had the highest correlation with egg size, with it being 0.6463. With that information I knew that it had a positive correlation with egg size, so as clutch volume increases egg size should as well. But due to the correlation only being 0.6463 I chose to use those variables to create a plot to see how it looked.
The plot above seems to be somewhat linear, as the clutch volume increases, the egg size also increases, but due to there being a lot of different clutch volumes at a specific egg size, clutch volume is only an okay option to use to predict the size of an egg. Since the egg size in the data set is based average diameter, and clutch volume is the volume of a single egg in the clutch which have hundreds their correlation is lower than it should be, as diameter and volume should be closely related. From the correlation matrix before the plot all of the correlation coefficient values, other than the one for clutch volume, were close to 0, with the one being the furthest from zero being -0.2232. This shows that the data collected in this data set from the 2013 study wouldn’t be effective to use to predict the egg sizes of frogs. Along with that due to their over 300 NA values for body size, and there only being 431 total observations, it wasn’t really possible to use it to check its correlation with egg size. Overall, this data set cannot be used for predicting the egg size of a frog and for future data sets to be able to that possibly including different kinds of variables and having significantly less NA values in a column would be useful.
Source(s):
Chen, W., et al. Maternal investment increases with altitude in a frog on the Tibetan Plateau. Journal of evolutionary biology 26.12 (2013): 2710-2715. Data accessible from Dryad Digital Repository: http://dx.doi.org/10.5061/dryad.6v0c1
Link to Dataset: https://www.openintro.org/data/index.php?data=frog