# A tibble: 90 × 7
Family Genus CommonName Species mean_BodyMass_kg Mean_brain_mass_g
<chr> <chr> <chr> <chr> <dbl> <dbl>
1 Cercopithecidae Alleno… Allen_s_S… Alleno… 5.3 58.0
2 Daubentoniidae Dauben… Aye_aye Dauben… 2.54 45.2
3 Pitheciidae Cacajao Bald_Uaca… Cacaja… 3.17 74.3
4 Cercopithecidae Macaca Barbary_M… Macaca… 13.5 87.7
5 Cercopithecidae Semnop… Bengal_Sa… Semnop… 11.4 112.
6 Lemuridae Varecia Black_and… Vareci… 3.55 31.2
7 Lemuridae Eulemur Black_Lem… Eulemu… 2.06 22.6
8 Cercopithecidae Cercop… Blue_Monk… Cercop… 4.89 75
9 Cercopithecidae Macaca Bonnet_Ma… Macaca… 5.14 69.4
10 Hominidae Pan Bonobo Pan_pa… 39.1 330.
# ℹ 80 more rows
# ℹ 1 more variable: Age_Weaning_d <dbl>
Data Expeditions Project
EvAnth Summer ’25
Hypotheses and Predictions
Hypothesis: A higher weaning age signifies a prolonged developmental period necessary for higher cognitive processes.
- Higher cognitive processes can be quantified by the ratio between brain mass and body mass, otherwise called the encephalization quotient.
Prediction: Primates with a larger encephalization are more likely to have a higher weaning age
Null hypothesis: No correlation exists between the developmental period length and a larger encephalization quotient.
Alt hypothesis: A correlation exists between the developmental period length and a larger encephalization quotient.
Our hypothesis aims to reject the null.
Variables we are interested in:
mean_BodyMass_kg: The mean body mass of primates in kilograms. In our data exploration, we will mutate this column to be the mean body mass in g.Mean_brain_mass_g: The mean brain mass of primates in grams.Age_Weaning_d: The age of a primate/mammal when it is able to consume food other than its mother’s milk.
Data Exploration
We’ve begun with selecting the following columns from our dataset titled primates:
FamilyGenusCommonNameSpeciesmean_BodyMass_kgMean_brain_mass_gAge_Weaning_d
We then got rid of the NA values in the numerical columns and renamed the dataset from primates to primates_selected. A glimpse of the dataset is below.
Next, we mutated and overwrote the dataset to do the following:
Change
mean_BodyMass_gfrom kilograms to grams.Find the encephalization quotient by dividing
Mean_brain_mass_gbymean_BodyMass_g.Renamed this dataset
primates_enceph
A glimpse of the dataset is below.
# A tibble: 90 × 9
Family Genus CommonName Species mean_BodyMass_kg Mean_brain_mass_g
<chr> <chr> <chr> <chr> <dbl> <dbl>
1 Cercopithecidae Alleno… Allen_s_S… Alleno… 5.3 58.0
2 Daubentoniidae Dauben… Aye_aye Dauben… 2.54 45.2
3 Pitheciidae Cacajao Bald_Uaca… Cacaja… 3.17 74.3
4 Cercopithecidae Macaca Barbary_M… Macaca… 13.5 87.7
5 Cercopithecidae Semnop… Bengal_Sa… Semnop… 11.4 112.
6 Lemuridae Varecia Black_and… Vareci… 3.55 31.2
7 Lemuridae Eulemur Black_Lem… Eulemu… 2.06 22.6
8 Cercopithecidae Cercop… Blue_Monk… Cercop… 4.89 75
9 Cercopithecidae Macaca Bonnet_Ma… Macaca… 5.14 69.4
10 Hominidae Pan Bonobo Pan_pa… 39.1 330.
# ℹ 80 more rows
# ℹ 3 more variables: Age_Weaning_d <dbl>, mean_BodyMass_g <dbl>,
# enceph_quotient <dbl>
In making our figure, we took the log of the enceph_quotient and age_weaning_d columns to make the relationship linear when plotting them on a scatter plot. We have added labels to the figure and a line of best fit.
`geom_smooth()` using formula = 'y ~ x'
Now that we have a visualization of the relationship, we then fit a linear model between the two transformed variables to find the p-value. The p-value will help us test if the relationship between weaning age and the encephalization quotient is statistically significant. If the p-value is less than an alpha value of 0.05, then we will reject the null hypothesis and accept the alternative, deeming that there is a significant correlation between the variables. If the p-value is above 0.05, we will fail to reject the null and deem no significant relationship.
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) -2.79 0.355 -7.85 9.21e-12
2 log_weaning_age -0.281 0.0633 -4.44 2.56e- 5
The p-value, 0.000025, is less than 0.05, so we reject the null and accept the alternative hypothesis. Thus, there exists a statistically significant relationship between weaning age and the encephalization quotient.