How many ducks were in FCUL’s 2023 duck farm
Introduction
As is usually the case, on the 5th lecture of Ecologia Numérica at FCUL (9th October 2023) I conducted an experiment that aims to illustrate a number of relevant sampling concepts in a relaxed environment.
This document describes a challenge set to the students in the “Ecologia Numérica” class on the . As with any good survey, there were some pilot surveys, that happened last couple of years, on the 4th October 2022 and the 27th September 2021. Those were different settings, and different animals.
The experiment
Some key terms that might be important for students in Ecologia Numérica are higlighted as bold.
In fact the challenge was announced on twitter 90 minutes before class, but the students had not been explicitly told about it. The original post is here
The challenge was that the students had to guess how many ducks were in a duck farm. Well, kind of, the ducks were small rubber and plastic ducks in a transparent glass jar. The image in the next slide shows the jar from different perspectives.
The duck farm
The details
The students were told that they would have to guess what was the number of ducks in the jar, with a detail, depending on their student number:
- students with an even number would get a cue
- students with an odd number would close their eyes and get no cue (and hence have no clue !)
The cue was given. The cue was that I showed the students (with even numbers only) 5 small ducks and 5 large ducks, and then added them to the jar.
The estimators and the truth
A key aspect of the exercise is that there were in reality 346 animals after the 10 from the cue were introduced back in the jar. It is this true number we want to estimate.
We will consider (at least) 3 estimators:
- the average of all students;
- the average of the odd students (this estimator was uncalibrated)
- the average of the even students (this estimator was somehow/somewhat calibrated)
The survey
Then the students in class guessed the number of animals and the results were reported on a google form here
The data
This form created a google spreadsheet (as google forms do!) that was downloaded by TAM and is further processed here. The file was downloaded as a .csv file (“Patos na Quinta da FCUL.csv”).
Read the data in
# A tibble: 6 × 5
Timestamp How many ducks are l…¹ If you are a student…² Would you mind shari…³
<chr> <dbl> <dbl> <chr>
1 2023/10/… 250 NA Female
2 2023/10/… 156 52553 Male
3 2023/10/… 123 NA Male
4 2023/10/… 200 NA Female
5 2023/10/… 63 40169 Male
6 2023/10/… 137 NA Female
# ℹ abbreviated names:
# ¹`How many ducks are living in the farm (do not write anything but a number in this field - comments can be added below)`,
# ²`If you are a student at FCUL, what is your student number (if not a student, please ignore this question)?`,
# ³`Would you mind sharing your gender?`
# ℹ 1 more variable:
# `Any other comment you might have about the FCUL duck farm and this exercise?` <chr>
Data summary
Lets take a look at a summary of the data
Timestamp
Length:76
Class :character
Mode :character
How many ducks are living in the farm (do not write anything but a number in this field - comments can be added below)
Min. : 54.00
1st Qu.: 99.25
Median : 150.00
Mean : 201.62
3rd Qu.: 220.50
Max. :1230.00
If you are a student at FCUL, what is your student number (if not a student, please ignore this question)?
Min. :40169
1st Qu.:56611
Median :57912
Mean :56803
3rd Qu.:57987
Max. :60439
NA's :11
Would you mind sharing your gender?
Length:76
Class :character
Mode :character
Any other comment you might have about the FCUL duck farm and this exercise?
Length:76
Class :character
Mode :character
Data wrangling
We recode the gender and the number of the student as factors, and create a new data.frame
, for easier use,
and now we take a look at the data again:
estimativa studentid gender
Min. : 54.00 Min. :40169 Female :50
1st Qu.: 99.25 1st Qu.:56611 Male :24
Median : 150.00 Median :57912 Rather not tell: 1
Mean : 201.62 Mean :56803 NA's : 1
3rd Qu.: 220.50 3rd Qu.:57987
Max. :1230.00 Max. :60439
NA's :11
and we create an indicator of whether student number is odd or even
The estimates
We can plot the estimates:
The estimates and the truth I
Let’s add a few things to the plot:
- true number of ducks as a green vertical line,
- and then the mean of the guesses, as a vertical dashed red line
- a naive the 95% confidence interval, the blue dashed lines
The estimates and the truth II
Conclusion
Clearly - and even though we have not covered a t-test in class yet, you have seen them before - one would reject the null hypothesis that the mean would be equal to 346. The estimates are too low compared to truth.
One Sample t-test
data: estimativas
t = -6.5073, df = 75, p-value = 7.667e-09
alternative hypothesis: true mean is not equal to 346
95 percent confidence interval:
157.4187 245.8181
sample estimates:
mean of x
201.6184
Gender matters?
Lets separate the estimates per gender
![]()
Not really much of a difference across males and females, and too small sample size to make inferences about those who “Rather not tell”.
Does a cue gives you a clue I
What about those that got the cue versus those that got no cue. We can check the mean value obtained by the different students
Does a cue gives you a clue I
![]()
As expected, the mean value for even numbers was closer to the truth (253) and for odd numbers was further from the truth (153).
What about non EN students
Interestingly, the few folks not EN students, and hence not getting a cue either, where less biased than those not getting a cue in class (mean 235), but there were only 11 of those and so that might just be a fluke and not interpreted as yet another proof that EN students are not very good at this game ;)
Take home messages
Estimators can be biased (as human guesses often are)
Humans are not in general good at counting ducks in jars (and other things too!)
Information provided as cues (information about estimator bias) can help in the estimation process
The larger the sample size, the more accurate an (unbiased) estimator becomes (not illustrated here, only shown in class)