Introduction

This page is to show my process in simulating data made by the package ‘squidSim’ to test different sample sizes and variances for a discriminant analysis prior to running my observed data. This analysis and simulating data is novel to me and I am choosing to write down the steps to help myself at a later time, and to have simulated data to compare my observed data analysis to.

Background

I have a small data set of adult American barn owls (Tyto furcata) I am using to run an LDA prior to using this analysis on nestling American barn owls (hereafter, barn owls). My goal is to identify the emergence of the reverse sexual dimorphism that appears in this species and to determine if an LDA can help sex these individuals at an earlier age (ie prior to full body plumage coloration).

Linear discriminant analyses have been used for sexing individual bird species for decades(cite). Recently, there was a paper explaining how the different methods of using this analysis may have led to poorly understood and even misused results. I aim to remove that uncertainty about my data and species by using ‘squidSim’, an r package for simulating data populations to determine prior to running my own data, if it is plausible to use an LDA on my birds.

EDIT First, I will run an LDA simulation on adult barn owls based on some of my observed data. This will include: the average morphometrics of hallux, culmen, and mass for each sex, and estimated variance for each morphometric of each sex. Will add more if anything else comes to mind, after.


I have measurements for 37 adults females, and 30 adult males. I will begin my simulations starting at 5 individuals and working my way up in increments of 10, until 100. I will run each of these population sizes at different variances beginning at 0.5 and increasing in increments of 0.5, until 2. This comes out to 40 simulations which I will input in a table, and have figures for showing changes over the differing sample sizes. Each of these simulations will be run with 5000 populations to ensure accurate results.

2023-07-11

Steps for running an LDA

1. Box’s M test (homogeneity of covariances)

2. Homogeneity of variances

3. Identify outliers

4. Use jackknife (leave-one-out) method

5. Report LDA coeff, percentage accuracy, and the CI for each