Simulating Mendel’s 1st Law of Segregation

When modeling Mendel’s law of segregation, we can simulate the offspring of all possible pairs of parental genotypes:

dd dD DD
dd dd x dd dd x dD dd x DD
dD dD x dd dD x dD dD x DD
DD DD x dd DD x dD DD x DD

Of these 9 scenarios, we expect the following probabilities to their offspring:

Expected Offspring Genotypes
Parental Genotypes Offspring Genotypes
dd dD DD
dd x dd 1 0 0
dd x dD 0.5 0.5 0
dd x DD 0 1 0
dD x dd 0.5 0.5 0
dD x dD 0.25 0.5 0.25
dD x DD 0 0.5 0.5
DD x dd 0 1 0
DD x dD 0 0.5 0.5
DD x DD 0 0 1


We can simulate this phenomenon by generating offspring for each possible pair of parental genotypes. For each of these 9 pairs, we can randomly sample 1 allele from each parent and store the 2 alleles into one string that represents that offspring’s genotype. This can be repeated until the desired sample size is reached (n=100). Then, the frequencies and conditional probabilities for homozygous dominant, heterozygous, and homozygous recessive individuals can be computed.

Simulated/Observed Offspring Genotypes
Parental Genotypes Offspring Genotypes
DD dD dd
dd x dd 0.00 0.00 1.00
dd x dD 0.00 0.59 0.41
dd x DD 0.00 1.00 0.00
dD x dd 0.00 0.57 0.43
dD x dD 0.27 0.52 0.21
dD x DD 0.46 0.54 0.00
DD x dd 0.00 1.00 0.00
DD x dD 0.52 0.48 0.00
DD x DD 1.00 0.00 0.00


Fully Penetrant Recessive Model

Let D represent the phenotype allele. When Y denotes phenotype status, Y = 1 defines the state of an individual having the phenotype, conditional on their genotype. Meanwhile, Y = 0 indicates the individual does not have this phenotype. For a fully penetrant recessive model,
\[P(Y = 1|G = DD) = 1\] \[P(Y = 1|G = dd) = P(Y = 1|G = dD) = 0\]

In other words, an individual requires both copies (DD) of the phenotype allele, D, in order to present the phenotype. If the individual possesses just one copy (dD or Dd) or zero copies (dd) of the phenotype allele, they will not present this recessive phenotype. It is important to emphasize that the lettercase in this scenario does not differentiate a dominant versus recessive allele, but instead it represents an encoding versus non-encoding allele for the phenotype of interest.

Fully Penetrant Recessive Model
Genotype Phenotype Present? Simulated Sample Size Proportion with Phenotype
DD Yes 100 1
dD No 100 0
dd No 100 0


Pink & White Flowers

When crossbreeding a pure pink (homozygous dominant, AA) and pure white (homozygous recessive, aa) flower, the first generation, \(F_1\), will be composed entirely of heterozygous pink flowers (aA). Self pollination of heterozygous \(F_1\) pink plants would bring about a mixture of pink- and white-flowered offspring, where we expect to see 25% of offspring inheriting two copies of the dominant pink allele (i.e., being homozygous dominant), 50% of offspring inheriting one dominant pink allele and one recessive white allele (i.e., being heterozygous), and the remaining 25% of offspring inheriting two recessive white alleles. This \(F_2\) generation is thus expected to possess 75% pink-flowered offspring and 25% white-flowered offspring.

Flower Generations P through F3
Representation of Mendel’s basic experimental design for the law of segregation. Source: Mange and Mange (1999)


Among these individuals, self-pollination among white flowers will perpetually bring about more white flowers in \(F_3\) and beyond. Similarly, self-pollination with the homozygous pink flowers will produce more homozyous pink flowers in the next generation and beyond. Meanwhile, self-pollination among the heterozygous pink flowers will produce a mixture of homozygous pink, heterozygous pink, and homozygous white offspring at the aforementioned ratio of 1:2:1, respectively.

We will simulate the first two self-pollinating generations after the original \(F_1\) cross with 100 experiments.

Here are the results from the simulated self pollination of \(F_1\) heterozygotes, producing \(F_2\):

F2 Offspring (aA x aA)
Genotype Frequency Proportion
AA 29 0.29
aA 45 0.45
aa 26 0.26


With a sample size of 100, we are not getting exactly the 25%/50%/25% breakdown of AA, aA, and aa we were anticipating, but it is close. A larger sample size would shift our simulated \(F_2\) generation closer to this expectation. However, we are seeing that roughly 3/4 of \(F_2\) comprise the pink flowers, and the remaining quarter of \(F_2\) are white flowers.


Next, simulating self pollination among the 3 offspring beloging to \(F_2\) resulted in the following genotype frequencies in \(F_3\), as broken down by \(F_2\) genotype:

F3 Offspring (AA x AA, aA x aA, aa x aa)
F2 Genotype F3 Genotype Frequency Proportion
AA AA 100 1.00
AA aA 0 0.00
AA aa 0 0.00
aA AA 18 0.18
aA aA 62 0.62
aA aa 20 0.20
aa AA 0 0.00
aa aA 0 0.00
aa aa 100 1.00