Preliminary Summary of a Plant Selection Model for DPR

Author

John M. Mola

Introduction

This document provides a preliminary look at the plant selection process. It uses structured decision analysis, which is a blend of quantitative and qualitative methods wherein a user starts with a dataset of attributes of interest and makes decisions about how those attributes should be valued in the decision making process.

The current version of the plant selection model includes 30 plant species, that are all forbs listed on the DPR’s existing plant list. We then have over 20 attributes for each plant species, though only 12 are considered here due to various limitations (described below).

There are several steps in this process, which I’ll try to summarize briefly.

Description of the procedure

Initial dataset

First, you need some sort of dataset. In our case, we have 120 plant species from several regionally-relevant pollinator-friendly plant lists and measured traits for them. Traits include things like the height of the plant, when it flowers, iNaturalist records of associations with insects, the color of the flower, root length, tissue density, etc. Each of these traits can be used as a proxy for a value we might hold. For example, color is associated with pollinator attraction but also aesthetic value for park goers. Root length is associated with drought resilience.

Here is a dataframe of all 30 plant species and their full list of attributes (i.e. not just the 12 we consider in the rest of the workflow below):

NOTE: ALL TABLES IN THIS DOCUMENT ARE INTERACTIVE. Feel free to scroll, sort, etc.

Hypothetical plant mixes

Next, we have to combine these plants together into possible “plant mixes” - because we are not interested in just planting a single species. As such, our traits like color or root length no longer become a single plant’s attribute. Instead, we may be concerned with having a variety of plant colors, or variation in height or root length as that variability may be favorable compared to a selection of plants that all share similar traits.

Here is a small sample of the plant mix data (just 10 randomly selected mixes in the 142,506 combinations considered!):

Assigning “attribute values” to different traits (“Single Attribute Value Function” or SAVF)

Now that we have our potential mixes and their associated traits, we need to decide how we will evaluate these traits. Using the example of plant height, let’s assume that we are interested in having lots of variability in plant height - we don’t just want the plant mix with the maximum height, but instead, we’d prefer a mix of short, medium, and tall plants. One way we can achieve that is by assigning higher value to plant mixes with a greater standard deviation of plant height.

Here’s what that might look like visually. In this example, the plant mix in our dataset with the lowest amount of variability is assigned an attribute value of 0, and the mix with the highest variability is assigned a value of 1.

The following plot represents the SAVF for the standard deviation of plant height (i.e. how variable the heights of the plants in the mix are). In SAVF’s, higher values on the y axis (SAVF Score) are “better” options - so in this case, a high standard deviation of plant height results in a high SAVF. The blue dot represents the mean for all plant mixes in the dataframe.

We can also assign different shapes to our attribute values, or different directions

SAVF function for the mean number of insects per 100 photos in iNaturlist. The blue dot represents the mean for all plant mixes.

SAVF function for the Specific Leaf Area. Represents the ratio of leaf area to dry leaf mass. Lower values are more drought tolerant, so here, our function is declining with low SLA values having a higher score. The blue dot represents the mean for all plant mixes.

Using these attribute functions, we assign values ranging from 0 (worst) to 1 (best) for each attribute using their value function (i.e. the curves shown above).

Here’s an example of what that looks like for the first 10 plant mixes in the dataset (note, the plant names are dropped here, but get added back in later). You’d read this as each row representing a different plant mix. So, all the mixes here have the same flowering duration - but they vary substantially in how variable their plant heights are, with lower values being less preferred (as they have lower variability).

Assigning priorities (using “weighting”) in our selection (“Multiple Attribute Value Function” or MAVF)

Lastly, we then decide where our priorities are in choosing the mix of plants. Because we have traits like variation in height, insect records, or root length that might be associated with different restoration priorities (biodiversity, pollinator attraction, and drought tolerance, respectively), we can design plant mixes that prioritize these different traits.

So, we can simply say for our 12 measured traits in the database that we want to consider all of them equally, assigning 100/12 = ~8.3% value to each trait equally. Or, we can decide we want to prioritize different attributes differently - so we might assign 30% of the value to iNaturalist insect records (i.e. a measure of pollinator attractiveness) and then the remaining weight equally (70/11 = ~6.36%).

We then multiply the attribute values for each plant mix by their weight, and that provides our final result (see below).

Current Limitations

Only 5 plants at a time are considered in each mix because we don’t have reliable cost data, so if we just tried to make every possible combination of 30 plants, we’d have trillions of possibilities and that breaks my computer.
We have 120+ plants with data, but because of the prior mentioned problem, we only consider the 30 already in the DPR database. I have ideas for how to address these problems in subsequent iterations.
We assume an exponential function for all SAVF calculations with the lowest value in the dataset being assigned a value of 0 and the highest being assigned a value of 1, with the median defining the shape of the function. That may not always be the case - for example, we may decide that a plant mix containing 2 colors is twice as good as a mix with 1 color…but a mix with 4 colors is far better than a mix with 2, and so we want a more aggressive curve for higher values.
I provide a few examples of weighting options below, but there are many possibilities and eventually I’d like to make this so the user can choose the parameters themselves. One other limitation is that most traits are associated with drought tolerance - so an “even” weighting is still skewed in importance towards drought tolerance, and there may be colinearity among those traits.
There are some other attributes that would be good to include, or some attributes that are silly. It would be nice to have more attributes to evaluate pollinator attraction. One attribute that is silly, and therefore not used below, is “floral form”…but it’s hard to break this down easily. Instead, I think I’ll incorporate family-level diversity at a later point. That will likely capture lots of broad scale variation, and help us avoid plant lists that just have lots of closely related plants (e.g. different legume or sunflower genera).

Preliminary Results

Each of the tables below represents the best or worst 10 plant mixes under different scenarios. The first column lists the plants in the mix, the second column represents their MAVF Score, and then the remaining columns are there just as a way of seeing the attributes in the dataset (and were what SAVF scores were calculated on).

Here’s what the weight assignments were in these two scenarios:

Attribute	Weight - Even Scenario	Weights - Pollinator Preference
Flower duration	8.3%	8.7%
Height Variability	8.3%	8.7%
iNaturalist insect records	8.3%	30%
Leaf Tissue mean	8.3%	5.8%
Leaf Tissue variability	8.3%	5.8%
Species richness	8.3%	5.8%
Root depth mean	8.3%	5.8%
Root depth variability	8.3%	5.8%
Specific leaf area mean	8.3%	5.8%
Specific leaf area variability	8.3%	5.8%
Specific root length mean	8.3%	5.8%
Specific root length variability	8.3%	5.8%

Even weight to all attributes

Best 10 plant mixes from an “even weight” scenario – if you look these up, you’ll see there’s lots of variation in height, floral form, etc in these plants!

Worst 10 plant mixes from an “even weight” scenario – if you look these up, you’ll see there is not much variation in the height or growth characteristics of these plants.

Priority given to pollinator attributes

Best 10 plant mixes from an “pollinator weight” scenario – 30% of the weight was given to the attribute “mean number of photos with insects foraging on the flower per 100 photos on inaturalist” (inat_insects, for short). Interestingly - the top mix is still the same! But lots of others change.

Worst 10 plant mixes from an “pollinator weight” scenario

Closing thoughts

Lots to do still, but it works and it’s pretty cool! Improving the quality of the data in will improve the quality of the data out.

We should have fun conversations about how to score/fit functions to different attributes.

There are ways to examine what the most influential plants in the dataset are, but I’ve run into some issues with implementing that and need time to troubleshoot - but that’ll be super cool once done.