Species Average Trait Value Calculation

Installing and Loading Packages
How your imported data should look
Summarizing Species Averages in New Dataframe
How your result data should look
Write .csv of new species averages
Now try with your own data!

This tutorial will walk you through the basic skill of calculating average trait values of a species in a dataset with multiple trait values for multiple species. We will create a new data-frame that groups data by species, providing an average trait value for each species present in the original dataset. Having trait averages for species can be useful for a number of different reasons, including creating figures and using the resulting dataset in a secondary analysis or index calculation.

Installing and Loading Packages

To calculate trait averages and organize the resulting data-frame we will be using tidyr and dplyr (this method will be very similar to the method we used in the community weighted mean calculation tutorial). You should start by installing tidyverse if you have not already done so (use install.packages("tidyverse")).You should also load the appropriate packages as shown here:

# Load packages
library(tidyr)
library(dplyr)

Be sure to set your working directory! HINT - setwd()

How your imported data should look

In order to calculate species averages for traits all you need is a data-frame with species and trait values for each specie sample.

Your data should look something like this:

Ex. Dataframe
Species	Height..cm.	SLA..cm2.g.	X.N..N.mass.	N15	X.C..Cmass.	C13
ammophila_breviligulata	156.0	70.570	1.03	1.74	49.21	-26.91
cyperus_esculentes	130.0	66.726	0.87	0.85	47.84	-13.14
panicum_amarum	127.0	293.776	1.27	-4.14	42.74	-11.87
setaria_parvifolia	77.0	138.056	0.51	-2.92	46.27	-11.68
spartina_patens	116.0	169.155	1.25	3.76	46.86	-12.69
ammophila_breviligulata	68.5	51.094	0.88	-0.54	48.85	-25.78
andropogon_virginicus	102.5	78.732	0.86	0.70	39.24	-13.19
conyza_canadensis	58.0	141.304	1.63	-2.61	47.22	-29.12

Summarizing Species Averages in New Dataframe

First we will want to direct the result to a new data-frame, this will help us check our work easily in R (in this example I am naming the df summarize.spp.avg). In the code chunk below you will notice an operator that may be new to you (%>%). This is called “piping”. It takes the output of one statement and makes it the input of the next statement. It is a commonly used operator in dplyr.

In the example below, we are taking all the data from the nutnet.spp.avg and using it as the input for the group_by() function. Grouping by “Species” tells R that we want our final result of average trait values to be group by species names. We then use the output of group_by(Species) as the input for the summarize function. The summarize function is how we can pull together different statistical functions. We can use the mean() function to calculate averages.

In the summarize() statement you will have to first identify how you want your summarization organized (here I use the names of the trait that will be averaged for each species). Next, you will define the statistical function you want R to perform, here we want mean().

# Calculating CWM using dplyr and tidyr functions
summarize.spp.avg <-   # New dataframe where we can inspect the result
nutnet.spp.avg %>%   # First step in the next string of statements
group_by(Species) %>%   # Groups the summary file by Plot number
summarize(           # Coding for how we want our CWMs summarized
Height = mean(Height..cm.),   # Actual calculation of CWMs
SLA = mean(SLA..cm2.g.),
Nmass = mean(X.N..N.mass.),
N15 = mean(N15),
Cmass = mean(X.C..Cmass.),
C13 = mean(C13),
CNratio = mean(C.N..ratio.)
)

How your result data should look

Your new summarize.spp.avg data-frame should be added to the data environment. It should look something like this:

Ex. Result Dataframe
Species	Height	SLA	Nmass	N15	Cmass	C13
ammophila_breviligulata	110.74500	61.31740	0.9440000	1.6260000	48.70100	-26.12250
andropogon_virginicus	110.15455	65.51773	0.7690909	2.1372727	47.38909	-12.94273
conyza_canadensis	76.91429	102.59471	1.0414286	-1.4557143	47.52714	-20.26143
cyperus_esculentes	122.25000	152.98050	0.8675000	0.6650000	46.22375	-12.56625
fimbrystylis_castanae	76.00000	169.70050	1.0350000	-2.3150000	48.40500	-20.56000
gnaphalium_purpureum	57.00000	247.14300	1.6300000	-2.4200000	45.98000	-30.19000
panicum_amarum	83.60000	183.56900	1.4560000	-0.1250000	46.08400	-20.69900
setaria_parvifolia	96.57692	181.21869	0.8192308	0.9023077	45.94692	-12.12231

You should notice that you no longer have repeating species observations because we grouped by species identity.

Write .csv of new species averages

As a final step you should write your result data-frame as a .csv and save it so that it can be easily read into R or other statistical software if you are interested in running further analysis on the data.

write.csv(summarize.spp.avg, "summarize_spp_avg.csv")

Now try with your own data!

# Not Run
# Load packages
library(tidyr)
library(dplyr)

# SET WORKING DIRECTORY!

# Calculating community weighted means and summarizing by plot - using dplyr
nutnet.spp.avg <- read.csv("fnxl.trait.nutnet_spp.avg.csv")
summarize.spp.avg <-
  nutnet.spp.avg %>%
  group_by(Species) %>%
  summarize(Height = mean(Height..cm.),
            SLA = mean(SLA..cm2.g.), 
            Nmass = mean(X.N..N.mass.),
            N15 = mean(N15),
            Cmass = mean(X.C..Cmass.),
            C13 = mean(C13),
            CNratio = mean(C.N..ratio.)
            )
write.csv(summarize.spp.avg, "summarize_spp_avg.csv")