This document contains information about candidate varieties to grow to test predictions about CDBN variety performance. There are four types of variety for which I’d like to test these predictions:
1. Varieties that are predicted to perform well across all sites (with low GxE).
2. Varieties that are predicted to perform well in major bean growing areas (with high GxE).
3. Varieties that are predicted to perform well at specific sites.
4. Varieties that are predicted to perform poorly at specific sites.
The tables below also contain information on possible seed sources for these field trials - I am bulking seed in Puerto Rico currently, but many of these varieties are also in the MDP, DDP, or ADP, and you might have seed for these varieties already.
These predictions are based on a Finlay-Wilkinson (FW) model fitted using a Bayesian Gibbs sampler and a matrix of variety relatedness (A). The Gibbs sampler shrinks estimates for each variety towards the average performance of the model, and generally gives better predictive power than an ordinary least squares model. The A matrix was calculated in Tassel using its recommended methodology (centered IBS). The SNP matrix used in Tassel was generated from the GBS data collected by Phil McClean, Rian Lee, and Alice MacQueen using the ApeKI enzyme, aligned using bwa mem to the P. vulgaris genome V 2.0, and with SNP calls using NGSEP.
The FW model was fitted for 312 bean varieties across 30 CDBN locations. The 30 locations were picked from 77 CDBN locations by selecting locations that had grown ten check varieties (CELRK, Fleetwood, Montcalm, NW63, Othello, Midnight, Redkloud, UI114, UI59, & Viva) at least once. The year each variety was grown was ignored in this model. Future work will account for the effect of year by incorporating daily weather data into the model.
In all sections below, you can optionally view the R code for by clicking on the “Code” button on the right. Just below are some sections for loading the data and preparing the dataframes.
load_all_experiments(laptop = TRUE)
wbA <- loadWorkbook("FW_GibbsA_Full-312var-30env_for-R-and-Tassel_2018-01-05_v03.xlsx")
FW2b_GibbsA_lst = readWorksheet(wbA, sheet = getSheets(wbA))
FW2b_GibbsA_lst$FW_data <- as_tibble(FW2b_GibbsA_lst$Data)
FW2b_GibbsA_lst$FW_data_var <- as_tibble(FW2b_GibbsA_lst$Varieties)
FW2b_GibbsA_lst$FW_data_env <- as_tibble(FW2b_GibbsA_lst$Environment)
wbB <- loadWorkbook("../../CDBN Variety Info/CDBN_Metadata_PR_2017-10-27.xlsx")
PR_list = readWorksheet(wbB, sheet = getSheets(wbB))
PR_rows <- as_tibble(PR_list$`Shipping Manifest`)
FW_g <- load_Tassel_MLM(path = "FW_GWAS_MLM_Outputs/", phenotype = "FW_GibbsA_312var_30env_g")
FW_b <- load_Tassel_MLM(path = "FW_GWAS_MLM_Outputs/", phenotype = "FW_GibbsA_312var_30env_b")
FW_SDg <- load_Tassel_MLM(path = "FW_GWAS_MLM_Outputs/", phenotype = "FW_GibbsA_312var_30env_SD_g")
# Load the Tassel GWAS outputs for type II stability (deviation of each variety from the FW model slope, b) and the genetic effect (intercept for each variety, g).
Seed_data <- Germplasm %>%
dplyr::select(CDBN_ID, Seq_ID, Market_class_ahm:Race, Year, GBS_Panel, MDP_ID, In.DDP, Seed.From.1, Seed.From.2)
Seed_data <- Seed_data %>%
left_join(PR_rows)
FWPred <- FW2b_GibbsA_lst$FW_data %>%
left_join(Seed_data)
# Join the prediction data with information about possible seed sources - are varieties in the MDP, DDP, or ADP? Are they in the set I'm bulking in Puerto Rico? If not, I likely won't be able to test predictions for these varieties.
The following plot displays Finlay-Wilkinson results for three check varieties: Fleetwood, Viva, and Montcalm. 30 locations from the CDBN are arranged along the x-axis in order of how well bean varieties yield, on average, at that location. Othello, WA (WAOT) is the highest yielding location of any in the dataset, and Lubbock, TX (TXLU) is the worst. The location codes always have two letters indicating the state first, followed by the first two letters of the site name, so usually you can guess pretty accurately what site it is if you know the CDBN locations already. I’ve included all 30 locations on the x-axis of this plot, so it is unreadable in some sections, but I remove some location labels for the remaining plots to make every location readable.
In these plots, the points indicate actual data - here actual yield data from a year in the CDBN at that location. The lines indicate the predictions for variety performance. The dotted line is the predicted average variety performance across all sites. Vertical deviation from this dotted line indicates a genetic effect of that variety on performance (which is the values labeled “g”" in the tables below). A change in the slope of a variety’s line relative to the dotted line indicates a difference in the type II stability of this variety, which is a measure of GxE (the values labeled “b” in the tables below).
Here is a table of the ten varieties with the flattest slopes across 30 CDBN locations that I anticipate having a good number of seed to send out (I have at least one 28 foot row for these in my Puerto Rico growout). The slope of the FW model here is equivalent to 1 + b.
(Low_gxe <- Seed_data %>%
left_join(FW2b_GibbsA_lst$Varieties) %>%
filter(Type.of.Bulk %in% c("2x 28 foot row", "28 foot row")) %>%
arrange(b) %>%
dplyr::select(CDBN_ID, g, SD_g, b, SD_b, Type.of.Bulk, Market_class_ahm:GBS_Panel, In.DDP:Seed.From.2) %>%
rename("Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(CDBN_ID != "Sapphire") %>%
head(10))
As you can see from the following table, the yields predicted for each variety at the five sites you all are running are very similar (Predicted_Yield), even though the actual yields (Yield_kg_ha) can diverge from these predictions.
Five_Loc <- c("MOCO", "NDHA", "MISA", "NESB", "WAOT")
FWPred %>%
filter(Location_code %in% Five_Loc & CDBN_ID %in% Low_gxe$CDBN_ID) %>%
arrange(CDBN_ID, yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Location_code, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, Location_code, .keep_all = TRUE) #%>%
# filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row"))
Tentatively, here are the three varieties with flat slopes and little GxE that I would like to grow at five CDBN sites this year.
Here are plots of the other six varieties with flat slopes, except for Redkloud, a test variety that we probably already know enough about! Some of these varieties do not have many or any datapoints on the left half of the x-axis here that are informing the model, which makes them more unreliable candidates in my mind. I’d welcome any feedback about line choice.
FW2b_data %>%
left_join(FW2b_GibbsA_env, by = "Location_code") %>%
filter(CDBN_ID %in% c("88728", "Fleetside", "Garnet")) %>%
ggplot(mapping = aes(x = h, y = yhat)) +
geom_line(aes(group = CDBN_ID, color = CDBN_ID)) +
geom_point(aes(y = y, color = CDBN_ID), shape = 21) +
theme(legend.position = c(0.25, .9), axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_abline(intercept = 2500, slope = 1, linetype = "dotted") +
scale_x_continuous(breaks = envlabel$h, labels = envlabel$Location_code) +
coord_cartesian(ylim=c(-200, 5200)) +
labs(x = "Environment", y = "Yield (kg/ha)")
FW2b_data %>%
left_join(FW2b_GibbsA_env, by = "Location_code") %>%
filter(CDBN_ID %in% c("Ivory", "JM126", "Mogul")) %>%
ggplot(mapping = aes(x = h, y = yhat)) +
geom_line(aes(group = CDBN_ID, color = CDBN_ID)) +
geom_point(aes(y = y, color = CDBN_ID), shape = 21) +
theme(legend.position = c(0.25, .9), axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_abline(intercept = 2500, slope = 1, linetype = "dotted") +
scale_x_continuous(breaks = envlabel$h, labels = envlabel$Location_code) +
coord_cartesian(ylim=c(-200, 5200)) +
labs(x = "Environment", y = "Yield (kg/ha)")
Here is a table of ten varieties with the steepest slopes that I would like to send out. The idea here is that it might be more valuable to improve beans for the actual areas of the country where they are grown, at the expense of most of the poor sites in the CDBN, which are not in common bean growing regions.
(High_gxe <- Seed_data %>%
left_join(FW2b_GibbsA_lst$Varieties) %>%
filter(Type.of.Bulk %in% c("2x 28 foot row", "28 foot row")) %>%
arrange(b) %>%
dplyr::select(CDBN_ID, g, SD_g, b, SD_b, Type.of.Bulk, Market_class_ahm:GBS_Panel, In.DDP:Seed.From.2) %>%
rename("Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
tail(10))
Many of these varieties are predicted to perform very poorly in Columbia, but quite well elsewhere in the US.
FWPred %>%
filter(Location_code %in% Five_Loc & CDBN_ID %in% High_gxe$CDBN_ID) %>%
arrange(CDBN_ID, yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Location_code, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, Location_code, .keep_all = TRUE) #%>%
# filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row"))
Here are the four varieties I’d like to grow at all five CDBN sites from this category. I could reduce this number if need be.
FW2b_data %>%
left_join(FW2b_GibbsA_env, by = "Location_code") %>%
filter(CDBN_ID %in% c("Buster", "AC_Ole", "UI465", "Matterhorn")) %>%
ggplot(mapping = aes(x = h, y = yhat)) +
geom_line(aes(group = CDBN_ID, color = CDBN_ID)) +
geom_point(aes(y = y, color = CDBN_ID), shape = 21) +
theme(legend.position = c(0.2, .9), axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_abline(intercept = 2500, slope = 1, linetype = "dotted") +
scale_x_continuous(breaks = envlabel$h, labels = envlabel$Location_code) +
coord_cartesian(ylim=c(-200, 5200)) +
labs(x = "Environment", y = "Yield (kg/ha)")
FW2b_data %>%
left_join(FW2b_GibbsA_env, by = "Location_code") %>%
filter(CDBN_ID %in% c("115M", "Avalanche", "US1140")) %>%
ggplot(mapping = aes(x = h, y = yhat)) +
geom_line(aes(group = CDBN_ID, color = CDBN_ID)) +
geom_point(aes(y = y, color = CDBN_ID), shape = 21) +
theme(legend.position = c(0.25, .9), axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_abline(intercept = 2500, slope = 1, linetype = "dotted") +
scale_x_continuous(breaks = envlabel$h, labels = envlabel$Location_code) +
coord_cartesian(ylim=c(-200, 5200)) +
labs(x = "Environment", y = "Yield (kg/ha)")
FW2b_data %>%
left_join(FW2b_GibbsA_env, by = "Location_code") %>%
filter(CDBN_ID %in% c("BillZ", "Max", "Mackinac")) %>%
ggplot(mapping = aes(x = h, y = yhat)) +
geom_line(aes(group = CDBN_ID, color = CDBN_ID)) +
geom_point(aes(y = y, color = CDBN_ID), shape = 21) +
theme(legend.position = c(0.25, .9), axis.text.x = element_text(angle = 45, hjust = 1)) +
geom_abline(intercept = 2500, slope = 1, linetype = "dotted") +
scale_x_continuous(breaks = envlabel$h, labels = envlabel$Location_code) +
coord_cartesian(ylim=c(-200, 5200)) +
labs(x = "Environment", y = "Yield (kg/ha)")
I ordered the sites from best for beans to worst for beans here, according to the FW analysis. That order is: MOCO, NDHA, MISA, NESB, WAOT
For each site, three tables follow. The first is of the five varieties with the highest yield predictions that I could have a good amount of seed for from Puerto Rico. The second is the ten varieties that are in the MDP, DDP, or ADP that other breeders might have seed for. The third is of the 20 varieties predicted to perform the best at that site.
My top choices for varieties at this site are: 55012, JM126, & Ivory.
If Alice’s bulk in Puerto Rico goes well, there should be seed from 1-2 28 foot rows for these varieties.
FWPred %>%
filter(Location_code == "MOCO") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row")) %>%
head(5)
FWPred %>%
filter(Location_code == "MOCO") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(GBS_Panel %in% c("MDP","ADP") | In_DDP == "DDP") %>%
head(10)
FWPred %>%
filter(Location_code == "MOCO") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
head(20)
My top choices for varieties at Hatton are: Buster, BillZ, & Yolano.
If Alice’s bulk in Puerto Rico goes well, there should be seed from 1-2 28 foot rows for these varieties.
FWPred %>%
filter(Location_code == "NDHA") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row")) %>%
head(5)
FWPred %>%
filter(Location_code == "NDHA") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(GBS_Panel %in% c("MDP","ADP") | In_DDP == "DDP") %>%
head(10)
FWPred %>%
filter(Location_code == "NDHA") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
head(20)
My top choices for varieties at Saginaw are: UNS_117. MISA has grown almost everything…
If Alice’s bulk in Puerto Rico goes well, there should be seed from 1-2 28 foot rows for these varieties.
FWPred %>%
filter(Location_code == "MISA") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row")) %>%
head(5)
FWPred %>%
filter(Location_code == "MISA") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(GBS_Panel %in% c("MDP","ADP") | In_DDP == "DDP") %>%
head(10)
FWPred %>%
filter(Location_code == "MISA") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
head(20)
My top choices for this category for Scottsbluff are: Buster and Montrose (However, Buster is already in the high GxE set).
If Alice’s bulk in Puerto Rico goes well, there should be seed from 1-2 28 foot rows for these varieties.
FWPred %>%
filter(Location_code == "NESB") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row")) %>%
head(5)
FWPred %>%
filter(Location_code == "NESB") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(GBS_Panel %in% c("MDP","ADP") | In_DDP == "DDP") %>%
head(10)
FWPred %>%
filter(Location_code == "NESB") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
head(20)
My top choices for this category at Othello are: Max, Jackpot, and Lariat.
If Alice’s bulk in Puerto Rico goes well, there should be seed from 1-2 28 foot rows for these varieties.
FWPred %>%
filter(Location_code == "WAOT") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row")) %>%
head(5)
FWPred %>%
filter(Location_code == "WAOT") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(GBS_Panel %in% c("MDP","ADP") | In_DDP == "DDP") %>%
head(10)
FWPred %>%
filter(Location_code == "WAOT") %>%
arrange(desc(yhat)) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
head(20)
I ordered the sites from best for beans to worst for beans here, as determined by the Finlay-Wilkinson analysis.
My top choices for this category for Columbia are: Max, 115M
If Alice’s bulk in Puerto Rico goes well, there should be seed from 1-2 28 foot rows for these varieties.
FWPred %>%
filter(Location_code == "MOCO") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row")) %>%
head(5)
FWPred %>%
filter(Location_code == "MOCO") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(GBS_Panel %in% c("MDP","ADP") | In_DDP == "DDP") %>%
head(10)
FWPred %>%
filter(Location_code == "MOCO") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
head(20)
My top choices for low predicted yield for Hatton are: AC_Calmont, CDC_Expression
If Alice’s bulk in Puerto Rico goes well, there should be seed from 1-2 28 foot rows for these varieties.
FWPred %>%
filter(Location_code == "NDHA") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row")) %>%
head(5)
FWPred %>%
filter(Location_code == "NDHA") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(GBS_Panel %in% c("MDP","ADP") | In_DDP == "DDP") %>%
head(10)
FWPred %>%
filter(Location_code == "NDHA") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
head(20)
My top choices for low predicted yield for Saginaw are: Cardinal, CDC_Expression
If Alice’s bulk in Puerto Rico goes well, there should be seed from 1-2 28 foot rows for these varieties.
FWPred %>%
filter(Location_code == "MISA") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row")) %>%
head(5)
FWPred %>%
filter(Location_code == "MISA") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(GBS_Panel %in% c("MDP","ADP") | In_DDP == "DDP") %>%
head(10)
FWPred %>%
filter(Location_code == "MISA") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
head(20)
My top choices for low predicted yield at Scottsbluff are: Mogul & Cardinal.
If Alice’s bulk in Puerto Rico goes well, there should be seed from 1-2 28 foot rows for these varieties.
FWPred %>%
filter(Location_code == "NESB") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row")) %>%
head(5)
FWPred %>%
filter(Location_code == "NESB") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(GBS_Panel %in% c("MDP","ADP") | In_DDP == "DDP") %>%
head(10)
FWPred %>%
filter(Location_code == "NESB") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
head(20)
My top choices for low predicted yield at Othello are: 88728, Sapphire, Emerson, & Pindak.
If Alice’s bulk in Puerto Rico goes well, there should be seed from 1-2 28 foot rows for these varieties.
FWPred %>%
filter(Location_code == "WAOT") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(PR_Growout_Size %in% c("2x 28 foot row", "28 foot row")) %>%
head(5)
FWPred %>%
filter(Location_code == "WAOT") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
filter(GBS_Panel %in% c("MDP","ADP") | In_DDP == "DDP") %>%
head(10)
FWPred %>%
filter(Location_code == "WAOT") %>%
arrange(yhat) %>%
dplyr::select(CDBN_ID, y, yhat, SD_yhat, Market_class_ahm, Race, Type.of.Bulk, GBS_Panel, In.DDP, Year, Seed.From.1, Seed.From.2) %>%
rename("Yield_kg_ha" = y, "Predicted_Yield" = yhat, "SD_of_Pred_Yield" = SD_yhat, "Market_class" = Market_class_ahm, "PR_Growout_Size" = Type.of.Bulk, "In_DDP" = In.DDP, "First_CDBN_Year" = Year, "Seed_Source_1" = Seed.From.1, "Seed_Source_2" = Seed.From.2) %>%
distinct(CDBN_ID, .keep_all = TRUE) %>%
head(20)
Here is a summary of the varieties I’m tentatively thinking to grow to test predictions from the Finlay-Wilkinson model. I decided to focus mostly on Durango varieties as these seem to have the most variation in GxE in the CDBN dataset. In the list below, if a question mark follows the variety name, that means I might not have seed for that variety, but it is in another Diversity Panel - might someone else have seed to grow for it?
So, focusing mostly on Durango varieties, here are the varieties I think should be grown at:
I’d be happy to grow this set of varieties, but please note that this is not my final word on varieties! I am hoping to finish sets of models of variety performance along climate gradients by the end of March. I think this approach will help address some of the non-linearity in the Finlay-Wilkinson model and improve the predictive ability of this model, which right now hovers at ~66%. I’m particularly hoping it will correct how some varieties outperform linear predictions in the NDHA - MISA region of the plots.