gmaR tutorial

Introduction

Generation Mean Analysis (GMA) is a statistical approach for analyzing quantitative traits in plant breeding, genetics and agricultural research. It is a useful tool for plant breeders to study the inheritance of traits, identify favorable combinations of genes and facilitate the development of superior cultivars. The objective of GMA is to estimate the genetic effects of each gene involved in the expression of a trait and predict the best combination of genes for enhancing trait expression. GMA involves the analysis of data generated by crossing two parents with contrasting phenotypic traits, followed by analysis of the performance of the offspring generations.

gmaR is an R package developed to provide tools for the analysis of GMA data. The fuctionalities incldes A, B, C, and D Scaling Tests, Joint Scaling Test, Six parameter model, Heritability, Hetorosis.

Instalation

Install the development version from GitHub with:

# devtools::install_github("Oyindamolajames/gmaR")

# To build the HTML vignette use
# devtools::install_github("Oyindamolajames/gmaR", build_vignettes = TRUE)

Note: If you are a Windows user, you should also first download and install the latest version of Rtools.

Load Package and Data

To illustrate the use of gmaR, we provide an example of a simulated data of height of two parental tobacco pure breeding lines, the F1, F2 progeny, the first backcross families B1 and B2 derived from crossing P1 to P2 from Mather and Jinks’s seminal textbook Introduction to Bio-metrical Genetics.

We first load the gmaR package and import the data:

library(gmaR)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
# tobacco_height <- read.csv("data/tobacco_height.csv")

Your data should have this format, The Gen column should be named p1, p2, f1, f2, bc1, bc2 respectively, and the next column should be the trait for each plant.

head(data)
#>   Gen       PH
#> 1  P1 115.3908
#> 2  P1 116.2923
#> 3  P1 115.1580
#> 4  P1 117.8634
#> 5  P1 116.4547
#> 6  P1 115.1749
tail(data)
#>     Gen       PH
#> 495 BC2 111.7725
#> 496 BC2 108.4152
#> 497 BC2 109.3819
#> 498 BC2 108.4165
#> 499 BC2 109.4093
#> 500 BC2 108.5760

Usage

Get summary of data

This function summarizes the data.

gmaR::get_summary(data$Gen, data$PH)
#> # A tibble: 6 × 6
#>       n Gen    mean variance mean_var weight_var
#>   <int> <ord> <dbl>    <dbl>    <dbl>      <dbl>
#> 1   120 BC1   116      0.489  0.00408      245. 
#> 2   120 BC2   109.     0.613  0.00511      196. 
#> 3    60 F1    118.     0.97   0.0162        61.9
#> 4   160 F2    112.     0.492  0.00308      325. 
#> 5    20 P1    116.     1.03   0.0516        19.4
#> 6    20 P2     98.4    1.45   0.0726        13.8

Scale Test

To compute the A, B, C and D scale test.

gmaR::scale_test(data$Gen, data$PH)
#> # A tibble: 4 × 6
#>   `Scaling Test` Parameter   S.E T_Value Sig   comment                          
#>   <chr>              <dbl> <dbl>   <dbl> <chr> <chr>                            
#> 1 A                  -1.97 0.290   -6.79 *     Dosen't support the additive dom…
#> 2 B                   2.20 0.330    6.66 *     Dosen't support the additive dom…
#> 3 C                  -2.97 0.488   -6.09 *     Dosen't support the additive dom…
#> 4 D                  -1.60 0.147  -10.9  *     Dosen't support the additive dom…

Joint Scalling Test

To compute the joint scale test, m, d and h parameter

gmaR::joint_scale_test(data$Gen, data$PH)
#> $model
#> # A tibble: 6 × 11
#>   Generation  Mean Variance Mean_of_variance Weight m     d     h     Observed
#>   <chr>      <dbl>    <dbl>            <dbl>  <dbl> <chr> <chr> <chr>    <dbl>
#> 1 p1         116      0.489          0.00408  245.  1     1     0        116  
#> 2 p2         109.     0.613          0.00511  196.  1     -1    0        109. 
#> 3 f1         118.     0.97           0.0162    61.9 1     0     1        118. 
#> 4 f2         112.     0.492          0.00308  325.  1     0     1/2      112. 
#> 5 bc1        116.     1.03           0.0516    19.4 1     1/2   1/2      116. 
#> 6 bc2         98.4    1.45           0.0726    13.8 1     -1/2  1/2       98.4
#> # ℹ 2 more variables: Expected <dbl>, xsqur <dbl>
#> 
#> $chi_square
#> [1] 3732.962
#> 
#> $genpara
#>   Genetic.Parameters      Value Standard.Error sig
#> m                  m 111.991239     0.04602198   *
#> d                  d   3.716961     0.04744982   *
#> h                  h   1.718010     0.10936878   *

Perfect Fit

To compute the perfect fit test

gmaR::perfectf_fit(data$Gen, data$PH)
#> $`Genetic Parameters`
#>   GEN  VALUE    Stand_err    Val Sig
#> 1   m 153.76 54.580329239 153.76   *
#> 2   d   3.42  0.003267288   3.42   *
#> 3   h -61.15  0.535476903 -61.15   *
#> 4   i -17.62  0.178620324 -17.62   *
#> 5   j  27.19  0.179001079  27.19   *
#> 6   l  48.62  0.358127289  48.62   *
#> 
#> $model
#>   Gen   Mean expected     chisqr
#> 1  p1 116.00 139.5600 136214.380
#> 2  p2 109.16 132.7200 108660.411
#> 3  f1 117.67 141.2300  34334.449
#> 4  f2 111.78 135.3400 180511.740
#> 5 bc1 116.30 139.4425  10369.319
#> 6 bc2  98.45 122.4275   7919.015
#> 
#> $chi_square
#> [1] "The Chiquare value is 478009.314631461"

Genetic Parameter

To compute the genetic parameters.

gmaR::genetic_parameters(data$Gen, data$PH)
#> Warning in sqrt(VG): NaNs produced
#>                 Components         Value
#> 1                       VP  4.920000e-01
#> 2                       VE  7.605000e-01
#> 3                       VG -2.685000e-01
#> 4                       VA -3.002000e+00
#> 5                       VD  4.930000e+00
#> 6                       HB -5.457000e+01
#> 7                       HN -3.050800e+02
#> 8                       GA -7.885000e-01
#> 9                      GAM -7.008000e-01
#> 10               heterosis  5.090000e+00
#> 11                     PCV  6.233888e-01
#> 12                     GCV           NaN
#> 13                     MPH  5.090000e+00
#> 14                     BPH  1.670000e+00
#> 15                      ID  5.890000e+00
#> 16 Wright Effective factor -1.223473e+01
#> 17 Bruton Effective factor -2.284968e+06

Conclusion

In summary, gmaR is a useful tool for the analysis of GMA data in plant breeding. It provides functions for the estimation of genetic parameters and prediction of the best genotypic combinations for enhancing trait expression. The package is easy to use and provides efficient and robust analysis of GMA data.