Maize Genetics Cooperation Newsletter vol 81 2007

 

Pergamino, Argentina

EEA INTA Pergamino

Cordoba, Argentina

FCA-UNC

Buenos Aires, Argentina

FAUBA

 

Prediction of maize (Zea mays L.) combining ability using molecular markers and mixed linear models theory

--Ornella, L*; Eyherabide, G; di Rienzo, J; Cantet, J;
Balzarini, M

*Present address:  Area Comunicaciones (FCIA-UNR), Rosario, Argentina

 

       Predicting the performance of untested single crosses is important in hybrid breeding programs.  The cost involved in field testing makes it impossible to evaluate all new inbreds and posible combinations.  The traditional fixed linear model, coupled with the ordinary least squares estimation used for most plant breeders, is too restrictive because of the independence assumption.  Error structure is often more complex than the one used in standard linear models (Balzarini, In Quantitative Genetics, Genomics and Plant Breeding, 2002).  In contrast, the general linear mixed model (Henderson, Applications of Linear Models in Animal Breeding, Univ. Guelph, 1984) can easily accommodate covariances among observations.  The inclusion of a numerator matrix generates unbiased heritability estimations when maximum likelihood methodologies are used (ML, REML and Bayes), mainly because it takes account of the correlation between observations due to covariance between relatives and the variation due to genetic drift, which  is important in finite populations under selection (Sorensen and Kennedy, Theor. Appl. Genet., 1983).  The objective of this study was to analyze the effectiveness of best linear unbiased prediction (BLUP) based on molecular (microsatellite) marker data.  Field data was obtained from Nestares et al. (Pesq. Agropec. Bras. 34:1399-1406, 1999):  topcrosses between a collection of 48 inbred lines and four tester populations (sB73 and sMo17 from the Reid x Lancaster pattern, and HP3 and P5L2 from the local orange flint pattern) were evaluated for grain yield during the 1991/92 season at four environments.  All lines but two (B73 and Mo17) were orange flint germplasm developed by INTA from twenty different sources (synthetics, composites, landraces, planned crosses and a commercial hybrid).  Molecular data were obtained for twenty-six (26/48) parent lines and the four tester populations using 21 microsatellite markers evenly distributed in the genome (Morales Yokobori et al., MNL 79:36, 2005).  We had some problems in molecular characterization of the testers HP3 and P5L2, but used the data, considering the robustness of blup predictors (Bernardo, Crop Sci. 36:862-866, 1996)

       Relatedness (r) between parents was estimated using MER (Moment Estimate of Relatedness) software (Wang, Genetics 160:1203-1215, 2002).  r = 2 Q , where Q , the coefficient of coancestry, is the probability that, for any autosomal locus, a random gene taken from individual x is identical by descent with a random gene taken from individual y.  Three different variance-covariance structures were compared using molecular and/or pedigree data and under the following linear mixed model (Henderson, 1984):

 

 

       Where: y is the response vector (yield data of hybrids derived from the crosses between lines and testers), and X, Zl, Zt, Zd and Zge are known design matrices.  b is the vector of fixed parameters and al, at, d are vectors of random effects associated with additive effects of lines, additive effects of testers and dominance effects, respectively.  e is the vector of residuals.  (ge) is a random effects vector associated with genotype-environment interaction.  For the sake of simplicity, we assumed that  Cge, the covariance matrix for (ge), is an  identity matrix (no correlation between interactions).  Residuals were also considered independent.

       Assumptions regarding relatedness between parents allows the definition of the covariance matrices Al, At and D:

Model 1) Variance components, Parents unrelated.  Al, At and D are identity matrices.

Model 2) Lines and testers are derived from two different ancestral populations, so: Al={rij(l)}, At={rij(t)} and D={dij=0.25 rij(l) rij(t)} (given hybrids i and j, rij(l) is the relatedness between parent lines  and  rij(t}  is the relatedness between  testers).

Model 3) Lines and testers are derived from the same ancestral population.  al and at  can be combined in one vector a of additive effects of parents,  A={axy}, axy =relatedness between parents (lines and/or testers).  D={dij=0.25(rij(ll)rij(tt)+rij(tl)rij(lt))}, rij(ll)} rij(xx) is the relatedness between parents of hybrids i and j; (xx): l stands for lines and t stands for testers.

       All models were evaluated by restricted likelihood (resLL) and Akaike's information criterion (AIC) (Table 1).  Cross-validation

Table 1.  Variance analysis of proposed models.

 

 

additive variance

dominance variance

GE

variance

error

-2resLL

AIC

Model 1

s2l=3.23

s2t =7.00

s2d=15.33

s2ge=3.87

s2e=151.36

6348.8

6358.8

Model 2

Pedigree

s2l=3.38   s2t=14.00

s2d=15.47

s2ge=3.88

s2e=151.36

6348.7

6358.7

Model2

microsatellite

s2l=3.23   s2t=11.11

s2d=15.60

s2ge=3.78

s2e=151.36

6348.6

6358.6

Model 3

Pedigree

s2a=13.88

s2d=14.43

s2ge=3.83

s2e=151.36

6349.7

6357.7

Model 3

microsatellite

s2a=12.34

s2d=14.59

s2ge=3.82

s2e=151.37

6349.3

6357.3

* Variance components were estimated via restricted maximum likelihood (REML) using SAS (Sas Institute) PROC MIXED.

**s2l additive variance due to parent lines, s2t additive variance due to parent testers, s2a additive variance of testers and lines (both groups belong to the same population).

 

statistics were calculated to assess and compare the predictive ability of some of the proposed models.  For each genetic model, the set performance of m missing crosses was predicted based on the formula (Balzarini, 2002):

 

Where yM = m x 1 vector of predicted yields of missing crosses, yP a p x 1 vector of average yields of predictor hybrids, C m x p matrix of genetic covariances between missing and predictor hybrids and V  (p x p) phenotypic variance-covariance matrix among the predictor hybrids.  We performed predictions for the (25 x 4) hybrids (m=4, p=100) and did not consider 4 hybrids based on a missing line.  Effectiveness of predictions was measured by Spearman correlation (Table 2).

 

Table 2.  Spearman Rank Correlation between observed (BLUP) and predicted hybrid yields (model 2).

 

Population

Pedigree data

Microsatellite data

26 lines

0.40**

0.36**

lines derived from synthetics

0.45*

0.44

lines derived from composites

0.52**

0.49**

Lines unrelated or highly divergent

0.08

0.02

* Indicates significance at P = 0.05.

** Indicates significance at P =0.01.

 

       Conclusions.  Inclusion of a numerator matrix (using pedigree or molecular data) generates more precise variance estimates and higher values of heritability when compared with traditional fixed effects models.  Molecular data used in these types of crosses (genetically divergent parental populations) did not provide any additional information to that provided by pedigree data.

 

 

Please Note: Notes submitted to the Maize Genetics Cooperation Newsletter may be cited only with consent of authors.