Maize Genetics Cooperation Newsletter vol 81 2007
EEA INTA Pergamino
FCA-UNC
FAUBA
Predicting
the performance of untested single crosses is important in hybrid breeding
programs. The cost involved in
field testing makes it impossible to evaluate all new inbreds and posible combinations. The traditional fixed linear model,
coupled with the ordinary least squares estimation used for most plant
breeders, is too restrictive because of the independence assumption. Error structure is often more complex
than the one used in standard linear models (Balzarini, In Quantitative Genetics,
Genomics and Plant Breeding, 2002).
In contrast, the general linear mixed model (Henderson, Applications of
Linear Models in Animal Breeding, Univ. Guelph, 1984) can easily accommodate
covariances among observations.
The inclusion of a numerator matrix generates unbiased heritability
estimations when maximum likelihood methodologies are used (ML, REML and
Bayes), mainly because it takes account of the correlation between observations
due to covariance between relatives and the variation due to genetic drift,
which is important in finite
populations under selection (Sorensen and Kennedy, Theor. Appl. Genet.,
1983). The objective of this study was to analyze the effectiveness of best
linear unbiased prediction (BLUP) based on molecular (microsatellite) marker
data. Field data was
obtained from Nestares et al. (Pesq. Agropec. Bras. 34:1399-1406, 1999): topcrosses between a collection of 48 inbred
lines and four tester populations (sB73 and sMo17 from the Reid x Lancaster
pattern, and HP3 and P5L2 from the local orange flint pattern) were evaluated
for grain yield during the 1991/92 season at four environments. All lines but two (B73 and Mo17) were
orange flint germplasm developed by INTA from twenty different sources
(synthetics, composites, landraces, planned crosses and a commercial
hybrid). Molecular data were obtained for twenty-six (26/48) parent lines and the
four tester populations using 21 microsatellite markers evenly distributed in
the genome (Morales Yokobori et al., MNL 79:36, 2005). We had some problems in molecular
characterization of the testers HP3 and P5L2, but used the data, considering
the robustness of blup predictors (Bernardo, Crop Sci. 36:862-866, 1996)
Relatedness (r) between parents
was estimated using MER (Moment Estimate of Relatedness) software (Wang,
Genetics 160:1203-1215, 2002).
r = 2 Q ,
where Q , the coefficient of coancestry, is the
probability that, for any autosomal locus, a random gene taken from
individual x is identical
by descent with a random gene taken from individual y.
Three different
variance-covariance structures were compared using molecular and/or pedigree
data and under the following linear mixed model (Henderson, 1984):
Where: y is the response vector (yield data of hybrids derived from the
crosses between lines and testers), and X, Zl, Zt, Zd
and Zge are known design matrices. b is the vector of fixed
parameters and al, at, d are vectors of random effects
associated with additive effects of lines, additive effects of testers and
dominance effects, respectively. e is the vector of residuals. (ge) is a random effects vector associated
with genotype-environment interaction.
For the sake of simplicity, we assumed that Cge, the covariance matrix for (ge), is an identity matrix (no correlation between
interactions). Residuals
were also considered independent.
Assumptions regarding relatedness between parents allows the definition of the covariance matrices Al, At and D:
Model 1) Variance components, Parents unrelated. Al, At and D are identity matrices.
Model 2) Lines and testers are derived from two different ancestral populations, so: Al={rij(l)}, At={rij(t)} and D={dij=0.25 rij(l) rij(t)} (given hybrids i and j, rij(l) is the relatedness between parent lines and rij(t} is the relatedness between testers).
Model 3) Lines and testers are derived from the same ancestral population. al and at can be combined in one vector a of additive effects of parents, A={axy}, axy =relatedness between parents (lines and/or testers). D={dij=0.25(rij(ll)rij(tt)+rij(tl)rij(lt))}, rij(ll)} rij(xx) is the relatedness between parents of hybrids i and j; (xx): l stands for lines and t stands for testers.
All models were evaluated by
restricted likelihood (resLL) and Akaike's information criterion (AIC) (Table
1). Cross-validation
Table
1. Variance analysis of proposed models.
|
additive variance |
dominance variance |
GE variance |
error |
-2resLL |
AIC |
Model 1 |
s2l=3.23 s2t
=7.00 |
s2d=15.33 |
s2ge=3.87 |
s2e=151.36 |
6348.8 |
6358.8 |
Model 2 Pedigree |
s2l=3.38 s2t=14.00 |
s2d=15.47 |
s2ge=3.88 |
s2e=151.36 |
6348.7 |
6358.7 |
Model2 microsatellite |
s2l=3.23 s2t=11.11 |
s2d=15.60 |
s2ge=3.78 |
s2e=151.36 |
6348.6 |
6358.6 |
Model 3 Pedigree |
s2a=13.88 |
s2d=14.43 |
s2ge=3.83 |
s2e=151.36 |
6349.7 |
6357.7 |
Model 3 microsatellite |
s2a=12.34 |
s2d=14.59 |
s2ge=3.82 |
s2e=151.37 |
6349.3 |
6357.3 |
* Variance
components were estimated via restricted maximum likelihood (REML) using SAS
(Sas Institute) PROC MIXED.
**s2l additive variance due to
parent lines, s2t additive variance due to
parent testers, s2a additive variance of testers
and lines (both groups belong to the same population).
statistics were calculated to assess and compare the predictive
ability of some of the proposed models.
For each genetic model, the set performance of m missing crosses was
predicted based on the formula (Balzarini, 2002):
Where yM = m x 1 vector of predicted yields of missing crosses, yP a p x 1 vector of average yields of predictor hybrids, C m x p matrix of genetic covariances between missing and predictor hybrids and V (p x p) phenotypic variance-covariance matrix among the predictor hybrids. We performed predictions for the (25 x 4) hybrids (m=4, p=100) and did not consider 4 hybrids based on a missing line. Effectiveness of predictions was measured by Spearman correlation (Table 2).
Table
2. Spearman Rank Correlation between observed
(BLUP) and predicted hybrid yields (model 2).
Population |
Pedigree
data |
Microsatellite
data |
26
lines |
0.40** |
0.36** |
lines
derived from synthetics |
0.45* |
0.44 |
lines
derived from composites |
0.52** |
0.49** |
Lines
unrelated or highly divergent |
0.08 |
0.02 |
*
Indicates significance at P = 0.05.
**
Indicates significance at P =0.01.
Conclusions. Inclusion of a numerator matrix (using pedigree or molecular data) generates more precise variance estimates and higher values of heritability when compared with traditional fixed effects models. Molecular data used in these types of crosses (genetically divergent parental populations) did not provide any additional information to that provided by pedigree data.
Please Note: Notes submitted to the Maize Genetics Cooperation
Newsletter may be cited only with consent of authors.