1. Testing for overdominance. Testing for
overdominance in grain yield by regression of F1 on homozygous
parent (1946 and 1947 Letters) has been done now with 17 sets of data to
provide unweighted odds of 14:3 for overdominance.
Testing by regression of F1 on mean
performance of parent as proposed in the 1947 Letter has been done with many
more sets of data to provide inconclusive results. Estimates of the natural
selection equilibrium are mostly higher by this second technic than those
obtained with the first, especially in the more recent tests.
A possible explanation of the discrepancy may be
bias occasioned by each parent being rated with a different tester group in
samples where each parent is crossed with every other parent.
Such bias may be avoided by resort to constant
tester groups. Thus, if the sample of parents is divided into two groups and
only crosses between groups are considered, each group is a constant tester for
individuals of the other group. From any one of the many sets of records available
on the 45 F1s of 10 inbred lines many 5x5 or 4x6 tables may be
extracted for analysis. Each parent must occur in only one group. The data
table must be complete.
Under the assumption of no epistasy, which seems
reasonably well warranted for corn yield, theory of regression analysis of data
tables as outlined above is simple. The problem essentially is regression of
individual F1s on row and column means ‑‑ prediction of
the single cross from general performance of the two parents. This problem has
been solved by methods employed in the two previous Letters for regression of F1
phenotype on parent phenotype per se.
If v1, v2, ‑‑‑‑
vn are proportions of A in the lines of one group, �v is the
proportion A for the whole group. Employ w similarly for the other group.
Expectations for row and column means may then be expressed in terms of v and
�w for one set and, w and �v for the other. Solving these expressions for v and
w respectively and substituting in the general F1 function provides
the desired theoretical regression of F1 phenotype on general
performance of parents (phenotype);
F1
= b1G1 + b'1G2 ‑ b2G1G2
+ C
b2
= ‑2k/nd(1+k‑2k�v)(1+k‑2k�w)
bp = (1+k‑2kw)/(1+k‑2k�w); G2
constant
= (1+k‑2kv)/(1+k‑2k�v);
G1 constant.
The natural selection equilibrium gene frequency is
(1+k)/2k. Thus, if any parent line has that gene frequency with respect to n
loci considered, regression of the F1 row or column which it heads
on the parallel row or column of G (general performance, row or column means)
has the expectation of zero. The test is carried through, as before, by
calculating simple regressions (bp) of each row and column of F1 on
the parallel row or column of G. Regression of bp on G then provides the
estimate of b2, (which is the same for either rows or columns)
and the estimate of G for bp = 0.
Sampling variances of these estimates from samples,
e.g., of 25 F1s must be large, but precision may be increased by
analysis of additional sets of data. The three analyses we have run so far on
yield data have not been very consistent.
We must note too that if the mean gene frequency �v
or �w of either tester group should closely approach the equilibrium value,
(1+k)/2k, genetic variance of G of lines of the tested group would approach
zero. Estimates of bp and b2 might then have little genetic meaning.
It is clear from the theoretical formulas for bp
that mean bp has the expectation of plus 1.00, regardless of degree of
dominance or of gene frequency. Calculated values for three analyses are:
1.0000, 1.0001; 1.0000, 0.9982; 0.9998, 0.9988. Deviations from the expectation
of 1.0000 may be due to dropping decimal places and to metrical bias. These
obtained values seem to be in agreement with the hypothesis that epistasy and
metrical bias are of little moment in grain yield of corn. Further research is
required to establish the sensitivity of this test for linearity.
It is also clear that the estimates of v and w for
the parent lines are independent of homozygosity or heterozygosity of the
parents. The analysis outlined here may be employed equally well with
heterozygous clones or with lines isolated by mild inbreeding, provided, of
course, that epistasy and metrical bias are not disturbing factors.
If calculated mean bp is found far from the
expectation of 1.0000 in any case the estimates of b2 and critical G
must be suspected. Such deviation may, of course, be evidence of epistatic
interaction.
In general, if v is the proportion A in any corn
plant of a crossbreeding variety, the best estimate of phenotype, with respect
to n loci, is a second degree function (v2) if there is dominance
bias, if mean k is not zero. If w is proportion A in any other plant, the best
estimate of phenotype of F1 is a second degree function (vw) as
shown in previous Letters. Hence, the best estimate of F1 phenotype
from phenotypes of heterozygous parents is a fourth degree function if there is
dominance bias.
This general function was reduced to second degree
by employing homozygous parents in the technic outlined in the two previous
Lotters. In the present technic the fourth degree function is reduced to second
degree by experimental design of constant tester groups. Row and column means
are linear functions of gene frequencies of parents. In both cases the second
degree function is reduced to first degree by holding one parent constant to
calculate bp for each F1 row or column. Thus, the least square fit
of a straight line is done only where the Mendelian expectation is linear.
Regression of bp on P or G is also theoretically linear under the stated
assumption.
We may note that the difficulty of the fourth degree
relation of offspring phenotype to phenotypes of heterozygous parents is
avoided in some theoretical considerations by treating the case of no
dominance, or by treating the case of complete dominance vrith mean gene
frequency at 0.5. The difficulty does not appear if consideration is restricted
to regression of offspring on parent gamete. Fisher, Immer and Tedin (Genetics,
1932) have avoided some of the difficulty by considering offspring of two
homozygotes thus to obtain a mean gene frequency of 0.5. I think they are
fitting straight lines, however, where the theoretical relations are curved if
there is dominance bias. If so, their methods must be less efficient than those
discussed above.
Fisher has noted that regression analysis and
Analysis of Variance are fundamentally the same. Present technic may perhaps be
employed or improved to study interaction of main effects and interactions
of environmental factors with main effects and interactions of genes, where the
same crosses are tested at different locations or with different treatments.
Omitting any one parent to make an (n1 -1) x (n2) table
may provide specific information on that parent.
The six single crosses of four parents provide for
three (2x2) tables equivalent to the three double crosses. In each case, the
two tester groups are the parent single crosses; the four items in the table
provide the predicted double cross.