1

1. Testing for overdominance. Testing for overdominance in grain yield by regression of F₁ on homozygous parent (1946 and 1947 Letters) has been done now with 17 sets of data to provide unweighted odds of 14:3 for overdominance.

Testing by regression of F₁ on mean performance of parent as proposed in the 1947 Letter has been done with many more sets of data to provide inconclusive results. Estimates of the natural selection equilibrium are mostly higher by this second technic than those obtained with the first, especially in the more recent tests.

A possible explanation of the discrepancy may be bias occasioned by each parent being rated with a different tester group in samples where each parent is crossed with every other parent.

Such bias may be avoided by resort to constant tester groups. Thus, if the sample of parents is divided into two groups and only crosses between groups are considered, each group is a constant tester for individuals of the other group. From any one of the many sets of records available on the 45 F₁s of 10 inbred lines many 5x5 or 4x6 tables may be extracted for analysis. Each parent must occur in only one group. The data table must be complete.

Under the assumption of no epistasy, which seems reasonably well warranted for corn yield, theory of regression analysis of data tables as outlined above is simple. The problem essentially is regression of individual F₁s on row and column means ‑‑ prediction of the single cross from general performance of the two parents. This problem has been solved by methods employed in the two previous Letters for regression of F₁ phenotype on parent phenotype per se.

If v₁, v₂, ‑‑‑‑ v_n are proportions of A in the lines of one group, �v is the proportion A for the whole group. Employ w similarly for the other group. Expectations for row and column means may then be expressed in terms of v and �w for one set and, w and �v for the other. Solving these expressions for v and w respectively and substituting in the general F₁ function provides the desired theoretical regression of F₁ phenotype on general performance of parents (phenotype);

F₁ = b₁G₁ + b'₁G₂ ‑ b₂G₁G₂ + C

b₂ = ‑2k/nd(1+k‑2k�v)(1+k‑2k�w)

bp = (1+k‑2kw)/(1+k‑2k�w); G₂ constant

= (1+k‑2kv)/(1+k‑2k�v); G₁ constant.

The natural selection equilibrium gene frequency is (1+k)/2k. Thus, if any parent line has that gene frequency with respect to n loci considered, regression of the F₁ row or column which it heads on the parallel row or column of G (general performance, row or column means) has the expectation of zero. The test is carried through, as before, by calculating simple regressions (bp) of each row and column of F₁ on the parallel row or column of G. Regression of bp on G then provides the estimate of b₂, (which is the same for either rows or columns) and the estimate of G for bp = 0.

Sampling variances of these estimates from samples, e.g., of 25 F₁s must be large, but precision may be increased by analysis of additional sets of data. The three analyses we have run so far on yield data have not been very consistent.

We must note too that if the mean gene frequency �v or �w of either tester group should closely approach the equilibrium value, (1+k)/2k, genetic variance of G of lines of the tested group would approach zero. Estimates of bp and b₂ might then have little genetic meaning.

It is clear from the theoretical formulas for bp that mean bp has the expectation of plus 1.00, regardless of degree of dominance or of gene frequency. Calculated values for three analyses are: 1.0000, 1.0001; 1.0000, 0.9982; 0.9998, 0.9988. Deviations from the expectation of 1.0000 may be due to dropping decimal places and to metrical bias. These obtained values seem to be in agreement with the hypothesis that epistasy and metrical bias are of little moment in grain yield of corn. Further research is required to establish the sensitivity of this test for linearity.

It is also clear that the estimates of v and w for the parent lines are independent of homozygosity or heterozygosity of the parents. The analysis outlined here may be employed equally well with heterozygous clones or with lines isolated by mild inbreeding, provided, of course, that epistasy and metrical bias are not disturbing factors.

If calculated mean bp is found far from the expectation of 1.0000 in any case the estimates of b₂ and critical G must be suspected. Such deviation may, of course, be evidence of epistatic interaction.

In general, if v is the proportion A in any corn plant of a crossbreeding variety, the best estimate of phenotype, with respect to n loci, is a second degree function (v²) if there is dominance bias, if mean k is not zero. If w is proportion A in any other plant, the best estimate of phenotype of F₁ is a second degree function (vw) as shown in previous Letters. Hence, the best estimate of F₁ phenotype from phenotypes of heterozygous parents is a fourth degree function if there is dominance bias.

This general function was reduced to second degree by employing homozygous parents in the technic outlined in the two previous Lotters. In the present technic the fourth degree function is reduced to second degree by experimental design of constant tester groups. Row and column means are linear functions of gene frequencies of parents. In both cases the second degree function is reduced to first degree by holding one parent constant to calculate bp for each F₁ row or column. Thus, the least square fit of a straight line is done only where the Mendelian expectation is linear. Regression of bp on P or G is also theoretically linear under the stated assumption.

We may note that the difficulty of the fourth degree relation of offspring phenotype to phenotypes of heterozygous parents is avoided in some theoretical considerations by treating the case of no dominance, or by treating the case of complete dominance vrith mean gene frequency at 0.5. The difficulty does not appear if consideration is restricted to regression of offspring on parent gamete. Fisher, Immer and Tedin (Genetics, 1932) have avoided some of the difficulty by considering offspring of two homozygotes thus to obtain a mean gene frequency of 0.5. I think they are fitting straight lines, however, where the theoretical relations are curved if there is dominance bias. If so, their methods must be less efficient than those discussed above.

Fisher has noted that regression analysis and Analysis of Variance are fundamentally the same. Present technic may perhaps be employed or improved to study interaction of main effects and interactions of environmental factors with main effects and interactions of genes, where the same crosses are tested at different locations or with different treatments. Omitting any one parent to make an (n₁ -1) x (n₂) table may provide specific information on that parent.

The six single crosses of four parents provide for three (2x2) tables equivalent to the three double crosses. In each case, the two tester groups are the parent single crosses; the four items in the table provide the predicted double cross.