Heterosis, grain yield

Heterosis, grain yield. For homozygous parents and linear interaction of non-allelic genes, in the notation of Fisher et al Genetics 17:107, 1932, d is (AA-aa)/2, h is the deviation of aA from the midpoint between aa and AA.

P₁ = 2n₁d + R	F₁ = n(d + h) + R	B₁ = 1/2n(d + h) + n₁d + R
P₂ = 2n₂d + R	F₂ = n(d +1/2h) + R	B₂ = 1/2n(d + h) + n₂d + R
P = 2nd +2R	F = 2nd + 3/2nh +2R	B = 2nd+ nh +2R

is the phenotype, n is number loci heterozygous in F₁, R is the least homozygote available by segregation.

Analysis of data

	Maize yield		Tomato, Powers³
	Neal¹	Lindstrom²	Danmark	× Red Current	Johannis. × Red C
			Height	Fruit wt.	Fruit wt.

Estimates of 2nh			(All records per cent of F₁)
4(F₁-F₂)	148.1	136.8	76.0	36.0
(2F₁-P)	124.4	127.6	58.5	- 751.7	- 625.1
2(2F₁-B)	113.2	62.8	-241.6	- 228.8
2(2F₂-P)	130.3	118.4	41.0	-1510.6	1486.3
4/3(F-P)	126.4	124.5	52.6	-1004.7	- 845.0
4(F-B)	89.6	49.6	-490.4	- 493.6
2(B-P)	142.0	54.2	-1261.8	-1021.5

Mean 2nh	132.3	121.7	56.4	- 750.5	- 666.3

(F₂-1/2B)	- 5.9	- 3.3	-67.5	67.3

P	75.6	72.4	141.5	950.7	836.6

¹J.A.S.A. 27:666, 1935.
²Proc. 7 Int. G.C.
³J.A. Res. 63:149, 1941.

The close agreement of Neal's and Lindstrom's data in the above analysis seems to indicate strongly that grain yield is a function of heterozygosis. For any locus, (aA-aa) - (AA-aA) = (h+d) - [2d-(h+d)] = 2h. The interval from the least homozygote to the heterozygote minus the interval from the heterozygote to the top homozygote is 2h for one locus or 2nh for n loci, if h and d values are essentially the same for all loci.

For all values of h or h/d (any degree of dominance) the 7 estimates of 2nh (table) are a homogenous set, except for non-genetic fluctuations. Heterogeneity indicates interaction of non-alleles.

The three quantities, (P = 2nd+2R)>(F₁ = nh+nd+R)>2nh must lie in that or the reverse order with each interval in any case equal to �[n(h-d)-R]. If h=d (dominance complete) the intervals are estimates of R. On that assumption the mean estimate of R for the two maize records is minus 26.5%F₁. If R cannot be negative the minimum estimate of R equal zero provides the minimum estimate of h equal 1.7d.

The top homozygote is (P-R). For these records it cannot be estimated larger than 74%F₁ if negative R is to be avoided.

The data on tomato weight and estimates of 2nh from them may seem to suggest a complication of interactions, although the two sets of 2nh are quite similar. It is proposed to separate allelic from any regular non-allelic interaction graphically. The points P₁, B₁, F₁, F₂, B₂ and P₂ are plotted with the scale on the Ø axis being that of the actual data and on the x axis that of allelic but no nonallelic interaction. Lay off a wide interval from P₁ to P₂ on the x axis. Trial positions of F₁ may then be taken with F₂ midway between F₁ and the mean of parents and each backcross midway from F₁ to the recurrent parent. The best trial position of F₁ should be 2(F₁-F₂) from the mean of parents in the direction indicated by the data, since F₁ and F₂ have the same gene number and their comparison will be least affected by non-allelic interaction. If the 6 plotted points do not seem to lie on a smooth curve F₁ is to be shifted right or left with F₂ and backcross shifts being 1/2 of the F₁ shift until the best fit to a smooth curve is obtained. The curve presumably represents regular non-allelic interaction or regular interaction with environment. Allelic interaction is evident in the 7 estimates of 2nh which should be a uniform set.

In this way, close fits to smooth curves were obtained with Power's data on the crosses Danmark × Red Current and Johannisfeur × Red Current with F₁s just slightly to the right of the parental midpoint towards heavier fruit. The curves lie between Ø = kx³ and Ø = b^x over most of the range. Both agree closely with the hypothesis of very slight dominance of heavier fruit and strong, regular interaction of non-alleles. The interaction may of course be little more than the cubic relation of weight or volume to linear dimension.

A slightly poorer fit was obtained for Johannisfeur × Bonny Best but the same dominance bias and interaction is evident. The two records on Danmark × Johannisfeur did not provide consistent solutions, perhaps because the parents are too close together. That difficulty would always appear with yield records on inbred maize.

Complementary interaction is not regular in the above sense. It might become evident in the (F₂-1/2B) comparison and in aberrations from regular interaction in the above graphical analysis. With 2-factor interaction, F₂ is 9/16 and 1/2B is 8/16 of the interval from 1/2P to F₁; both are 8/16 without interaction. There is no evidence of complementary interaction as a factor of heterosis of maize yield or of tomato plant height. There seems to be no evidence for complementary interaction for tomato weight except in the cross Johannisfeur x Bonny Best. If the curve for that cross is plotted by neglecting the F₂ to obtain the best fit with F₁ and backcrosses the F₂ deviation from the curve is large and positive which may indicate complementary interaction for heavier fruit. Plotting 3√Ø or log Ø might bring the complementary interaction out more clearly.

The reader should be warned that application of the above graphical analysis to data involving little or no non-allelic interaction and strong interaction of alleles as in tomato plant height may produce a straight line with the 6 values spaced the same on both axes or a smooth curve through P₁, B₁, F/2, B₂ and P₂. In the latter event the six values will agree with the hypothesis of no allelic interaction on the x axis. The factor of curvature here is h. I do not now have the function.

For linear interaction of non-alleles, theoretical regressions in F₂ and backcross of Ø on x (gene number) are:

F₂;	Ø =	-hx₂ + (2n-1)dx + 2nhx + R			,	dØ/dx = d +	(2n-2x)h
		2n-1					2n-1

Bn;	Ø =	nd + (n-2n_b)hx	+	2n²_bh + R	,	dØ/dx = d +	(n-2n_b)h
		n		n			n

n is the number of loci heterozygous in F₁; nb is the number of n loci fixed AA in the recurrent parent.

These equations seem to be mainly useful for the solution of theoretical problems. For example, the backcross distribution is not skewed by any degree of dominance even though the recurrent parent is fixed AA at all n loci, (n_b = n). The slope is then (d-h) or zero if h = d. If h>d the slope is negative -- 0 decreases as the number of plus genes increases. If n_b is zero the slope is (d+h) -- positive unless h is negative and greater than d.

F₂ regression is a second degree parabola with slope a function of -2hx. The F₂ distribution is skewed by dominance. The familiar case (h = d) involves the left branch of the parabola from (0,R) rising with decreasing slope to the vertex at (x = 2n-1/2), then dropping slightly to (x = 2n). This function may be employed with the normal frequency table to construct a theoretical distribution for any number of loci and any degree of dominance to show that maximum skewness is reached when h = d; and that skewness then decreases with increasing h. The demonstration is facilitated by working with one pair of genes. Thus if A'A' equals AA, and A'A is some greater value, d is zero and h is relatively large. The F₂, (1/4 A'A'+1/2AA) becomes (1/2A'A',AA + 1/2A'A). This distribution or the product of any number of such distributions is symmetrical. If d is now allowed to take increasing positive values, skewness increases up to h = d. East's alleles of divergent function would not intensify skewness of F₂.

The conclusion of h>d for maize yield is supported by failure of mass and ear row selection, by failure of synthetic combinations of selected inbreds, by superiority of hybrids of inbreds of diverse origins and by the success of modern maize breeding itself. If h is not greater than d, mass or ear row selection will probably continue to surpass present maize breeding technic, because of more frequent recurrence of selection. But if h>d, present technic is the only method so far tried which should effect appreciable improvement. No degree of allelic interaction will confuse selection among F₁ hybrids of homozygous lines. However, selection favoring the heterozygote loses efficiency rapidly. It is questionable if the expectation of continuing success with present technic can be supported in Mendelian theory.

Selection may be measured by the deviation of the mean of a selected group from the original mean in terms of the standard deviation of the original. Thus "student" noted selection effects of 12 and 7 sigma for high and low oil in the Illinois experiments. If the selected group may be represented by a tail of the normal area cut off above x = t, and the mean of the tail is s; s = (ordinate at t)/(area beyond t), or (P_t). Then 1/P_t is the number of individuals from which selection of the top one may be expected to effect a selection differential of the given value of s. The highest value of s calculable from a 15-place table of areas and ordinates of the normal curve, (W.P.A. City of New York) is 8, for which 1/P_t is 222,222,000,000,000. This is roughly 2000 times the number of maize plants grown in the world in one season. That the low oil result (s = 7) might have been obtained by selection among 400,000,000 homozygous lines is plausible. The high oil result (s = 12) is 4 billion million times as difficult. Selection of the top 10 from 26 provides an s of one in the absence of gene interaction and environmental effects. Eight recurrences of such selection will effect an s value of 8 if variability is maintained as it was in the selection for oil. A total of 208 plants is required. From this viewpoint the oil selection results do not seem improbable as the work was done; they do seem very improbable in the face of much inbreeding.

The s value of the top one of 11,185 singlecrosses from at least 150 inbred lines is about 4. This might be a yield increase of about 40% over original stock. The genetic variance of singlecrosses is the same as for single plants of original crossbred stock. Sigma in this case is then 10% of the original mean yield. This seems a fair estimate of the present Florida situation. The problem now is how much effort will be required for further gains. If each cycle of inbreeding must begin at the same level as the first, as indicated by the yield of synthetic combinations of selected lines and nearly all other available evidence, it will be necessary to identify the best single cross among 1,300,000 from 1600 homozygous lines to effect a further improvement of 10%. Gaining 10% again beyond that will be truly difficult, even though the genetic variation may remain unimpaired in the process as suggested by oil selection results.

A breeding technic has been proposed to deal with the case h>d, Hull, Recurrent Selection for Specific Combining Ability in Corn. J.A.S.A. in press. The method is recurrent selection in a crossbred lot for combining ability with a specific homozygous line. Selection is among testcrosses of single plants of the crossbred lot to the homozygous tester line. For any locus heterozygous in the crossbred lot and aa in the tester the testcrosses are: aa, (aa+aA)/2, and aA, or if the tester is AA they are: aA, (aA+AA)/2, and AA. The three testcrosses are separated by equal intervals, (d+h)/2 in the first case and (d-h)/2 in the second. The essential point is that the three values are equally spaced as would be the three genotypes in a crossbred population without dominance. This type of selection avoids the confusion of dominance or allelic interaction even though h>d. The price is some loss of variance. It also allows maximum frequency of recurrence of selection. Maximum frequency of recurrence with respect to resistance to insects and diseases as well as to yield and any other desirable characters would seem to be obtained by simultaneous selection.

Tomato weight and height have been included for contrast with maize yield. Estimates of 2nh involving (-B) are smaller than those involving (-P) for both maize yield and tomato weight. B values might suffer less distortion from non-allelic interaction than P values since the former are nearer the center. The slightly excessive value of B in Lindstrom's data may indicate nothing more than a little heterozygosity remaining in the parent lines. Strong allelic interaction is indicated for maize yield. Tomato weight records indicate very slight allelic interaction but strong non-allelic interaction. Both the maize yield and tomato weight situations seem improbable. If the tomato weight interaction is the cubic relation of volume to linear dimension, why does not this function appear in the relations of aa, aA and AA at one locus? Why would it not appear in the maize yield between non-alleles? Why does h>d appear only in grain yield of maize; not in components, e.g. ear length and diameter, plant height, stalk diameter etc.? Tomato height in F₁ exceeds the greater parent but not the sum of parents (P). There is no evidence here of h>d and slight evidence of non-allelic interaction.

The enormous selection intensities available by properly controlled recurrent selection provide a tool for investigation of physiological limits, limits of recombination, and perhaps detection of aggregates of natural or induced mutations in a group of numerous small genes.

Appendix - January 10, 1945: Hayes et al, J.A.S.A. 36:998, 1944; data on synthetic, mean of parent lines and mean F₁. From F₁ minus synthetic the estimate of 2nh is 160% F₁. The (2F₁ - P) estimate of 2nh is 127% F₁. If h = d, and R = 0, then F₁ = 2nh. Decline from F₁ to F₂ or synthetic is 2nh/2N, where N is number of lines. On the ongoing assumptions, expected decline of Hayes' synthetic is 100/16 o r 6.25 % F₁. If R is 20 % F₁, expected decline of synthetic is 5 %F₁.

The actual decline of 10% F₁, may be evidence of h>d, non-allelic interaction, or R<0. Taking R = 0, no interaction, then h = 4d for the F₁ - synthetic comparison, and h = 1.74d for (2F₁ - P).

Kiesselbach, J.A.S.A. 22:614, 1930; F₂ and F₃ of 21 singlecrosses, h = 1.98d.

Richey et al, J.A.S.A. 26:196, 1934; F₂ 10 double crosses, h = 1.55d.

Neal, loc. cit., F₂ 10 double crosses, h = 172d.

If R is some positive value all of the above estimates of h must be revised upward.

Fred H. Hull