Dual ancestry of Zea: Sequence evidence at the adh1,
adh2,
sh1 and o2 loci
--Bird, RMcK
When last year (MNL 69:100-101) I suggested that comparisons of gene sequences should allow one to test models of evolution and domestication, I had forgotten a paper I read a decade ago, and I had yet to read several more recent papers, all revealing unusual variance within species of Zea. Werr et al. (EMBO J. 4:1373-1380, 1985) noted such great difference between two maize alleles of the sh1 locus that they estimated that the two alleles reflected millions of years of separate evolution. They found 16 silent (synonymous 3rd codon) base differences (3.0%) among 540 silent positions in 2100 bp of exon DNA that they could compare in genomic and cDNA sequences from two maize lines. They also found 10 base differences (3.7%) in 270 bp of 3'-untranslated DNA. Using the evolutionary rate of 5.37 base substitutions per 1000 silent positions per million years determined by Miyata et al. (J. Mol. Evol. 19:28-35, 1982) for several animal genes, these indicate that the two alleles separated 3.0 million years ago (Mya) (evolutionary distance = (26/810) / (5.37 / (1000 x 1 My)); age of separation = 1/2 x distance).
Gaut and Clegg (PNAS 88:2060-2064, 1991) estimated that the adh1-1S and adh1-1F alleles of maize separated @2.6 Mya calibrated on the separation of rice from other grasses at 50 Mya, separation of Pennisetum from Sorghum and Zea at 25 Mya, and a mean coding region substitution rate of 3.63 x 10-9 per site per year. Their estimation of the Pennisetum-Zea substitution rate at silent exon sites was 7.90 x 10-9 per site per year over the 25 million years. Later they reported (PNAS 90:5095-5099, 1993) on 8 alleles from Z. mays, Z. diploperennis and Z. luxurians, finding 81 polymorphic nucleotide sites in 1483 silent positions and at these sites up to 46 nucleotide differences (between the adh1-Pollo allele and the adh1+1S, adh1+Coroico and Zea luxurians alleles). This can be used to estimate a maximum separation for these alleles of 2.0 Mya, based on the 7.90 x 10-9 /site/year silent site substitution rate (=(46/1483) / (7.90 / (1000 x 1 My)) x 1/2).
Based on the same rate, Goloubinoff et al. (PNAS 90:1997-2001, 1993) concluded that polymorphism in a 315 +15 bp segment of the adh2 locus indicates that "the gene pool of maize must be at least several million years old" (p. 2000). They included a wide range of materials in their study--a tripsacum, several teosintes and modern and archaeological maize. A further analysis of their data (below) provides yet another conclusion about the evolution of Zea. And, most recently, Hartings et al. (MNL 69:18-19) calculated two ages of separation for alleles at the o2 locus: 1.06 and 1.86 Mya, based on Kimura's neutral nucleotide substitution rate of 5 x 10-9 per site per year.
These estimates, of course, are subject to redefinition of the times when rice separated from other grasses, when Pennisetum separated from Sorghum and Zea, and adjustment of the synonymous substitution and other silent rates. Also, given that these great differences within the maize and the several teosinte gene pools are due to introgression between two very different ancestors (below), these are minimal estimates in large part because recombination within the loci will have created many alleles with reduced differences over time. I have found no reports of divergence between species of Zea on the level of thousands of years.
This variation may be explained by the Intersectional Introgression Model of maize origin (Bird, MNL 69:100-101, 1995), that the two sections of Zea separated for several million years and, within the last 5000 years, were involved in a hybridization and mutual introgression between a domesticated pure maize and a wild teosinte. This is more parsimonious (straightforward, simple) than proposing that there were multiple domestications of very different Zea species, which have since combined into one species, or that there are extremely high base substitution rates in maize and teosinte, or that several species of Zea are separately maintaining shared ancestral polymorphism.
I would like to demonstrate further evidence that two long-separated ancestors were involved. Figure 1 is the result of taking the partial sequences of alleles of the adh2 locus presented in Figure 3 of Goloubinoff et al., deleting all portions with no or one shift, placing only the differences in a table, sorting the table to place similar sequences together and dissimilar ones far apart, and marking with either light or dark background those "shifts" which belong to one of two very different "linked sets". Thus, for nucleotides 56-103, the g-0000-g-t-gct-t-c linked set (Set T), from alleles 9A and 9B of mexicana teosinte, 12B of Z. luxurians, 4 of Tabloncillo, and 1A and 1B of Northern Flint, is shaded darkly, while the a-agct-a-c-000-c-g set (Set B), from alleles 7A of the Cabuza (Chile) archaeological kernels and BF of a Corn Belt inbred, is shaded lightly. For this zone of the locus, the other alleles are mostly recombinations of the two opposite linked sets. There is very little possibility that such linked sets are due to fairly recent independent mutational events. Rather they are most likely the result of the accumulation of shifts over millions of years in two separate taxa, followed by a relatively recent mutual introgression and recombination of the two sequences. On the other hand, the shift to "t" at nucleotide 75 in allele 8A and to "c" at nucleotide 79 in alleles 7B and 8B could be independent events. Possibly the identical sequences of alleles 11B of Z. diploperennis and 6 of the charred Junín (Peru) cobs and kernels represent a third pattern and ancestor. Here the 56-103 set is g-agct-a-c-gct-t-g, and nucleotides 30, 52 and 125 are often "g" in these and alleles 12A of Z. luxurians and 5 of Kculli. However, this pattern can be explained as a subset derived from the introgression of two ancestors plus early independent change. What the two ancestors supplying sets T and B might be is not revealed here where a relatively small sample has been studied.
There also seems to be some linkage of the T and B sets to numbers of repeats in the GA microsatellite region (nucleotides 10-35 upstream of the transcription start site): (GA)12-13 in alleles 9A, 12B and 1A, (GA)4 in alleles 7A and 10A, and even (GA)8 in alleles 6 and 11B. At least in this microsatellite zone the polymorphism seems conservative.
Another, perhaps as interesting, feature is the remarkably high yet parallel polymorphism in all the species studied. Seven shifts separate the two alleles of the Z. luxurians sample, ten separate the parviglumis alleles, and 11 separate two of the alleles from the archaeological Cabuza kernels. But the variation runs in parallel such that an allele from mexicana teosinte is identical to one from Z. luxurians, and one from Z. diploperennis is identical to that from the 440 year-old Junín cobs and kernels! As Goloubinoff et al. say (p. 2001), "a phylogenetic analysis [of these data] yields no evidence in support of the notion that modern races of maize emerged from a single common ancestor, such as a specific line of Z. mays parviglumis or Z. mays mexicana." However, the II Model does explain the evidence, though perhaps needing to be expanded to include all the Zea species as products of the introgression of the last four millennia.
Figure 1. Non-unique nucleotide shifts noted in 13 modern and ancient Zea and Tripsacum materials by Goloubinoff, Pääbo; and Wilson (1993).
Return to the MNL 70 On-Line Index
Return to the Maize Newsletter Index
Return to the MaizeGDB Homepage