State University of New York
Codon usage tables for zein and non-zein genes: an update
--Douglas A. Hamilton and Joseph P. Mascarenhas
In an earlier codon usage table for maize based on 25 nuclear genes (D. M. Bashe and J. P. Mascarenhas, MNL 63: 4-5, 1989) it was found that at least one of the zeins (22 kD family) was not typical of the codon usage profile of other maize genes. With the additional sequences now available we have updated the codon usage table. A total of 56 maize nuclear genes have now been analyzed. This analysis shows that zeins of the 19 kD and 22 kD families exhibit a codon usage pattern that is different from that of the bulk of other nuclear encoded genes. Accordingly we have created two tables, one for the 19 and 22 kD zeins and the second for nuclear genes other than those of the 19 and 22 kD zeins. Codon usage in maize has also been discussed by W. H. Campbell and G. Gowri (Plant Physiol. 92: 1-11, 1990).
Genes were selected from the GenBank database (Release 64, 6/90) on the basis of the presence of a complete coding sequence, either as mRNA or as combined exons from a genomic sequence. Codon usage tables were created by the repetitive addition of tables generated for individual genes by utilizing the Genetics Computer Group (GCG) Sequence Analysis Software (Version 6.2) program "CodonFrequency" (J. Devereux, P. Haeberli and O. Smithies, Nucleic Acids Res. 12: 387-395, 1984). Fourteen sequences for the 19 and 22 kD zein genes and forty two sequences of other nuclear genes were used (Table 1). The results in Table 2 show the number of occurrences of each codon for an amino acid, and the number per 100 codons for that amino acid. It is interesting that the most frequently used codon for the majority of nuclear genes, GAG, is entirely absent in the table for the 19 and 22 kD zein genes.
To test the ability of the tables in distinguishing the correct reading frame of a sequence, and whether the sequence utilized a "19 & 22 kD zein" or "other nuclear gene" codon preference, the tables were used to analyze several maize sequences whose reading frames were known. The results are shown in Table 3. As in our earlier report, the scoring was calculated as follows: for each codon in the sequence a score was assigned as the percent occurrence of that codon divided by the highest possible percent occurrence of any codon for that amino acid, as determined by the table. The scores for all codons in the sequence to be scored were summed and divided by the total number of codons to give a percent similarity to the table. The genes in Table 3 were scored in each of the three reading frames, over the same region, using both of the tables. The first frame is that which has been identified as the coding frame. All nuclear genes tested, as well as zeins of the 15-16 kD classes, are correctly distinguished by the "other nuclear gene" table. In contrast, however, the coding frames of zeins of the 19 and 22 kD classes are not correctly distinguished by a codon usage table made from other nuclear genes, but are distinguished by the "19 & 22 kD zein" table. Note that correct reading frames of two other seed storage proteins tested (MZEGLUT2E and MZEGLB1SA) are also correctly distinguished by the "other nuclear gene" table. The alternate coding preference raises important questions about the evolutionary origin of the 19 and 22 kD families of zein genes.
We thank David Bashe for his program to calculate reading frame scores in Table 3.
Permission of the authors is not required for citing the codon usage tables.
Table 1. GenBank sequences used in the preparation of the codon usage tables.
GenBank locus | Description |
"Other Nuclear Genes" | |
MZEA1G | A1 gene for NADPH-dependent reductase |
MZEACT1G | Actin 1 gene |
MZEADH1F | Alcohol dehydrogenase (ADH1-F) mRNA |
MZEADH1FA | Alcohol dehydrogenase (ADH1-1F) gene |
MZEADH2NR | Alcohol dehydrogenase (ADH2-N) mRNA |
MZEALBB32 | Albumin b-32 mRNA |
MZEALD | Aldolase mRNA |
MZEANT | ATP/ADP translocator mRNA |
MZEBRNZW | UDP glucose flavonoid glycosyl transferase (Bz-W22) |
MZEBRNZZA | UDP glucose flavonoid glycosyl transferase (Bz-McC) |
MZECAT1I | Catalase-1 isoenzyme (cat-1) mRNA |
MZECAT3I | Catalase-3 isoenzyme (cat-1) mRNA |
MZEEG2R | Endosperm glutelin-2 protein mRNA |
MZEGAPDH | Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) mRNA |
MZEGGST3B | Glutathione-S-transferase (GSTIII) mRNA |
MZEGLB1SA | Embryo globulin S allele mRNA |
MZEGLUT2E | Endosperm glutelin-2 gene |
MZEGSTI | Glutathione-S-transferase I mRNA |
MZEGSTIB | Glutathione-S-transferase (GST-I) mRNA |
MZEH3C2 | Histone 3 gene |
MZEH3C4 | Histone 3 gene |
MZEH4C14 | Histone 4 gene |
MZEH4C7 | Histone 4 gene |
MZEHSP701,2 | Heat shock protein 70, exons 1+2 |
MZELHCP | Light-harvesting chlorophyll a/b binding protein mRNA |
MZEMPL3 | Major lipid body protein L3 mRNA |
MZENAR,1 | NADH:nitrate reductase (NR) mRNA (5' + 3' ends) |
MZENDMEX | NADP-dependent malic enzyme (Me1) mRNA |
MZEPCSSU | RuBisCo small subunit mRNA |
MZEPEPCR | Phospohenolpyruvate carboxylase (PEPCase) mRNA |
MZEPLTP | Phospholipid transfer protein mRNA |
MZEPOD | Pyruvate, orthophosphate dikinase mRNA |
MZERBCS | rbcS gene for RuBisCo small subunit |
MZEREGG | Lc regulatory protein mRNA |
MZESOD2A | Superoxide dismutase 2 (SOD2) mRNA |
MZESOD3I | Superoxide dismutase-3 isoenzyme mRNA |
MZESUSYSG | Sucrose synthase gene (shrunken) |
MZETPI1 | Triosephosphate isomerase 1, exon 1 |
MZEWAXY | Amyloplast-specific transit protein (waxy locus) |
MZEZE15A3 | 15kD zein |
MZEZE15G | 15kD zein |
MZEZE16 | 16kD zein |
"19 & 22 kD Zein Genes"
MZEI19 | 19kD zein |
MZEZE19A | 19kD zein |
MZEZE19B1 | 19kD zein |
MZEZE19C1 | 19kD zein |
MZEZE19C2 | 19kD zein |
MZEZE19D1 | 19kD zein |
MZEZE22A | 22kD zein |
MZEZE22B | 22kD zein |
MZEZEA20M | 19kD zein |
MZEZEA30M | 19kD zein |
MZEZEAZ124 | 19kD zein |
MZEZEPCM1 | 22kD zein |
MZEZEZ4G | 22kD zein |
MZEZEZG3A | 22kD zein |
Table 2. Table of codon usage with the number of occurrences of
each codon, and the occurrences per 100 codons fo the same amino acid.
|
|
||||
Amino Acid | Codon Used | Occurrences | % Usage | Occurrences | % Usage |
Arg | CGA
CGC CGG CGT AGA AGG |
34
332 134 80 48 231 |
4
39 16 9 6 27 |
3
0 4 3 2 25 |
8
0 11 8 5 68 |
Leu | CTA
CTC CTG CTT TTA TTG |
39
466 515 165 9 104 |
3
36 40 13 1 8 |
130
70 79 141 70 157 |
20
11 12 22 11 24 |
Ser | TCA
TCC TCG TCT AGC AGT |
59
272 174 88 269 45 |
7
30 19 10 30 5 |
60
30 18 62 49 13 |
26
13 8 27 21 6 |
Thr | ACA
ACC ACG ACT |
77
361 217 117 |
10
47 28 15 |
34
39 14 17 |
33
38 13 16 |
Pro | CCA
CCC CCG CCT |
159
268 347 144 |
17
29 38 16 |
148
56 23 80 |
48
18 7 26 |
Ala | GCA
GCC GCG GCT |
147
598 459 311 |
10
39 30 21 |
137
99 61 179 |
29
21 13 38 |
Gly | GGA
GGC GGG GGT |
151
610 254 211 |
12
50 21 17 |
4
14 3 37 |
7
24 5 64 |
Val | GTA
GTC GTG GTT |
41
411 500 172 |
4
37 44 15 |
24
19 64 20 |
19
15 50 16 |
Lys | AAA
AAG |
97
673 |
13
87 |
5
8 |
38
62 |
Asn | AAC
AAT |
392
90 |
81
19 |
132
17 |
89
11 |
Gln | CAA
CAG |
67
496 |
12
88 |
407
185 |
69
31 |
His | CAC
CAT |
265
96 |
73
27 |
15
18 |
45
55 |
Glu | GAA
GAG |
165
780 |
17
83 |
15
0 |
100
0 |
Asp | GAC
GAT |
552
176 |
76
24 |
6
3 |
67
33 |
Tyr | TAC
TAT |
381
48 |
89
11 |
82
24 |
77
23 |
Cys | TGC
TGT |
242
39 |
86
14 |
21
15 |
58
42 |
Phe | TTC
TTT |
493
87 |
85
15 |
106
63 |
63
37 |
Ile | ATA
ATC ATT |
42
476 134 |
6
73 21 |
35
56 53 |
24
39 37 |
Met | ATG | 391 | 100 | 69 | 100 |
Trp | TGG | 187 | 100 | 1 | 100 |
Term. | TAA
TAG TGA |
7
10 24 |
17
24 59 |
0
14 0 |
0
100 0 |
Table 3. Scoring of some maize coding regions on the basis of
the codon usage tables (highest score in bold print).
|
||||||
|
|
|||||
Gene tested | 1 | 2 | 3 | 1 | 2 | 3 |
MZEA1G | 77 | 55 | 50 | 46 | 50 | 49 |
MZEGGST3 | 77 | 60 | 56 | 48 | 49 | 46 |
MZEGLB1SA | 74 | 62 | 45 | 44 | 49 | 41 |
MZEGLUT2E | 74 | 72 | 57 | 46 | 46 | 66 |
MZEH3C4 | 79 | 58 | 55 | 42 | 54 | 46 |
MZELHCP | 84 | 57 | 48 | 48 | 49 | 46 |
MZEPEPCR | 73 | 58 | 46 | 49 | 54 | 49 |
MZERBCS | 89 | 53 | 51 | 54 | 56 | 48 |
MZESUSYSG | 81 | 57 | 47 | 52 | 48 | 43 |
MZEZE15G | 83 | 68 | 50 | 48 | 52 | 50 |
MZEZE16 | 79 | 68 | 54 | 43 | 46 | 60 |
MZEI19 | 47 | 60 | 49 | 76 | 64 | 70 |
MZEZE19A | 46 | 62 | 48 | 76 | 63 | 67 |
MZEZE22A | 51 | 65 | 46 | 73 | 68 | 65 |
MZEZE22B | 49 | 63 | 45 | 71 | 67 | 65 |
Return to the MNL 65 On-Line Index
Return to the Maize Newsletter Index
Return to the MaizeGDB Homepage