AMES, IOWA
Iowa State University
Analysis of the maize Myb gene superfamily: conserved motifs and functional characterization
--Jiang, C, Peterson, T
Myb proteins are defined by a highly conserved DNA-specific binding domain termed Myb, which is composed of approximately 50 amino acids with constantly spaced tryptophan residues. Myb genes encode one of the largest families of diverse transcription factors in plants. However, with the exception of a few well-studied cases, little is known about the functions of most Myb genes. In our study, we attempted to classify and predict the functions of Myb genes from maize and other plants. First, we clustered closely-related Myb genes into subgroups on the basis of similarity and phylogeny. We found that those Myb genes with similar function were clustered within the same subgroup by consulting the related published literatures. Interestingly, AtMyb33, 65, 101, 104 and At3g60460 were complementary, with few mismatches, to Arabidopsis Myb microRNA (noncoding RNA) miR159 (Rhoades et al., Cell 110:513–520, 2002). These five Arabidopsis Mybs are located in the same subgroup in our analysis. This clustering further provides a line of evidence for the reliability of the subgroup designations in our analysis. In addition, gene structure analysis shows that Myb genes in the same subgroup have conserved exon-intron structures and intron phases. Second, we used the motif-searching program MEME to identify conserved motifs in the C-terminal regions of the Myb proteins in each subgroup. We identified 38 candidate motifs with e-value ≤ 1e-10. Motifs were subjected to the similarity score test program PlotSimilarity from the GCG package from Genetics Computer Group (http:// www.accelrys.com/products/gcg_wisconsin_package/), and the nonsynonymous substitution test using program YN00 from the PAML package (Yang and Nielsen, Mol Biol Evol, 17:32–43, 2000). These tests eliminated 20 putative motifs, leaving a total of 18 qualified motifs. These 18 motifs were found to be specific to each subgroup. To further test the specificity of these motifs, we performed a homology search in the Swiss-Prot database and the EST database from GenBank. Only genes containing Myb domains were detected in both datasets. For some motifs, no hits were detected in the EST search, possibly due to low expression levels. Finally, we predicted the functions of a large proportion of previously uncharacterized Myb genes. The resulting functional classification table may provide a useful starting point for determination of Myb gene function (Jiang, Gu and Peterson, submitted).