VICTORIA, AUSTRALIA
La Trobe University

UNITED KINGDOM
University of Bristol

Real-time mining of maize genome data for simple sequence repeat (SSR) and single nucleotide polymorphism (SNP) molecular genetic markers

— Savage, D; Logan, E; Erwin, T; Robinson, A; Love, CG; Barker, G; Batley, J; Edwards, D

The discovery of molecular genetic markers has now moved from the lab to the computer with the increasing availability of expressed and genomic DNA sequence and the development of bioinformatics tools for the mining of this sequence data. We have developed tools for the rapid, real-time identification of SNPs and SSRs from maize sequence data and made these tools publicly available online.

The SNP discovery tool is based on a method previously applied for the en masse discovery of maize SNPs (Batley et. al. 2003). The new tool builds on the previous method and allows users to enter a sequence of interest for BLAST comparison with a maize EST database. Matching sequences may then be selected for assembly and subsequent SNP discovery. Alternatively, users may either enter their own file of gene sequences for assembly and SNP discovery, or a file of previously assembled sequences for SNP discovery. All parameters for BLAST comparison, assembly using cap3 and SNP discovery may be set at the web interface. The high degree of flexibility in this system gives users the opportunity to fine-tune the discovery parameters for sequences of interest.

The SSR discovery tool has been recently published (Robinson et. al. 2004). Users enter DNA sequences over the web in FASTA format. Sequences are then processed using a combination of SPUTNIK (Abajian 1994) and Primer3 (Rozen and Skaletsky 2000) to identify SSRs and subsequently design PCR primers for their amplification. This tool is available online. We have applied this tool for the identification of SSR primer pairs from both expressed sequences and genomic shotgun sequences from maize. Processing of 58,582 maize unique expressed sequences assembled by TIGR (TIGR gene indices) led to the design of PCR primers for a total of 11,585 SSRs, representing 1,480 dinucleotide, 7,048 trinucleotide, 1,680 tetranucleotide and 1,377 pentanucleotide repeats. Further processing of 1.8 million maize genomic sequences identified PCR primer pairs for 73,933 dinucleotide, 126,026 trinucleotide, 69,184 tetranucleotide and 58,602 pentanucleotide SSRs. The unusually high frequency of trinucleotide SSRs within genomic maize sequences reflects the enrichment for expressed genomic regions. These designed SSR primer pairs are freely available for download.

References

Abajian, C (1994) SPUTNIK.

Batley J, Barker G, O'Sullivan H, Edwards KJ and Edwards D. Mining for Single Nucleotide Polymorphisms and Insertions/Deletions in Maize Expressed Sequence Tag Data. Plant Physiology (2003) 132: 84–91.

Robinson AJ, Love CG, Batley J, Barker G and Edwards D. Simple Sequence Repeat Marker Loci Discovery using SSR Primer. Bioinformatics 20 (9): 1475–1476.

Rozen S. and Skaletsky H.J. (2000) Primer3 on the WWW for general users and for biologist programmers. In: Krawetz S, Misener S. (eds) Bioinformatics Methods and Protocols: Methods in Molecular Biology. Humana Press, Totowa, NJ, pp 365–386. Available online.

The TIGR Gene Index Databases, The Institute for Genomic Research, Rockville, MD 20850 (available online) [December, 2004].



Please Note: As is the policy with the printed version, notes submitted to the Maize Genetics Cooperation Newsletter may be cited only with consent of the authors.

Return to the MNL Volume 79 Index
Return to the index of Maize Newsletters
Return to the Maize Genome Database Page