Announcements
- July 15-18, 2012
Annual meeting of the Mycological Society of America, New Haven, CT - June 17-22, 2012
Gordon Conference on Cellular & Molecular Fungal Biology, Holderness, NH - May 14-18, 2012
MGM workshop at JGI, Walnut Creek, CA - Apr 2-4 2012, 2012
Biocuration 2012, Washington, DC
Releases
- May 4, 2012
Piloderma croceum F 1598 v1.0 - April 30, 2012
Hyphopichia burtonii NRRL Y-1933 v1.0 - April 30, 2012
Metschnikowia bicuspidata NRRL YB-4993 v1.0 - April 19, 2012
Mixia osmundae IAM 14324 v1.0 - April 19, 2012
Conidiobolus coronatus NRRL28638 v1.0
Validation of Short Read ESTs and Genomic Assemblies
Motivation
The current JGI Annotation Pipeline is optimized for use with hybrid 454- Illumina genomic assemblies and with 454 ESTs and contigs derived thereof. The complete replacement of 454 sequencing by Illumina sequencing poses new challenges to the current practice, as shorter (relative to 454) Illumina reads are more difficult to assemble into accurate genomic scaffolds or EST contigs, which may have significant downstream effects on annotation quality. Simultaneously, the enormous deepening of EST coverage provided by Illumina sequencing (relative to 454) poses computational challenges to some of the software currently standard in the JGI Annotation Pipeline. To perform these assessments, we are choosing and developing simple metrics of relevance to gene prediction quality, performing systematic and controlled annotation experiments to measure the effects of substituting Illumina for 454 inputs, and developing, as needed, new protocols and programs to compensate for any deleterious effects. This project aims at assessing the utility of Illumina- only ESTs, EST contigs, and genomic assemblies for whole-genome annotation.
Results
Assessment of ESTs and EST contigs is nearly complete. Changes to annotation process in response to assessments are in process of consideration and implementation. Genomic assemblies not available yet for assessment, but planning for such assessments has begun. Remaining problems include overclustering of EST contigs, leading to chimaeric transcript models.