Submit CSP proposal
Announcements
- February 25-28, 2018
European Conference on Fungal Genetics, Haifa, Israel - March 13-16, 2018
Fungal genomics workshop @ JGI User Meeting, San Francisco, CA, U.S.A. - July 16-21, 2018
International Mycological Society Annual Meeting, San Juan, Puerto Rico
Releases
- March 7, 2024
Aspergillus fumigatus P4 SB v1.0 - March 7, 2024
Penicillium ribium P3 SB v1.0 - January 30, 2024
Trebouxiophyceae sp. bin 3300059473_978 v1.0 - January 30, 2024
Trebouxiophyceae sp. bin 3300059473_6682 v1.0 - January 30, 2024
Chlorellaceae sp. bin 3300059473_4402 v1.0
Validation of Short Read ESTs and Genomic Assemblies
Motivation
The current JGI Annotation Pipeline is optimized for use with hybrid 454- Illumina genomic assemblies and with 454 ESTs and contigs derived thereof. The complete replacement of 454 sequencing by Illumina sequencing poses new challenges to the current practice, as shorter (relative to 454) Illumina reads are more difficult to assemble into accurate genomic scaffolds or EST contigs, which may have significant downstream effects on annotation quality. Simultaneously, the enormous deepening of EST coverage provided by Illumina sequencing (relative to 454) poses computational challenges to some of the software currently standard in the JGI Annotation Pipeline. To perform these assessments, we are choosing and developing simple metrics of relevance to gene prediction quality, performing systematic and controlled annotation experiments to measure the effects of substituting Illumina for 454 inputs, and developing, as needed, new protocols and programs to compensate for any deleterious effects. This project aims at assessing the utility of Illumina- only ESTs, EST contigs, and genomic assemblies for whole-genome annotation.
Results
Assessment of ESTs and EST contigs is nearly complete. Changes to annotation process in response to assessments are in process of consideration and implementation. Genomic assemblies not available yet for assessment, but planning for such assessments has begun. Remaining problems include overclustering of EST contigs, leading to chimaeric transcript models.