Submit CSP proposal
Announcements
- February 25-28, 2018
European Conference on Fungal Genetics, Haifa, Israel - March 13-16, 2018
Fungal genomics workshop @ JGI User Meeting, San Francisco, CA, U.S.A. - July 16-21, 2018
International Mycological Society Annual Meeting, San Juan, Puerto Rico
Releases
- March 7, 2024
Aspergillus fumigatus P4 SB v1.0 - March 7, 2024
Penicillium ribium P3 SB v1.0 - January 30, 2024
Trebouxiophyceae sp. bin 3300059473_978 v1.0 - January 30, 2024
Trebouxiophyceae sp. bin 3300059473_6682 v1.0 - January 30, 2024
Chlorellaceae sp. bin 3300059473_4402 v1.0
CombEST
CombEST : annotation of fungal genome using Illumina EST data
Motivation
Gene modeling using RNA-Seq ESTs struggles with handling very large amounts of short ESTs. An initio EST assemblies using genome assemblers like Velvet when mapped to genomic sequences to build gene models suffers from fragmentation, chimerism, and misinterpretation of alternative splicing. CombEST is a new approach that offers genome based EST assembly that offers better performance, higher quality of gene models, and simpler parallelizable computations than ab initio methods.
Results
The CombEST algorithm consists of three parts. The first step includes mapping ESTs to genome sequences using one of public alignment tools Gmap, TopHat, or Blat. At the second step, these alignments are sorted based on genomic location and grouped into congregations (overlapping alignments), which are then assembled into gene models. In the final stage, chimeric gene models are detected and split using base coverage profiles. In addition, fragmented models predicted by Combest can be improved by combining with gene models predicted using other methods.
The algorithm is implemented in C++ using objects with focus on performance and modularity. Tested on a single CPU for 10+ genomes with variable EST coverage, CombEST demonstrated 1.e3-1.e4 speed-up compared to PASA and good quality of predicted gene models.