Nematostella vectensis v1.0

Unless otherwise noted, all files are in FASTA format and compressed with Gnu Zip (gzip).To uncompress files: on Mac use StuffIt, on PC use WinZip or WinRAR.

NOTE: Masked regions are represented with lowercase characters; gaps in the assembly are represented with Ns.

Assembly:

Assembled scaffolds (unmasked): Nemve1.fasta.gz

Assembled scaffolds (masked): Nemve1.allmasked.gz

NvTRjug.fasta.gz *

* These are tandemly repeated sequences in the Nematostella genome that have been reconstructed using the juggernaut program (Chapman unpublished). They constitute roughly 30% of the sequenced genome.

Annotation:

"Filtered Models" is the filtered set of models representing the best gene model for each locus.

"GenBankSubmission" is the mapping of JGIprotein ids to GenBank accession numbers.

Filtered Models ("best")

Proteins: proteins.Nemve1FilteredModels1.fasta.gz

Transcripts: transcripts.Nemve1FilteredModels1.fasta.gz

Genes: Nemve1.FilteredModels1.gff.gz

Functional Annotations

GO annotations: Nvectensis.goinfo_FilteredModels1.tab.gz

KOG annotations: Nvectensis.koginfo_FilteredModels1.tab.gz

InterPro domains: Nvectensis.domaininfo_FilteredModels1.tab.gz

All Models,Filtered and Not

Proteins: proteins.Nemve1FilteredModels1.fasta.gz

Transcripts: transcripts.Nemve1AllModels.fasta.gz

Genes: Nemve1.AllModels.gff.gz

GenBankSubmission

JGI_NCBI_id_mapping: N.vectensis_ABAV.modified.scflds.p2g.gz

ESTs

Clustered ESTs: Nemve_JGIestCL.fasta.gz

ESTs: Nemve_JGIest.fasta.gz