Genome annotation using the MAKER-P on JetStream cloud
DescriptionThe promise of genome research depends on our ability to accurately annotate and derive meaning from sequence data. In plants such as maize, extremes of genome size push the limits on current algorithms, expertise, and computational power needed by today’s researchers. Furthermore, new sequence technologies are driving research communities to move beyond reliance on a single reference genome to represent a species. As we head into the “pan-genome age”, researchers need access to reliable computational resources, standardized annotation workflows and the consistent evidence sets for annotation of multiple reference strains. Here, we describe a freely available, robust and reusable resource for automated annotation of maize genomes that utilizes the NSF JetStream cloud service. With this service, users can check out a virtual image pre-installed with the MAKER-P annotation engine and underlying software, plus configuration files and curated evidence datasets (e.g. repeat library and transcriptome data), for standardized maize-specific annotation. As proof-of-concept, we demonstrate the utility of this resource by annotating three maize lines, B73, NC350, and W22 using message parsing interface (MPI) and work queue system (WQ-MAKER) for scalability.

This work is supported in part by NSF grants DBI-1265383 (for the CyVerse project),IOS-1445025 (for the MaizeCode project), ACI-1445604 (for the JetStream project) and OAC-1642409(for the CCtools project).