The ViReflow pipeline enables user friendly large scale viral consensus genome reconstruction.

Niema Moshiri, Kathleen M Fisch, Amanda Birmingham, Peter DeHoff, Gene W Yeo, Kristen Jepsen, Louise C Laurent, Rob Knight

Scientific reports · 2022 · Vol. 12 (1) · pp. 5077

PubMed DOI PMC Full Text

Abstract

Throughout the COVID-19 pandemic, massive sequencing and data sharing efforts enabled the real-time surveillance of novel SARS-CoV-2 strains throughout the world, the results of which provided public health officials with actionable information to prevent the spread of the virus. However, with great sequencing comes great computation, and while cloud computing platforms bring high-performance computing directly into the hands of all who seek it, optimal design and configuration of a cloud compute cluster requires significant system administration expertise. We developed ViReflow, a user-friendly viral consensus sequence reconstruction pipeline enabling rapid analysis of viral sequence datasets leveraging Amazon Web Services (AWS) cloud compute resources and the Reflow system. ViReflow was developed specifically in response to the COVID-19 pandemic, but it is general to any viral pathogen. Importantly, when utilized with sufficient compute resources, ViReflow can trim, map, call variants, and call consensus sequences from amplicon sequence data from 1000 SARS-CoV-2 samples at 1000X depth in < 10 min, with no user intervention. ViReflow's simplicity, flexibility, and scalability make it an ideal tool for viral molecular epidemiological efforts.

Publication Types

["Journal Article", "Research Support, N.I.H., Extramural", "Research Support, Non-U.S. Gov't", "Research Support, U.S. Gov't, P.H.S."]

Keywords

[]

MeSH Terms

["COVID-19", "Genome, Viral", "Humans", "Pandemics", "SARS-CoV-2", "Software"]

Funding

S10 OD026929 NIH HHS (United States)

75D30120C09795 CDC HHS (United States)

UL1 TR001442 NCATS NIH HHS (United States)

2038509 National Science Foundation

2028040 National Science Foundation

Potentially Related Datasets (1)

These accessions were text-mined from the PMC full text. They may be referenced for comparison, cited from other studies, or otherwise mentioned without being primary data for this paper.

PRJEB47823 PRJEB BioProject