Nasp

An accurate, rapid method for the identification of snps in wgs datasets that supports flexible input and output formats

Jason W. Sahl, Darrin Lemmer, Jason Travis, James M. Schupp, John D. Gillece, Maliha Aziz, Elizabeth M. Driebe, Kevin P. Drees, Nathan D. Hicks, Charles Hall Davis Williamson, Crystal M. Hepp, David Earl Smith, Chandler Roe, David M. Engelthaler, David M. Wagner, Paul Keim

Research output: Contribution to journalArticle

Abstract

Whole-genome sequencing (WGS) of bacterial isolates has become standard practice in many laboratories. Applications for WGS analysis include phylogeography and molecular epidemiology, using single nucleotide polymorphisms (SNPs) as the unit of evolution. NASP was developed as a reproducible method that scales well with the hundreds to thousands of WGS data typically used in comparative genomics applications. In this study, we demonstrate how NASP compares with other tools in the analysis of two real bacterial genomics datasets and one simulated dataset. Our results demonstrate that NASP produces similar, and often better, results in comparison with other pipelines, but is much more flexible in terms of data input types, job management systems, diversity of supported tools and output formats. We also demonstrate differences in results based on the choice of the reference genome and choice of inferring phylogenies from concatenated SNPs or alignments including monomorphic positions. NASP represents a source-available, version-controlled, unit-tested method and can be obtained from tgennorth.github.io/NASP.

Original languageEnglish (US)
Pages (from-to)1-12
Number of pages12
JournalMicrobial genomics
Volume2
Issue number8
DOIs
StatePublished - Aug 1 2016

Fingerprint

Genome
Genomics
Single Nucleotide Polymorphism
Phylogeography
Bacterial Genomes
Molecular Epidemiology
Phylogeny
Datasets

Keywords

  • Bioinformatics
  • Phylogeography
  • SNPs

ASJC Scopus subject areas

  • Genetics
  • Molecular Biology
  • Microbiology
  • Epidemiology

Cite this

Nasp : An accurate, rapid method for the identification of snps in wgs datasets that supports flexible input and output formats. / Sahl, Jason W.; Lemmer, Darrin; Travis, Jason; Schupp, James M.; Gillece, John D.; Aziz, Maliha; Driebe, Elizabeth M.; Drees, Kevin P.; Hicks, Nathan D.; Williamson, Charles Hall Davis; Hepp, Crystal M.; Smith, David Earl; Roe, Chandler; Engelthaler, David M.; Wagner, David M.; Keim, Paul.

In: Microbial genomics, Vol. 2, No. 8, 01.08.2016, p. 1-12.

Research output: Contribution to journalArticle

Sahl, JW, Lemmer, D, Travis, J, Schupp, JM, Gillece, JD, Aziz, M, Driebe, EM, Drees, KP, Hicks, ND, Williamson, CHD, Hepp, CM, Smith, DE, Roe, C, Engelthaler, DM, Wagner, DM & Keim, P 2016, 'Nasp: An accurate, rapid method for the identification of snps in wgs datasets that supports flexible input and output formats', Microbial genomics, vol. 2, no. 8, pp. 1-12. https://doi.org/10.1099/mgen.0.000074
Sahl, Jason W. ; Lemmer, Darrin ; Travis, Jason ; Schupp, James M. ; Gillece, John D. ; Aziz, Maliha ; Driebe, Elizabeth M. ; Drees, Kevin P. ; Hicks, Nathan D. ; Williamson, Charles Hall Davis ; Hepp, Crystal M. ; Smith, David Earl ; Roe, Chandler ; Engelthaler, David M. ; Wagner, David M. ; Keim, Paul. / Nasp : An accurate, rapid method for the identification of snps in wgs datasets that supports flexible input and output formats. In: Microbial genomics. 2016 ; Vol. 2, No. 8. pp. 1-12.
@article{f43aa25df6c7424a9d782f43a75acefc,
title = "Nasp: An accurate, rapid method for the identification of snps in wgs datasets that supports flexible input and output formats",
abstract = "Whole-genome sequencing (WGS) of bacterial isolates has become standard practice in many laboratories. Applications for WGS analysis include phylogeography and molecular epidemiology, using single nucleotide polymorphisms (SNPs) as the unit of evolution. NASP was developed as a reproducible method that scales well with the hundreds to thousands of WGS data typically used in comparative genomics applications. In this study, we demonstrate how NASP compares with other tools in the analysis of two real bacterial genomics datasets and one simulated dataset. Our results demonstrate that NASP produces similar, and often better, results in comparison with other pipelines, but is much more flexible in terms of data input types, job management systems, diversity of supported tools and output formats. We also demonstrate differences in results based on the choice of the reference genome and choice of inferring phylogenies from concatenated SNPs or alignments including monomorphic positions. NASP represents a source-available, version-controlled, unit-tested method and can be obtained from tgennorth.github.io/NASP.",
keywords = "Bioinformatics, Phylogeography, SNPs",
author = "Sahl, {Jason W.} and Darrin Lemmer and Jason Travis and Schupp, {James M.} and Gillece, {John D.} and Maliha Aziz and Driebe, {Elizabeth M.} and Drees, {Kevin P.} and Hicks, {Nathan D.} and Williamson, {Charles Hall Davis} and Hepp, {Crystal M.} and Smith, {David Earl} and Chandler Roe and Engelthaler, {David M.} and Wagner, {David M.} and Paul Keim",
year = "2016",
month = "8",
day = "1",
doi = "10.1099/mgen.0.000074",
language = "English (US)",
volume = "2",
pages = "1--12",
journal = "Microbial genomics",
issn = "2057-5858",
publisher = "Microbiology Society",
number = "8",

}

TY - JOUR

T1 - Nasp

T2 - An accurate, rapid method for the identification of snps in wgs datasets that supports flexible input and output formats

AU - Sahl, Jason W.

AU - Lemmer, Darrin

AU - Travis, Jason

AU - Schupp, James M.

AU - Gillece, John D.

AU - Aziz, Maliha

AU - Driebe, Elizabeth M.

AU - Drees, Kevin P.

AU - Hicks, Nathan D.

AU - Williamson, Charles Hall Davis

AU - Hepp, Crystal M.

AU - Smith, David Earl

AU - Roe, Chandler

AU - Engelthaler, David M.

AU - Wagner, David M.

AU - Keim, Paul

PY - 2016/8/1

Y1 - 2016/8/1

N2 - Whole-genome sequencing (WGS) of bacterial isolates has become standard practice in many laboratories. Applications for WGS analysis include phylogeography and molecular epidemiology, using single nucleotide polymorphisms (SNPs) as the unit of evolution. NASP was developed as a reproducible method that scales well with the hundreds to thousands of WGS data typically used in comparative genomics applications. In this study, we demonstrate how NASP compares with other tools in the analysis of two real bacterial genomics datasets and one simulated dataset. Our results demonstrate that NASP produces similar, and often better, results in comparison with other pipelines, but is much more flexible in terms of data input types, job management systems, diversity of supported tools and output formats. We also demonstrate differences in results based on the choice of the reference genome and choice of inferring phylogenies from concatenated SNPs or alignments including monomorphic positions. NASP represents a source-available, version-controlled, unit-tested method and can be obtained from tgennorth.github.io/NASP.

AB - Whole-genome sequencing (WGS) of bacterial isolates has become standard practice in many laboratories. Applications for WGS analysis include phylogeography and molecular epidemiology, using single nucleotide polymorphisms (SNPs) as the unit of evolution. NASP was developed as a reproducible method that scales well with the hundreds to thousands of WGS data typically used in comparative genomics applications. In this study, we demonstrate how NASP compares with other tools in the analysis of two real bacterial genomics datasets and one simulated dataset. Our results demonstrate that NASP produces similar, and often better, results in comparison with other pipelines, but is much more flexible in terms of data input types, job management systems, diversity of supported tools and output formats. We also demonstrate differences in results based on the choice of the reference genome and choice of inferring phylogenies from concatenated SNPs or alignments including monomorphic positions. NASP represents a source-available, version-controlled, unit-tested method and can be obtained from tgennorth.github.io/NASP.

KW - Bioinformatics

KW - Phylogeography

KW - SNPs

UR - http://www.scopus.com/inward/record.url?scp=85028623011&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85028623011&partnerID=8YFLogxK

U2 - 10.1099/mgen.0.000074

DO - 10.1099/mgen.0.000074

M3 - Article

VL - 2

SP - 1

EP - 12

JO - Microbial genomics

JF - Microbial genomics

SN - 2057-5858

IS - 8

ER -