Phylogenetically typing bacterial strains from partial SNP genotypes observed from direct sequencing of clinical specimen metagenomic data

Jason W. Sahl, James M. Schupp, David A. Rasko, Rebecca E. Colman, Jeffrey T Foster, Paul S Keim

Research output: Contribution to journalArticle

16 Citations (Scopus)

Abstract

We describe an approach for genotyping bacterial strains from low coverage genome datasets, including metagenomic data from complex samples. Sequence reads from unknown samples are aligned to a reference genome where the allele states of known SNPs are determined. The Whole Genome Focused Array SNP Typing (WG-FAST) pipeline can identify unknown strains with much less read data than is needed for genome assembly. To test WG-FAST, we resampled SNPs from real samples to understand the relationship between low coverage metagenomic data and accurate phylogenetic placement. WG-FAST can be downloaded from https://github.com/jasonsahl/wgfast.

Original languageEnglish (US)
Article number52
JournalGenome Medicine
Volume7
Issue number1
DOIs
StatePublished - Jun 9 2015

Fingerprint

Bacterial Typing Techniques
Metagenomics
Single Nucleotide Polymorphism
Genotype
Genome
Alleles

ASJC Scopus subject areas

  • Genetics(clinical)
  • Genetics
  • Molecular Biology
  • Molecular Medicine

Cite this

Phylogenetically typing bacterial strains from partial SNP genotypes observed from direct sequencing of clinical specimen metagenomic data. / Sahl, Jason W.; Schupp, James M.; Rasko, David A.; Colman, Rebecca E.; Foster, Jeffrey T; Keim, Paul S.

In: Genome Medicine, Vol. 7, No. 1, 52, 09.06.2015.

Research output: Contribution to journalArticle

@article{1c37ca76ab654490bf98920c4316653e,
title = "Phylogenetically typing bacterial strains from partial SNP genotypes observed from direct sequencing of clinical specimen metagenomic data",
abstract = "We describe an approach for genotyping bacterial strains from low coverage genome datasets, including metagenomic data from complex samples. Sequence reads from unknown samples are aligned to a reference genome where the allele states of known SNPs are determined. The Whole Genome Focused Array SNP Typing (WG-FAST) pipeline can identify unknown strains with much less read data than is needed for genome assembly. To test WG-FAST, we resampled SNPs from real samples to understand the relationship between low coverage metagenomic data and accurate phylogenetic placement. WG-FAST can be downloaded from https://github.com/jasonsahl/wgfast.",
author = "Sahl, {Jason W.} and Schupp, {James M.} and Rasko, {David A.} and Colman, {Rebecca E.} and Foster, {Jeffrey T} and Keim, {Paul S}",
year = "2015",
month = "6",
day = "9",
doi = "10.1186/s13073-015-0176-9",
language = "English (US)",
volume = "7",
journal = "Genome Medicine",
issn = "1756-994X",
publisher = "BioMed Central",
number = "1",

}

TY - JOUR

T1 - Phylogenetically typing bacterial strains from partial SNP genotypes observed from direct sequencing of clinical specimen metagenomic data

AU - Sahl, Jason W.

AU - Schupp, James M.

AU - Rasko, David A.

AU - Colman, Rebecca E.

AU - Foster, Jeffrey T

AU - Keim, Paul S

PY - 2015/6/9

Y1 - 2015/6/9

N2 - We describe an approach for genotyping bacterial strains from low coverage genome datasets, including metagenomic data from complex samples. Sequence reads from unknown samples are aligned to a reference genome where the allele states of known SNPs are determined. The Whole Genome Focused Array SNP Typing (WG-FAST) pipeline can identify unknown strains with much less read data than is needed for genome assembly. To test WG-FAST, we resampled SNPs from real samples to understand the relationship between low coverage metagenomic data and accurate phylogenetic placement. WG-FAST can be downloaded from https://github.com/jasonsahl/wgfast.

AB - We describe an approach for genotyping bacterial strains from low coverage genome datasets, including metagenomic data from complex samples. Sequence reads from unknown samples are aligned to a reference genome where the allele states of known SNPs are determined. The Whole Genome Focused Array SNP Typing (WG-FAST) pipeline can identify unknown strains with much less read data than is needed for genome assembly. To test WG-FAST, we resampled SNPs from real samples to understand the relationship between low coverage metagenomic data and accurate phylogenetic placement. WG-FAST can be downloaded from https://github.com/jasonsahl/wgfast.

UR - http://www.scopus.com/inward/record.url?scp=84934275888&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84934275888&partnerID=8YFLogxK

U2 - 10.1186/s13073-015-0176-9

DO - 10.1186/s13073-015-0176-9

M3 - Article

VL - 7

JO - Genome Medicine

JF - Genome Medicine

SN - 1756-994X

IS - 1

M1 - 52

ER -