The effects of signal erosion and core genome reduction on the identification of diagnostic markers

Jason W. Sahl, Adam J. Vazquez, Carina M. Hall, Joseph D. Busch, Apichai Tuanyok, Mark Mayo, James M. Schupp, Madeline Lummis, Talima R Pearson, Kenzie Shippy, Rebecca E. Colman, Christopher J. Allender, Vanessa Theobald, Derek S. Sarovich, Erin P. Price, Alex Hutcheson, Jonas Korlach, John J. LiPuma, Jason Ladner, Sean LovettGalina Koroleva, Gustavo Palacios, Direk Limmathurotsakul, Vanaporn Wuthiekanun, Gumphol Wongsuwan, Bart J. Currie, Paul S Keim, David M Wagner

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Whole-genome sequence (WGS) data are commonly used to design diagnostic targets for the identification of bacterial pathogens. To do this effectively, genomics databases must be comprehensive to identify the strict core genome that is specific to the target pathogen. As additional genomes are analyzed, the core genome size is reduced and there is erosion of the target-specific regions due to commonality with related species, potentially resulting in the identification of false positives and/or false negatives. IMPORTANCE A comparative analysis of 1,130 Burkholderia genomes identified unique markers for many named species, including the human pathogens B. pseudomallei and B. mallei. Due to core genome reduction and signature erosion, only 38 targets specific to B. pseudomallei/mallei were identified. By using only public genomes, a larger number of markers were identified, due to undersampling, and this larger number represents the potential for false positives. This analysis has implications for the design of diagnostics for other species where the genomic space of the target and/or closely related species is not well defined.

Original languageEnglish (US)
Article numbere00846-16
JournalmBio
Volume7
Issue number5
DOIs
StatePublished - Sep 1 2016

Fingerprint

Genome
Malleus
Burkholderia
Genome Size
Genomics
Databases

ASJC Scopus subject areas

  • Microbiology
  • Virology

Cite this

Sahl, J. W., Vazquez, A. J., Hall, C. M., Busch, J. D., Tuanyok, A., Mayo, M., ... Wagner, D. M. (2016). The effects of signal erosion and core genome reduction on the identification of diagnostic markers. mBio, 7(5), [e00846-16]. https://doi.org/10.1128/mBio.00846-16

The effects of signal erosion and core genome reduction on the identification of diagnostic markers. / Sahl, Jason W.; Vazquez, Adam J.; Hall, Carina M.; Busch, Joseph D.; Tuanyok, Apichai; Mayo, Mark; Schupp, James M.; Lummis, Madeline; Pearson, Talima R; Shippy, Kenzie; Colman, Rebecca E.; Allender, Christopher J.; Theobald, Vanessa; Sarovich, Derek S.; Price, Erin P.; Hutcheson, Alex; Korlach, Jonas; LiPuma, John J.; Ladner, Jason; Lovett, Sean; Koroleva, Galina; Palacios, Gustavo; Limmathurotsakul, Direk; Wuthiekanun, Vanaporn; Wongsuwan, Gumphol; Currie, Bart J.; Keim, Paul S; Wagner, David M.

In: mBio, Vol. 7, No. 5, e00846-16, 01.09.2016.

Research output: Contribution to journalArticle

Sahl, JW, Vazquez, AJ, Hall, CM, Busch, JD, Tuanyok, A, Mayo, M, Schupp, JM, Lummis, M, Pearson, TR, Shippy, K, Colman, RE, Allender, CJ, Theobald, V, Sarovich, DS, Price, EP, Hutcheson, A, Korlach, J, LiPuma, JJ, Ladner, J, Lovett, S, Koroleva, G, Palacios, G, Limmathurotsakul, D, Wuthiekanun, V, Wongsuwan, G, Currie, BJ, Keim, PS & Wagner, DM 2016, 'The effects of signal erosion and core genome reduction on the identification of diagnostic markers', mBio, vol. 7, no. 5, e00846-16. https://doi.org/10.1128/mBio.00846-16
Sahl JW, Vazquez AJ, Hall CM, Busch JD, Tuanyok A, Mayo M et al. The effects of signal erosion and core genome reduction on the identification of diagnostic markers. mBio. 2016 Sep 1;7(5). e00846-16. https://doi.org/10.1128/mBio.00846-16
Sahl, Jason W. ; Vazquez, Adam J. ; Hall, Carina M. ; Busch, Joseph D. ; Tuanyok, Apichai ; Mayo, Mark ; Schupp, James M. ; Lummis, Madeline ; Pearson, Talima R ; Shippy, Kenzie ; Colman, Rebecca E. ; Allender, Christopher J. ; Theobald, Vanessa ; Sarovich, Derek S. ; Price, Erin P. ; Hutcheson, Alex ; Korlach, Jonas ; LiPuma, John J. ; Ladner, Jason ; Lovett, Sean ; Koroleva, Galina ; Palacios, Gustavo ; Limmathurotsakul, Direk ; Wuthiekanun, Vanaporn ; Wongsuwan, Gumphol ; Currie, Bart J. ; Keim, Paul S ; Wagner, David M. / The effects of signal erosion and core genome reduction on the identification of diagnostic markers. In: mBio. 2016 ; Vol. 7, No. 5.
@article{02286f7f9cb249f0ab588cbdf3699578,
title = "The effects of signal erosion and core genome reduction on the identification of diagnostic markers",
abstract = "Whole-genome sequence (WGS) data are commonly used to design diagnostic targets for the identification of bacterial pathogens. To do this effectively, genomics databases must be comprehensive to identify the strict core genome that is specific to the target pathogen. As additional genomes are analyzed, the core genome size is reduced and there is erosion of the target-specific regions due to commonality with related species, potentially resulting in the identification of false positives and/or false negatives. IMPORTANCE A comparative analysis of 1,130 Burkholderia genomes identified unique markers for many named species, including the human pathogens B. pseudomallei and B. mallei. Due to core genome reduction and signature erosion, only 38 targets specific to B. pseudomallei/mallei were identified. By using only public genomes, a larger number of markers were identified, due to undersampling, and this larger number represents the potential for false positives. This analysis has implications for the design of diagnostics for other species where the genomic space of the target and/or closely related species is not well defined.",
author = "Sahl, {Jason W.} and Vazquez, {Adam J.} and Hall, {Carina M.} and Busch, {Joseph D.} and Apichai Tuanyok and Mark Mayo and Schupp, {James M.} and Madeline Lummis and Pearson, {Talima R} and Kenzie Shippy and Colman, {Rebecca E.} and Allender, {Christopher J.} and Vanessa Theobald and Sarovich, {Derek S.} and Price, {Erin P.} and Alex Hutcheson and Jonas Korlach and LiPuma, {John J.} and Jason Ladner and Sean Lovett and Galina Koroleva and Gustavo Palacios and Direk Limmathurotsakul and Vanaporn Wuthiekanun and Gumphol Wongsuwan and Currie, {Bart J.} and Keim, {Paul S} and Wagner, {David M}",
year = "2016",
month = "9",
day = "1",
doi = "10.1128/mBio.00846-16",
language = "English (US)",
volume = "7",
journal = "mBio",
issn = "2161-2129",
publisher = "American Society for Microbiology",
number = "5",

}

TY - JOUR

T1 - The effects of signal erosion and core genome reduction on the identification of diagnostic markers

AU - Sahl, Jason W.

AU - Vazquez, Adam J.

AU - Hall, Carina M.

AU - Busch, Joseph D.

AU - Tuanyok, Apichai

AU - Mayo, Mark

AU - Schupp, James M.

AU - Lummis, Madeline

AU - Pearson, Talima R

AU - Shippy, Kenzie

AU - Colman, Rebecca E.

AU - Allender, Christopher J.

AU - Theobald, Vanessa

AU - Sarovich, Derek S.

AU - Price, Erin P.

AU - Hutcheson, Alex

AU - Korlach, Jonas

AU - LiPuma, John J.

AU - Ladner, Jason

AU - Lovett, Sean

AU - Koroleva, Galina

AU - Palacios, Gustavo

AU - Limmathurotsakul, Direk

AU - Wuthiekanun, Vanaporn

AU - Wongsuwan, Gumphol

AU - Currie, Bart J.

AU - Keim, Paul S

AU - Wagner, David M

PY - 2016/9/1

Y1 - 2016/9/1

N2 - Whole-genome sequence (WGS) data are commonly used to design diagnostic targets for the identification of bacterial pathogens. To do this effectively, genomics databases must be comprehensive to identify the strict core genome that is specific to the target pathogen. As additional genomes are analyzed, the core genome size is reduced and there is erosion of the target-specific regions due to commonality with related species, potentially resulting in the identification of false positives and/or false negatives. IMPORTANCE A comparative analysis of 1,130 Burkholderia genomes identified unique markers for many named species, including the human pathogens B. pseudomallei and B. mallei. Due to core genome reduction and signature erosion, only 38 targets specific to B. pseudomallei/mallei were identified. By using only public genomes, a larger number of markers were identified, due to undersampling, and this larger number represents the potential for false positives. This analysis has implications for the design of diagnostics for other species where the genomic space of the target and/or closely related species is not well defined.

AB - Whole-genome sequence (WGS) data are commonly used to design diagnostic targets for the identification of bacterial pathogens. To do this effectively, genomics databases must be comprehensive to identify the strict core genome that is specific to the target pathogen. As additional genomes are analyzed, the core genome size is reduced and there is erosion of the target-specific regions due to commonality with related species, potentially resulting in the identification of false positives and/or false negatives. IMPORTANCE A comparative analysis of 1,130 Burkholderia genomes identified unique markers for many named species, including the human pathogens B. pseudomallei and B. mallei. Due to core genome reduction and signature erosion, only 38 targets specific to B. pseudomallei/mallei were identified. By using only public genomes, a larger number of markers were identified, due to undersampling, and this larger number represents the potential for false positives. This analysis has implications for the design of diagnostics for other species where the genomic space of the target and/or closely related species is not well defined.

UR - http://www.scopus.com/inward/record.url?scp=84994416001&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84994416001&partnerID=8YFLogxK

U2 - 10.1128/mBio.00846-16

DO - 10.1128/mBio.00846-16

M3 - Article

C2 - 27651357

AN - SCOPUS:84994416001

VL - 7

JO - mBio

JF - mBio

SN - 2161-2129

IS - 5

M1 - e00846-16

ER -