Ascertainment bias and the pattern of nucleotide diversity at the human ALDH2 locus in a Japanese population

Benjamin T. Brown, August Woerner, Jason A Wilder

Research output: Contribution to journalArticle

3 Citations (Scopus)

Abstract

Many East Asian human populations harbor a high-frequency deficiency allele for the aldehyde dehydrogenase 2 (ALDH2) enzyme, a critical protein involved in the metabolism of ethanol. Here we use resequencing and long-range SNP haplotype data from a Japanese sample to test whether patterns of nucleotide diversity and linkage disequilibrium at this locus are compatible with a standard neutral model of evolution. Examination of the pattern of polymorphism at a locus such as this, where the frequency of a common allele is known a priori, introduces an ascertainment bias that must be corrected for in analyses of the frequency spectrum of polymorphisms. We apply a flexible and generally applicable simulation approach to correct for this bias in our ALDH2 data and, also, to explore the effect of bias on the commonly used summary statistics Tajima's D, Fu and Li's D, and Fay and Wu's H. Our study finds no evidence that the pattern of genetic variation at ALDH2 differs from that expected under a standard neutral model. However, our general examination of ascertainment bias indicates that a priori knowledge of segregating alleles greatly affects the expected distributions of summary statistics. Under many parameter combinations we find that ascertainment bias introduces an elevated rate of false positives when summary statistics are used to test for deviations from a standard neutral model. However, we also show that over a wide range of conditions the power of all summary statistics can be greatly increased by incorporating prior knowledge of segregating alleles.

Original languageEnglish (US)
Pages (from-to)375-385
Number of pages11
JournalJournal of Molecular Evolution
Volume64
Issue number3
DOIs
StatePublished - Mar 2007
Externally publishedYes

Fingerprint

aldehyde dehydrogenase
Aldehyde Dehydrogenase
aldehyde
allele
Nucleotides
nucleotides
Statistics
statistics
loci
Alleles
alleles
Polymorphism
Population
polymorphism
Genetic Drift
genetic polymorphism
Linkage Disequilibrium
Ports and harbors
Gene Frequency
Metabolism

Keywords

  • Aldehyde dehydrogenase deficiency
  • ALDH2
  • Ascertainment bias
  • Natural selection
  • Nucleotide diversity

ASJC Scopus subject areas

  • Genetics
  • Biochemistry
  • Biochemistry, Genetics and Molecular Biology(all)
  • Genetics(clinical)
  • Ecology, Evolution, Behavior and Systematics
  • Molecular Biology
  • Agricultural and Biological Sciences(all)
  • Agricultural and Biological Sciences (miscellaneous)

Cite this

Ascertainment bias and the pattern of nucleotide diversity at the human ALDH2 locus in a Japanese population. / Brown, Benjamin T.; Woerner, August; Wilder, Jason A.

In: Journal of Molecular Evolution, Vol. 64, No. 3, 03.2007, p. 375-385.

Research output: Contribution to journalArticle

@article{11c231830e1d49f5b1c520729a388f41,
title = "Ascertainment bias and the pattern of nucleotide diversity at the human ALDH2 locus in a Japanese population",
abstract = "Many East Asian human populations harbor a high-frequency deficiency allele for the aldehyde dehydrogenase 2 (ALDH2) enzyme, a critical protein involved in the metabolism of ethanol. Here we use resequencing and long-range SNP haplotype data from a Japanese sample to test whether patterns of nucleotide diversity and linkage disequilibrium at this locus are compatible with a standard neutral model of evolution. Examination of the pattern of polymorphism at a locus such as this, where the frequency of a common allele is known a priori, introduces an ascertainment bias that must be corrected for in analyses of the frequency spectrum of polymorphisms. We apply a flexible and generally applicable simulation approach to correct for this bias in our ALDH2 data and, also, to explore the effect of bias on the commonly used summary statistics Tajima's D, Fu and Li's D, and Fay and Wu's H. Our study finds no evidence that the pattern of genetic variation at ALDH2 differs from that expected under a standard neutral model. However, our general examination of ascertainment bias indicates that a priori knowledge of segregating alleles greatly affects the expected distributions of summary statistics. Under many parameter combinations we find that ascertainment bias introduces an elevated rate of false positives when summary statistics are used to test for deviations from a standard neutral model. However, we also show that over a wide range of conditions the power of all summary statistics can be greatly increased by incorporating prior knowledge of segregating alleles.",
keywords = "Aldehyde dehydrogenase deficiency, ALDH2, Ascertainment bias, Natural selection, Nucleotide diversity",
author = "Brown, {Benjamin T.} and August Woerner and Wilder, {Jason A}",
year = "2007",
month = "3",
doi = "10.1007/s00239-006-0149-0",
language = "English (US)",
volume = "64",
pages = "375--385",
journal = "Journal of Molecular Evolution",
issn = "0022-2844",
publisher = "Springer New York",
number = "3",

}

TY - JOUR

T1 - Ascertainment bias and the pattern of nucleotide diversity at the human ALDH2 locus in a Japanese population

AU - Brown, Benjamin T.

AU - Woerner, August

AU - Wilder, Jason A

PY - 2007/3

Y1 - 2007/3

N2 - Many East Asian human populations harbor a high-frequency deficiency allele for the aldehyde dehydrogenase 2 (ALDH2) enzyme, a critical protein involved in the metabolism of ethanol. Here we use resequencing and long-range SNP haplotype data from a Japanese sample to test whether patterns of nucleotide diversity and linkage disequilibrium at this locus are compatible with a standard neutral model of evolution. Examination of the pattern of polymorphism at a locus such as this, where the frequency of a common allele is known a priori, introduces an ascertainment bias that must be corrected for in analyses of the frequency spectrum of polymorphisms. We apply a flexible and generally applicable simulation approach to correct for this bias in our ALDH2 data and, also, to explore the effect of bias on the commonly used summary statistics Tajima's D, Fu and Li's D, and Fay and Wu's H. Our study finds no evidence that the pattern of genetic variation at ALDH2 differs from that expected under a standard neutral model. However, our general examination of ascertainment bias indicates that a priori knowledge of segregating alleles greatly affects the expected distributions of summary statistics. Under many parameter combinations we find that ascertainment bias introduces an elevated rate of false positives when summary statistics are used to test for deviations from a standard neutral model. However, we also show that over a wide range of conditions the power of all summary statistics can be greatly increased by incorporating prior knowledge of segregating alleles.

AB - Many East Asian human populations harbor a high-frequency deficiency allele for the aldehyde dehydrogenase 2 (ALDH2) enzyme, a critical protein involved in the metabolism of ethanol. Here we use resequencing and long-range SNP haplotype data from a Japanese sample to test whether patterns of nucleotide diversity and linkage disequilibrium at this locus are compatible with a standard neutral model of evolution. Examination of the pattern of polymorphism at a locus such as this, where the frequency of a common allele is known a priori, introduces an ascertainment bias that must be corrected for in analyses of the frequency spectrum of polymorphisms. We apply a flexible and generally applicable simulation approach to correct for this bias in our ALDH2 data and, also, to explore the effect of bias on the commonly used summary statistics Tajima's D, Fu and Li's D, and Fay and Wu's H. Our study finds no evidence that the pattern of genetic variation at ALDH2 differs from that expected under a standard neutral model. However, our general examination of ascertainment bias indicates that a priori knowledge of segregating alleles greatly affects the expected distributions of summary statistics. Under many parameter combinations we find that ascertainment bias introduces an elevated rate of false positives when summary statistics are used to test for deviations from a standard neutral model. However, we also show that over a wide range of conditions the power of all summary statistics can be greatly increased by incorporating prior knowledge of segregating alleles.

KW - Aldehyde dehydrogenase deficiency

KW - ALDH2

KW - Ascertainment bias

KW - Natural selection

KW - Nucleotide diversity

UR - http://www.scopus.com/inward/record.url?scp=33847641387&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33847641387&partnerID=8YFLogxK

U2 - 10.1007/s00239-006-0149-0

DO - 10.1007/s00239-006-0149-0

M3 - Article

VL - 64

SP - 375

EP - 385

JO - Journal of Molecular Evolution

JF - Journal of Molecular Evolution

SN - 0022-2844

IS - 3

ER -