Complex Disease Application

  1. Pediatric high-risk acute lymphoblastic leukemia: In collaboration with the Children’s Oncology Group (COG), we are sequencing over 1,700 genes from the matched germline and leukemia DNA of 346 children enrolled in prospective COG treatment protocols. We are looking for genes harboring an enrichment of rare/private variants, suggesting a role for that gene in disease incidence or outcome. We are currently working to validate our initial findings showing that abundant functional germline variation in certain genes related to energy and drug metabolism correlate with poor therapeutic outcomes.
  2. Infant acute leukemia: The worst of all pediatric leukemias, we are sequencing exomes from matched mothers and their infants who developed leukemia to look for congenital variation that may contribute to developing leukemia at such a young age.
  3. Longevity: In collaboration with our CGS colleagues, Michael Province and Ingrid Borecki, we are sequencing over 450 genes in nearly 5,000 people enrolled on the Long Life Family Study in an effort to identify genetic variants that protect individuals from the most common diseases and allow one to live to an extreme old age in a healthy manner.
  4. Age and tissue-specific methylation: We have designed a genome-wide hybridization capture array to query the mouse methylome. We will compare six different organs from 4 newborn and 4 aged mice in an effort to characterize tissue-specific and age-specific methylation patterns.

Technology development:

  1. Pooled DNA sequencing: We report a targeted, cost-effective method to quantify rare single-nucleotide polymorphisms from pooled human genomic DNA using second-generation sequencing. We pooled DNA from 1,111 individuals and targeted four genes to identify rare germline variants. Our base-calling algorithm, SNPSeeker, derived from large deviation theory, detected single-nucleotide polymorphisms present at frequencies below the raw error rate of the sequencing platform.
  2. SPLINTER algorithm for pooled sequencing analysis: Pooled-DNA sequencing strategies enable fast, accurate, and cost-effect detection of rare variants, but current approaches are not able to accurately identify short insertions and deletions (indels), despite their pivotal role in genetic disease. Furthermore, the sensitivity and specificity of these methods depend on arbitrary, user-selected significance thresholds, whose optimal values change from experiment to experiment. Here, we present a combined experimental and computational strategy that combines a synthetically engineered DNA library inserted in each run and a new computational approach named SPLINTER that detects and quantifies short indels and substitutions in large pools. SPLINTER integrates information from the synthetic library to select the optimal significance thresholds for every experiment.
  3. Characterizing somatic mosaicism from next-generation sequencing: Single DNA molecule labeling combined with hybridization capture for ultra-rare variant detection across a variety of loci in heterogeneous DNA populations.
  4. Pooled hybridization capture with individual indexing: A highly scalable methodology for pooled hybridization capture with indexed multiplexing of up to 48 individuals or non-indexed pooled sequencing of up to 92 individuals with as little as 70 ng of DNA per person.
© 2021 by Washington University in St. Louis
One Brookings Drive, St. Louis, MO 63130