We believe this site might serve you best:

United States

United States

Language: English

Promega's Cookie Policy

Our website uses functional cookies that do not collect any personal information or track your browsing activity. When you select your country, you agree that we can place these functional cookies on your device.

Our website does not fully support your browser.

We've detected that you are using an older version of Internet Explorer. Your commerce experience may be limited. Please update your browser to Internet Explorer 11 or above.

Analyzing Data from Next Generation Sequencers Using the PowerSeq® Auto/Mito/Y System

  • Print
  • Email
Seth A. Faith and Melissa Scheible

NC State University
Forensic Sciences Institute
1060 William Moore Dr.
Raleigh, NC 27607
919-513-8099
safaith@ncsu.edu

Publication Date: 2016
PiD-2016-NGS

Introduction

Next-generation sequencing (NGS) is a powerful technology that can provide forensic DNA profiles compatible with current databases, plus deliver additional levels of genetic data that will open new doors to investigations, mixture interpretation, missing persons cases, and more. The earliest published reports on the use of NGS for analysis of forensic markers were as recent as 2011–12 (1) (2) (3) . Since this time, greater than 60 peer-reviewed studies using NGS laboratory methods and/or custom software tools have been reported for forensic examination of short tandem repeats (STRs), mitochondria and single nucleotide polymorphisms (SNPs). A full list can be viewed online at: www.genomicidlab.com/opendata

Promega Corporation developed an autosomal STR kit optimized for downstream sequencing with NGS called PowerSeq® Auto System that amplifies 23 loci, including the full CODIS extended STR panel. Developmental validation studies with PowerSeq® Auto System have been performed, demonstrating backwards compatibility to existing STR databases and extremely high sequence diversity in some CODIS core loci (4) (5) . This sequence data may be used to augment STR analysis in samples for which capillary electrophoresis methods are inconclusive (e.g., degradation and mixtures) or additional information could assist the interpretation and reporting (e.g., familial searching and kinship). PowerSeq® Auto System has undergone further development to include Y-chromosome STRs and the control region of the mitochondria. This multiplex forensic NGS kit is called PowerSeq® Auto/Mito/Y System and features small amplicons (129–303bp), high sensitivity (~100pg DNA) and data for three forensic panels (22 autosomal STRs, 23 Y-STRs and 10 amplicons covering the mitochondrial and amelogenin control region). Herein we report on the evaluation of PowerSeq® Auto/Mito/Y System for analysis of reference samples.

Materials and Methods

Laboratory
One-half nanogram of each single source genomic DNA from Standard Reference Materials SRM2391c (National Institute of Standards and Technology) and 2800M Control DNA (Promega Corporation) were amplified using PowerSeq® Auto/Mito/Y, according to manufacturer’s protocol. Five hundred nanograms of column-purified amplification product were used to construct Illumina sequencing libraries with KAPA Hyper Prep Kit (Kapa Biosystems) using barcoded adapters. Individual libraries were quantified using KAPA Library Quantification kit (Kapa Biosystems), pooled without normalization and diluted to 4nM. Pooled libraries were sequenced (300bp single-end) with Illumina MiSeq (NC State University Genomic Sciences Laboratory) using MiSeq Reagent kit V2 (Illumina). Raw data (FASTQ) were generated for each indexed sample and may be downloaded at: www.genomicidlab.com/opendata

Data Analysis
FASTQ files were adapter- and quality-end-trimmed using Trimmomatic v0.33 single-end module (6)  with the following arguments: phred33, SLIDINGWINDOW:4:15, MINLEN:40. Autosomal and Y-chromosome STR data were analyzed using a modified version of STRaitRazor 2.0 (7)  in the authors’ custom Forensic Cloud Environment called Altius (Amazon Web Services). Fragment allele data for autosomal STRs were used to calculate Random Match Probabilities (RMPs) with the Federal Bureau of Investigation extended-set allele frequencies (8)  according to the NRC II guidelines (9) . Mitochondrial data were analyzed with CLCBio Genomics Workbench v8.0.2 (Qiagen) using published parameters (10) . Variants to the rCRS (11) (12)  within the control region were identified using the software’s variant caller and manually reviewed. Mitochondrial haplotype frequencies and haplogroup estimation were generated using EMPOP3 (13)  (n=26,127), and haplogroup estimations were confirmed using Phylotree Build 16 (14) .

Results

NGS Data
FASTQ data from two separate sequencing runs demonstrated that each sample provided high quality sequence information (92.58% +/– 0.1 reads passed filter), and less than 1% of the sequence reads were removed due to quality filtering. Data sets of 250,000 reads were analyzed using Altius, and each data file was processed in less than 2 minutes. For each sample 44–56% of the reads were identified as matching autosomal or Y-STRs, while the remainder of reads were aligned to the control region of the mitochondrial genome. Interlocus and heterozygous intralocus balance and sensitivity for STRs was similar to previously reported values (5) . Thus, the PowerSeq® Auto/Mito/Y System reproducibly generated high-quality data for reference samples when using NGS workflows.

STRs
For the four reference samples evaluated, full autosomal STR profiles were produced for 22 loci (Table 1). Further, the fragment sizes and sequences were concordant with capillary electrophoresis data as reported in the SRM2391c Certificate of Analysis (NIST), and fragment sizes were concordant with 2800M product literature (Promega). For sample SRM2391c Component A, the D2S441 marker was observed to be homozygous by fragment size but heterozygous by sequence (Tables 1 and 2). Further, within this cohort, four autosomal STR loci (D2S441, D8S1179, D12S391 and vWA) gave instances of shared length (fragment) alleles that could be distinguished as separate sequence alleles (Table 2).

13567CATable 1. Summary of PowerSeq Auto/Mito/Y System for Four Reference Samples

/-/media/images/resources/tables/13500-13599/13567ca.jpg?la=en
13568CATable 2. Shared Fragment Alleles with Sequence Differences.

/-/media/images/resources/tables/13500-13599/13568ca.jpg?la=en

The RMP was calculated for all 22 autosomal STRs based on fragment size and provided discrimination ranging from 1E-29 to 1E-37, dependent on population group. Note, the sequence allele frequencies are currently in development. Thus, the discriminatory power is expected to increase when the RMP is calculated using sequence allele frequencies of appropriate databases. This increase in power may be especially valuable when analyzing samples with partial profiles or interpreting mixtures with likelihood ratios.

For the male samples, SRM2391c Component B, SRM2391c Component C and 2800M Control DNA, full Y-STR profiles for 23 loci were produced for both size and sequence. Further, the fragment size and sequence data were concordant with the certified haplotypes in the SRM2391c Certificate of Analysis (NIST). So, the PowerSeq® Auto/Mito/Y System may be used to generate full fragment and sequence profiles for autosomal and Y-STRs in casework applications such as familial searching.

Mitochondrial DNA
To date, forensic mitochondrial DNA data are often underutilized by most forensic laboratories. Here we show that the PowerSeq® Auto/Mito/Y System concurrently generates high-resolution, complete coverage of the mitochondrial control region (Figure 1). Four distinct mitochondrial haplotypes were observed in our reference samples (Table 1). The mitochondrial control region haplotypes were concordant with Sanger sequencing-based analysis of the samples (data not shown). Further, haplogroup estimations and population frequencies could be determined from the data, providing an additional level of information from the reference samples.

Figure 1. Coverage across the control region (16024–576) Sample 2391c component A.

//embed.widencdn.net/img/promega/25r7obwufz/640px/13569CA.jpeg?keep=c&crop=yes&u=7fvzhm

Forensic laboratories not currently generating mitochondrial DNA data may benefit from the additional information generated with the PowerSeq® Auto/Mito/Y System. The data gained might prove useful for samples with low quantities or fragmented DNA, as well as for cases of closed or small populations, extended kinship and missing persons identification.

Discussion

The PowerSeq® Auto/Mito/Y System is a powerful new system that can be added to the forensic DNA analysis toolkit to help meet the needs of complex sample analysis. The ability to simultaneously analyze autosomal- and Y-STRs along with mitochondrial data from one sample will likely add value to many forensic casework, databasing, and missing persons laboratories. However, for this technology to be fully realized, some aspects need to be addressed. First, forensic sequence databases must be developed to statistically interpret profiles. We are currently addressing this need through a sequence population database to be presented to the Criminal Justice community late 2016 (NIJ Award 2015-DN-BX-K062). Second, laboratory methods need to be streamlined, optimized and validated in simple, low-cost workflows. Third, the software and analysis tools need to be fully developed and validated for the forensic laboratory. Lastly, NGS standards and guidelines need to be defined by governing bodies to allow implementation in accredited laboratory systems. Once these items are addressed, a system such as the PowerSeq® Auto/Mito/Y System can routinely be utilized by crime laboratories to maximize DNA data output from challenging forensic samples.

Funding/Acknowledgments

Portions of this work were funded by the Kenan Collaboratory Fund through the Kenan Institute for Ethics at Duke University and the NC State University Chancellors Faculty Excellence Program in Forensic Sciences.

References

  1. Holland, M.M. et al. (2011) Second generation sequencing allows for mtDNA mixture deconvolution and high resolution detection of heteroplasmy Croatian Medical Journal 52, 299–313.
  2. Bornman, D.M. et al. (2012) Short-read, high-throughput sequencing technology for STR genotyping Biotech Rapid Dispatches 2012, 1–6.
  3. Gymrek, M. et al. (2012) lobSTR: A short tandem repeat profiler for personal genomes Genome Research 22, 1154–1162.
  4. Gettings, K.B. et al. (2016) Sequence variation of 22 autosomal STR loci detected by next generation sequencing Forensic Science International: Genetics 21, 15–21.
  5. Zeng, X. et al. (2015) An evaluation of the PowerSeq Auto System: A multiplex short tandem repeat marker kit compatible with massively parallel sequencing Forensic Science International: Genetics 19, 172–179.
  6. Bolger, A.M. et al. (2014) Trimmomatic: a flexible trimmer for Illumina sequence data Bioinformatics 30, 2114–2120.
  7. Warshauer, D.H. et al. (2015) STRait Razor v2.0: the improved STR Allele Identification Tool–Razor Forensic Science International: Genetics 14, 182–186.
  8. Federal Bureau of Investigation (2015) Notice of release of the 2015 FBI Population Data for the expanded CODIS core STR loci 1–11.
  9. National Research Council (1996) The evaluation of forensic DNA evidence
  10. Parson, W. et al. (2015) Massively parallel sequencing of complete mitochondrial genomes from hair shaft samples Forensic Science International: Genetics 15, 8–15.
  11. Anderson, S. et al. (1981) Sequence and organization of the human mitochondrial genome Nature 290, 457–465.
  12. Andrews, R.M. et al. (1999) Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA Nature Genetics 23, 147.
  13. Parson, W. and A. Dur (2007) EMPOP–a forensic mtDNA database Forensic Science International: Genetics 1, 88–92.
  14. van Oven, M. and M. Kayser (2009) Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation Human Mutation 30, E386–394.

How to Cite This Article

Faith, S.A. and Scheible, M. Analyzing Data from Next Generation Sequencers Using the PowerSeq® Auto/Mito/Y System. [Internet] 2016. [cited: year, month, date]. Available from: http://www.promega.com/resources/profiles-in-dna/2016/analyzing-data-from-next-generation-sequencers-using-the-powerseq-automitoy-system/

Faith, S.A. and Scheible, M. Analyzing Data from Next Generation Sequencers Using the PowerSeq® Auto/Mito/Y System. Promega Corporation Web site. http://www.promega.com/resources/profiles-in-dna/2016/analyzing-data-from-next-generation-sequencers-using-the-powerseq-automitoy-system/ Updated 2016. Accessed Month Day, Year.

PowerSeq is a registered trademark of Promega Corporation.

Figures

Figure 1. Coverage across the control region (16024–576) Sample 2391c component A.

//embed.widencdn.net/img/promega/25r7obwufz/640px/13569CA.jpeg?keep=c&crop=yes&u=7fvzhm

Tables

13567CATable 1. Summary of PowerSeq Auto/Mito/Y System for Four Reference Samples

/-/media/images/resources/tables/13500-13599/13567ca.jpg?la=en
13568CATable 2. Shared Fragment Alleles with Sequence Differences.

/-/media/images/resources/tables/13500-13599/13568ca.jpg?la=en
Choose your country

Americas

Brazil
Canada
United States

Pacific Asia

Australia
India
Japan
Korea, Republic of
Singapore

Europe

Austria
Belgium
Denmark
Estonia
Finland
France
Germany
Iceland
Italy
Luxembourg
Netherlands
Norway
Poland
Spain
Sweden
Switzerland
United Kingdom