Catalog  |  Cart  |  Log In

 

A Comprehensive Analysis of Short Tandem Repeat Polymorphisms: Detection Methods, Population Genetics, NRC II, and Proficiency Tests

 

Kevin McElfresh1, Deborah DiPerro1, Susie DelRio1, Amy Hayden1, Dawn Jarvis1, Monroe Chinsee2, and Martin Tracey3
1The Bode Technology Group, Inc., Sterling, Virginia
2Metro Dade Criminal Lab, Miami, Florida
3Department of Biologoical Sciences, Florida International University, University Park, Miami, FL

× Ø × Ø × Ø × Ø × Ø × Ø × Ø × Ø × Ø × Ø × Ø × Ø × Ø × Ø × Ø

INTRODUCTION

The Laboratories at the Bode Technology Group, Inc., have been actively engaged in the validation and use of short tandem repeat polymorphisms (STRP) since 1993. Initially, the STRPs were analyzed using monoplex PCR reactions and silver staining. Over time, the analysis process has been continually upgraded. Currently, we are using two multiplex reactions of 4 loci each, with fluorescent detection. The 8 STRP loci in current use are: HUMCSF1PO (CSF), HUMTPOX (TPOX), HUMTH01 (TH01), HUMvWA1 (vWA), HUMF13A01 (F13A), HUMFESFPS (FESFPS), HUMF13B (F13B), and HUMLPL (LPL). CSF, TPOX, TH01, and vWA comprise the CTTv quadriplex and F13A01, FESFPS, F13B, and LPL comprise the FFFL quadriplex (both from Promega). An additional four loci have recently been added to the routine laboratory repertoire: D16S539, D7S820, D13S317, and D5S818. These 4 loci, along with the CTTv loci, comprise the Promega PowerPlex™. Currently, there are approximately 2,000 samples in our databases encompassing 4 main ethnic groups: Caucasians, Blacks, Hispanics, and Asians.

Validation studies have included both the molecular and population aspects of STRP technology. Molecular validation studies included: the examination of the best polyacrylamide gel length, effective methods for the isolation of DNA from various substrates for subsequent STRP analysis, and a complete analysis of the PCR conditions needed for optimal results. Analysis of population genetic parameters associated with the calculation of genotype probabilities included standard Hardy-Weinberg tests on the individual allele frequencies as well as comparison of the independence of loci. Specific attention was paid to the population genetic parameters set forth by the National Research Council’s second report on the use of DNA in forensics (NRCII), specifically the use of theta (Ø) as a measure of population substructure effects on genotype probabilities.

METHODS

DNA Isolation

DNA was isolated from whole blood using a commercial kit (Wizard Kit™ from Promega). The protocol provided by Promega was modified slightly; most of the changes were centrifuge times and speeds. The first change was the cell lysate mixture being centrifuged for 3 minutes at 10,000 rcf to separate the white blood cells. Secondly, the centrifugation step to precipitate the protein was 10,000 rcf for 6 minutes. Third, the centrifugation step after the addition of isopropanol and the final centrifugation step were both changed to 5 minutes at 12,000 rcf in a refrigerated centrifuge. Lastly, the DNA was rehydrated in 50 µl Tris-HCl, pH 8.0 at 56° C for 1 hour.

PCR

DNA amplification was accomplished using the Promega GenePrint™ Fluorescent STR Systems. The two multiplex systems employed were: CSF1PO, TPOX, TH01, vWA (CTTv) and F13A01, FESFPS, F13B, LPL (FFFL). A 25 µl reaction was used with the Perkin Elmer PCR 9600. However, mineral oil and aluminum foil were not used, as suggested in the protocol. Typically between 20 and 100 ng of DNA were added per reaction along with 22.50 µl of master mix. The master mix was composed of 17.30 µl sterile, deionized, distilled water, 2.50 µl STR 10s Buffer, 2.50 µl multiplex 10x primer pair, and 1 unit of Taq DNA Polymerase (5u/µl). The thermal cycling conditions were as follows: 96°C for 2 minutes; followed by 10 cycles of [ramp 50 seconds to 94°C, hold for 1 minute, ramp 34 seconds to 60°C, hold for 1 minute, ramp 25 seconds to 70°C, and hold for 1.5 minutes]; then 20 cycles of [ramp 45 seconds to 90°C, hold for 1 minute, ramp 30 seconds to 60°C, hold for 1 minute, ramp 25 sec to 70°C, and hold for 1.5 minutes]; this was followed by a 4°C soak.

Silver Staining

The gel plates were treated according to the Promega GenePrint™ STR Systems Technical Manual, and all solutions and developing times used were as stated in the silver staining protocol in the Promega GenePrint™ STR Systems Technical Manual.

Acrylamide Analytical Gels

For analysis on the FMBIO®, optically clear borosilicate glass SA-32 (Hitachi) plates were used. The plates (one short and one long) were cleaned with deionized distilled water and thoroughly dried with a Kimwipe. The glass plates were assembled according to the Promega GenePrint™ Fluorescent STR Systems Technical Manual. However, SA-32 casting boots were used to hold the glass sandwich in place rather than clamps. GEL-MIX 6 (Life Technologies) was used and 450 µl of 10% ammonium persulfate was added, then mixed gently. The gel was poured according to the directions in the manual and the plates positioned horizontally. The gel polymerized for approximately one hour. If the gel was made the day before, it was stored overnight in the refrigerator with a paper towel saturated in deionized distilled water placed around the top of the gel with the entire gel wrapped in plastic wrap.

 

Gel Electrophoresis, Analysis
and Fluorescent Detection


Multiplex PCR products were analyzed by gel electrophoresis. Gels were positioned in the SA-32 gel box according to the technical manual and allowed to pre-run (using 1x TBE) for 20-40 minutes at a constant 30 watts. For each sample lane that was loaded, 1.8 µl bromophenol blue loading solution and 3.0 µl of the sample were combined. The ladder was diluted with an equal volume of 1x STR Buffer and combined with 1.8 µl bromophenol blue loading solution as well. Prior to loading the gel, the samples were denatured at 95°C for two minutes and immediately placed on ice. The gel ran according to the same conditions as in the pre-run (FFFL quadriplex for 1 hour and 30 minutes, CTTv quadriplex for 1 hour and 50 minutes). After electrophoresis, the gels were removed from the apparatus, both sides of the plate sandwich were cleaned with deionized distilled water and placed into the scanner.

RESULTS

Silver vs. Fluorescence

Initially, STRPs were detected using silver staining. However, silver staining has several limitations, not the least of which is the generation of hazardous wastes. The other serious limitation of silver staining is the post-gel staining time required to actually visualize the results. After using silver staining systems for over a year, it became clear that fluorescent detection and multiplexing would be necessary to meet the high throughput requirements of our laboratories.

Two methods of fluorescent detection, based on the machinery needed to visualize the results, were compared to the data obtained using silver stain. One method used an ABI 373A and the other method used an Hitachi FMBIO® 100. Silver stain data was obtained by amplifying the genomic DNA samples using the Promega monoplex kit for silver staining (e.g. F13B) and running the amplified DNA out on 6% polyacrylamide gels using multiple loads per gel. Each load was accompanied by allelic ladder as the size marker. The same genomic DNA samples were then amplified using Perkin Elmer ABI monoplex reagents. Rox 500 was used as the sizing marker. The F13B data for the Hitachi FMBIO® 100 was generated using the Promega FFFL Quadriplex and sized using the allelic ladder provided with the Promega Quadriplex kit, and the allele calling program software on the FMBIO® 100.

The data in Table 1 for the F13B allele 10, with a known size of 185 base pairs, show that between silver stain and the Hitachi FMBIO® 100 there is no difference in the size of the bands obtained. While the ABI373A data appears to be significantly different, the fact that the sizing was done without the use of an allelic ladder, is probably the reason that there is approximately a 1.5 base pair difference in the sizing. Since the ABI fluorescent STR reagents were not yet all available, and since the sizings of the ABI data are different, although explainable, the Promega Quadriplex STR multiplexes were chosen for continued validation of fluorescent detection methods.

Allelic (Band) Resolution

Having chosen the fluorescent Promega Quadriplexes based on the ease and accuracy of detection, the next issue to validate was the resolving capability of the overall gel/detection system. For illustration purposes, the F13A01 locus was used to examine resolving power (Fig. 1). The data show that there are no allele sizes that overlap each other, i.e. all alleles are clearly resolved. Clear resolution of all alleles is also the case for the alleles of the other loci within the CTTv, and FFFL quadriplexes (data not shown).

The length of the polyacrylamide gel used to size fractionate the DNA bands will effect the resolving power of the system. Conventionally, the longer the gel, or the further the DNA is run through the resolving matrix, the better the resolution will be. We tested a prototype extension unit for the Life Technologies SA32 gel box that allowed a 43 cm gel to be run instead of the standard 32 cm gel box. The data in figure 1 are based on the standard 32 cm gel run, and there appeared to be better resolution of the higher molecular weight bands on the 43 cm gel run. Comparison of the 307 base pair band gave 307.24+0.5 bp for the 32 cm gel, and 307.05+0.26 bp (n=5) for the 43cm gel. The small sample size prevents any real assumption concerning significance, however, this data does reflect the anecdotal evaluation of the longer gels. The extension unit for the SA32 is not yet available, therefore all gels are being run on standard 32 cm gels.

Population Genetics


In any validation study, all of the population statistical data must be carefully analyzed before it is considered ready for application. This has been done for the data in our laboratory. However, the recent National Research Council Report on The Evaluation of Forensic DNA Evidence (NRC II) has made great strides to address a number of population genetic issues. It is one of the NRC II recommendations that we wish to address here; specifically the issue of Ø, or coefficient of inbreeding, and its impact on the actual numbers generated by an STRP genotype.

The NCR II report recommends that for any genetic system in which it is possible to unequivocally identify discrete alleles, that the use of 2pq is the proper way in which to calculate the frequency of a homozygote, and that a conservative value for Ø is 0.01. For this study we addressed the following: How does a Ø value of 0.01 compare with reality, and what effect does including Ø in the calculations really have on the outcome of a genotype probability that would be presented in court?

The data in Table 2 clearly show that the value of Ø is less than 0.01. There is also a good deal of variability in the Ø values generated from our databases, with the largest value generated for the Black/Hispanic comparison (0.00885) and the smallest between Caucasian population samples (0.00037). Also worth noting is a comparison of North American Blacks with Haitian Blacks (Haitian data was provided by Cecelia Crouse of West Palm Beach, FL) where Ø = 0.0063, which is less than the Black/Hispanic calculation. What is clear from this data is that the Ø value of 0.01 is conservative for virtually all calculations made in most cases from forensic labs of North America.

To gain insight into the actual numerical effect of using Ø in the calculation of the frequency of a homozygote pattern, we chose the 10 alleles of the FESFPS locus which had a frequency of 0.273 in our Hispanic database. Using the standard Hardy-Weinberg Equilibrium (HWE) calculation, i.e. p2, the expected frequency of a 10/10 homozygote for FESFPS would be 0.074, or 1 in 13.5. Using the NRC II recommended calculation of p2 + p(1-p) Ø and using the Hispanic/Hispanic value for Ø of 0.00576 (from Table 2), the new homozygote frequency becomes 1 in 13.3. Substituting the NRC II recommended value of 0.01 for Ø, the frequency of the FESFPS 10/10 homozygote becomes 1 in 13.1. Clearly the difference of 0.5 in the non- Ø frequency versus the Ø frequency is negligible, as one would predict based on the social morés against inbreeding and the cosmpolitan nature of the North American societies. Given the negligible, yet conservative effect on the calculated homozygote frequencies, we have adopted the 0.01 recommended value of Ø for standard practice within our laboratories, using our databases.

Ultimately, the measure of any DNA based system of identity testing depends on the power of the genetic loci to actually exclude a wrongfully accused individual. During the validation of the STRPs in use in our lab, we of course calculated the power of exclusion (pe) and the complete set of loci for each population. As expected, the pe was greater than 99.9% for each population. In fact within our databases, we have not seen a 5 locus STR genotype that had a random match (n > 1800). Within our databases, the most common 8 locus STR genotype is 1 in 669,700, while the rarest genotype was 1 in 2.2 x 1041 or 1 in 2.2 sagans (1 sagan = billions and billions, therefore 669,700 = 66 µsagans.)

Validation and Proficiency Testing

One interesting aspect of our validation studies was the examination of the STRPs within a family. Specifically, we examined the inheritance patterns in a large extended Amish family that included maternal and paternal grandparents, parents and 12 children. All of the STRPs exhibited Mendelian inheritance patterns and there was sufficient polymorphism to clearly identify each of the individuals.

The final phase of implementation of the STRP technology was to proficiency test the laboratory using STRPs. This was accomplished in standard formats by handing out unknown samples to the staff and checking the results against the known results in the databases. In addition, we participate in the Cellmark IQAS proficiency testing program. However, we have also had what clearly meets the definition of a blind proficiency test. During the final phase of data collection for entry into our databases, each genotype from every new sample is compared against all the genotypes in the database. On three separate occasions, one of the organizations that has submitted samples to us did not realize that samples had been drawn on the same individual more than once. Consequently, the same individual had his blood drawn twice, was given two different submission numbers and the samples were sent to us for analysis as two different individuals. In one instance, almost a year had elapsed between the first draw and the second draw. When the routine scan of the database was performed to identify duplicates, the sample re-submissions were identified. For the purposes of confirming the duplicate genotypes of what were represented to be different samples, mtDNA from each sample was sequenced for the HV1 and HV2 regions. Again, the samples were identical. Given the high probability (> 1.0 sagans) that the two samples had to be duplicates, the submitting organization was notified of the suspected duplication and after they checked the records, it was clear that duplicate samples had indeed been submitted. In each and every proficiency test, STRPs have performed flawlessly.

We have also used STRPs in casework very successfully. One interesting case involved the receipt of a hairbrush, that clearly had been a personal favorite of the owner and just as clearly never been cleaned in its long and arduous life. The owner of the hairbrush was missing, presumed dead, and obviously no body had been found. We were asked to isolate DNA for future reference and were able to type the DNA from single hairs, with roots, at all STRP loci, as well as sequence the mtDNA for the HV1 and HV2 regions of the D-loop. The case is still pending.

DISCUSSION

Fundamental to any DNA-based identity test is the amount of polymorphism that a specific test system has at its disposal for the purposes of discriminating between individuals. Historically, RFLPs have been the benchmark by which all testing systems have been judged, given their very high degree of polymorphism. However, the high degree of polymorphism of RFLPs provided its own set of statistical problems, based on the fact that the alleles of an RFLP were not discretely identifiable. DQA and Polymarker, the first PCR based DNA tests, were marginally polymorphic relative to RFLPs, but they had discrete alleles that could, under general circumstances, be clearly defined. STRPs clearly split the difference, in that while they are not as polymorphic as RFLPs, they are far more polymorphic than DQA and Polymarker, and they have discrete alleles.

Given that the more polymorphic a locus is, the more informative it is for identity testing, we have begun to investigate micro variation at the STR loci. Properly quantified, micro-variation would add to the total amount of polymorphism and therefore increase the utility of STR loci for identity testing. Micro variation is operationally defined as alleles that are not exact multiples of the basic repeat motif or sequence variants of the repeat motif, or both. For example, the repeat motif of the F13A01 locus is a tetra nucleotide of the sequence AAAG. An allele with the motif AAG would be considered a micro variant, as would AATG. The classic example of this type of variation is the 9.3 allele of the TH01 locus, which is a 1 bp deletion of a TH01 10 allele.

We have begun to examine each locus by graphing the actual sizing of each allele analyzed, Fig 2. is the graph for the locus F13A01. What we have begun looking for are alleles that have been called as a given repeat, but size at the limit of the call window. Arguably, given a 1 bp resolution capability, a 2 bp difference which would fall between alleles, would in all likelihood be a micro variant, provided that the inheritance and molecular basis for that designation can be conclusively established. Looking at Fig x, there are alleles from 297 bp to 333 bp, and the "whole" alleles between 313 bp and 325 bp are: 313, 317, 321, and 325 bp. However, there are putative micro variants that size at 315, and 319 bp, between the "whole" classes. Our research is focused on characterizing these alleles and hopefully adding them to the list of known alleles, therefore increasing the polymorphism of the STR loci, and consequently increasing their utility in DNA identity testing.

CONCLUSION

STR-based DNA identification systems have all the advantages of PCR-based tests. One of the more subtle advantages of STRs, but probably the biggest, is that any given STR locus is independent of manufacturer or test methodology, so long as the same repeats are assayed. This is unlike the situation in RFLP tests, where the restriction enzyme dictates the loci that can be used and data from the same locus, assayed using different restriction enzymes, cannot be reconciled. The basis for the STR advantage is that the alleles are discrete and the genotype is based on the number of uniquely identifiable repeats. Therefore, once a given locus has been validated using a specific standard, inter-laboratory comparisons of data can be made, even if different labs use different manufacturers’ reagents.

We have found short tandem repeat polymorphisms to be valid, reliable DNA identification systems that are statistically robust and technically simple to use.

 

Table 1. Band Size Comparison

F13B - 10 Repeat (185 bp)

Silver

ABI 373A

FMBIO

184.92 + 0.22

186.46 + 0.04

184.92 + 0.25

Comparison of the F13B 10 repeat allele sizes obtained using silver staining,
and two methods of fluorescent detection. See text for details.

Table 2. Comparison of Ø Values

Population

Ø Value

Population

Ø Value

C/B/H

0.00579

C/C

0.00037

C/B

0.00642

B/B

0.00211

C/H

0.00106

H/H

0.00546

C/H

0.00885

B/B*

0.0063

Theta values calculated using data for the STR locus FESFPS.

C = Caucasian, B = Black, H = Hispanic, and B* = Haitian Blacks.

 

 

Figure 1. Allele Sizes for the F13A01 Locus

 

12fig2.gif (6691 bytes)

 

Figure 2. Band Resolution at the F13A01 Locus


Go to proceedings home page