Catalog  |  Cart  |  Log In

Presenting DNA Statistics in Court

B.S. Weir
Program in Statistical Genetics, Department of Statistics
North Carolina State University, Raleigh, NC 27695-8203


INTRODUCTION
INTERPRETATION OF THE EVIDENCE
CONDITIONAL FREQUENCIES
TYPING ERRORS
PROFILE FREQUENCIES
SINGLE-LOCUS FREQUENCIES
INDEPENDENCE WITHIN LOCI
MULTI-LOCUS FREQUENCIES
CONFIDENCE INTERVALS
POPULATION STRUCTURE
MIXED STAINS
CONCLUSION
REFERENCES
TABLES

INTRODUCTION

On October 5, 1994, when the defense sought to bar the admission of DNA evidence in the Simpson case (Superior Court of the State of California for the County of Los Angeles, Case No. BA097211) many of the objections centered on the use of statistics: "The statistical estimates being offered for Cellmark's RFLP, polymarker, and DQ Alpha tests, DOJ's D1S80 and DQ Alpha tests, and LAPD's DQ Alpha tests should not be admitted because the statistical methods used by the laboratories are not generally accepted as reliable." The Motion to Exclude DNA Evidence made specific mention of:

  1. "The general acceptance of the methods used to determine the probability of a coincidental match for each test."
  2. "The general acceptance of the methods used to determine the false positive error rates of the laboratories for each test."
  3. "The general acceptance of the methods used to express the probability of that the [sic] defendant is the source of DNA evidence: whether it is appropriate to express the probabilities of a coincidental match and a false positive error as one statistical estimate, two statistical estimates, or in some other fashion."

Although this was a somewhat unusual case, challenges to the numbers attached to DNA profiles are quite frequent. Early challenges centered on the question of independence of alleles within and between loci, and the variation of allele frequencies among subgroups of a population. Questions of the related issue of the adequacy of current databases continue, although attention is shifting towards probabilities of errors. There are sometimes more specialized questions having to deal with the issue of relatedness between suspects and perpetrators, or with the interpretation of mixed stains.

All these issues will be treated in this paper, along with suggestions of how best to present statistics in court. Such testimony may not receive good press, as I found out in the Simpson case. After my testimony, the New York Times was relieved on June 27, 1995 to be "Back to the tangible after days of statistics."

Although I believe it is possible to give meaningful numbers in court, I also believe that the need for numbers addressing the frequency of profiles in a population is almost over. The 1992 report of the National Research Council was being hasty in asserting that "To say that two patterns match, without providing any scientifically valid estimate (or at least an upper bound) of the frequency with which such matches might occur by chance, is meaningless." - report page 9. It is the number itself that becomes meaningless when the profile consists of genotypes at 14 VNTR loci and 7 PCR loci, as was the case for one of the stains in the Simpson case (Weir 1995). Even for the now-routine application of seven or more VNTR loci, the chance of unrelated people in the same large population sharing the same set of 14 alleles is vanishingly small.

INTERPRETATION OF THE EVIDENCE

Much of the confusion surrounding the statistical treatment of DNA profiles is avoided if standard forensic methods are used. A comprehensive account was given by Aitken (1995), and the language of that text will be used here. For simplicity, suppose a stain from the scene of the crime has DNA profile A and this is known to be from the perpetrator P of the crime. Person S, who is a suspect in the crime, has profile B. In the conventional treatment, S is removed from suspicion if B A, although this match-based approach can be avoided with a continuous treatment (Evett et al., 193). Suppose the evidence of a match is denoted by E. There are two explanations for E.

  1. C: Person S is the perpetrator P (the prosecution explanation).
  2. :Person S is not the perpetrator P (the defense explanation).

If S denies being the perpetrator, then the issue needs to be determined by the trier(s) of fact (judge or jury) is whether C or is true. Before the evidence of matching profiles is found, there may be prior probabilities Pr(C) and Pr(), for the two explanations. The ratio of these two is called the "prior odds." It is unlikely that a forensic scientist can, or should, assign a value to this ratio. The scientist may be able, however, to present probabilities of the evidence if either of the explanations is true. These quantities are written as Pr(EC) and Pr(E). It needs to be stressed that the forensic scientist is not claiming that either explanation is true, but is merely pointing to the consequences of either being true. The ratio of probabilities.

is called a likelihood ratio. In paternity testing, it has been called the "paternity index," suggesting the term "forensic index" for forensic testing.

It is a consequence of Bayes' theorem that the posterior odds is the product of the likelihood ration and the prior odds.

The calculation of L, however, is not a "Bayesian analysis" as this term usually implies the assignment of prior probabilities.

It is worth spending some time during testimony to lay out this foundation, and to explain that it leads to statements such as "The evidence is L times more likely if the prosecution explanation is correct, than if the defense explanation is correct," or "The evidence is L times more likely if S left the stain than if someone unrelated to S left the stain," or in some cases "The evidence is worth pointing to the danger, and ease, of transposing the conditional to produce statements like "After considering the evidence of a match, it is L times more likely that S left the stain than it is that someone else did." The forensic scientist needs to guard against the tendency of both prosecutors and defense attorneys to make such statements (the Simpson defense made it in the quotations at the beginning of this paper), and simple analogies are useful. The probability that a man weighs over 250 pounds if he is a professional footballer is high, but the probability that a man is a professional footballer if he weighs over 250 pounds is quite low and depends on the (low) proportion of the population who play football for a living.

The reluctance of courts to embrace the use of likelihood ratios has been encouraged by the possibility of simplifying the ratios in may cases. If the DNA profiling technique is error-free, then a match is guaranteed when the suspect is actually the perpetrator. The evidence of a match is certain: Pr(EC) =1. Furthermore, if the suspect is not the perpetrator and these two people have independent chances of being of DNA profile type A, then the denominator of L reduces to the probability of a random member of the population having profile A. In other words, L is just the reciprocal of the profile frequency and it makes little difference whether A is said to have a frequency of 1 in a million, or is said to give a likelihood ratio of a million. Although this is expedient, it glosses over some issues and is of no help in accommodating population structure, relatives or mixed stains.

CONDITIONAL FREQUENCIES

A little algebra is helpful here, even if it does not make for good testimony. The evidence E is actually a compound event: both S and P have profile A, and this will be written in an informal way as SA, PA. The two explanations refer to S and P either being the same or different people, written here as S=P or SP. Then the forensic index is

The last line follows because P is certain is have profile A if S does and they are the same person, and the probability of S having profile A is the same whether or not S and P are the same person. This algebra emphasizes that the interpretation of single-contributor stains requires the conditional frequency with which one person (P) has a particular profile given that another person (S) has been seen to have that profile. Only under special circumstance of independent profile frequencies can this conditional frequency be replaced by an unconditional frequency:

and the likelihood ratio can also be expressed in terms of this frequency

TYPING ERRORS

The symbols SA and PA in the previous section were said to mean that S or A had profile A. Strictly, they mean that these people are declared to have profile A, and it is possible that one or both of these declarations are in error. Although there has been discussion of how such errors could be quantified (Thompson 1995) it is difficult to envisage how this could be done. A DNA typing laboratory can participate in various proficiency tests, and could report the numbers of successes and failures from such tests. The fact that a failure is likely to cause changes in protocols or personnel makes it difficult to speak about error "rates." A rate indicates a proportion of some outcome under repeated trials, and a chance in conditions following an error means that trials are not being exactly repeated. The fact that the laboratory either did or did not make errors in these tests is, of course, relevant information to the trier of fact.

Trying to incorporate proficiency test results into likelihood ratios faces the additional complication that each case is unique. Circumstances surrounding the analysis of a case may be quite different from those surrounding a proficiency test. Known (person S) and query (person P) profiles may be determined some months apart, for example. There is no repetition of the analysis that could give rise to a rate at which errors arose- an error was either made or not made. Having said that, it must be acknowledged that the forensic scientist can discount the possibility of errors with greater confidence when multiple items of evidence provide the same matching profiles or when different laboratories either get duplicate results or each type different ones of the matching profiles (Lempert 1995)

For the rest of this paper, profile determinations will be assumed to be error free. The numerical issues refer only to profile frequencies.

PROFILE FREQUENCIES

Assigning a numerical value to the forensic index has the dual problem of needing to be based on sound science and to convey meaning to the trier of fact. The most straightforward demonstration of the meaning of the index can be provided by experiments conducted on databases.

Simulating the case of the prosecution explanation C being true can be done by evaluating the index for every individual represented in a database, and regarding that person as both suspect and perpetrator. For the case where the defense explanation is true, each pair of people in a database can be regarded as representing suspect and perpetrator. A good DNA typing system should give large forensic index values when suspect and perpetrator are the same person, and low values when they are unrelated people. Both the system, and an appreciation of the range of index values, can be explained by reference to these experiments. The value for the case at hand can be put into context with a minimum of assumptions or background theory.

This simple experiments will work only if matching profiles occur in a database, for otherwise the index is not defined in the second experiment. Evett et al. (1993) got around this problem for VNTR loci by using a continuous analysis that does not require a declaration of matching. For STR loci, four-locus matches were found in some United Kingdom databases (Evett et al. 1996).

The point is that profiles at several loci tend to be so rare that duplicates are not found in samples of a few hundred people. Likewise, the profile(s) of interest in a particular case are not found in these samples. Procedures such as adding the profile to a database of n profiles will give an estimated frequency of 1/(n +1) under or 2/(n+2) under C but this has the unsatisfactory aspect of giving the same answer whether the profile not seen in a database was based on a single VNTR locus or 14 VNTR loci plus 7 PCR loci. Any numeric statements must recognize that a DNA profile is not a single entity, but is actually a composite of information from several loci.

SINGLE-LOCUS FREQUENCIES

The first step is to show the frequencies at each of the loci in the matching profile, using a range of databases. For example, in Table 1 are shown the frequencies at six of the VNTR loci in the profile found to match between a bloodstain on a sock found in O.J. Simpson's bedroom and a blood sample from Nicole Brown. The typing was performed by the California Department of Justice DNA Laboratory, and frequencies were found by seeking matching one-locus genotypes in four FBI databases. Presenting such a chart in court makes it very clear that matches at single loci are not common, and they are not common whether frequencies based on African Americans, Caucasians or Hispanics are used. At this point there have been no genetic assumptions. The statistical assumption is that the databases are sufficiently random with respect to genotypes at these loci that the observed proportions provide appropriate estimates of population frequencies.

INDEPENDENCE WITHIN LOCI

Another advantage of presenting Table 1 is that it shows that a one-locus profile matching the one of interest may not be found in a database, even though alleles matching each of the two alleles at that locus are found in reasonable frequencies. It can be pointed out that a basic law in population genetics suggests that frequencies of genotypes are given by the products of frequencies of alleles. Although the biological conditions which lead to that law being true certainly do not hold in human populations, it is a simple matter to see whether or not the data are consistent with the law. Table 2 is a very useful demonstration of the similarity between observed genotypic frequencies in the FBI databases and the frequencies expected under the law of independence.

Table 2 shows that similar conclusions are reached whether the observed or the expected genotype counts are used. For the profile in this example, three loci were single-banded, and the table also shows that doubling the single-allele frequency rather than squaring it provides an overestimate of the genotype count. Although it will not be necessary in every case, it will sometimes be helpful to use tables like Table 2 to answer questions about "binning" strategies. The counts in Table 1 and the middle block of columns of Table 2 use floating bins. Every fragment in a database that matches the profile fragment contributes to the estimated frequency of that fragment. The right hand block of columns in Table 2 are for fixed bins, whereby a profile fragment is assigned to the fixed bins defined by Budowle et al. (1991) and all database fragments falling into that same bin contribute to the estimated frequency of the fragment. Similar frequencies are obtained from both binning methods, although the six-locus product is larger (more conservative) for the fixed bins in this case.

Finally, although the comparison of observed and expected counts in tables such as Table 2 make clear the consistency of databases with the law of independence of allele frequencies within loci, mention can be made of the many statistical tests for independence that have been conducted and published. Recent publications include Maiste and Weir (1995), Evett et al. (1996) and Hamilton et al. (1996). Publications challenging independence (e.g. Geisser and Johnson 1993) generally ignore the single-band issue and do not use conventional binning strategies.

MULTI-LOCUS FREQUENCIES

The discriminatory power of DNA profiles comes from the rarity of matching profiles when many loci are considered. Quantifying rarity is even more difficult than in the one-locus case because a specific profile is very unlikely to be seen in any database. To be 99% sure of seeing at least one copy of a profile that occurs once in a million people would require a sample of size 4.6 million. Once again, population genetics theory suggests that allele frequencies at different loci are independent and may be multiplied together to provide an estimate of the multi-locus frequency.

Demonstrating independence follows the same path as for one locus, but only so far. Tables of observed and expected counts can be prepared, but beyond two loci most observed counts will be zero or one, and expected counts less than one. Reference can be made to published statistical tests (e.g. Zaykin et al. 1995) that show general consistency of forensic DNA databases with independence between loci. Indeed, it is difficult to imagine a biological reason for dependence among the frequencies for unlinked neutral markers, even in structured populations. Published accounts of linkage disequilibrium in human populations invariably refer to loci that are tightly linked. Once again, the best approach may be to demonstrate the effects of assuming independence by experiments on databases. The experiments conducted by Evett et al. (1993, 1996) show that likelihood ratios for several loci calculated under this assumption are large when suspect and perpetrator are the same person and small when they are different people.

There is the technical issue of bias in the product rule. Even at one locus, the genotype frequencies formed as the products of allele frequencies have an expected value (the average over many replicates of the same procedure) that is less than the true value for heterozygotes. If the true and sample allele frequencies for the ith allele are written as pi and , then the expected value of is 2pipj (2n-1)/2n when the database is from n individuals and the population does satisfy the Hardy-Weinberg law of independent allele frequencies. For a database of size 250, the expected bias when the true allele frequencies are 0.1 is 0.0004 or 0.2%. There are similar biases for products over several loci. Balding (1995) suggests increasing allele frequency estimates to compensate for this bias. If a database contains n1 copies of the ith allele among the 2n alleles listed, he would estimate pi as (ni +2)/(2n+4) instead of the usual ni/2n. This can be thought of as the estimate resulting from adding the profiles of both suspect and perpetrator to the database. For a database of size n=250 and true allele frequencies 0.1, this is expected to provide heterozygote frequency estimates of 0.0213, or 6.25% higher than the true value. An alternative procedure is to report confidence intervals for profile frequency estimates.

CONFIDENCE INTERVALS

One of the most frequent questions raised of profile frequency estimates of the order of 1 in a million or less, is how they can be justified from databases of size 250 or so. The response should point out that, because of independence of allele frequencies, there are effectively separate databases of size 500 for alleles at each locus and that this size gives allele frequency estimates with low sampling variances. An effective response is to remind the questioner that statistical estimates can have levels of confidence attached to them, as is routine in public opinion surveys. Statements such as '47% of those surveyed support the President (based on a telephone survey of 1,063 registered voters, margin of error ± 3 percentage points' are common. Estimating m-locus profile frequencies amounts to asking a series of 2m questions, where each question can have many answers. Approximate formulas have been given to calculate confidence limits on products of frequencies (Chakraborty et al. 1993) but a better procedure seems to be to use bootstrapping.

It can be explained that the profile frequency estimate, or the likelihood ratio, is based on certain databases. If new databases were constructed for the same populations, different numerical values would result simply because different people were typed. It is not feasible to construct new databases, but the numerical resampling technique of "bootstrapping" allows new databases to be formed from the present one. The technique is standard, and was described in a monograph by Efron (1982), and a recent book by Efron and Tibshirani (1994). It is also treated by Weir (1996). For any particular case, a thousand bootstrap databases can be constructed and the value that cuts off the most extreme 1% of the values found. Only those values more favorable to the defense are likely to be of interest, and for the profile in Table 2 the estimate and upper 99% confidence limit on the likelihood ratio are 8.7 x 1012 and 1.6 x 1012 for floating bins and 1.6 x 1012 and 0.5 x 1012 for fixed bins. There is 99% confidence in the statement that 1.6 x 1012 is less than the forensic index for floating bins or that 0.5 x 1012 is less than the forensic index for fixed bins. It is quite common for the confidence limit to be up to 10 times smaller than the original estimate, and common also for the fixed and floating bin estimates to lie within the confidence intervals of the other.

A criticism of these confidence limits points to the discrepancy in orders of magnitude between estimates such as 10-6 and confidence limits based on the most extreme 10-2 values. A confidence limit of 99% is equivalent to acknowledging a probability of 1% of underestimating the frequency, which may have been estimated as 0.0001%. It may be better to have confidence limits of 99.9999% for such values. Bootstrapping would therefore require at least a million samples. A more practical approach would be based on confidence limits for each allele, as implicit in the approach of Chakraborty et al. (1993), but exact limits should be used instead of appealing to normal theory (Weir 1996). Bootstrapping by repeatedly sampling individuals has the great advantage of preserving whatever dependencies there are among alleles within and between loci. A fuller account of this issue is given by B.S. Weir and J.S. Buckleton (in preparation).

Rather than attempting to answer questions such as "How large should a database be?" a forensic scientist can cite the confidence limits for the databases used and explain that these limits take into account the number of people sampled.

POPULATION STRUCTURE

Simple calculation of profile frequencies is not sufficient when there are dependencies between the different people, suspect and perpetrator, featured in the defense explanation. The most common source of dependency is a result of membership in the same population and having similar evolutionary histories. The mere fact of populations being finite means that two people taken at random from a population have a nonzero chance of having relatively recent common ancestors.

The appropriate theory is well understood for pairs of alleles, as opposed to pairs of genotypes. The probability of choosing allele A from a population given that A has already been chosen is

A full treatment was given by Weir (1994), but here it is sufficient to note that pA is the allelic frequency in the whole population and q is often written as FST, and Balding and Nichols (1994) have suggested that 0.05 is likely to be an upper bound for large populations. This conditional allelic frequency is greater than theta and pA.

Theory for pairs of genotypes was given by Cockerham (1971), but there are difficulties with estimating the three- and four-gene analogs of theta that this exact treatment requires. A good approximation appears to be that given by Balding and Nichols (1994).

It is useful to present a table showing the effects of different values of q, from published estimates like 0.001 (Weir 1994) to the likely upper bounds of 0.05. Such values are shown in Table 2. Clearly, allowing for a q value of 0.05 can diminish the likelihood ratio by 1,000. Just as clearly, however, likelihood ratios in the billions reflect the implausibility of finding the same nine-band VNTR profile in two different people.

MIXED STAINS

The evidential value of a DNA profile can be substantially reduced when there is clearly more than one contributor to the profile and prosecution explanation does not account for the whole profile. The issue has been dealt with by Evett et al. (1991) and Aitken (1995), and a more detailed account is given by Weir et al. (1996). The problem should cause no difficulty, provided the likelihood ratio framework is adopted.

In the language of Weir et al. (1996), suppose the evidence profile has a set of alleles {e} at a locus. Under the prosecution explanation, there are some known contributors but they lack alleles {u} in the profile. The probability that x unknown people have these alleles among them, and do not have any alleles in the profile among them, is Px ({u}{e}). Likewise, the defense explanation may include certain known contributors, but leave alleles {u} unaccounted for and the probability of finding this set among y unknowns in Py ({u}{e}). The likelihood ratio is

As an example, suppose the evidentiary profile in a rape case has alleles abcd at a locus, the victim has alleles ab and a suspect has alleles cd. The prosecution explanation for the evidence is that it is from the victim and the suspect. No alleles in the evidence need to be accounted for. If q denotes the empty set, then the numerator of the likelihood ration is P0 (qabcd)=1. The defense explanation is that the evidence is from the victim and some unknown person. Alleles cd need to be accounted for, and P1 (cdabcd)=2pcpd. The likelihood ratio is the usual

If, however, the stain could not be positively said to contain the victim's DNA, the defense explanation may be that both contributors are unknown and there is a need to account for all four alleles abcd. The arrangement of these four alleles among two people is unknown, and the people could be ab, cd or ac, bd or ad, bc and the order could be reversed in each case. The probability P2 (abcdabcd) is 24papbpcpd. If the prosecution explanation is still that the victim and suspect were the contributors, then

The strength of the evidence against the suspect is stronger (providing 12papb1), because of there being more unknown contributors in the defense explanation.

Suppose, instead, that the evidence is from a murder committed by two people, both of whom left blood at the crime scene. The evidence profile has alleles abcd but there is only one suspect, with profile cd. The prosecution explanation must be that the contributors were the suspect and an unknown person. The probably P1 (ababcd)=2papb is needed. The defense explanation is that both contributors are unknown, and the likelihood ratio is

The strength of the evidence against the suspect is weaker because of there being more unknown contributors in the prosecution explanation.

This discussion has assumed that the circumstances of the crime dictated there were exactly two contributors to the evidence. There will be cases when the number of contributors is unknown, and then a more extensive analysis is required. The simplest way is to give separate answers for each possible number of contributors, although a more satisfactory approach would allow the forensic scientist to express a professional opinion as to the number. This issue has been discussed by Evett et al. (1996).

The effect of the number of unknown contributors was quite substantial in the Simpson case. A stain on the steering wheel of O.J. Simpson's Bronco had three or four VNTR alleles at each of three loci, as shown in Table 4. The defense argues that the number of contributors was not known, and the court ordered calculations to be performed as though there were two, three or four contributors. The prosecution explanation was that O.J. Simpson (OS) and Ronald Goldman (RG) were contributors. Using five FBI databases is relevant, the range of values for the likelihood ratios are shown in Table 5 for a variety of pairs of explanations. It would clearly be in the best interest of the defense to insist that the prosecution include as many unknowns as possible in their explanation, while including as few unknowns as possible in their own explanation. The absence of likelihood ratios in the trial meant that only the frequencies with which two, three or four contributors had the profile between them were presented. These frequencies correspond to the highest possible likelihood ratios. There is an additional issue of allowing for unknown contributors to have unseen VNTR alleles (Weir and Buckleton 1996).

CONCLUSION

Until such time as DNA profiles are of such detail and general acceptance that the fact of a match between two profiles can be presented without numerical qualification, the best approach in court rests on likelihood ratios. Even if the term "likelihood ratio" is not explicitly mentioned, forensic scientists are likely to convey the quantitative meaning of a match in the clearest possible way by casting their own thinking in this framework.

REFERENCES

Aitken C.G.G. (1995) Statistics and the Evaluation of Evidence for Forensic Scientists. New York: Wiley.

Balding D.A. (1995) Estimating Products in Forensic Identification Using DNA Profiles. J. Am. Stat. Assoc. 90:839-844.

Balding D.J. and Nichols R.A. (1994) DNA Profile Match Probability Calculation: How to Allow for Population Stratification, Relatedness, Database Selection and Single Bands. Forensic Sci. Intl. 64:125-140.

Budowle B., Giusti A.M., Waye J.S., Baechtel F.S., Fourney R.M., Adams D.E., Presley, L.A., Deadman H.A. and Monson K.L. (1991) Fixed-bin Analysis for Statistical Evaluation of Continuous Distributions of Allelic Data from VNTR Loci, for Use in Forensic Comparisons. Am. J. Hum. Genet. 48:841-855.

Chakraborty R., Srinivasan M.R., and Daiger S.P. (1993) Evaluation of Standard Erro and Confidence Interval of Estimated Multilocus Genotype Probabilities, and their Implications in DNA Forensics. Am. J. Hum. Genet. 52:60-70.

Cockerham C.C. (1971) Higher Order Probability Functions of Identity of Alleles by Descent. Genetics 69:235-246.

Efron B. (1982) The Jacknife, the Bootstrap, and Other Resampling Plans. CBMS-NSF Regional Conference Series in Applied Mathematics, Monograph 38. Philadelphia: SIAM.

Efron B. and Tibshirani R.J. (1993) An Introduction to the Bootstrap. New York: Chapman and Hall.

Evett I.W., Buffery C., Wilott G. and Stoney D. (1991) A Guide to Interpreting Single Locus Profiles of DNA Mixtures in Forensic Cases. J. Foren. Sci. Soc. 31:41-47.

Evett I.W., Scranage J.K. and Pinchin R. (1993) An Illustration of the Advantages of Efficient Statistical Methods for RFLP Analysis in Forensic Science. Am. J. Hum. Genet. 52:498-505.

Evett I.W., Gill P.D., Scranage J.K. and Weir B.S. (1996) Establishing the Robustness of STR Statistics for Forensic Applications. Am. J. Hum. Genet. (Forthcoming).

Giesser S and Johnson W. (1993) Testing Independence of Fragment Lengths within VNTR Loci. Am. J. Hum. Genet. 53:1103-1106.

Hamilton J.F., Starling L., Cordiner S.J., Monahan, D.L. Buckleton J.S., Chambers G.K. and Weir B.S. (1995) New Zealand Population Data at Five VNTR Loci: Validation as Databases for Forensic Identity Testing. Science and Justice (forthcoming).

Lempert R. (1995) The Honest Scientist's Guide to DNA Evidence. Genetica 96:119-124.

Maiste P.J. and Weir B.S. (1995) A Comparison of Tests for Independence in the FBI RFLP Databases. Genetica 96:125-138.

National Research Council (NRC) (1992) DNA Technology in Forensic Science. Washington, D.C.: National Academy Press.

Thompson W.C. (1995) Subjective Interpretation, Laboratory Error and the Value of Forensic Evidence: Three Case Studies. Genetica 96:153-168.

Weir B.S. (1994) Effects of Inbreeding on Forensic Calculations. Ann. Rev. Genet. 28:597-621.

Weir B.S. (1995) DNA Statistics in the Simpson Matter. Nature Genetics 11:365-368.

Weir B.S. (1996) Genetic Data Analysis II. Sunderland, MA: Sinauer.

Weir B.S. and Buckleton J.S. (1996) Statistical Issues in DNA Profiling. Proc 16th Int. Cong. Forensics Haemogenetics Soc.

Weir B.S., Triggs C.M., Starling L., Stowell L.I., Walsh K.A.J. and Buckleton J.S. (1996) Interpreting DNA mixtures. (Submitted).

Zaykin D., Zhivotovsky L. and Weir B.S. (1995) Exact Tests for Association Between Alleles at Arbitrary Numbers of Loci. Genetica 96:169-178.


Table 1. One-locus frequency estimates for Nicole Brown profile.

Locus

African. American

Caucasian

SE Hispanic

SW Hispanic

Total

D1S7

2/359

6/595

3/305

2/288

13/1547

D2S44

2/475

3/792

2/300

0/284

7/1851

D4S139

2/448

12/594

2/311

3/265

19/1618

D5S110

4/353

2/511

4/286

1/165

11/1315

D10S28

3/288

4/429

1/230

6/283

14/1230

D14S13

0/524

0/751

1/306

0/187

1/1768

Table 2. Observed and expected counts in FBI Caucasian database for Nicole Brown profile.

     

Floating Bins

Fixed Bins

Locus

n

Frag. Lengths

Obs.

Exp.

'2p'

Obs.

Exp.

'2p'

D1S7

595

5319

6

3.5

91.0

2

2.1

71.0

D2S44

792

2638, 2528

3

4.4

 

2

2.7

 
D4S139

594

5606

12

4.5

103.0

8

6.9

128.0

D5S110

511

4341, 2353

2

4.0

 

6

7.9

 
D10S28

429

3597, 1449

4

2.5

 

4

2.2

 
D14S13

751

6433

0

0.0

1.0

0

0.0

10.0

Table 3. Likelihood ratios for Nicole Brown profile, using FBI Caucasian database.

q

Likelihood ratio

99% Confidence Limit

0.000

1.63 x 1012

5.19 x 1011

0.001

1.20 x 1012

4.08 x 1011

0.010

1.59 x 1011

6.86 x 1010

0.050

1.76 x 109

1.08 x 109

Table 4. RFLP Profiles for Bronco Center Console.

Locus

Allele

Sample

OS

RG

D2S44

a

2931

2925

3017

 

b

1874

1877

 
 

c

1684

 

1689

D4S139

a

8899

8915

 
 

b

3281

3301

 
 

c

7203

 

7192

 

d

5683

 

5733

D5S110

a

11356

11355

 
 

b

4777

4778

 
 

c

5717

 

5772

 

d

3015

 

3022

Table 5. Likelihood ratios for interpreting evidence.

Prosecution Explanation

Defense Explanation

Likelihood

Ratio

Two Contributors

   

OS+RG

OS+U1

65,000-150,000

OS+RG

RG+U1

38,000-73,000

OS+RG

U1+U2

100,000,000-220,000,000

OS+U1

U1+U2

720-20,000

RG+U1

U1+U2

1,000-5,800

Three Contributors

   

OS+RG+U1

OS+U1+U2

2,000-5,000

OS+RG+U1

RG+U1+U2

1,200-3,600

OS+RG+U1

U1+U2+U3

1,000,000-4,600,000

OS+U1+U2

U1+U2+U3

400-1,100

RG+U1+U2

U1+U2+U3

650-2,100

Four Contributors

   

OS+RG+U1+U2

OS+U1+U2+U3

880-2,200

OS+RG+U1+U2

RG+U1+U2+U3

550-1,500

OS+RG+U1+U2

U1+U2+U3+U4

260,000-1,300,000

OS+U1+U2+U3

U1+U2+U3+U4

110-680

RG+U1+U2+U3

U1+U2+U3+U4

410-1,100

U1, U2, U3, and U4 are distinct unknown people


Go to proceedings home page