Statistical Evidence Versus Legal Evidence
This is the job of a juror in the US legal system described in statistical terms:
Compute the posterior probability of a defendant’s guilt conditioned on the admissible evidence, starting with a prior biased toward innocence. Report “guilty” if the posterior mean probability of guilt is above a level referred to as “beyond reasonable doubt.”
A juror is not to compute a probability conditioned on all evidence, only admissible evidence. One of the purposes of voir dire is to identify potential jurors who do not understand this concept and strike them from the jury pool. Very few jurors would understand or use the language of conditional probability, but a competent juror must understand that some facts are not to be taken into consideration in reaching a verdict.
For example, the fact that someone has been arrested, indicted by a grand jury, and brought to trial is not itself to be considered evidence of guilt. It is not legal evidence, but it certainly isstatistical evidence: People on trial are more likely to be guilty of a crime than people who are not on trial.
This sort of schizophrenia is entirely proper. Statistical tendencies apply to populations, but trials are about individuals. The goal of a trial is to make a correct decision in an individual case, not to make correct decisions on average. Also, the American legal system embodies the belief that false positives are much worse than false negatives. 
Thinking of a verdict as a conditional probability allows a juror to simultaneously believe personally that someone is probably guilty while remaining undecided for legal purposes.
 Jury instructions are implicitly Bayesian rather than frequentist in the sense that jurors are asked to come up with a degree of belief. They are not asked to imagine an infinite sequence of similar trials etc.
 For example, Benjamin Franklin said ”That it is better 100 guilty Persons should escape than that one innocent Person should suffer, is a Maxim that has been long and generally approved.” In decision theory vocabulary, this is a highly asymmetric loss function.
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)