Extraordinary claims require extraordinary evidence

a probabilistic analysis

In #religion

Extraordinary claims require extraordinary evidence - Carl Sagan

I think no pithy quote has caused so much angst with apologists than this one from Carl Sagan, directed in particular to religious miracle claims. Before looking into it a bit more specifically, I want to point out that the word "claim" here refers to "explanation" or "proposition" and not to testimony (which would include a form of evidence). I'll have more to say about this in another post. If we are talking about testimony as the evidence, then we need to break the "claim" into the "explanation" part and the "evidence" part (see my ramble on claims and evidence here).

To see an overly technical treatment of this by an apologist, we can look to Tim McGrew's treatment on Erik Manning's channel, as well as one of his own on the related topic of Hume.

I'll give more detailed responses to these videos later, but here I just want to map the Sagan quote to probabilities.

The analysis

For me, it seems straightforward to set up. We start with some notation.

  • <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>D</mi></mrow><annotation encoding="application/x-tex">D</annotation></semantics></math> = data, or the evidence
  • <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>M</mi><mi>i</mi></msub></mrow><annotation encoding="application/x-tex">M_i</annotation></semantics></math> = the various models, or explanations, or claims. We'll treat <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>M</mi><mi>o</mi></msub></mrow><annotation encoding="application/x-tex">M_o</annotation></semantics></math> as the extraordinary explanation and <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>M</mi><mn>1</mn></msub><mo separator="true">,</mo><msub><mi>M</mi><mn>2</mn></msub><mo separator="true">,</mo><mo>…</mo></mrow><annotation encoding="application/x-tex">M_1, M_2, \ldots</annotation></semantics></math> as the mundane explanations

Bayes theorem then has

<math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable columnalign="right left" columnspacing="0em" rowspacing="0.25em"><mtr><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mi mathvariant="normal">∣</mi><mi>D</mi><mo stretchy="false">)</mo></mrow></mstyle></mtd><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow><mrow></mrow><mo>=</mo><mfrac><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo></mrow><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mo>+</mo><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mo>+</mo><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mn>2</mn></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mn>2</mn></msub><mo stretchy="false">)</mo><mo>+</mo><mo>⋯</mo></mrow></mfrac></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow></mrow></mstyle></mtd><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow><mrow></mrow><mo>=</mo><mfrac><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo></mrow><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mo>+</mo><munderover><mo>∑</mo><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mi>N</mi></munderover><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>i</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>i</mi></msub><mo stretchy="false">)</mo></mrow></mfrac></mrow></mstyle></mtd></mtr></mtable><annotation encoding="application/x-tex"> \begin{aligned} P(M_o|D) &= \frac{P(D|M_o)P(M_o)}{P(D|M_o)P(M_o)+P(D|M_1)P(M_1)+P(D|M_2)P(M_2)+\cdots} \\ &=\frac{P(D|M_o)P(M_o)}{P(D|M_o)P(M_o)+\sum_{i=1}^{N}P(D|M_i)P(M_i)} \end{aligned} </annotation></semantics></math> We can simplify this by merging all of the mundane explanations together into <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><msub><mi>M</mi><mn>1</mn></msub></mrow><annotation encoding="application/x-tex">M_1</annotation></semantics></math>, which is also equivalent to <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>∼</mo><msub><mi>M</mi><mi>o</mi></msub></mrow><annotation encoding="application/x-tex">\sim M_o</annotation></semantics></math> <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable columnalign="right left" columnspacing="0em" rowspacing="0.25em"><mtr><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mi mathvariant="normal">∣</mi><mi>D</mi><mo stretchy="false">)</mo></mrow></mstyle></mtd><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow><mrow></mrow><mo>=</mo><mfrac><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo></mrow><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mo>+</mo><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo></mrow></mfrac></mrow></mstyle></mtd></mtr></mtable><annotation encoding="application/x-tex"> \begin{aligned} P(M_o|D) &= \frac{P(D|M_o)P(M_o)}{P(D|M_o)P(M_o)+P(D|M_1)P(M_1)} \end{aligned} </annotation></semantics></math> which leads to <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mtable columnalign="right left" columnspacing="0em" rowspacing="0.25em"><mtr><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mi mathvariant="normal">∣</mi><mi>D</mi><mo stretchy="false">)</mo></mrow></mstyle></mtd><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow><mrow></mrow><mo>=</mo><mfrac><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo></mrow><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mo>+</mo><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo></mrow></mfrac></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow></mrow></mstyle></mtd><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow><mrow></mrow><mo>=</mo><mfrac><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo></mrow><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo></mrow></mfrac><mrow><mo fence="true">(</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>+</mo><mfrac><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo></mrow><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo></mrow></mfrac></mrow></mfrac><mo fence="true">)</mo></mrow></mrow></mstyle></mtd></mtr><mtr><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow></mrow></mstyle></mtd><mtd><mstyle displaystyle="true" scriptlevel="0"><mrow><mrow></mrow><mo>≡</mo><mfrac><mn>1</mn><mrow><mn>1</mn><mo>+</mo><mi>r</mi></mrow></mfrac></mrow></mstyle></mtd></mtr></mtable><annotation encoding="application/x-tex"> \begin{aligned} P(M_o|D) &= \frac{P(D|M_o)P(M_o)}{P(D|M_o)P(M_o)+P(D|M_1)P(M_1)} \\ &=\frac{P(D|M_o)P(M_o)}{P(D|M_o)P(M_o)}\left(\frac{1}{1+\frac{P(D|M_1)P(M_1)}{P(D|M_o)P(M_o)}}\right)\\ &\equiv \frac{1}{1+r} \end{aligned} </annotation></semantics></math> where <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mo>=</mo><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mi mathvariant="normal">/</mi><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mi mathvariant="normal">/</mi><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">r=P(D|M_1)P(M_1)/P(D|M_o)/P(M_o)</annotation></semantics></math> which was a long-winded way of getting to the odds form of Bayes theorem.

In order for the posterior for the extraordinary claim to rise above <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mi mathvariant="normal">∣</mi><mi>D</mi><mo stretchy="false">)</mo><mo>></mo><mn>0.5</mn></mrow><annotation encoding="application/x-tex">P(M_o|D) >0.5</annotation></semantics></math> then <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mo><</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">r<1</annotation></semantics></math>. To map the phrase to values we have,

  • extraordinary claim <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>≡</mo></mrow><annotation encoding="application/x-tex">\equiv</annotation></semantics></math> low prior: <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mo>≪</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">P(M_o)\ll 1</annotation></semantics></math>
  • mundane claim <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mo>≡</mo></mrow><annotation encoding="application/x-tex">\equiv</annotation></semantics></math> high prior: <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mo>∼</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">P(M_1) \sim 1</annotation></semantics></math>

This means that <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mi mathvariant="normal">/</mi><mi>P</mi><mo stretchy="false">(</mo><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mo>≫</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">P(M_1)/P(M_o)\gg 1</annotation></semantics></math> which immediately implies for <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>r</mi><mo><</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">r<1</annotation></semantics></math> that <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mo>≫</mo><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo></mrow><annotation encoding="application/x-tex">P(D|M_o) \gg P(D|M_1)</annotation></semantics></math>.

Now, let's assume that the extraordinary claim perfectly fits the observed data, so that we have <math xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mi>o</mi></msub><mo stretchy="false">)</mo><mo>∼</mo><mn>1</mn></mrow><annotation encoding="application/x-tex">P(D|M_o)\sim 1</annotation></semantics></math>. What the last statement implies is that to justify your extraordinary claim, you must have <math display="block" xmlns="http://www.w3.org/1998/Math/MathML"><semantics><mrow><mi>P</mi><mo stretchy="false">(</mo><mi>D</mi><mi mathvariant="normal">∣</mi><msub><mi>M</mi><mn>1</mn></msub><mo stretchy="false">)</mo><mo>≪</mo><mn>1</mn></mrow><annotation encoding="application/x-tex"> P(D|M_1)\ll 1 </annotation></semantics></math> or every other mundane claim must be nearly ruled out, not just unlikely. This is exactly what science does -- construct situations (i.e. controlled experiments) to make any mundane explanation nearly impossible. The level of "nearly impossible" depends of course on the extraordinariness of the primary claim.

I find this consequence so straightforward, that I find it baffling that it is at all contentious.