Section 01

Why Methodology Comes First

Codex Numerica catalogues numerical patterns claimed in sacred texts across more than fifteen traditions. Some of those patterns are verifiable arithmetic — reproducible from the source text by anyone with a pencil. Others are found patterns — produced by searching a large text for arrangements that confirm a prior expectation. The two look identical on the page. They are not the same kind of evidence.

This page is the credibility anchor of the entire site. Every claim catalogued elsewhere — the 37 × 73 factoring of Genesis 1:1, the 19-based structure of the Quran, the Dresden Codex Venus table, the 231 gates of Sefer Yetzirah — is measured against the toolkit described below. Where a claim survives the methodology, it earns the verified or remarkable badge. Where it fails, it earns disputed. Where it has not yet been tested rigorously, it earns exploratory.

The Single Question

For every pattern, this site asks the same question: Would the pattern survive if the analyst had committed in writing to the search procedure, the input data, and the success criterion before looking at the text?

That question collapses most of the disputed claims in sacred numerics into a single methodological criterion: the a priori commitment. Sections 03–08 of this page enumerate the specific failure modes that arise when it is missing.

verified — The methodological framework presented here reproduces the consensus of contemporary statistical practice in the relevant peer-reviewed journals (Statistical Science, Chance) and is directly applied below to a documented controversy.

Section 02

Case Study — WRR vs. MBBK

verified controversy   contested claim   peer-reviewed rebuttal

The cleanest, most thoroughly documented controversy in sacred numerics is the “Bible Codes” debate, conducted entirely within the pages of a single mainstream statistics journal. Both sides are peer-reviewed. Both papers are open-access. No background in religion is required to follow the argument — only ordinary statistical literacy.

The Claim — Witztum, Rips & Rosenberg (1994)

In 1994, Doron Witztum, Eliyahu Rips, and Yoav Rosenberg published “Equidistant Letter Sequences in the Book of Genesis” in Statistical Science, the flagship survey journal of the Institute of Mathematical Statistics. The paper’s claim:

The paper passed peer review. The editor, Robert Kass, included an unusual note describing the paper as “a challenging puzzle” rather than an endorsement — an early signal that the reviewers had not converged on a confident verdict.

Paper: Witztum, D., Rips, E., Rosenberg, Y. (1994) Title: "Equidistant Letter Sequences in the Book of Genesis" Venue: Statistical Science 9(3), 429–438 Reported: p < 16 / 1,000,000 (Monte Carlo permutation test, 32 rabbi-pairs) Status: PEER-REVIEWED claim, subsequently REFUTED in the same journal.

The Rebuttal — McKay, Bar-Natan, Bar-Hillel & Kalai (1999)

Five years later the same journal published “Solving the Bible Code Puzzle” by Brendan McKay, Dror Bar-Natan, Maya Bar-Hillel, and Gil Kalai — four professional mathematicians and statisticians with no theological stake in the outcome. Their findings reduce to three independent failure modes:

MBBK’s Three Findings

  1. Flexibility in data selection. The list of rabbi appellations and the spellings used were chosen by the WRR team. MBBK demonstrated that with the freedom retained in that selection, comparably extreme “significance” could be produced in a Hebrew translation of War and Peace. The signal tracked the choices, not the text.
  2. ELS “messages” appear in any sufficiently long text. MBBK exhibited equally striking ELS results in the Unabomber Manifesto and in the Hebrew translation of Crime and Punishment. The combinatorial space of equidistant subsequences is large enough that almost any target string will appear at some skip in a 300,000-letter corpus.
  3. No independent replication survived. Every attempt to reproduce WRR’s results using a priori-locked appellation lists — lists fixed before computation began — failed to reach the published significance. The result was an artifact of the data-selection freedom, not a property of Genesis.
Paper: McKay, B., Bar-Natan, D., Bar-Hillel, M., Kalai, G. (1999) Title: "Solving the Bible Code Puzzle" Venue: Statistical Science 14(2), 150–173 DOI: 10.1214/ss/1009212243 Verdict: The WRR effect is attributable to flexibility in appellation selection. The same procedure produces comparable "signals" in War and Peace, in the Unabomber Manifesto, and in any long Hebrew text.

The Popularisation — Drosnin (1997)

Michael Drosnin’s 1997 trade book The Bible Code reached mass readership by applying ELS searches with no a priori commitment whatsoever — searching for any politically resonant word at any skip, then declaring a match meaningful after the fact. The original WRR authors themselves disavowed Drosnin’s method. McKay subsequently published equally striking ELS “predictions” of assassinations in Moby Dick, illustrating that with sufficient post-hoc freedom, any text predicts anything.

disputed — The WRR claim is the only ELS result ever to pass peer review in a top statistics journal. It was refuted in the same journal five years later. The methodological lesson is durable: a published p-value is not a property of the text alone — it is a joint property of the text, the search procedure, and the analyst’s freedom of choice. Full primary sources are linked in References & Sources; readers are encouraged to consult both papers directly.

Section 03

A Priori vs. Post Hoc Hypotheses

verified principle

The single most important distinction in sacred-numerics methodology is whether a hypothesis was committed to before the data was examined (a priori) or after (post hoc). The arithmetic of an a priori test and a post hoc test can look identical; their evidential weight is not.

Working Definitions

  • A priori hypothesis. The claim is stated, the data and procedure are specified, and the success criterion is fixed — all before the test is executed. Pre-registration in modern empirical science formalises this commitment.
  • Post hoc hypothesis. The pattern is noticed in the data first, and then a test is constructed that the pattern is guaranteed to pass.

Why the Distinction Matters

Imagine an honest archer who shouts “bullseye”, shoots one arrow, and hits the centre. That is impressive. Now imagine the archer shoots one arrow at a blank wall and paints the bullseye around the hole afterwards. The arrow landed in the “centre” in both cases. Only the first event is evidence of skill.

Sacred numerics frequently runs on post hoc patterns: a striking number is found in a text, a thematic interpretation is constructed around it, and the result is presented as if it had been predicted in advance. The arithmetic is real. The evidential leap from “this number appears” to “this number was deliberately encoded” depends entirely on whether the encoding was specified before the search.

How to Apply This on Codex Numerica

Section 04

Multiple Comparisons & the Look-Elsewhere Effect

verified principle

If a test has a 1-in-1000 false-positive rate and the analyst runs it 1000 times under different parameter choices, a single “significant” result is expected by chance. This is the multiple-comparisons problem. In physics it is called the look-elsewhere effect. In sacred numerics, it is the most pervasive failure mode after post-hoc selection.

The Arithmetic of Looking Everywhere

Consider a single ELS test searching for a 7-letter word at a fixed skip in Genesis. The text is approximately 304,805 letters long. The number of distinct equidistant subsequences of length 7 with skip between 2 and 1000 is on the order of 107. If each subsequence has even a 10−5 chance of matching some meaningful Hebrew word from a permissive word list, the expected number of “hits” is on the order of 100.

Expected coincidences ≈ (subsequences searched) × (chance of match)
                      = 107 × 10−5 = 100

A claim of “astonishing” significance must therefore correct for the size of the search space. The Bonferroni correction in plain language: divide the headline significance threshold by the number of comparisons attempted. A “p < 0.001” result after 10,000 silent comparisons is no longer surprising at all.

Silent vs. Reported Comparisons

The Two Failure Modes

  • Reported multiple testing. The analyst tries several configurations, reports them all, and applies a Bonferroni-style correction. This is honest practice.
  • Silent multiple testing. The analyst tries dozens of configurations privately, reports only the best, and quotes the single uncorrected p-value. This is the dominant failure mode in popular numerology.

Silent multiple testing is invisible to a reader who only sees the final result. The defence against it is the same as the defence against post-hoc hypotheses: a priori pre-registration of the search procedure.

Section 05

The Texas Sharpshooter Fallacy

verified principle

The Texas sharpshooter fallacy is the visual form of post-hoc selection: a shooter fires many bullets at the side of a barn, then walks over and paints a target around the tightest cluster. The cluster is real. The marksmanship is fiction.

In sacred numerics the fallacy manifests as thematic clustering: the analyst identifies a small region of text or a small set of numbers where a pattern happens to hold, declares that region the “target,” and ignores all the regions where the same pattern does not hold. The result is invariably impressive within the painted target and unremarkable outside it.

Diagnosis Checklist

A claim is likely sharpshooter-affected if any of the following hold:

  • The text is a single verse, a single chapter, or a single book selected from a larger canon — and no a priori reason is offered for the selection.
  • The “remarkable” counts apply to a hand-picked subset of letters (only initials; only consonants; only the “divine” words).
  • Adjacent verses or chapters were checked and found to lack the pattern, but were omitted from the report.
  • The reported number depends on which edition, manuscript, or vocalisation is used.

The sharpshooter fallacy interacts multiplicatively with multiple comparisons: a freely chosen target region, multiplied by silent multiple testing within that region, can produce arbitrarily extreme apparent significance from random data.

Section 06

Degrees of Freedom in Encoding

verified principle

Every gematria or isopsephy claim is computed inside an encoding system. The same word in two different encoding systems yields two different numbers — see the comparison matrix on Numeral Systems. The same Hebrew word can carry at least three standard values:

Encoding Degrees of Freedom — Hebrew Example

ChoiceMethodResult for חי (Chai)
StandardMispar Hechrachi (full values)18
OrdinalMispar Siduri (position 1–22)18
ReducedMispar Katan (digital root)9
Integral reducedMispar Katan Mispari9
SquaredMispar Boneeh108

These are five widely-used standard methods; the rabbinic literature describes more than seventy named variants.

The Hidden Multiple Test

When an analyst is free to choose which encoding system “reveals” the target number, every additional encoding silently multiplies the search space. If five gematria methods are available and a claim is reported under the one that produces the intended result, the analyst has performed a hidden 5-way comparison.

This problem compounds when the analyst is also free to choose:

Each independent freedom is a silent multiple test. Five binary choices give 25 = 32 ways to compute “the” value of one word. A 1-in-32 coincidence is precisely the kind of result that 32 silent comparisons make routine.

verified — The presence of multiple encoding traditions is independently documented (see Numeral Systems for primary attestations). The methodological consequence — that freedom of encoding choice is a silent multiple test — is a direct application of the multiple-comparisons principle in §04.

Section 07

Monte Carlo & Permutation Testing

verified method

Monte Carlo testing compares an observed pattern against the distribution of patterns produced under an explicit null hypothesis — typically by simulating many random “texts” or randomly permuting the input and running the same procedure. The method is mathematically sound. WRR (1994) used a permutation test correctly in form. The disagreement with MBBK (1999) is not about Monte Carlo arithmetic; it is about the discipline applied to the inputs.

The Null Model is Half the Test

A Monte Carlo p-value depends on:

  1. the observed statistic on the real text;
  2. the null distribution of that statistic on simulated or permuted data.

Both halves can be silently manipulated. If the analyst is free to choose which letters to permute, what counts as a “hit,” or which simulation procedure represents “chance,” the same observed pattern can be made to look astronomically significant or completely ordinary.

Interactive Monte Carlo Demo

A text of 5,000 random letters is generated with a non-uniform letter-frequency distribution (loosely modelled on Hebrew). You choose a 2–6 letter target and a skip interval; the demo counts equidistant-letter-sequence (ELS) matches of the target in the text, then permutes the letters many times to build a null distribution. The p-value is the fraction of null trials with at least as many matches as the observed text.

Show generated text (5,000 letters)

Teaching point: with enough places to look, rare patterns are guaranteed.

verified — Monte Carlo / permutation testing is a standard inferential method in modern statistics. Its correct application is documented in the WRR paper; its misapplication is documented in the MBBK rebuttal. Both are linked in §10.

Section 08

Survivorship of Texts

verified principle

Letter-exact numerical claims presuppose letter-exact transmission. For most ancient corpora that presupposition is false — or at least untested. A canonical text is an edited object: it has manuscript variants, transcription errors, intentional emendations, and an editorial history spanning centuries. A pattern visible in the modern critical edition is not automatically a pattern present in the autograph.

Transmission Stability by Tradition

Approximate Letter-Level Stability

CorpusStability of letter-exact formImplication
Masoretic Hebrew BibleHigh — standardised c. 10th c. CE; Dead Sea Scrolls confirm broad fidelity but reveal real variantsGematria-on-MT claims must declare an edition; small letter differences propagate into different sums
Quran (‘Uthmanic recension)High at the rasm (consonantal skeleton) level; vowel/diacritic layer added laterLetter-count claims must distinguish rasm from the fully-pointed text
Greek New TestamentModerate — thousands of manuscripts with documented variantsIsopsephy claims sensitive to spelling (e.g., Iesous vs. Iesous abbreviated) shift between manuscripts
RigvedaVery high — oral transmission preserved verse-exact form for millenniaVerse-count and metrical claims are robust; letter-count claims less applicable to oral traditions
Maya codicesCatastrophically incomplete — only four authenticated codices survive Landa’s 1562 mass burningNumerical-structural claims survive only where the surviving manuscripts directly attest the structure
Popol VuhSingle colonial-era K’iche’ transcriptionVerse-count claims unreliable; structural claims must note the transmission layer
Norse Eddas13th-century Icelandic manuscripts of much older oral materialNumerical motifs (nine, three) are attested; canonical “list of nine worlds” is a modern reconstruction

See the Structural Dashboard for per-corpus edition notes and canon-by-canon raw counts with edition sources.

verified — The transmission history of each major canon is independently documented in mainstream textual criticism. The methodological consequence — that fragile transmission makes letter-exact claims correspondingly fragile — is a direct logical implication, not a contested interpretation.

Section 09

Applying the Toolkit to Codex Numerica

The four-tier evidence-grading system used across the site (defined on the home page) is the direct expression of the methodology above:

How Evidence Tiers Map to the Methodology

TierMeansMethodological standard
verifiedIndependently confirmable arithmeticResult invariant under reasonable a priori choice; no silent multiple testing involved; transmission of the source text is stable on the relevant scale.
remarkableStriking pattern, design unprovenArithmetic is verified; the leap from “pattern present” to “pattern deliberate” depends on extra-mathematical evidence (cultural, philological, archaeological) that has not yet conclusively settled the question.
disputedContested by peer-reviewed scholarshipThe claim fails one or more of: a priori commitment, multiple-comparisons correction, sharpshooter avoidance, encoding-DoF discipline, or transmission stability.
exploratorySpeculative, data-dependentNot yet tested rigorously; presented for reader inspection with explicit notice.

Worked Application: The 37 × 73 Factoring of Genesis 1:1

The Hebrew of Genesis 1:1 consists of 7 words whose letter values (standard gematria) sum to 2701 = 37 × 73. Apply the methodology:

Net assessment: the arithmetic is verified; the broader claim that the factoring is intentional design is remarkable — it survives the worst objections but does not survive an a priori test (none has ever been performed). This is the honest reading.

Worked Application: Dresden Codex Eclipse Table

The Dresden Codex eclipse table records 405 consecutive lunations across 11,960 days, structured as 46 × 260 to lock the lunar count into the Tzolk’in 260-day cycle. Apply the methodology:

Net assessment: verified — one of the strongest numerical-design claims on the entire site. The structure is intentional, attested in the primary source, and arithmetically exact. See Maya Numerics §1.4.

Section 10

References & Sources

The Primary Controversy — Peer-Reviewed Sources

  • Witztum, D., Rips, E., Rosenberg, Y. (1994). “Equidistant Letter Sequences in the Book of Genesis.” Statistical Science 9(3), 429–438. — The original claim. PEER-REVIEWED
    Project Euclid open access →
  • McKay, B., Bar-Natan, D., Bar-Hillel, M., Kalai, G. (1999). “Solving the Bible Code Puzzle.” Statistical Science 14(2), 150–173. DOI 10.1214/ss/1009212243. — The peer-reviewed refutation. PEER-REVIEWED
    Project Euclid PDF →  |  Author mirror →
  • Bar-Hillel, M., Bar-Natan, D., McKay, B. (1998). “The Torah Codes: Puzzle and Solution.” Chance 11, 13–19. — ASA-published popular-statistical summary. PEER-REVIEWED

Author Archives & Working Code

Methodological References

  • Bonferroni, C. E. (1936). “Teoria statistica delle classi e calcolo delle probabilità.” Pubblicazioni del R. Istituto Superiore di Scienze Economiche e Commerciali di Firenze. — The original multiple-comparisons correction.
  • Gross, E., Vitells, O. (2010). “Trial factors for the look elsewhere effect in high energy physics.” European Physical Journal C 70, 525–530. — Modern treatment of the look-elsewhere effect.
  • Simmons, J., Nelson, L., Simonsohn, U. (2011). “False-Positive Psychology.” Psychological Science 22(11), 1359–1366. — Empirical demonstration of researcher degrees of freedom.

Cross-Site References