Mapping the Canon: Quantitative Approaches to Literary History · Darmstadt · June 18–19, 2026
01
Rarely the whole poem—usually a line someone can teach with.
Authority made through use—lines cut, copied, and repurposed.
“le jeu du découpage et du collage” Compagnon, 1979
02
Pages per year
03
Not an anthology of great literature—books that make verse legible, audible, and teachable.
Poets as instruments. Names invoked to settle a rule—which poets become usable, repeatable common ground when writers explain how verse works.
Their lines. Lines quoted to make a point concrete—cut into couplets and stanzas, made to do explanatory work, and able to travel without their poems.
3.1
with Petr Plecháč & Artjoms Šeļa · [forthcoming]
A poet’s name is not a stable search object — so we resolve every mention to one identity.
01 surface forms
02 neural linking
03 knowledge base
04 countable signal
Method
Named-entity recognition only says “Shelley is a person.” Entity linking says which Shelley — Mary or Percy Bysshe — and pulls “Mrs. Shelley,” “Mary Wollstonecraft Shelley,” and the bare surname onto one identity across two centuries.
Final poet universe: 3,164 poets · 737,194 mentions across 3,149 works, 1559–1929
A handful of poets carry most of the attention.
The top 1% — just 32 poets — account for 59% of all mentions (Gini 0.92). Shakespeare, Milton, Chaucer, Homer, Virgil at the head.
The leaderboard reshuffles for two centuries — yet a few anchors never leave the head.
Cumulative distinct works, by decade · top 10 · PPA Literary collection
but a name isn’t a line
3.2
with Meredith Martin, Rebecca Sutton Koeser & Laure Thompson · [under review]
A quoted poem is not a search object — so instead of searching, we let the archive collide with nearly all of English poetry, and see what aligns.
01quoted fragmentsource corpus
The first line, “The curfew tolls the knell of parting day,” has been adopted from Dante’s Purgatorio.
a PPA host work, 1857
02 text reuse
03reference poemreference corpus
04 countable signal
passim finds far too much. Most of the method is deciding what counts.
Raw passim alignments
1,263,587
Poem appearances (poem × host work)
509,030
Literary + Linguistic · reprints collapsed · 1694–1920
274,206
Implausible & misattributed matches removed
252,893
252,893 poem appearances · 37,406 poems · 3,962 host works
Method
A quotation is an asymmetric match: we align the PPA, where quotations surface, against a reference body of poetry — the Chadwyck-Healey English Poetry Database, deduplicated to 246,724 reference poems, plus Shakespeare’s plays, which Chadwyck-Healey leaves out.
1,263,587 raw alignments → 509,030 appearances → 252,893 appearances of 37,406 poems across 3,962 host works
| # | Work | Author | Appearances | In the PPA |
|---|---|---|---|---|
| 1 | Paradise Lost | John Milton | 1,326 | 1679–1929 |
| 2 | Hamlet | William Shakespeare | 1,287 | 1702–1929 |
| 3 | Psalms | King James Bible | 1,145 | 1644–1928 |
| 4 | Julius Caesar | William Shakespeare | 909 | 1702–1929 |
| 5 | Macbeth | William Shakespeare | 877 | 1702–1929 |
| 6 | The Merchant of Venice | William Shakespeare | 844 | 1718–1928 |
| 7 | An Essay on Criticism | Alexander Pope | 833 | 1712–1929 |
| 8 | Proverbs | King James Bible | 757 | 1644–1927 |
| 9 | Elegy Written in a Country Churchyard | Thomas Gray | 735 | 1756–1929 |
| 10 | Job | King James Bible | 698 | 1644–1927 |
| 11 | Don Juan | Lord Byron | 694 | 1810–1929 |
| 12 | Christus: A Mystery | Henry Wadsworth Longfellow | 687 | 1825–1929 |
| ⋯ 38,898 more · 42% quoted just once · Gini 0.72 ⋯ | ||||
| 38,911 | The Mohawk | John D. M’Kinnon | 1 | 1909 |
| 38,912 | Seldom “can’t” | Christina Georgina Rossetti | 1 | 1915 |
| 38,913 | Greater Memory | Arthur William Edgar O’Shaughnessy | 1 | 1916 |
| 38,914 | To a Cricket | William Cox Bennett | 1 | 1912 |
When a match lies
passim finds real text reuse — but a reference “poem” isn’t always original. Chadwyck-Healey’s long tail of the mute inglorious hides centos and adaptations.
Ode to the Human Heart — Laman Blanchard, 1842
Canto LXXX — Ezra Pound, 1917
The evil that men do lives after them’ Shakespeare
well, that is from Julius Caesar
unless memory trick me
who crossed the Rubicon up near Rimini
So we hand-remove centos & adaptations, then apply a biographical-plausibility filter — no work can quote a poet not yet born (it discards Pound, b. 1885, correctly).
Distinct poems quoted per 100 pages of prosody writing — a rate, so it climbs only if quotation itself broadened, not merely because the archive grew.
Method
A raw count of distinct poems can’t separate a widening practice from a thickening archive — more host pages simply mean more chances to quote. So we model a rate, with the archive’s size netted out.
Modelled window 1694–1920 · 1694 = first year with ≥5 PPA host works · 1920 = the PPA’s coverage horizon
Which poems did prosodists reach for more over time — and which did they quietly set aside?
\( p_{it} \) — a poem’s prevalence: the share of a decade’s prosody works that quote it.
\( k_{it} \sim \text{Beta-Binomial}\,(n_t,\, p_{it}) \) — counts are overdispersed, so not a plain binomial.
Hierarchical Bayesian · 896 sustained-presence poems (≥30 host works across ≥10 decades)
Medieval and Romantic verse climbs; neoclassical translation fades; the perennials hold.
Method
Eligibility starts 18 yrs after the poet’s birth — a Tennyson zero in the 1700s is impossibility, not neglect · population = ≥30 host works across ≥10 decades → 896 poems · Bambi / PyMC (NUTS), R̂ ≤ 1.02
| # | Work | Author / Translator | Slope | Host works |
|---|---|---|---|---|
| 1 | Pastoral V (transl. 1697) | Virgil / John Dryden | −.27 | 31 |
| 2 | The Art of Poetry (transl. 1680) | Horace / Earl of Roscommon | −.25 | 92 |
| 3 | Prince Arthur (1695) | Sir Richard Blackmore | −.25 | 58 |
| 4 | On Roscommon’s Translation of Horace (1680) | Edmund Waller | −.24 | 32 |
| 5 | The Art of Poetry (transl. 1757) | Horace / William Duncombe | −.24 | 42 |
| 6 | The Georgics (transl. 1753) | Virgil / Joseph Warton | −.24 | 50 |
| 7 | Medulla Poetarum Romanorum (transl. 1737) | Henry Baker | −.23 | 214 |
| 8 | Pastoral I (transl. 1697) | Virgil / John Dryden | −.23 | 45 |
| 9 | The First Satire of Persius (transl. 1693) | Persius / John Dryden | −.22 | 39 |
| 10 | An Essay on Translated Verse (1684) | Earl of Roscommon | −.22 | 125 |
| 11 | Pastoral III (transl. 1697) | Virgil / John Dryden | −.22 | 35 |
| 12 | Metamorphoses, Book XII (transl. 1700) | Ovid / John Dryden | −.22 | 31 |
| 13 | Metamorphoses, Book I (transl. 1693) | Ovid / John Dryden | −.21 | 48 |
| 14 | An Essay upon Poetry (1682) | Duke of Buckingham | −.21 | 79 |
| 15 | Virgil’s Æneid (transl. 1740) | Virgil / Christopher Pitt | −.20 | 140 |
| # | Work | Author | Slope | Host works |
|---|---|---|---|---|
| 1 | The Rime of the Ancient Mariner (1798) | Samuel Taylor Coleridge | +.35 | 287 |
| 2 | The Daffodils (1807) | William Wordsworth | +.31 | 109 |
| 3 | Ode: Intimations of Immortality (1807) | William Wordsworth | +.30 | 263 |
| 4 | Tintern Abbey (1798) | William Wordsworth | +.26 | 170 |
| 5 | The Eve of St Agnes (1820) | John Keats | +.24 | 108 |
| 6 | Prometheus Unbound (1820) | Percy Bysshe Shelley | +.23 | 128 |
| 7 | Is There for Honest Poverty (1795) | Robert Burns | +.21 | 64 |
| 8 | Christabel (1816) | Samuel Taylor Coleridge | +.21 | 157 |
| 9 | The Education of Nature (1800) | William Wordsworth | +.20 | 74 |
| 10 | On His Blindness (1673) | John Milton | +.19 | 99 |
| 11 | Peter Bell (1819) | William Wordsworth | +.19 | 63 |
| 12 | To Daffodils (1648) | Robert Herrick | +.19 | 55 |
| 13 | Sonnet 73 (1609) | William Shakespeare | +.18 | 70 |
| 14 | To a Waterfowl (1818) | William Cullen Bryant | +.18 | 71 |
| 15 | Ae Fond Kiss (1791) | Robert Burns | +.18 | 41 |
| # | Work | Author | Slope | Host works |
|---|---|---|---|---|
| 1 | Psalms (1611) | King James Bible | −.011 | 1,110 |
| 2 | Proverbs (1611) | King James Bible | +.007 | 738 |
| 3 | Job (1611) | King James Bible | −.005 | 683 |
| 4 | Othello (1623) | William Shakespeare | +.017 | 661 |
| 5 | King Lear (1623) | William Shakespeare | +.021 | 579 |
| 6 | The Faerie Queene (1596) | Edmund Spenser | +.023 | 560 |
| 7 | Henry VIII (1623) | William Shakespeare | +.026 | 544 |
| 8 | Night Thoughts (1742–46) | Edward Young | −.003 | 526 |
| 9 | Henry IV, Part 1 (1598) | William Shakespeare | +.027 | 499 |
| 10 | Essay on Man, Epistle IV (1734) | Alexander Pope | −.018 | 490 |
| 11 | King John (1623) | William Shakespeare | +.026 | 467 |
| 12 | Essay on Man, Epistle I (1733) | Alexander Pope | −.019 | 462 |
| 13 | Henry IV, Part 2 (1600) | William Shakespeare | +.009 | 459 |
| 14 | Henry V (1623) | William Shakespeare | +.007 | 450 |
| 15 | Measure for Measure (1623) | William Shakespeare | +.020 | 379 |
Method
No new model — the same per‑poem slopes b¹ᵢ, re‑cut by the literary period each poem belongs to. Labels are Chadwyck‑Healey’s bibliographic categories (the database’s organizing logic).
Slopes in log‑odds per decade vs the shared trend · Middle English sits furthest right — Chaucer and the founders the nineteenth century canonized