OTA Core Collection
Permanent URI for this collection
Datasets and texts collected from a variety of people and projects in the period since 1976. This collection excludes 'Legacy' and 'Text Creation Partnership' items in the Oxford Text Archive, and the contents of this collection are thought to be of reasonable quality and usefulness
Browse
Recent Submissions
Item Encyclopedia Britannica, Seventh Edition: A Machine-Readable Text Transcription (version 3.1)(Temple University, 2025) Logan, Peter M.; Kretz, Don; Jockers, Matthew L.; Tennis, Joseph T.; Greenberg, Jane; Bigenheimer, Marcus; Flanders, Julia; Grabus, Samantha; Pascua, Sonia; Huang, Luling; Scales, Gary; Siotto, Andrea; Farrell, Bethany; Kopaczewski, James; Rasing, Joyce; Gates, Ian; Gittelman, Tyler; Hammell, Madeline; Nguyen, Nhan; Rogers, Katie; Stover, Rachel; Hample, Jordan; Lacy, DavidItem Encoded transcriptions and other files associated with a Social Edition of the Devonshire MS (BL Add. MS 17492)(Wikibooks, 2012) Siemens, Ray; Armstrong, Karin; Bond, Barbara; Crompton, Constance; Dickson, Terra; Paquette, Johanne; Podracky, Jonathan; Weber, Ingrid; Leitch, Cara; Chernyk, Melanie; Hirtsch, Bret D.; Powell, Daniel; Gaudet, Chris; Haswell, Eric; Ciula, Arianna; Starza-Smith, Daniel; Cummings, James; Holmes, Martin; Newton, Greg; Gibson, Jonathan; Remley, Paul; Kwakkel, Erik; Shirkie, AimieItem HeliPaD: the Heliand Parsed Database(Ghent University, 2015) Walkden, George; Sievers, EduardItem Gavin Douglas's Eneados, annotated, tagged, and aligned with its source texts Virgil's Aeneid and Maffeo Vegio's Supplement(University of Oxford, 2021) Douglas, Gavin; Virgil; Vegio, Maffeo; Bushnell, Megan; Coldwell, David F.C.; Greenough, J.B.; Ketrridge, G.L.; Ascensius, Jodocus Badius; Brinton, A.C.Item The Cambrian Register for the Year 1795(E. and T. Williams, 2025) Pughe, William Owen; Clare, JoshuaItem PREMOVE – A diachronic dataset of Ancient Greek and Latin annotated PREverbed MOtion VErbs(King's College London, 2025-06-16) Farina, AndreaItem Anglo-Norman Dictionary Transcriptions(Anglo-Norman Dictionary, Aberystwyth University, 2005-2022) De Wilde, Geert; Rothwell, William; Gabel de Aguirre, Jennifer; Pagan, Heather; Cavell, EmmaItem Novel450(Figshare, 2016-01-28) Piper, AndrewItem Open English WordNet (2024 version)(University of Galway, 2024-10-31) McCrae, John Philip; Rademaker, Alexandre; Bond, Francis; Rudnicka, Ewa; Fellbaum, ChristianeItem The Corpus of the Canon of Western Literature(University of Oxford, 2018-01-01) Green, ClarenceItem The Corpus of Late Modern English Texts, version 3.1(KU Leuven, 2015-10) De Smet, Hendrik; Flach, Susanne; Diller, Hans-Jürgen; Tyrkkö, JukkaItem The Corpus of English Novels(KU Leuven, 2024) De Smet, HendrikItem Interactional Variation Online(Cardiff University, 2024-09-05) Knight, Dawn; O’Keeffe, Anne; Fitzgerald, Christopher; McNamara, Justin; Geraldine, Mark; Fahey Palma, Tania; Farr, Fiona; Cowan, Benjamin; Adolphs, SvenjaItem Open English WordNet (2022 version)(University of Galway, 2022-12-31) McCrae, John Philip; Rademaker, Alexandre; Bond, Francis; Rudnicka, Ewa; Fellbaum, ChristianeItem Open English WordNet (2021 version)(University of Galway, 2021-11-09) McCrae, John Philip; Rademaker, Alexandre; Bond, Francis; Rudnicka, Ewa; Fellbaum, ChristianeItem Open English WordNet (2020 version)(University of Galway, 2020-04-17) McCrae, John Philip; Rademaker, Alexandre; Bond, Francis; Rudnicka, Ewa; Fellbaum, ChristianeItem Open English WordNet (2019 version)(University of Galway, 2019-04-17) McCrae, John Philip; Rademaker, Alexandre; Bond, Francis; Rudnicka, Ewa; Fellbaum, ChristianeItem Open English WordNet (2023 version)(University of Galway, 2023-10-31) McCrae, John Philip; Rademaker, Alexandre; Bond, Francis; Rudnicka, Ewa; Fellbaum, ChristianeItem Frequently repeated clusters of words in Early English Books Online(University of Oxford, 2024-02-01) Wynne, MartinLists of repeated clusters of words, lemmata and part-of-speech tags derived from the 60238 works in the public domain from the Early English Books Online collection, as made available in the Oxford Text Archive collections in late 2023. In each case. the list contains the top 4000 most frequent clusters (or "n-grams"). The lists are made available as a lexical resource for exploring n-grams in historical English texts.Item Ancient Greek semantic annotation datasets(University of Oxford, 2021-11-12) Viivi Lähteenoja; Alessandro VatriDatasets containing semantic annotation of the Ancient Greek words mus, harmonia, and kosmos in the Diorisis Ancient Greek corpus. The files are in a tab-separated format. Authors: Viivi Lähteenoja (dataset for kosmos) Alessandro Vatri (datasets for mus and harmonia).

