Please use the following text to cite this item or export to a predefined format:
De Smet, Hendrik; Flach, Susanne; Diller, Hans-Jürgen and Tyrkkö, Jukka, 2015, The Corpus of Late Modern English Texts, version 3.1, CLARIN DSpace, http://hdl.handle.net/20.500.14106/2574.
dc.contributor.authorDe Smet, Hendrik
dc.contributor.authorFlach, Susanne
dc.contributor.authorDiller, Hans-Jürgen
dc.contributor.authorTyrkkö, Jukka
dc.date.accessioned2024-11-25T15:04:52Z
dc.date.available2024-11-25T15:04:52Z
dc.date.issued2015-10
dc.descriptionThe Corpus of Late Modern English Texts (CLMET) is a corpus of roughly 35 million words of British English from 1710–1920, grouped into three 70-year periods. The history, versions and specifics of corpus composition can be followed up by referring to the CLMET3.0 website. CLMET3.0 is currently distributed in three formats: (i) plain text, (ii) plain text with one sentence per line, and (iii) a tagged version (one sentence per line). Version CLMET3.1 is the result of making CLMET available in a CQP format for use in CWB and CQPweb-based corpus environments. While there is no change to the selection of texts, CLMET3.1 includes additions and changes in linguistic annotation. The changes in CLMET3.1 are of three general types: (a) retokenization and retagging, (b) fixing of some systematic issues that come with historical data, and (c) enhancing annotation by adding lemmas and simplified part-of-speech class tags.
dc.identifier2574
dc.identifier.urihttp://hdl.handle.net/20.500.14106/2574
dc.languageEnglish
dc.language.isoeng
dc.publisherKU Leuven
dc.relation.ispartofOxford Text Archive Core Collection
dc.relation.isreferencedbyhttps://essenglish.org/messenger/wp-content/uploads/sites/2/2016/01/192-29-35.pdf
dc.rightsCreative Commons - Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)
dc.rights.labelPUB
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/
dc.source.urihttps://fedora.clarin-d.uni-saarland.de/clmet/clmet.html
dc.source.urihttps://perswww.kuleuven.be/%7Eu0044428/clmet3_0.htm
dc.subjectLinguistic corpora
dc.subject.lcshLinguistics analysis (Linguistics)
dc.subject.lcshLinguistics
dc.titleThe Corpus of Late Modern English Texts, version 3.1
dc.title.alternativeCLMET3.1
dc.typecorpus
local.brandingLiterary and Linguistic Data Service
local.contact.personHendrik De Smet hendrik.desmet@kuleuven.be KU Leuven
local.files.count5
local.files.size723314347
local.has.filesyes
local.hasCMDIfalse
local.hiddenfalse
local.language.nameEnglish
local.size.info34386225 tokens
local.size.info333 texts
local.size.info212 other
local.size.info687 mb
metashare.ResourceInfo#ContentInfo.mediaTypetext
otaterms.date.range1700-1799
otaterms.date.range1800-1899
otaterms.date.range1900-1999
 Files in this item
This item contains no files.