Bergen Corpus of London Teenage Language information
The Bergen Corpus of London Teenage Language (COLT) is a data set of samples of spoken English that was compiled in 1993 from tape recorded and transcribed conversations by teens between the ages of 13 and 17 in schools throughout London, England.[1][2] This corpus, which has been tagged for part of speech using the CLAWS 6 tagset, is one of the linguistic research projects housed at the University of Bergen in Norway.[3]
^"COLT Summary" (PDF). Retrieved March 29, 2015.
^González-Díaz, Victorina (2008). English Adjective Comparison: A Historical Perspective. John Benjamins. p. 9.
^"COLT: The Bergen Corpus of London Teenage Talk". November 20, 2003. Retrieved March 29, 2015.
and 28 Related for: Bergen Corpus of London Teenage Language information
The BergenCorpusofLondonTeenageLanguage (COLT) is a data set of samples of spoken English that was compiled in 1993 from tape recorded and transcribed...
corpora such as the Lancaster-Oslo-BergenCorpus (British English from the early 1990s) and the Freiburg-Brown Corpusof American English (FROWN) (American...
Corpus (OEC) is a text corpusof 21st-century English, used by the makers of the Oxford English Dictionary and by Oxford University Press' language research...
The Enron Corpus is a database of over 600,000 emails generated by 158 employees of the Enron Corporation in the years leading up to the company's collapse...
vision of computational linguists whose goal was a corpusof modern (at the time of building the corpus), naturally occurring language in the form of speech...
The Corpusof Contemporary American English (COCA) is a one-billion-word corpusof contemporary American English. It was created by Mark Davies, retired...
achievements of the COBUILD project have been the creation and analysis of an electronic corpusof contemporary text, the Collins Corpus, later leading...
Learner Corpus (CLC) is a collection of exam scripts written by students learning English, built in collaboration with Cambridge English Language Assessment...
part of the Arabic language computing research group within the School of Computing, supervised by Eric Atwell. The annotated corpus includes: A manually...
The Bank of English (BoE) is a representative subset of the 4.5 billion words COBUILD corpus, a collection of English texts. These are mainly British in...
Croatian LanguageCorpus (CLC) (Croatian: Hrvatski jezični korpus, HJK) is a corpusof Croatian compiled at the Institute of Croatian Language and Linguistics...
With the political expansion of the EU the official languagesof the ten new member states have been added to the corpus data. The latest release (2012)...
Engine is a corpus manager and text analysis software developed by Lexical Computing since 2003. Its purpose is to enable people studying language behaviour...
National Corpusof Polish (Polish : Narodowy Korpus Języka Polskiego NKJP) is the biggest and the most important corpusof the Polish language. A linguistic...
teaching language proficiency. American National Corpus Bank of English BookCorpus British National CorpusBergenCorpusofLondonTeenageLanguage (COLT)...
favor of a small group of experienced Tatoebans. As of February 2024, the Tatoeba Corpus has over 11,900,000 sentences in 422 languages. 59 of these languages...
is a corpus that is annotated with verbal propositions and their arguments—a "proposition bank". Although "PropBank" refers to a specific corpus produced...
National Corpus FidaPLUS is the 621 million words (tokens) corpusof the Slovenian language, gathered from selected texts written in Slovenian of different...
The American National Corpus (ANC) is a text corpusof American English containing 22 million words of written and spoken data produced since 1990. Currently...
Institute for the German Language (Leibniz Institute for the German Language, abbr. : IDS) in Mannheim, Germany. The corpus archive is continuously updated...
Bijankhan corpus (Persian: پیکرهٔ بیجنخان) is a tagged corpus that is suitable for natural language processing (NLP) research on the Persian language. This...
Kolhapur Corpusof Indian English, the BergenCorpusofLondonTeenageLanguage (COLT), the Helsinki Corpusof Older Scots, and the International Corpusof English—East-African...
Speech Corpus is a Modern Persian speech corpus for speech synthesis. The corpus contains phonetic and orthographic transcriptions of about 2.5 hours of Persian...
National Corpus (Russian: Национальный корпус русского языка, lit. 'National Corpusof the Russian language') is a corpusof the Russian language that has...
TIMIT is a corpusof phonemically and lexically transcribed speech of American English speakers of different sexes and dialects. Each transcribed element...
The Wellington Corpusof Spoken New Zealand English is a one-million-word corpusof transcribed English compiled from materials collected between 1988...
The Switchboard Telephone Speech Corpus is a corpusof spoken English language consisted of almost 260 hours of speech. It was created in 1990 by Texas...