The Brown University Standard Corpus of Present-Day American English, better known as simply the Brown Corpus, is an electronic collection of text samples of American English, the first major structured corpus of varied genres. This corpus first set the bar for the scientific study of the frequency and distribution of word categories in everyday language use. Compiled by Henry Kučera and W. Nelson Francis at Brown University, in Rhode Island, it is a general language corpus containing 500 samples of English, totaling roughly one million words, compiled from works published in the United States in 1961.
The Brown University Standard Corpus of Present-Day American English, better known as simply the BrownCorpus, is an electronic collection of text samples...
genres. The BrownCorpus was the first computerized corpus designed for linguistic research. Kučera and Francis subjected the BrownCorpus to a variety...
about its accuracy. The lowest perplexity that had been published on the BrownCorpus (1 million words of American English of varying topics and genres) as...
The Quranic Arabic Corpus (Arabic: المدونة القرآنية العربية, romanized: al-modwana al-Qurʾāni al-ʿArabiyya) is an annotated linguistic resource consisting...
British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British...
English. American National Corpus British National Corpus Bank of English BrownCorpus Milana, Prior (2021). A Comparative Corpus Study on Intensifier Usage...
The Enron Corpus is a database of over 600,000 emails generated by 158 employees of the Enron Corporation in the years leading up to the company's collapse...
The Oxford English Corpus (OEC) is a text corpus of 21st-century English, used by the makers of the Oxford English Dictionary and by Oxford University...
legomena. Thus, in the BrownCorpus of American English, about half of the 50,000 distinct words are hapax legomena within that corpus. Hapax legomenon refers...
as three corpora: a corpus from the Survey of English Usage, the Lancaster-Oslo-Bergen Corpus (UK English), and the BrownCorpus (US English). In 1988...
Naval Air Station Corpus Christi (IATA: NGP, ICAO: KNGP, FAA LID: NGP) is a United States Navy naval air base located six miles (10 km) southeast of the...
corpora includes British National Corpus, BrownCorpus, Cambridge Academic English Corpus and Cambridge Learner Corpus, CHILDES corpora of child language...
used for the BrownCorpus. Unlike Brown or the Lancaster-Oslo-Bergen (LOB) Corpus (or indeed mega-corpora such as the British National Corpus), however,...
Thus, early text projects such as Roberto Busa's Index Thomisticus, the BrownCorpus, and others had to resort to conventions such as keying an asterisk preceding...
common nouns, NP for singular proper nouns (see the POS tags used in the BrownCorpus). Other tagging systems use a smaller number of tags and ignore fine...
sequences or a specific part of the corpus. First text corpora were created in the 1960s, such as the 1-million-word BrownCorpus of American English. Over time...
The Cambridge International Corpus (CIC) is a collection of over 800 million words of real spoken and written English . The texts are stored in a database...
The Switchboard Telephone Speech Corpus is a corpus of spoken English language consisted of almost 260 hours of speech. It was created in 1990 by Texas...
697 lexical items. BulPosCor has been compiled from the Structured "Brown" Corpus of Bulgarian by sampling 300+ word-excerpts (expanded to sentence boundary)...
vs. "NNS" for plural noun, vs. "NNS$" for plural possessive noun (see BrownCorpus). Others provide more explicit separation of features, even formalizing...
is a corpus that is annotated with verbal propositions and their arguments—a "proposition bank". Although "PropBank" refers to a specific corpus produced...