Global Information Lookup Global Information

Corpus linguistics information


Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural corpora).[1] Corpora are balanced, often stratified collections of authentic, "real world", text of speech or writing that aim to represent a given linguistic variety.[1] Today, corpora are generally machine-readable data collections.

Corpus linguistics proposes that a reliable analysis of a language is more feasible with corpora collected in the field—the natural context ("realia") of that language—with minimal experimental interference. Large collections of text, though corpora may also be small in terms of running words, allow linguists to run quantitative analyses on linguistic concepts that may be difficult to test in a qualitative manner.[2]

The text-corpus method uses the body of texts in any natural language to derive the set of abstract rules which govern that language. Those results can be used to explore the relationships between that subject language and other languages which have undergone a similar analysis. The first such corpora were manually derived from source texts, but now that work is automated.

Corpora have not only been used for linguistics research, they have since the 1969 been increasingly used to compile dictionaries (starting with The American Heritage Dictionary of the English Language in 1969) and reference grammars, with A Comprehensive Grammar of the English Language, published in 1985, as a first.

Experts in the field have differing views about the annotation of a corpus. These views range from John McHardy Sinclair, who advocates minimal annotation so texts speak for themselves,[3] to the Survey of English Usage team (University College, London), who advocate annotation as allowing greater linguistic understanding through rigorous recording.[4]

  1. ^ a b Meyer, Charles F. (2023). English Corpus Linguistics (2nd ed.). Cambridge: Cambridge University Press. p. 4.
  2. ^ Hunston, S. (1 January 2006), "Corpus Linguistics", in Brown, Keith (ed.), Encyclopedia of Language & Linguistics (Second Edition), Oxford: Elsevier, pp. 234–248, doi:10.1016/b0-08-044854-2/00944-5, ISBN 978-0-08-044854-1, retrieved 31 October 2023
  3. ^ Sinclair, J. 'The automatic analysis of corpora', in Svartvik, J. (ed.) Directions in Corpus Linguistics (Proceedings of Nobel Symposium 82). Berlin: Mouton de Gruyter. 1992.
  4. ^ Wallis, S. 'Annotation, Retrieval and Experimentation', in Meurman-Solin, A. & Nurmi, A.A. (ed.) Annotating Variation and Change. Helsinki: Varieng, [University of Helsinki]. 2007. e-Published

and 24 Related for: Corpus linguistics information

Request time (Page generated in 0.819 seconds.)

Corpus linguistics

Last Update:

Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural corpora). Corpora are balanced, often stratified collections...

Word Count : 2576

Text corpus

Last Update:

In linguistics and natural language processing, a corpus (pl.: corpora) or text corpus is a dataset, consisting of natively digital and older, digitalized...

Word Count : 879

Linguistics

Last Update:

Linguistics is the scientific study of language. Linguistics is based on a theoretical as well as a descriptive study of language and is also interlinked...

Word Count : 9246

Brown Corpus

Last Update:

Zipf's law. Although the Brown Corpus pioneered the field of corpus linguistics, by now typical corpora (such as the Corpus of Contemporary American English...

Word Count : 1056

Corpus of Contemporary American English

Last Update:

created by Mark Davies, retired professor of corpus linguistics at Brigham Young University (BYU). The Corpus of Contemporary American English (COCA) is...

Word Count : 1135

Corpus

Last Update:

corpus, in linguistics, a large set of speech audio files Corpus linguistics, a branch of linguistics Corpus (album), by Sebastian Santa Maria Corpus...

Word Count : 315

British National Corpus

Last Update:

of spoken and written British English of that time. It is used in corpus linguistics for analysis of corpora. The project to create the BNC involved the...

Word Count : 3894

Computational linguistics

Last Update:

M. (1993). "Building a large annotated corpus of English: The Penn Treebank" (PDF). Computational Linguistics. 19 (2): 313–330. Archived (PDF) from the...

Word Count : 1069

Law and Corpus Linguistics

Last Update:

Law and corpus linguistics (LCL) is an academic sub-discipline that uses large databases of examples of language usage equipped with tools designed by...

Word Count : 1535

List of linguistics journals

Last Update:

Journal Colombian Applied Linguistics Journal Corpus Linguistics and Linguistic Theory International Journal of Corpus Linguistics Archiv für das Studium...

Word Count : 182

Quranic Arabic Corpus

Last Update:

English sources, instead of producing a new translation of the Qur'an. Corpus linguistics Quran Classical Arabic Treebank K. Dukes, E. Atwell and N. Habash...

Word Count : 599

Language contact

Last Update:

sociolinguistics (the study of language use in society), from corpus linguistics and from formal linguistics are used in the study of language contact. The most...

Word Count : 1552

Applied linguistics

Last Update:

Applied linguistics is a practical use of language. Applied linguistics is an interdisciplinary field. Major branches of applied linguistics include bilingualism...

Word Count : 1365

Chinese computational linguistics

Last Update:

noun recognition, natural language understanding and generation, corpus linguistics, and machine translation. Chinese character Information Technology...

Word Count : 2290

Collocation

Last Update:

In corpus linguistics, a collocation is a series of words or terms that co-occur more often than would be expected by chance. In phraseology, a collocation...

Word Count : 1320

Outline of linguistics

Last Update:

linguistic factors that place a discourse in context. Contrastive linguistics Corpus linguistics Dialectology Discourse analysis Grammar Interlinguistics Language...

Word Count : 1759

Internet linguistics

Last Update:

computational linguistics using large corpora" (PDF). Computational Linguistics. 19 (1): 1–24. McEnery, Tony; Wilson, Andrew (1996). Corpus Linguistics (PDF)...

Word Count : 8247

Cognitive linguistics

Last Update:

linguistics. Models and theoretical accounts of cognitive linguistics are considered as psychologically real, and research in cognitive linguistics aims...

Word Count : 3346

Linguistic description

Last Update:

In the study of language, description or descriptive linguistics is the work of objectively analyzing and describing how language is actually used (or...

Word Count : 1373

Structural linguistics

Last Update:

Structural linguistics, or structuralism, in linguistics, denotes schools or theories in which language is conceived as a self-contained, self-regulating...

Word Count : 4385

Hapax legomenon

Last Update:

In corpus linguistics, a hapax legomenon (/ˈhæpəks lɪˈɡɒmɪnɒn/ also /ˈhæpæks/ or /ˈheɪpæks/; pl. hapax legomena; sometimes abbreviated to hapax, plural...

Word Count : 3548

Geoffrey Leech

Last Update:

published papers. His main academic interests were English grammar, corpus linguistics, stylistics, pragmatics, and semantics. Leech was born in Gloucester...

Word Count : 1985

Word

Last Update:

"type"+"writ"+"er", and "can"+"not"). Since the beginning of the study of linguistics, numerous attempts at defining what a word is have been made, with many...

Word Count : 3882

Corpus language

Last Update:

be close to classical Arabic. Corpus languages are studied using the methods of corpus linguistics, but corpus linguistics can be used (and is commonly...

Word Count : 259

PDF Search Engine © AllGlobal.net