Source: Gistik Blog

Gistik Blog Corpus linguistics

How can the corpus linguistics be useful in our everyday life?Corpus linguistics is a study of language and a method of linguistic analysis which uses a collection of natural or "real word" texts known as corpus. Corpus linguistics is used to analyse and research a number of linguistic questions and offers a unique insight into the dynamic of language which has made it one of the most widely used linguistic methodologies.Main functions:1. To give an access to naturalistic linguistic information. Corpora is a valuable research source for dialectology, sociolinguistics and stylistics.2. To facilitate linguistic research. Electronically readable corpora have dramatically reduced the time needed to find particular words or phrases. 3. To enable the study of wider patterns and collocation of words.Before the advent of computers, corpus linguistics was studying only single words and their frequency. Modern technology allowed the study of wider patterns and collocation of words.4. To allow analysis of multiple parameters at the same time.Various corpus linguistics software programmes, online marketing and analytical tools allow the researchers to analyse a larger number of parameters simultaneously.5. To support the study of the second language. Study of the second language with the use of natural language allows the students to get a better "feeling" for the language and learn the language like it is used in real rather than "invented" situations. There are facts that can not be explained by corpus linguistics:1. Does not explain why. The study of corpora tells us what and how happened but it does not tell us why the frequency of a particular word has increased over time for instance.2. Does not represent the entire language. Linguistic analyses that use the methods and tools of corpus linguistics do not represent the entire language.List of corpora resources can be found hereTo make your own corpora from Oxford Text Archive in Unix-based OS you can run "wget http://ota.ahds.ac.uk/text/{3001..5730}.txt"It will download 2730 texts and it's good way to start to play with it.Sources (Yauhen's NLProc blog, Northern Arizona Univ. docs, Corpus Linguistics and Text Corpora at Quora, Practical Introduction in pdf)

Read full article »
Est. Annual Revenue
$100K-5.0M
Est. Employees
1-25
CEO Avatar

CEO

Update CEO

CEO Approval Rating

- -/100