Corpora are collections of spoken and/ or written language that are stored and processed electronically. Corpora are used in corpus linguistics to find patterns in language use e.g. the collocations of words, grammatical structures in spoken language, word frequencies, types of error. Corpora in ELT can contain a range of different types of texts or be based on particular genres e.g. ESP texts, or on proficient speaker or learner English. Examples of English language corpora are the British National Corpus (BNA), Corpus of Contemporary American English (COCA), Cambridge English Corpus (CEC), the Louvain Corpus of Native English Essays (Locness).
"The corpus shows this word collocates more frequently with slightly than with a little."
International Journal of Corpus Linguistics, John Benjamin Publishing.
Carter, Ronald & McCarthy, Michael (1995): Grammar and the spoken language. Applied Linguistics 16:2, 141-158.
Cheng, W., Warren, M. & Xu X.F. (2003). The language learner as language researcher: putting corpus linguistics on the timetable. System 31:2, 173-186.
Granger, S.; Hung, J. & Petch-Tyson, S. (eds) (2002). Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. Amsterdam: Benjamins.
O’Keefe, A., McCarthy, M., Carter, R. (2007.) From Corpus to classroom. Cambridge: Cambridge University Press.
Sinclair, J. ed. (2004) How to Use Corpora in Language Teaching, John Benjamins Publishing.