COMPARE TO THE STANDARD COCA INTERFACE

This site is based on frequency data from the 450 million word Corpus of Contemporary American English (COCA), which is the largest and most up-to-date corpus of English that is freely available online. The corpus is composed of more than 170,000 texts from 1990-2012, and it is evenly divided in total size between spoken, fiction, popular magazines, newspapers, and academic. In other words, it represents many different genres of English, from recent texts.

The following table compares what is available here, and what is available via the regular COCA website.

  WORDANDPHRASE . INFO CORPUS OF CONTEMPORARY AMERICAN ENGLISH
Audience

Mainly students and teachers, as well as the general public

Mainly linguists, but also students, teachers and others

Unique features

- Enter texts and then get detailed information on words
- Enter texts, choose phrases in the text, and then find related phrases in COCA
- Browse through and search frequency lists
- Lots of information on one screen: overall frequency, frequency by genre, collocates, concordances, and synonyms
- Easy to move from one related word to another (e.g. synonyms)
- Integration of WordNet, for more specific / more general words
- Results are "pre-processed", for much faster retrieval

- Search by phrase and grammatical construction
- See frequency over time (1990-2012), e.g. which words or phrases are increasing over time
- See frequency in sub-genres (e.g. Magazine-Financial or Academic-Legal)
- More control over collocates, e.g. just to the left or right of the node word, and varying number of words to the left or right
- Sort results by Mutual Information score, to see how closely related two words are
- Compare the collocates of two related words, to compare meaning and usage (e.g. men and women, small and little, rob and steal)
- Compare two sections of the corpus, e.g. collocates of chair in fiction and academic, or all adjectives that are much more common in Academic than Fiction
- Can save queries and results for later use and viewing, and can save customized word lists for later use as a part of queries (e.g. words related to the family, or emotions, or colors)