DOWNLOADING DATA FOR OFFLINE USE

This site contains information on the top 60,000 words (lemmas) in English -- their overall frequency, frequency by genre (spoken, fiction, magazine, newspaper, and academic), 20-30 collocates (nearby words) for each word (which provide insight into meaning and usage), and about two hundred re-sortable concordance lines for each word. However, there are some limitations on use via the web interface.

If you need more data than what you can get via the web interface, you might consider downloading some of this data for offline use. And the downloadable data is often more extensive -- for example 200-300 collocates per word (vs. 20-30 here), and frequency information for not just the five main genres, but also the forty sub-genres as well (e.g. Newspaper-Financial, Academic-Medicine, etc). Some of the data is available for free, and some must be purchased (there is a discount for teachers or students at a school, college, or university). More information...

And if you're looking for free downloadable data on the frequency of phrases, you might consider http://www.ngrams.info, which contains the top one million phrases for two, three, four, and five-word sequences in English.