N-GRAMS | ||||||||
home | compare to Google | samples | using the data | historical (COHA) | non-English | free downloads | purchase |

These n-grams are based on the largest publicly-available, genre-balanced corpus of English -- the 450 million word Corpus of Contemporary American English (COCA). With this n-grams data (2, 3, 4, 5-word sequences, with their frequency), you can carry out powerful queries offline -- without needing to access the corpus via the web interface. A few examples (from among an unlimited number of searches) might be: The data is available in several different formats:
If you're interested in the frequency of single words (including frequency by genre and sub-genre), or collocates (all words "near by" a given word), you might look at http://www.wordfrequency.info. |