再来一个免费词频表,学英语必备。

N-GRAMS
from the COCA and COHA corpora of American English

home compare to Google samples using the data historical (COHA) non-English free downloads purchase


These n-grams are based on the largest publicly-available, genre-balanced corpus of English -- the 450 million word Corpus of Contemporary American English (COCA). With this n-grams data (2, 3, 4, 5-word sequences, with their frequency), you can carry out powerful queries offline -- without needing to access the corpus via the web interface.

A few examples (from among an unlimited number of searches) might be:

 NOUN + NOUN sequences three word strings with a preposition in the middle position
 VERB + the + NOUN sequences two word strings, where the words begin or end with certain letters
 like + word + word (potential) phrasal verb: VERB + ADV particle

The data is available in several different formats:

1Free lists

million most frequent 2, 3, 4, and 5-grams

2Inexpensive data sets

All n-grams that occur three times or more: 6.2 million 2-grams, 11.9 million 3-grams, and 8.3million 4-grams

3All 2, 3, and 4-grams

Up to 155 million distinct strings -- searchable by word form and part of speech (as above), and also lemma

If you're interested in the frequency of single words (including frequency by genre and sub-genre), or collocates (all words "near by" a given word), you might look at http://www.wordfrequency.info.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值