ASU 25 million tweets
http://www.public.asu.edu/~mdechoud/datasets.html
This dataset comprises a large set of about 10.5 million tweets from 200,000 users; along with their time zone, location, status count, favorite count, followers and followings count and the social graph information. The time range of thetweets is between 2006 and 2009. This dataset is useful for temporal analysis of posting activity on Twitter- with respect to information flow and social network topology. I'm puzzled why the original paper and TI all used a large set with 25 million tweets.
Edinburgh Twitter Corpus, 97 million tweets
.
http://homepages.inf.ed.ac.uk/miles/papers/socmed10.pdf
User information is anonymous. Only date info is associated with each message.
Twitter TREC . 16million tweets.
Twitter User Graph
http://an.kaist.ac.kr/traces/WWW2010.html
I have downloaded this fairly large dataset. The current zip file(4.6G in size) is stored at dlib's 146.245 server.
-
新浪微薄: One of a group has crawled around 30 million user information within ten days. I wrote a simple crawler this summer and only 10 million user information is fetched. No thorough analysis work has been conducted on this new platform, in comparison with Twitter.