Twitter数据获取

推文获取

申请Twitter API

  1. 注册Twitter账号
  2. https://dev.twitter.com/apps, 点击create apps。
  3. Twitter API 申请
  4. 点击确定完成
  5. 在Application Management的Keys and Access Tokens,点击Generate Access Token.

利用tweepy获取特朗普最近推文

# -*- coding: utf-8 -*-
"""
Created on Fri Jan  6 18:31:59 2017

@author: caofk
"""
import re  
import tweepy  
import time
import tweepy 
#https://github.com/tweepy/tweepy
import random
import pandas as pd

#Twitter API credentials
consumer_key = " "
consumer_secret = " "
access_key = " "
access_secret = " "

screen_name = "realDonaldTrump"

#authorize twitter, initialize tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)

#initialize a list to hold all the tweepy Tweets
alltweets = []  

#make initial request for most recent tweets (200 is the maximum allowed count)
new_tweets = api.user_timeline(screen_name = screen_name,count=50)

#save most recent tweets
alltweets.extend(new_tweets)

#save the id of the oldest tweet less one
oldest = alltweets[-1].id - 1


#keep grabbing tweets until there are no tweets left to grab
while len(new_tweets) > 0:
    print("getting tweets before %s" % (oldest))
    #all subsiquent requests use the max_id param to prevent duplicates
    is_finished = 0
    while is_finished == 0:
        try:
            new_tweets = api.user_timeline(screen_name = screen_name,count=50,max_id=oldest)
            is_finished = 1
        except Exception as e:
            print(e)
            time.sleep(random.choice(range(300, 600)))
            is_finished = 0
    #save most recent tweets
    alltweets.extend(new_tweets)
    #update the id of the oldest tweet less one
    oldest = alltweets[-1].id - 1
    print("...%s tweets downloaded so far" % (len(alltweets)))


outtweets = pd.DataFrame()
outtweets["推文ID"] = [tweet.id_str for tweet in alltweets]
outtweets["推文创建时间"] = [tweet.created_at for tweet in alltweets]
outtweets["推文内容"] = [re.sub(r'\s+'," ", tweet.text) for tweet in alltweets]
outtweets.to_excel("E:\\"+ screen_name + 's_tweets.xls')
  • 1
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值