pandas查找
Over the past couple of months we’ve been gradually working on twitter-pandas, a pandas dataframe based interface to twitter data (powered by tweepy behind the scenes). I’ve posted about the first limited release previously here.
在过去的几个月中,我们一直在逐步开发twitter-pandas ,这是一个基于pandas数据框的twitter数据接口(由幕后幕布提供动力)。 我以前在这里发布过第一个限量发行版。
The initial release was focused on just replicating the tweepy API as best as we could as a first building block to a more concise and usable library. Similarly to the development of git-pandas, we do that by being users ourselves. So one by one, we pick off useful types of analysis that may be done with twitter data, and do them with twitter-pandas, adding functionality, clarity and stability to the code used along the way.
最初的发行版侧重于尽可能地将tweepy API复制为更简洁和可用的库的第一步。 与git-pandas的开发类似,我们通过自己成为用户来做到这一点。 因此,我们一个接一个地挑出有用的分析类型,这些分析类型可以用twitter数据完成,然后用twitter-pandas进行分析,从而为沿途使用的代码增加了功能,清晰度和稳定性。
In twitter-pandas, the first such analysis is simple: “which of the people I follow don’t follow me back?”.
在twitter-pandas中,第一个这样的分析很简单:“我追随的人中有谁不追随我?”。
To answer this question in twitter-pandas, we only need to hit one method:
要在twitter-pandas中回答这个问题,我们只需要点击一种方法:
from twitterpandas import TwitterPandas
from keys import TWITTER_OAUTH_SECRET, TWITTER_OAUTH_TOKEN, TWITTER_CONSUMER_SECRET, TWITTER_CONSUMER_KEY
if __name__ == '__main__':
# create a twitter pandas client object
tp = TwitterPandas(
TWITTER_OAUTH_TOKEN,
TWITTER_OAUTH_SECRET,
TWITTER_CONSUMER_KEY,
TWITTER_CONSUMER_SECRET
)
# get our own user id
user_id = tp.api_id
# use it to find all of our own friends (people we follow)
df = tp.friends_friendships(id_=user_id, rich=True)
total_friends = df.shape[0]
# filter the df down to only those who don't follow us back
df = df[df['target_follows_source'] == False]
# print out the info:
print('A total of %d of those who I follow on twitter, don't follow me back.' % (df.shape[0], ))
print('...that's about %4.2f%% of them.n' % ((float(df.shape[0]) / total_friends) * 100, ))
print(df['target_user_screen_name'].values.tolist())
Which will yield:
这将产生:
A total of 109 of those who I follow on twitter, don't follow me back.
...that's about 59.89% of them.
['user1', ... , 'user2']
Twitter’s API limits the number of requests you can issue in a 15 minute window, connections can timeout, requests can fail, and a ton of other problems can arise in the minutes or hours that this has to run (depending on how many people you follow), but unlike a lower-level library like tweepy (which we use under the hood), with twitter-pandas, it’s handled for you.
Twitter的API限制了您在15分钟的窗口中可以发出的请求的数量,连接可能超时,请求可能失败以及在运行的几分钟或几小时内可能会产生大量其他问题(取决于您关注的人数) ),但与诸如tweepy(我们在幕后使用)之类的低级库不同,它具有twitter-pandas,可以为您处理。
The goal is to allow data scientists, researchers, analysts and others to get their data, in the format they want it, simply.
目的是允许数据科学家,研究人员,分析师和其他人员简单地以他们想要的格式获取数据。
This example uses the current master branch which is not yet released to pypi, but will be in version 0.0.2 of twitter-pandas. To install from master and try this out before the release, just use:
本示例使用当前的master分支,该分支尚未发布到pypi,但其版本为twitter-pandas的0.0.2。 要从master安装并在发行前尝试一下,只需使用:
pip install git+https://github.com/wdm0006/twitter-pandas.git
And if you’re interested in contributing, theres a few of us working on this, and there’s a ton left to do. Find us at:
如果您有兴趣做出贡献,那么我们中的一些人会为此而努力,还有很多工作要做。 在以下位置找到我们:
翻译自: https://www.pybloggers.com/2016/07/using-twitter-pandas-to-find-friends-who-dont-follow-you-back/
pandas查找