Twitter-Pandas:类似于git-pandas,但适用于twitter。

I’ve got a python library that I’ve posted here before, that people seem to like called git-pandas.  The idea is to provide a pandas-centric interface to the data in a git repository.  To start with, we added simple representations of common datasets (commits, file changes, branches, etc), and as the library grew, we added in specialized processing methods to ease common analyses (cumulative blame, bus-factor, file owners, etc).

我有一个Python库,我已经张贴在这里 之前 ,人们似乎喜欢叫混帐大熊猫 。 这个想法是为git仓库中的数据提供一个以熊猫为中心的接口。 首先,我们添加了常见数据集的简单表示(提交,文件更改,分支等),并且随着库的增长,我们添加了专门的处理方法以简化常见分析(累积责任,总线因素,文件所有者等)。 )。

This week, I’ve started a very similar project: twitter-pandas.

本周,我启动了一个非常类似的项目: twitter-pandas

The goal of twitter-pandas is pretty much the same as the goal of git-pandas: to provide a simple, intuitive way to get data out of twitter and into an easily usable format for data science and data analytics.  Because twitter is a public api and not a privately held database (like git), we have the added responsibility with this library to be responsible API users.  That means not going over rate limits, and not forcing the user to reason about that kind of thing in the first place. To do this, we merge two fantastic libraries: tweepy and pandas.  The library is still in active development, looking for help from anyone that would like to help out, and is targeting a v1.0.0 release in the next month or so.

twitter-pandas的目标与git-pandas的目标几乎相同:提供一种简单,直观的方法来将数据从twitter中获取并转换为易于使用的格式,以用于数据科学和数据分析。 由于twitter是公共api而不是私有数据库(例如git),因此我们对该库负有额外的责任,以负责API用户。 这意味着首先不要超过速率限制,也不要强迫用户对这种事情进行推理。 为此,我们合并了两个很棒的库: tweepypandas 。 该库仍在积极开发中,正在寻求任何希望提供帮助的人的帮助,并计划在下个月左右发布v1.0.0。

The interface to twitter-pandas is very simple, and reminiscent of the interface to tweepy:

twitter-pandas的界面非常简单,让人联想到tweepy的界面:

from twitterpandas import TwitterPandas

# create a twitter pandas client object
tp = TwitterPandas(
    TWITTER_OAUTH_TOKEN,
    TWITTER_OAUTH_SECRET,
    TWITTER_CONSUMER_KEY,
    TWITTER_CONSUMER_SECRET
)

# create a dataframe with 10 of my own followers
df = tp.followers(limit=10)
print(df.head())

# create a dataframe with my own information
df = tp.me()
print(df)

# get a dataframe with the information of user willmcginnis
df = tp.get_user(screen_name='willmcginnis')
print(df)

# get back 10 users who match the query willmcginnis
df = tp.search_users(query='willmcginnis', limit=10)
print(df)

The different methods and API endpoints in tweepy are broken into a few categories:

tweepy中的不同方法和API端点分为以下几类:

  • user methods
  • timeline methods
  • status methods
  • direct message methods
  • friendship methods
  • account methods
  • favorite methods
  • block methods
  • saved search methods
  • help methods
  • list methods
  • trend methods
  • geo methods
  • 用户方法
  • 时间轴方法
  • 状态方法
  • 直接消息方法
  • 友谊方法
  • 账户方法
  • 最喜欢的方法
  • 块方法
  • 保存的搜索方法
  • 帮助方法
  • 列出方法
  • 趋势法
  • 地理方法

So far I’ve implemented TwitterPandas methods for only the user methods.  Over the next few weeks the plan is to work through the methods in the remaining groups that return datasets (TwitterPandas is intended to be read-only at this stage).  The target for the version 1 release is full implementation of all of these methods including tests and documentation.

到目前为止,我仅为用户方法实现了TwitterPandas方法。 在接下来的几周内,该计划将通过返回数据集的其余组中的方法进行工作(TwitterPandas在此阶段旨在为只读)。 版本1发行版的目标是所有这些方法(包括测试和文档)的完整实现。

In version 2, the plan is to add in the higher level analysis methods on top of these building blocks, with functionality like:

在版本2中,计划在这些构件之上添加更高级别的分析方法,其功能如下:

  • People who I follower, but don’t follow me back (and vice versa)
  • Top users of a hashtag (by different metrics)
  • Top followers of mine (by different metrics)
  • Follower growth charts
  • Any other useful features we think up along the way
  • 我关注者但不关注我的人(反之亦然)
  • 主题标签的主要用户(按不同指标)
  • 我的追随者(按不同指标)
  • 追随者增长图
  • 我们在此过程中想到的其他有用功能

The code is up on github, so check it out, open issues with suggestions, or if you’d like to help implement some of the methods mentioned above.

该代码位于github上,因此请检查出来,打开包含建议的问题,或者您是否希望帮助实现上述某些方法。

翻译自: https://www.pybloggers.com/2016/05/twitter-pandas-like-git-pandas-but-for-twitter/

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值