Python:使用图和机器学习检测Twitter Bot

本文展示了如何借助Python的网络图形和机器学习库,构建一个预测Twitter用户是人类还是机器人的模型,只需利用用户的最小可行图形表示。
摘要由CSDN通过智能技术生成

The uptick in Twitter user activity during the recent lockdown made it seem like a good place to start looking for a quarantine project to increase my competency with machine learning. Specifically, as misinformation and baffling conspiracies took hold of the U.S.’s online population, trying to come up with new ways to identify bad actors seemed like more and more of a relevant task.

在最近的锁定期间,Twitter用户活动的增加使它看起来像是一个开始寻找隔离项目以提高我的机器学习能力的好地方。 具体来说,随着误导和令人困惑的 阴谋 笼罩着美国的在线人群,试图找到新的方法来识别不良行为者似乎越来越是一项重要的任务。

In this post, I’ll be demonstrating, with the help of some useful Python network graphing and machine learning packages, how to build a model for predicting whether Twitter users are humans or bots, using only a minimum viable graph representation of each user.

在这篇文章中,我将在一些有用的Python网络图形和机器学习包的帮助下演示如何构建模型,以仅使用每个用户的最小可行图形表示来预测Twitter用户是人类还是机器人。

大纲 (Outline)

1. Preliminary Research

1.初步研究

2. Data Collection

2.数据收集

3. Data Conversion

3.数据转换

4. Training the Classification Model

4.训练分类模型

5. Closing thoughts / Room for Improvement

5.总结思想/改进空间

技术说明 (Technical Notes)

All programming, data collection, etc. was done in a Jupyter Notebook. Libraries used:

所有编程,数据收集等都在Jupyter Notebook中完成。 使用的库:

tweepy
pandas
igraph
networkx
numpy
json
csv
ast
itemgetter (from operator)
re
Graph2Vec (from karateclub)
xgboost

Finally, four resources were key to this task, which I will discuss later in this writeup:

最后,四个资源是此任务的关键,我将在本文后续部分中讨论这些资源:

Let’s get to it!

让我们开始吧!

初步研究 (Preliminary Research)

While bot detection as a goal is nothing new, to the extent that a project like this would have been impossible without drawing on

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值