ai驱动数据安全治理_利用AI驱动的自动协调器实时停止有毒信息

ai驱动数据安全治理

In this post, I’ll show you how to build an AI-powered moderator bot for the Discord chat platform using the Perspective API.

在本文中,我将向您展示如何使用Perspective API为Discord聊天平台构建AI驱动的主持人机器人。

I’ve been fascinated by the topic of moderation-deciding who gets to post what on the internet-ever since I started working at the online dating site OkCupid, five years ago . The moderation team there was responsible for the near-impossible task of drawing the line between which messages counted as riské flirtation (usually ok), illicit come-ons (possibly ok), and sexual harassment (which would get you banned). As RadioLab put it in their excellent podcast episode on the topic, “How much butt is too much butt?” Questions like these are tough enough, and then, if you’re Twitter, you have to decide what to do when the President’s tweets violate your Terms of Service.

自从五年前我开始在网上约会网站OkCupid工作以来,我一直着迷于节制决定谁将内容发布到互联网上的话题。 那里的审核小组负责完成几乎不可能的任务 ,以划定消息之间的界限,这些消息被视为风险调情(通常是正常),非法来袭(可能是正常)和性骚扰(这将使您被禁止)。 正如RadioLab在他们关于主题的出色播客插播中所说的那样:“多少屁股是太多屁股?” 诸如此类的问题已经够棘手了,然后,如果您是Twitter用户,则必须决定当总统的推文违反您的服务条款时该怎么做。

It’s a dirty job, but someone’s got to do it. Or do they? Can an AI handle moderation instead?

这是一项肮脏的工作,但必须有人去做。 还是他们? AI可以代替审核吗?

Want to see the video version of this post? Check it out here:

想看这篇文章的视频版本吗? 在这里查看:

Some policy questions, like what to do with the President’s tweets or how to define hate speech, have no right answer. But in many more instances, and for many more platforms, bad content is easy to spot. You probably can’t share any sort of nudity or gore or hate speech on a professional networking app or an educational site for children. Plus, since most apps aren’t public forums like Facebook or Twitter (where we have strong expectations of free speech), the consequences of being too harsh or conservative in filtering risky content are lower. For these applications, machine learning can really help.

一些政策问题,例如如何处理总统的推文或如何定义仇恨言论,都没有正确的答案。 但是在更多情况下,对于更多平台,很容易发现不良内容。 您可能无法在专业的网络应用程序或儿童教育网站上共享任何裸体或血腥或仇恨言论。 另外,由于大多数应用都不是像Facebook或Twitter这样的公共论坛(我们对言论自由寄予厚望),因此在过滤风险内容时过于苛刻或保守的后果会更低。 对于这些应用程序,机器学习确实可以提供帮助。

In this post, I’ll show you how to build your own AI-powered moderation bot for the chat platform Discord. Don’t worry if you’ve never done any machine learning before-we’ll use the Perspective API, a completely free tool from Google, to handle the complicated bits.

在本文中,我将向您展示如何为聊天平台Discord构建自己的AI驱动的审核机器人。 如果您以前从未进行过任何机器学习,请不要担心-我们将使用Google完全免费的工具Perspective API处理复杂的代码。

But, before we get into the tech-y details, let’s talk about some high-level moderation strategies. Most companies use one of three approaches:

但是,在深入探讨技术细节之前,让我们先讨论一些高级的调节策略。 大多数公司使用以下三种方法之一:

  1. Pre-Moderation is when a team of human moderators review every single piece of content before it’s ever posted. It’s a good approach when it’s very, very important that no “bad” content slips through the cracks. Apple, for example, requires every app submitted to the App Store to be reviewed by an employee before they’re published.

    预审核是指一组人工审核员在发布每条内容之前都要对其进行审查。 当非常非常重要的一点是不要让“不良”内容从裂缝中溜走时,这是一个好方法。 例如,Apple要求员工发布到App Store的每个应用程序都必须经过审核,然后才能发布。

  2. Post-Moderation is the opposite-content is allowed to be posted before it’s reviewed by moderators. Instead, the job of flagging posts usually gets crowdsourced to users, who are able to “flag” or “report” content they believe violates a site’s TOS. You see this almost everywhere (YouTube, Facebook, Instagram, and many more).

    审核后是相反的内容,允许其主持人审核之前发布。 相反,标记帖子的工作通常会被众包给用户,这些用户能够“标记”或“报告”他们认为违反网站TOS的内容。 您几乎在所有地方(YouTube,Facebook,Instagram等)都可以看到它。

Image for post
With post-moderation, users are encouraged to flag inappropriate content.
通过后期审核,鼓励用户标记不适当的内容。

Both of these approaches clearly come with drawbacks-pre-moderation requires a large human moderation team, and doesn’t work for real-time applications (chat, or any type of streaming). Post-moderation scales better, but forces your users to consume potentially offensive or disturbing media. Which leads us to:

这两种方法显然都具有缺点-预审核需要庞大的人工审核团队,并且不适用于实时应用程序(聊天或任何类型的流式传输)。 后审核可以更好地扩展,但是会迫使您的用户使用可能令人反感或令人不安的媒体。 这导致我们:

AI moderators work around the clock and react near-instantaneously, so they can spot, for example, a livestream of a gory video seconds after it starts. Usually, AIs don’t replace a human moderation team, but work in conjunction with one, helping to automatically prioritize content that needs to be checked.

AI主持人昼夜不停地工作,几乎立即做出React,因此,他们可以在启动后几秒钟内发现例如血腥视频的实时流。 通常,AI不会代替人工审核小组,而是与一个小组协同工作,从而帮助自动确定需要检查的内容的优先级。

So, without further ado, let’s build an AI-powered moderator bot for the chat platform Discord.

因此,事不宜迟,让我们为聊天平台Discord构建一个AI驱动的主持人机器人。

为不和谐构建AI驱动的审核机器人 (Building an AI-Powered Moderation Bot for Discord)

Want to jump straight to the code? Check it out here.

想直接跳转到代码吗? 检查它 在这里

Today I’ll show you how to build an AI-powered moderation bot for Discord.

今天,我将向您展示如何为Discord构建一个AI驱动的审核机器人。

The bot sits in a Discord channel and analyzes all messages users to see if they’re toxic, nonsensical, flirty, insulting, or spam-y. When it detects that a message does fit one of these buckets (i.e. toxic) it “reacts” to the message with an emoji (🧨). Later, you can check to see how many reactions each user has received using the “!karma” hot word. And if a user sends too many toxic messages, they get kicked from the channel.

该漫游器位于Discord频道中,并分析所有用户的消息,以查看它们是否有毒,无意义,轻浮,侮辱性或垃圾邮件。 当它检测到消息确实适合这些存储桶之一(即有毒)时,它会通过表情符号(🧨)与消息“发生React”。 稍后,您可以使用“!karma”热词检查查看每个用户收到了多少React。 而且,如果用户发送过多的有害消息,他们就会被从频道中踢出去。

Image for post

Let’s see how to build it.

让我们看看如何构建它。

建立 (Setup)

First let’s grab the code for our moderator bot. Clone the `making_with_ml` Github repo and navigate to the `discord_moderator` folder:

首先,让我们为我们的主持人机器人获取代码。 克隆“ making_with_ml” Github存储库,并导航到“ discord_moderator”文件夹:

Here’s all the code you’ll need to run your ML moderator bot. Before we can run the bot, we’ll need to get some (completely free) services set up. First, make a copy of the file .env_template and name it .env.

这是运行ML版主机器人所需的全部代码。 在运行机器人之前,我们需要设置一些(完全免费的)服务。 首先,复制文件.env_template并将其命名为.env

Open that file up in your favorite text editor:

在您喜欢的文本编辑器中打开该文件:

As you can see, we’ll need a couple of different API and developer tokens to get started, one for the Perspective API, which we’ll use for analyzing messages, and one for Discord (more on that in a second).

如您所见,我们将需要几个不同的API和开发人员令牌来开始使用,其中一个是用于透视API(将用于分析消息),另一个是Discord(稍后再讨论)。

What is the Perspective API? The Perspective API is a free tool built by Jigsaw, a unit within Google that, in their own words, “forecasts and confronts emerging threats, creating future-defining research and technology to keep our world safer.” The Perspective API is one such tool they provide for keeping the (digital) world safer. It takes text as input (“You stink like butts”) and analyzes it for attributes like toxicity, insults, profanity, identity attacks, sexual explicitness, flirtation, threats, and more. You can quickly try it out in their interactive web demo:

什么是Perspective API? 透视图API是由Jigsaw(Google的一个部门)构建的免费工具,Jigsaw是Google的一个部门,用他们自己的话说,“预测并面对新兴威胁,创造未来定义的研究和技术,以确保我们的世界更安全。” Perspective API是他们提供的一种这样的工具,用于保持(数字)世界更加安全。 它以文本作为输入(“您像屁股一样臭”)并分析其属性,例如毒性,侮辱,亵渎,身份攻击,性露骨,调情,威胁等。 您可以在他们的交互式Web演示中快速尝试一下:

Image for post
You can test the Perspective API on perspectiveapi.com
您可以在Perspectiveapi.com上测试Perspective API

Today, The New York Times uses Perspective to help automatically moderate their comments section.

今天,《纽约时报》使用Perspective来帮助自动审核他们的评论部分。

Unfortunately, the documentation to actually use the Perspective API in Javascript is a bit sparse, so I’ll fill in some of the details here.

不幸的是,实际上在Javascript中使用Perspective API的文档很少,因此我将在此处填写一些详细信息。

有利视角 (Enabling Perspective)

First, sign in to your Google Cloud account (it’s free to get started), and either create a new project or select an existing one. In your project, enable the Perspective Comment Analyzer API. You’ll have to fill out a short survey to gain access (you should receive an email in a couple of hours).

首先,登录到您的Google Cloud帐户( 免费开始 ),然后创建一个新项目或选择一个现有项目。 在您的项目中,启用Perspective Comment Analyzer API 。 您必须填写一份简短调查表才能获得访问权限(您应该在几个小时内收到一封电子邮件)。

Next you’ll need to generate an API key to access the API in code. In the Google Cloud console left hand menu, click API & Services -> Credentials. On that screen, click “+ Create Credentials” -> “API key”. Copy that API key.

接下来,您需要生成一个API密钥以使用代码访问该API。 在Google Cloud控制台左侧菜单中,单击API和服务->凭据。 在该屏幕上,单击“ +创建凭据”->“ API密钥”。 复制该API密钥。

Image for post

Now go back to that file you created earlier-.env-and drop the key into the PERSPECTIVE_API_KEY field:

现在回到该文件创建earlier- .env -and中删除键进入PERSPECTIVE_API_KEY领域:

分析消息 (Analyzing Messages)

Now you should be able to use the Perspective API to analyze text in code. To see an example, check out the file perspective.js. At the top of the file, you'll see all of the possible attributes the API can recognize:

现在,您应该能够使用Perspective API来分析代码中的文本。 要查看示例,请查看文件perspective.js 。 在文件顶部,您将看到API可以识别的所有可能的属性:

On the next line, you’ll see the attributes we’ll actually using in our bot:

在下一行,您将看到我们将在机器人中实际使用的属性:

See all those numbers next to each attribute? When you ask the Perspective API to analyze a comment (“You’re soooo sexy”), it returns a “summaryScore” for each attribute:

看到每个属性旁边的所有数字吗? 当您要求Perspective API分析评论(“您太性感了”)时,它将为每个属性返回一个“ summaryScore”:

The score represents roughly how confident the machine learning model is that a comment is really flirtation or toxic or threatening, etc. The job is then on you, the developer, to choose a “cutoff” for deciding when a comment should really get a label. That’s what all those numbers mean in the attributeThreshold object I posted above. I'll only consider a comment insulting or toxic or threatening if the summaryScore is above 0.75.

分数大致代表了机器学习模型对评论确实是调情,有毒或威胁等东西的信心。然后,开发人员就需要选择一个“临界值”来决定评论何时​​应真正获得标签。 。 这就是我上面发布的attributeThreshold对象中所有这些数字的含义。 如果summaryScore高于0.75,我将只考虑评论侮辱性或有毒性或威胁性。

Pro tip: In your own application, you’ll want to choose a cutoff that aligns with how your human moderation team (if you have one) is already moderating content. For example, on Tinder, sending flirtatious messages is totally ok, and we might have a higher cutoff for filtering sexually explicit messages than, say, a site like LinkedIn.

专家提示:在您自己的应用程序中,您需要选择一个与您的人工审核小组(如果有的话)已经审核内容的方式相匹配的截止点。 例如,在Tinder上,发送妖艳的消息是完全可以的,与诸如LinkedIn之类的网站相比,我们过滤色情露骨消息的截止日期可能更高。

Meanwhile, take a look at the function analyzeText to see how we actually call the Perspective API:

同时,看一下函数analyzeText ,看看我们实际上如何调用Perspective API:

To actually connect with the API, we call const analyzer = new googleapis.commentanalyzer_v1alpha1.Commentanalyzer();. We then package up a request on line 21, specifying our language and the attributes we want to analyze, and send it to the API. That's it! On line 30, we check to see if the scores returned from the Perspective API are above our threshold (0.75).

要实际连接API,我们将const analyzer = new googleapis.commentanalyzer_v1alpha1.Commentanalyzer(); 。 然后,我们在第21行打包请求,指定我们的语言和我们要分析的属性,并将其发送到API。 而已! 在第30行中,我们检查了从Perspective API返回的分数是否高于我们的阈值(0.75)。

Congratulations, you can now use machine learning to analyze text! Now let’s throw that useful functionality into a Discord bot.

恭喜,您现在可以使用机器学习来分析文本! 现在,让我们将有用的功能放入Discord机器人中。

设置Discord Bot (Setting Up a Discord Bot)

If you’ve never used Discord, it’s a voice, video, and text chat platform that’s popular with gamers. You can use the methods in this post to build a bot for other messaging platforms, like Hangouts or Slack, but I chose Discord because it’s got such a delightful developer experience.

如果您从未使用过Discord ,那么它是语音,视频和文本聊天平台,深受游戏玩家欢迎。 您可以使用本文中的方法为其他消息传递平台(例如HangoutsSlack)构建机器人,但我选择Discord是因为它具有令人愉悦的开发人员经验。

To get started, download Discord (or use the web version), and sign up for a Discord developer account. Once you’re in, click “New Application,” and give your new app a name and a description.

首先,下载Discord(或使用网络版本),并注册一个Discord开发人员帐户。 进入后,单击“新应用程序”,然后为新应用程序命名和描述。

Image for post
Create a new Discord application.
创建一个新的Discord应用程序。

On the left hand panel, choose “Bot” to create a new bot. Select “Add Bot.” Give your new Bot a username and upload a cute or intimidating user icon.

在左侧面板上,选择“启动”以创建一个新的机器人。 选择“添加机器人”。 给您的新Bot用户名,并上传一个可爱或令人生畏的用户图标。

Image for post
Give your bot a username and a profile pic.
给您的机器人一个用户名和一个头像。

To be able to control your bot in code, you’ll need a Discord developer token, which you can grab straight from that bot page by clicking “Copy.” Drop that code in your `.env` file:

为了能够用代码控制您的机器人,您需要一个Discord开发人员令牌,您可以通过单击“复制”直接从该机器人页面中获取该令牌。 将该代码放入您的.env文件中:

Now that you’ve created a bot in Discord, you can immediately add it to a channel. In any Discord app (I’m using the desktop app), log in and create a new server:

现在,您已经在Discord中创建了一个漫游器,可以立即将其添加到频道中。 在任何Discord应用程序(我正在使用桌面应用程序)中,登录并创建新服务器:

Image for post
After you download the Discord app, just click on the plus button to create a new server.
下载Discord应用程序后,只需单击加号按钮即可创建新服务器。

Now let’s add your bot to the server. Back in the Discord Developer Portal, in your application, click on OAuth on the left side panel

现在,将您的机器人添加到服务器中。 返回Discord Developer Portal ,在您的应用程序中,单击左侧面板上的OAuth

Discord has a very nice system for handling bot permissions. Under “SCOPES,” tick off the box next to “bot.” This should open a “BOT PERMISSIONS” section below.

Discord有一个非常好的系统来处理漫游器权限。 在“范围”下,选中“机器人”旁边的框。 这应该在下面打开“ BOT PERMISSIONS”部分。

Image for post
Check off whatever permissions your bot requires (ours are the 3 above).
检查您的漫游器需要的所有权限(我们是上面的3个)。

Tick the permissions “Send Messages,” “Add Reactions” (for reacting to message with emojis), and “Kick Members” (to bad ban members from the channel).

勾选权限“发送消息”,“添加React”(用于对带有表情符号的消息做出React)和“踢成员”(对频道中的禁令成员)。

If you scroll up, just above the BOT PERMISSIONS panel, you should see a url, like: https://discord.com/api/oauth2/authorize?client_id=YOUR_CLIENT_ID&permissions=2114&scope=bot. Paste that url in your browser. If everything's set up correctly, you should be able to add your bot to your server. And there you go! You created your first Discord bot without writing a single line of code. But right now, it doesn't do anything. You need to give it a brain.

如果向上滚动,则在BOT PERMISSIONS面板的上方,您应该看到一个URL,例如: https://discord.com/api/oauth2/authorize?client_id=YOUR_CLIENT_ID&permissions=2114&scope=bot ://discord.com/api/oauth2/authorize?client_id=YOUR_CLIENT_ID&permissions=2114&scope https://discord.com/api/oauth2/authorize?client_id=YOUR_CLIENT_ID&permissions=2114&scope=bot 。 将该网址粘贴到浏览器中。 如果一切设置正确,则应该可以将漫游器添加到服务器中。 然后你去! 您无需编写任何代码即可创建第一个Discord机器人。 但是现在,它什么也没做。 您需要给它一个大脑。

建立不和谐主持人 (Building a Discord Moderator)

In this project, our bot’s brain lives in the file discord.js.

在这个项目中,我们的机器人的大脑位于discord.js.文件中discord.js.

At the top of the file, we import the Perspective file I talked about earlier:

在文件的顶部,我们导入我之前提到的Perspective文件:

const perspective = require('./perspective.js');

const perspective = require('./perspective.js');

Then we set up an emoji map, which tells the bot how to react when various attributes are detected:

然后,我们设置了一个表情符号图,该图告诉机器人在检测到各种属性时如何React:

Feel free to change those to whatever you’d like.

随意将它们更改为您想要的任何内容。

What we’d like our bot to do is analyze every message in a channel. We do that by creating a new Discord client and passing our developer token (that’s the last line in the file):

我们希望机器人执行的操作是分析频道中的每条消息。 为此,我们创建一个新的Discord客户端并传递我们的开发人员令牌(这是文件中的最后一行):

To listen for messages, we write a function that listens on the client message event:

为了监听消息,我们编写了一个监听客户端message事件的函数:

There’s a lot going on here.

这里有很多事情。

  • On line 4, we make sure that the messages we’re analyzing only come from other users (not bots)

    在第4行,我们确保我们正在分析的消息仅来自其他用户(而不是机器人)
  • On 8, we allocate some memory to keep track of our users. This is how we’ll remember how many emojis we’ve given them, and how many times they’ve said toxic things.

    在8月,我们分配了一些内存来跟踪我们的用户。 这就是我们会记住给我们的表情符号多少次,以及他们说过多少次有毒的表情的方式。
  • On 14, we call the evaluateMessage function, which uses the Perspective API to analyze a user's message. That function (which you can further investigate in the file) passes text to the Perspective API, responds with an emoji reaction if an attribute is found, and counts up the number of times a user has said something toxic. If it's more than KICK_THRESHOLD, a value set in our .env file, the function returns True (i.e. we should kick the user from the channel).

    在14日,我们调用了evaluateMessage函数,该函数使用Perspective API分析用户的消息。 该函数(您可以在文件中进一步研究该函数)将文本传递给Perspective API,如果找到属性,则通过表情符号响应进行响应,并计算用户说出有毒内容的次数。 如果大于KICK_THRESHOLD文件中设置的值.env ,则该函数返回True(即,我们应该从频道中踢用户)。

  • On line 19, we actually kick the user from the channel, using the function kickBaddie.

    在第19行,我们实际上使用功能kickBaddie将用户踢出了频道。

  • Finally, on line 27, we watch for the “!karma” hot word. If a user types this hot word, we’ll send a message with a roundup of the stats for users in the channel.

    最后,在第27行,我们注意“!karma”热词。 如果用户键入此热门单词,我们将向该频道发送一条消息,其中包含该用户的统计信息。

To really see what’s going on here in detail, you’ll have to look at the functions evaluateMessage and kickBaddie in the file. I've added lots of documentation in line. But in a nutshell, that's all there is to it.

要真正明白是怎么回事了详细,你必须看看功能evaluateMessagekickBaddie的文件中。 我添加了很多文档。 简而言之,仅此而已。

So there you go-you have your own AI-powered moderator bot for Discord. What do you think?

因此,您去了,您拥有了自己的基于AI的Discord主持人机器人。 你怎么看?

Originally published at https://daleonai.com on June 30, 2020.

最初于 2020年6月30日 https://daleonai.com 发布

翻译自: https://medium.com/google-cloud/stop-toxic-messages-in-real-time-with-an-ai-powered-moderation-bot-for-discord-ea5c17d669c0

ai驱动数据安全治理

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值