kinesis_如何使用Kinesis数据生成器生成模拟流数据

kinesis

Kinesis Data Generator is a tool which can be used to generate mock data and send to Kinesis Firehose or Streams. This can be an easy way to test your pipeline with streaming data, if you do not have enough data to play with.

Kinesis Data Generator是一个工具,可用于生成模拟数据并将其发送到Kinesis Firehose或Streams。 如果您没有足够的数据可玩,这可能是一种使用流数据测试管道的简单方法。

In this article, I am using Kinesis Data Generator to send mock Stack Overflow data mimicking the original json structure which I streamed using Stackapi.

在本文中,我将使用Kinesis Data Generator发送模拟的Stack Overflow数据,以模仿我使用Stackapi流式传输的原始json结构。

Step 1: Go to this link https://awslabs.github.io/amazon-kinesis-data-generator/web/help.html & create an Amazon Cognito user and download the CloudFormation template

步骤1:转到此链接https://awslabs.github.io/amazon-kinesis-data-generator/web/help.html并创建Amazon Cognito用户并下载CloudFormation模板

Step 2: Configure the stack- Choose Template is ready option.

步骤2:配置堆栈-选择模板准备就绪选项。

Image for post

Step 3: Upload the json file that you downloaded in Step 2

步骤3:上传您在步骤2中下载的json文件

Image for post

Step 4: Specify user name and password.

步骤4:指定用户名和密码。

Image for post

Step 6: Leave Default Options & Create Stack

步骤6:保留默认选项并创建堆栈

Image for post
Image for post

Step 7:Login to Kinesis data generator using the below url and sign in with your credentials

步骤7:使用以下URL登录到Kinesis数据生成器并使用您的凭据登录

https://awslabs.github.io/amazon-kinesis-data-generator/web

https://awslabs.github.io/amazon-kinesis-data-generator/web

Image for post

Step 8:Generate Streams using the template provided by Kinesis Data Generator

步骤8:使用Kinesis Data Generator提供的模板生成流

Image for post

You will need to select the region in which the firehose was created

您将需要选择创建消防水带的区域

Choose the number of Records per Second to send

选择每秒发送的记录数

Create a template similar to the stack overflow actual data

创建类似于堆栈溢出实际数据的模板

Image for post
Sample of the json stack overflow data
json堆栈溢出数据的样本

Below is a sample I used

以下是我使用的示例

{ “questionid”: {{random.number(100000)}}, “view_count”: {{random.number( { “min”:0, “max”:1000 } )}}, “is_answered”: “{{random.arrayElement( [“True”,”False”] )}}”,“answer_count”: {{random.number( { “min”:0, “max”:20 } )}},“score”: {{random.number( { “min”:0, “max”:50 } )}},“creation_date”: {{random.arrayElement( [1546300800] )}}

{“ questionid”:{{random.number(100000)}},“ view_count”:{{random.number({“ min”:0,“ max”:1000}}}},“ is_answered”:“ {{ random.arrayElement([“ True”,“ False”])}}}“”,“ answer_count”:{{random.number({“ min”:0,“ max”:20}}}}},“得分”:{ {random.number({“ min”:0,“ max”:50})}},“ creation_date”:{{random.arrayElement([1546300800]}}}

}

}

Here, I want the creation_date to be constant and only take the current date.

在这里,我希望creation_date为常数,并且仅采用当前日期。

Step 10 : Send data to Kinesis Firehose

步骤10:将数据发送到Kinesis Firehose

Image for post

The firehose- streams will be created in your s3 bucket in the format you specified when you created the delivery stream.

firehose-stream将以您在创建传递流时指定的格式在s3存储桶中创建。

Image for post

翻译自: https://medium.com/@snehamehrin22/how-to-generate-mock-streaming-data-using-kinesis-data-generator-a3dce7d43236

kinesis

  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值