无服务器:使用AWS SNS构建小型生产者/消费者数据管道

我想用Serverless创建一个小的数据管道,其主要用途是每天运行一次,调用API,然后将该数据加载到数据库中。

它主要用于从该API提取最新数据,但我也希望能够手动调用它并指定日期范围。

我创建了以下一对Lambda, 它们通过SNS主题相互通信

编码

无服务器

service: marks-blog

frameworkVersion: ">=1.2.0 <2.0.0"

provider:
  name: aws
  runtime: python3.6
  timeout: 180
  iamRoleStatements:
    - Effect: 'Allow'
      Action:
        - "sns:Publish"
      Resource:
        - ${self:custom.BlogTopic}

custom:
  BlogTopic:
    Fn::Join:
      - ":"
      - - arn
        - aws
        - sns
        - Ref: AWS::Region
        - Ref: AWS::AccountId
        - marks-blog-topic

functions:
  message-consumer:
    name: MessageConsumer
    handler: handler.consumer
    events:
      - sns:
          topicName: marks-blog-topic
          displayName: Topic to process events
  message-producer:
    name: MessageProducer
    handler: handler.producer
    events:
      - schedule: rate(1 day)

handler.py

import boto3
import json
import datetime
from datetime import timezone
 
def producer(event, context):
    sns = boto3.client('sns')
 
    context_parts = context.invoked_function_arn.split(':')
    topic_name = "marks-blog-topic"
    topic_arn = "arn:aws:sns:{region}:{account_id}:{topic}".format(
        region=context_parts[3], account_id=context_parts[4], topic=topic_name)
 
    now = datetime.datetime.now(timezone.utc)
    start_date = (now - datetime.timedelta(days=1)).strftime("%Y-%m-%d")
    end_date = now.strftime("%Y-%m-%d")
 
    params = {"startDate": start_date, "endDate": end_date, "tags": ["neo4j"]}
 
    sns.publish(TopicArn= topic_arn, Message= json.dumps(params))
 
 
def consumer(event, context):
    for record in event["Records"]:
        message = json.loads(record["Sns"]["Message"])
 
        start_date = message["startDate"]
        end_date = message["endDate"]
        tags = message["tags"]
 
        print("start_date: " + start_date)
        print("end_date: " + end_date)
        print("tags: " + str(tags))

尝试一下

我们可以通过执行以下命令来模拟本地接收的消息:

$ serverless invoke local \
    --function message-consumer \
    --data '{"Records":[{"Sns": {"Message":"{\"tags\": [\"neo4j\"], \"startDate\": \"2017-09-25\", \"endDate\": \"2017-09-29\"  }"}}]}'
 
start_date: 2017-09-25
end_date: 2017-09-29
tags: ['neo4j']
null

这似乎很好。 如果我们在AWS上调用消息生成器怎么办?

$ serverless invoke --function message-producer
 
null

消费者收到消息了吗?

$ serverless logs --function message-consumer
 
START RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f Version: $LATEST
start_date: 2017-09-29
end_date: 2017-09-30
tags: ['neo4j']
END RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f
REPORT RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f	Duration: 0.46 ms	Billed Duration: 100 ms 	Memory Size: 1024 MB	Max Memory Used: 32 MB

看起来像它! 我们还可以直接在AWS上调用使用者:

$ serverless invoke \
    --function message-consumer \
    --data '{"Records":[{"Sns": {"Message":"{\"tags\": [\"neo4j\"], \"startDate\": \"2017-09-25\", \"endDate\": \"2017-09-26\"  }"}}]}'
 
null

现在,如果我们查看使用者的日志,我们将看到两条消息:

$ serverless logs --function message-consumer
 
START RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f Version: $LATEST
start_date: 2017-09-29
end_date: 2017-09-30
tags: ['neo4j']
END RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f
REPORT RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f	Duration: 0.46 ms	Billed Duration: 100 ms 	Memory Size: 1024 MB	Max Memory Used: 32 MB	
 
START RequestId: 4cb42bc9-a5b1-11e7-affb-99fa6b4dc3ed Version: $LATEST
start_date: 2017-09-25
end_date: 2017-09-26
tags: ['neo4j']
END RequestId: 4cb42bc9-a5b1-11e7-affb-99fa6b4dc3ed
REPORT RequestId: 4cb42bc9-a5b1-11e7-affb-99fa6b4dc3ed	Duration: 16.46 ms	Billed Duration: 100 ms 	Memory Size: 1024 MB	Max Memory Used: 32 MB

成功!

翻译自: https://www.javacodegeeks.com/2017/10/serverless-building-mini-producerconsumer-data-pipeline-aws-sns.html

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值