我想用Serverless创建一个小的数据管道,其主要用途是每天运行一次,调用API,然后将该数据加载到数据库中。
它主要用于从该API提取最新数据,但我也希望能够手动调用它并指定日期范围。
我创建了以下一对Lambda, 它们通过SNS主题相互通信 。
编码
无服务器
service: marks-blog
frameworkVersion: ">=1.2.0 <2.0.0"
provider:
name: aws
runtime: python3.6
timeout: 180
iamRoleStatements:
- Effect: 'Allow'
Action:
- "sns:Publish"
Resource:
- ${self:custom.BlogTopic}
custom:
BlogTopic:
Fn::Join:
- ":"
- - arn
- aws
- sns
- Ref: AWS::Region
- Ref: AWS::AccountId
- marks-blog-topic
functions:
message-consumer:
name: MessageConsumer
handler: handler.consumer
events:
- sns:
topicName: marks-blog-topic
displayName: Topic to process events
message-producer:
name: MessageProducer
handler: handler.producer
events:
- schedule: rate(1 day)
handler.py
import boto3
import json
import datetime
from datetime import timezone
def producer(event, context):
sns = boto3.client('sns')
context_parts = context.invoked_function_arn.split(':')
topic_name = "marks-blog-topic"
topic_arn = "arn:aws:sns:{region}:{account_id}:{topic}".format(
region=context_parts[3], account_id=context_parts[4], topic=topic_name)
now = datetime.datetime.now(timezone.utc)
start_date = (now - datetime.timedelta(days=1)).strftime("%Y-%m-%d")
end_date = now.strftime("%Y-%m-%d")
params = {"startDate": start_date, "endDate": end_date, "tags": ["neo4j"]}
sns.publish(TopicArn= topic_arn, Message= json.dumps(params))
def consumer(event, context):
for record in event["Records"]:
message = json.loads(record["Sns"]["Message"])
start_date = message["startDate"]
end_date = message["endDate"]
tags = message["tags"]
print("start_date: " + start_date)
print("end_date: " + end_date)
print("tags: " + str(tags))
尝试一下
我们可以通过执行以下命令来模拟本地接收的消息:
$ serverless invoke local \
--function message-consumer \
--data '{"Records":[{"Sns": {"Message":"{\"tags\": [\"neo4j\"], \"startDate\": \"2017-09-25\", \"endDate\": \"2017-09-29\" }"}}]}'
start_date: 2017-09-25
end_date: 2017-09-29
tags: ['neo4j']
null
这似乎很好。 如果我们在AWS上调用消息生成器怎么办?
$ serverless invoke --function message-producer
null
消费者收到消息了吗?
$ serverless logs --function message-consumer
START RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f Version: $LATEST
start_date: 2017-09-29
end_date: 2017-09-30
tags: ['neo4j']
END RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f
REPORT RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f Duration: 0.46 ms Billed Duration: 100 ms Memory Size: 1024 MB Max Memory Used: 32 MB
看起来像它! 我们还可以直接在AWS上调用使用者:
$ serverless invoke \
--function message-consumer \
--data '{"Records":[{"Sns": {"Message":"{\"tags\": [\"neo4j\"], \"startDate\": \"2017-09-25\", \"endDate\": \"2017-09-26\" }"}}]}'
null
现在,如果我们查看使用者的日志,我们将看到两条消息:
$ serverless logs --function message-consumer
START RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f Version: $LATEST
start_date: 2017-09-29
end_date: 2017-09-30
tags: ['neo4j']
END RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f
REPORT RequestId: 0ef5be87-a5b1-11e7-a905-f1387e68c65f Duration: 0.46 ms Billed Duration: 100 ms Memory Size: 1024 MB Max Memory Used: 32 MB
START RequestId: 4cb42bc9-a5b1-11e7-affb-99fa6b4dc3ed Version: $LATEST
start_date: 2017-09-25
end_date: 2017-09-26
tags: ['neo4j']
END RequestId: 4cb42bc9-a5b1-11e7-affb-99fa6b4dc3ed
REPORT RequestId: 4cb42bc9-a5b1-11e7-affb-99fa6b4dc3ed Duration: 16.46 ms Billed Duration: 100 ms Memory Size: 1024 MB Max Memory Used: 32 MB
成功!