aws lambda_在AWS Lambda上进行无服务器我的线性回归api

aws lambda

About a year ago, I had a fun idea for a small side project. I wanted to build an API to do a simple linear regression on two series of data. If you don’t know what a linear regression is, it’s basically a way of calculating the correlation between two variables — it’s not super important to understanding this post, but if you want to learn more, you can get an idea of what it is here.

大约一年前,我有一个有趣的小项目 。 我想构建一个API,以对两个系列的数据进行简单的线性回归。 如果您不知道线性回归是什么,那么它基本上是一种计算两个变量之间的相关性的方法-了解这篇文章并不是非常重要,但是如果您想了解更多信息,则可以了解一下它是什么在这里

It was a nice way to refresh some of my econometrics knowledge from college. Since I also wanted to improve my Python skills, I chose to do it in Python. Specifically, I used Flask to turn my regression code into an API. After getting Flask set up and my regression code hooked up to it, I wrote a quick few tests, slapped together a simple frontend with some HMTL, CSS, and JavaScript (vanilla for most of it, D3 for some charting).

这是从大学那里重新获得一些计量经济学知识的好方法。 由于我也想提高自己的Python技能,因此我选择使用Python来做。 具体来说,我使用Flask将我的回归代码转换为API。 设置好Flask并连接好回归代码之后,我写了一些快速测试,将一些简单的前端与HMTL,CSS和JavaScript(大多数为香草,某些图表为D3)放在一起。

When I was done and it came time to deploy my code, my first instinct was to reach for Heroku, because that was the way I had deployed previous projects, and that’s what I did at first. However, at the time I was in the middle of a job search, and I was noticing a lot of job listings asking for some AWS experience, so I decided to dip my toe in deploying my regression project there. Googling “deploying a flask app to AWS” brought up the official docs on deploying a flask app to Elastic Beanstalk.

当我完成并部署代码的时候,我的第一个直觉是接触Heroku,因为那是我部署先前项目的方式,而这正是我最初所做的。 但是,当时我正处于求职过程中,并且注意到很多工作清单都要求获得一些AWS经验,因此我决定全神贯注地在此处部署回归项目。 谷歌搜索“将烧瓶应用程序部署到AWS”带来了有关将烧瓶应用程序部署到Elastic Beanstalk的官方文档。

Having never worked with AWS before, it was tough getting everything set up, but I was able to push through and get my code up and running. I could go to my messy AWS URL and see the project I had built. That was a very exciting day. However, at that point my project was pretty much done, and I put it in the back of my mind.

以前从未使用过AWS,因此很难进行所有设置,但是我能够完成并启动并运行我的代码。 我可以转到凌乱的AWS URL并查看我构建的项目。 那天真是令人兴奋。 但是,到那时,我的项目已经完成了很多,我把它放在了脑海中。

Fast forward to this August. I’ve got a new job that wants me to get AWS certified. Awesome! I take some lessons and pass my Certified Cloud Practitioner exam. Even more awesome! While I was studying up for it, I found myself thinking back to what I had done to deploy my project. As (relatively) straightforward as deploying to Elastic Beanstalk was, it may have been overkill. I didn’t need a lot of the features that it gives me. It may have been more straightforward to deploy it directly to EC2. Then I learned about Lambda.

快进到今年八月。 我有一份新工作,希望我获得AWS认证。 太棒了! 我上了一些课,并通过了认证的Cloud Practitioner考试。 更棒了! 在为之努力学习的过程中,我发现自己回想起部署项目所要做的事情。 与部署到Elastic Beanstalk相比(相对)直接,但它可能已经过时了。 我不需要它提供的许多功能。 将其直接部署到EC2可能更直接了。 然后我了解了Lambda。

AWS Lambda Logo
Ooh. Ah! Lambda!
哦 啊! Lambda!

Lambda is AWS’ main serverless offering. It abstracts away provisioning and running servers to the point that you can pretty much upload your code and have it run when you need it to. That’s where I put the code that actually runs the regression. That sits behind a service called API Gateway, which lets you (among other things) run your Lambda functions in response to HTTP calls. In my case, my regression function runs in response to a POST call with two arrays of numbers in the body.

Lambda是AWS的主要无服务器产品。 它将配置和运行服务器抽象化,以至于您几乎可以上传代码并在需要时运行它。 那是我放置实际运行回归的代码的地方。 它位于称为API Gateway的服务后面,该服务使您(除其他事项外)运行Lambda函数以响应HTTP调用。 在我的情况下,我的回归函数运行以响应POST调用,该调用在正文中包含两个数字数组。

I was able to get this working by following along with the official AWS guide for building an API with Lambda. Getting the lambda function working was pretty easy: I just pasted my old regression code in and wrapped it in some of the Flask code to mediate between the HTTP call and the regression function. After tweaking some variables, I had a working AWS Lambda function.

通过遵循有关使用Lambda构建API官方AWS指南,我能够使此工作正常进行。 使lambda函数正常工作非常容易:我只是将旧的回归代码粘贴到其中,并将其包装在一些Flask代码中,以在HTTP调用和回归函数之间进行中介。 调整了一些变量后,我有了一个有效的AWS Lambda函数。

The tough part came from hooking the Lambda function up to API gateway. Actually, hooking it up was pretty easy. I was able to follow the AWS docs to get my Lambda function to respond to a POST. The hard part was getting it to respond in a meaningful way.

困难的部分来自将Lambda函数连接到API网关。 实际上,将其连接起来非常容易。 我能够按照AWS文档获取Lambda函数以响应POST。 困难的部分是使它以有意义的方式做出响应。

AWS API Gateway Logo
The next step: API Gateway
下一步:API网关

You see, my old code had a couple of branches that the regression code could go along depending on what the request looked like. If the data was good, it created a Python dictionary with a statusCode key with a value of 200, and the whole thing got wrapped up in json.dumps so that Flask could send the response back to the browser as a JSON string. AWS didn’t like that. It turns out, API Gateway expects the whole response object (with keys statusCode, body, and headers) to just be a dictionary. The contents of the body key however, do need to be a JSON string. So my code went from:

您会看到,我的旧代码有几个分支,根据请求的样子,回归代码可以执行。 如果数据良好,它会创建一个Python字典,该字典的statusCode密钥的值为200,整个内容都包裹在json.dumps以便Flask可以将响应作为JSON字符串发送回浏览器。 AWS不喜欢这样。 事实证明,API Gateway希望整个响应对象(带有键statusCodebodyheaders )只是一个字典。 但是,主体键的内容确实必须是JSON字符串。 所以我的代码来自:

# ind and dep are the two variables from the HTTP request
return json.dumps({
statusCode: 200,
body: regression(ind, dep)
})

To this:

对此:

return {
statusCode: 200,
body: json.dumps(regression(ind,dep))
}

So, that got my tests working for the API, but I still had to ensure I could get that response from my own website, not just from AWS’ test service. As a first step, I took my testing to Postman, my API development tool of choice. Using my API Gateway URL and some sample data from my original Python unit tests, I sent a request to my API URL, and got back… a 403 with the body:

因此,这使我的测试适用于该API,但是我仍然必须确保可以从自己的网站获得响应,而不仅仅是从AWS的测试服务获得响应。 第一步,我将测试带到了我选择的API开发工具Postman 。 使用我的API网关URL和原始Python单元测试中的一些示例数据,我向我的API URL发送了一个请求,然后返回……正文为403:

{    "message": "Missing Authentication Token"}

I went to Google. I verified again that I had set this API up as a public API: I had. Then I saw a suggestion that the “Missing Authentication Token” error shows up when one tries to send the wrong HTTP method. I checked, and in fact, I was trying to do a GET on my API that only accepts POSTs. Why the error for that is "Missing Authentication Token" and not something more descriptive is anyone’s guess. Still, changing that to a POST got me:

我去了谷歌。 我再次验证是否已将此API设置为公共API:我已经拥有。 然后我看到一个建议,即当尝试发送错误的HTTP方法时,将显示“Missing Authentication Token”错误。 我检查了一下,实际上,我试图在仅接受POST的API上执行GET。 任何人都猜测为什么错误是"Missing Authentication Token"而不是更具描述性的错误。 尽管如此,将其更改为POST仍使我:

A Postman screenshot with the error text “”message”: “Missing Authentication Token””
Did you guess the same error?
您是否猜到了相同的错误?

That’s right, I was still getting the same error while sending a POST. I think at that point I had some discourteous words for my computer and took a break for a little bit. When I came back to my project, I combed through the resources I had been consulting before — AWS Docs, StackOverflow questions, Medium articles — and couldn’t figure out what I was doing wrong. I had done every step that they suggested. I’d dotted all my i’s and crossed all my t’s, but still it wasn’t working.

没错,发送POST时我仍然遇到相同的错误。 我认为那时我的电脑上有些不礼貌的话,然后休息了一会儿。 回到我的项目时,我梳理了以前咨询过的资源(AWS Docs,StackOverflow问题,中型文章),却无法弄清楚我做错了什么。 我做了他们建议的所有步骤。 我点了我所有的i,并划过我所有的t,但仍然没有用。

Or had I? While looking at a StackOverflow question for the dozenth time, I finally saw something in the URL. You see, API Gateway gives you a “Invoke URL” to access your API. It should look something like: https://abunchofrandomletters.execute-api.us-west-1.amazonaws.com/environment. I had thought that was where I had to go to access my API. However, the person asking the question on SO was sending their request to https://abunchofrandomletters.execute-api.us-west-1.amazonaws.com/environment/resource. See that /resource on the end there? That’s the name of the API Gateway resource that they set up initially. I had one of those too, called /regress, so I thought I’d try tacking that on to the end of my route.

还是我? 在第十次查看StackOverflow问题时,我终于在URL中看到了一些东西。 您会看到,API Gateway为您提供了一个“调用URL”来访问您的API。 它应该类似于: https://abunchofrandomletters.execute-api.us-west-1.amazonaws.com/environment : https://abunchofrandomletters.execute-api.us-west-1.amazonaws.com/environment 。 我以为那是我必须去访问我的API的地方。 但是,在SO上提问的人正在将其请求发送到https://abunchofrandomletters.execute-api.us-west-1.amazonaws.com/environment /resource 。 看到那最后的/resource吗? 这是他们最初设置的API网关资源的名称。 我也有一个叫做/regress ,所以我认为我会尽力将其坚持到我的路线的尽头。

A successful Postman call
HA! It works!
哈! 有用!

That was it. Nowhere in the tutorial I was following was that mentioned, but in retrospect, it makes sense that I would have had to do that. With that working, I could move on to finally integrating my API with my frontend.

就是这样 在本教程中,我所关注的都没有提到过,但是回想起来,我不得不这样做。 通过这项工作,我可以继续将API与前端进行最终集成。

Since I wanted to access the frontend through my personal website, the first step was taking the HTML, CSS, and JavaScript I had been serving with my Flask server, and put it in GitHub Pages. That was easy, just an mv command, a git commit, and a couple of button presses in GitHub. The next step was somewhat harder: getting my frontend to talk to my API. I started by finding the function that was sending off the uploaded CSV, and swapping out my old Flask route for my new API Gateway.

因为我想通过我的个人网站访问前端,所以第一步是将我与Flask服务器一起使用HTML,CSS和JavaScript放入GitHub Pages。 这很容易,只需在GitHub中执行一个mv命令,一个git commit和几次按钮即可。 下一步有些困难:让前端与我的API对话。 我首先找到发送上载CSV的函数,然后将旧的Flask路由换成新的API网关。

I loaded some data in, sent off my request, and it failed. Not even a nice 400 or 500 error. Just that nasty (failed) message in my Network tab. Since I couldn’t see just what was going on, it was time to use another one of the services I’d been learning about: CloudWatch.

我加载了一些数据,发出了请求,但失败了。 甚至没有400500错误。 只是我的网络标签中的讨厌(failed)消息。 由于我看不到到底发生了什么,所以该使用我正在学习的另一项服务了:CloudWatch。

AWS CloudWatch Logo
I think this logo is the metrics you see. Or maybe the cloud you’re watching?
我认为这个徽标是您看到的指标。 还是您正在观看的云?

CloudWatch is one of AWS’ monitoring services. It can be used to monitor performance, and even send alerts if there’s something wrong with your applications. For my purposes though, I was interested in the logging functionality. I turned on logging, and was able to track down the problem. It was CORS. Now, CORS has derailed a personal project of mine more than once. However, in those cases, I didn’t usually control both the browser and the server (or rather, the lack of a server). A quick Access-Control-Allow-Origin header in my Lambda function, and I had a working product. I could go to my site, upload a CSV, and get a linear regression.

CloudWatch是AWS的监视服务之一。 它可用于监视性能,甚至在应用程序出现问题时发送警报。 但是出于我的目的,我对日志记录功能感兴趣。 我打开了日志记录功能,并能够找到问题所在。 是CORS。 现在,CORS已经不止一次地破坏了我的个人项目。 但是,在那种情况下,我通常无法同时控制浏览器和服务器(或者说缺少服务器)。 我的Lambda函数中有一个快速的Access-Control-Allow-Origin标头,并且我有一个有效的产品。 我可以去我的网站,上传CSV,然后进行线性回归。

A screenshot of my regression app
The final product
最终产品

Go ahead, check it out!

继续,检查一下!

翻译自: https://medium.com/swlh/going-serverless-my-linear-regression-api-on-aws-lambda-51ab84403755

aws lambda

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值