与Spring Webflux和新的CosmosDB API v3完全反应

最新推荐文章于 2024-07-22 20:19:23 发布

cunxiedian8614

最新推荐文章于 2024-07-22 20:19:23 发布

阅读量291

点赞数

文章标签：数据库 memcached scala

Going full reactive?

Following this blog post on Spring Boot and MongoDB, I decided to port it in order to be "fully reactive" :

Migrating from the standard Spring Web stack to Spring Webflux, which uses project Reactor in order to have a reactive API.
Migrating from the CosmosDB with MongoDB API to the newest CosmosDB SDK, using the SQL API. If you want more information on CosmosDB, here is the documentation.

这篇文章的想法是，我们应该从数据库到Web层都是完全反应的，因此我们可以研究所需的API，并了解此体系结构的性能和可伸缩性。

Giving a spin to the new Azure CosmosDB SDK

请注意这使用的是最新的Azure CosmosDB SDK（尚未完成），因此本篇文章也是该SDK的全球预览，因此我们可以对其进行测试并进行讨论。我（Julien Dubois）与Microsoft的SDK团队直接联系，因此，如果您有任何意见或问题，请随时在本文中发布或直接与我联系！

This new SDK is available on https://github.com/Azure/azure-cosmosdb-java/tree/v3.
At the time of this writing, the documentation and sample applications are not ready yet.
If you are using the previous SDK, please note that the Maven artifactId has changed, and that this SDK is now available at com.microsoft.azure:azure-cosmos.

As this is some new code, you can expect a few issues. For example I found this one while doing this blog post, and I proposed a fix here. But as you'll see along this blog post, this new API is much better than the previous one, and behaves really well.

Doing a CRUD with the new CosmosDB SDK

关于新CosmosDB SDK的真正令人敬畏的消息是，它使用像Spring Webflux这样的项目Reactor，而不是旧版本中的RxJava v1 API老化了。这意味着它将返回单声道和助焊剂 objects,和those are exactly what Spring Webflux likes, so integrating both is going to be very smooth和easy.

The whole demo project is available at https://github.com/jdubois/spring-webflux-cosmosdb-sql/, but let's focus on the repository layer as this is where all the CosmosDB SDK magic lives.

CosmosDB configuration and connection

We have created a specific Spring Boot configuration properties class in order to hold our configuration. This is used in our repository layer, which looks like this:

ConnectionPolicy connectionPolicy = new ConnectionPolicy();
connectionPolicy.connectionMode(ConnectionMode.DIRECT);

client = CosmosClient.builder()
    .endpoint(accountHost)
    .key(accountKey)
    .connectionPolicy(connectionPolicy)
    .build();

重要的部分是它使用“直接模式”连接策略，而不是默认的“网关”策略：在我们的测试中，这显然产生了很大的不同，因为我们的反应式代码对于网关而言可能过于高效，并且正在泛滥。它很快。因此，我们有很多与网关的连接错误，当我们切换到直接模式时，该错误就消失了：如果可以使用它，则在这种“反应性”场景中强烈建议使用。

In the init() method (available here) we also did 2 blocking calls to create the database and its container. There are a couple of interesting tweaks here:

正如我们以前使用的，我们的容器没有索引政策indexingPolicy.automatic（false）;。默认情况下，CosmosDB会为所有存储对象的所有字段建立索引，这在插入过程中会产生很大的开销。我们在测试中不需要使用它，但我们也认为它过于激进，因此应针对每个特定用例进行调整。使用以下命令创建默认RU / s为400的容器database.createContainerIfNotExists（containerSettings，400）。请谨慎使用此设置，因为如果设置得太高，这会很快花费很多钱。奇怪的是，它被设置为1000使用MongoDB API时400默认情况下使用SQL API-但无论如何，这是一个非常重要的设置，因此修复它比依赖默认值更好。做的时候新的CosmosContainerProperties（CONTAINER_NAME，“ / ID”）；，我们使用了ID作为分区键。这就是为什么要使用container.getItem(ID, ID)：第一个参数是ID第二个是分区键，恰好也是ID。效果很好，在我们的演示中项目应该确实用于对所有内容进行分区，因此这具有商业意义。

Creating, finding and deleting an item

对于简单的操作，当我们有一个项目的ID（和分区键），我们可以直接使用简单的API，例如用于创建：

public Mono<Project> save(Project project) {
    project.setId(UUID.randomUUID().toString());
    return container.createItem(project)
        .map(i -> {
            Project savedProject = new Project();
            savedProject.setId(i.item().id());
            savedProject.setName(i.properties().getString("name"));
            return savedProject;
        });
}

由于没有提供ORM，因此我们需要手动将返回结果映射到我们的域对象。对于较大的对象，这是很多样板代码，但这对于这种技术是很常见的。当然，好消息是，在这种情况下，我们可以轻松返回Mono<Project>，这正是Spring Webflux想要的。

Querying

进行SQL查询要复杂一些，这里有两个问题：

作为我们的ID也是我们的分区键，我们必须允许跨分区查询才能获取所有数据，使用options.enableCrossPartitionQuery（true）;。当然，这会降低性能。因为我们想要分页数据，所以我们使用了前20名在我们的SQL查询中只能获得20个项目，而不会泛滥系统。

这是结果代码：

FeedOptions options = new FeedOptions();
options.enableCrossPartitionQuery(true);

return container.queryItems("SELECT TOP 20 * FROM Project p", options)
    .map(i -> {
        List<Project> results = new ArrayList<>();
        i.results().forEach(props -> {
            Project project = new Project();
            project.setId(props.id());
            project.setName(props.getString("name"));
            results.add(project);
        });
        return results;
    });

小心尝试限制返回值的数量时，您可能会倾向于配置FeedOptions实例使用options.maxItemCount（20）。这将不起作用，并且非常棘手：

查询返回分页值，并且maxItemCount实际上是每个页面中值的数量。这来自CosmosDB API（实际上，这是进行查询时在其下面使用的HTTP标头的名称），因此该名称中包含一些逻辑，但这肯定会引起麻烦，因为该名称具有误导性。因此，如果将其设置为20，这意味着您仍然会在小页面中获得整个商品列表，这确实会非常昂贵。请注意，文档没有说明默认值maxItemCount是，但已硬编码为100。由于此API，我们的查询返回了Flux<List<Project>>而不是Flux<Project>：我们的页面不断变化，而不仅仅是变化。

Performance testing

At least we arrive to performance testing! We're going to do something similar to the blog post on Spring Boot and MongoDB so you can have a look at both results, but don't compare apples and oranges, as this other application was created using JHipster. JHipster does not (yet) fully support reactive programming, so the Spring Webflux was coded manually, and is thus quite different:

The JHipster application had security, auditing and metrics: this all consumes quite a lot of performance, but they are essential if you want to deploy real applications in production. Our Spring Weblux demo is much more simple.
Also, JHipster provides several performance tweaks that we don't have on the Spring Webflux demo, for example if uses Afterburner.

因此，尽管比较这两个应用程序很有趣，但请记住，它们并非完全一对一。

Going to production

As we did in the Spring Boot/MongoDB blog post, we deployed the application on 一种zure Web Apps using the provided Maven plugin (see it here in our pom.xml).

Test scenario

Our test scenario is made with Gatling, and is available at https://github.com/jdubois/spring-webflux-cosmosdb-sql/blob/master/src/test/gatling/user-files/simulations/ProjectGatlingTest.scala. It's a simple script that simulates users going through the API: creating, querying and deleting items.

Running with 100 users

我们的第一个测试是对100个用户进行的测试，正如预期的那样，一切正常，因为并发请求并不多：

没什么，让我们继续前进！

Going to 500 users

拥有500位用户很有趣：它仍然可以很好地运行，但是我们遇到了3个错误：

这是因为在CosmosDB上删除项目是一项昂贵的操作（它使用了超过5 RU），因此在应用程序满载时执行此操作意味着我们已达到API限制。这是因为与经典的Spring Web框架相比，它具有更高性能和稳定的应用程序：我们在后端更加努力，我们需要考虑到这一点。

Reaching 1,000 users

要超过500个用户，我们需要像使用Spring Web一样增加CosmosDB RU / s。在这里，1,200 RU / s似乎足够，但是说实话，我们将其推升至5,0000 RU / s，因此在其余测试中我们不必为此担心。

再次，一切顺利，没有任何问题，让我们扩大规模！

10,000 users

达到10,000个用户有一个有趣的副作用：我们的加特林测试开始在客户端失败。所以我们不得不增加极限在我们的负载测试机上：这是很常见的，但是在Spring Web上却没有发生，因此在这里我们再次看到完全反应是有影响的，因为对于我们的负载测试机来说，它运行得太快了。尽管如此，我们还是遇到了一些客户端错误，因为加特林无法找到服务器的主机名：这就是为什么我们无法访问20,000个用户的原因……

We also started to have some server errors after reaching 5,000 users: those are basically the same ones than on the client side, with too many files opened on the server. As we are using Azure w ^eb Apps we couldn't modify anything on the server, but we could easily scale it out. From our tests, it seems that 2/3 servers would be enough, but we used 5 just to be sure. Please note that with Spring Web we used 20 servers: once again, both tests are not 1-to-1 equivalents, and should be refined, but it's pretty clear that we use less resources with the reactive stack.

还请注意，我们的第99个百分位性能非常好，并且我们非常容易地将其扩展到一分钟内的1,0000个请求/秒，并具有非常清晰的图形：

Profiling

As everything looked really great with our graphs, but our load testing tool prevented us to go further, we decided to do some profiling with YourKit, in order to be sure we had nothing blocking or holding us to go further.

在本地计算机上有5,000个用户运行时，我们可以看到没有线程在阻塞：

而且我们的CPU使用率极低，线程数稳定，内存又低又稳定：

我们还对YourKit进行了一些分析，以查找瓶颈，锁或需要大量内存的对象：我们为您提供了详细信息，因为我们找不到任何东西！

Conclusion and final thoughts

通过“完全反应”，我们获得了许多优势：

该应用程序启动速度更快，占用的CPU和内存更少。它具有非常稳定的吞吐量。它易于扩展。

但是，一切都不完美：

还有很多代码，这很复杂，需要良好的技术背景。Everything needs to be non-blocking: it's awesome in this simple use-case, but in real life it's a bit more complex. For instance, I love to use Spring Cache: it's easy to use, and using a Memcached or Redis serveris probably way cheaper than scaling Cosmos DB. But as this is a blocking operation, we can't use it here!当用户数量很多时，只有很大的兴趣去“完全反应”。如果您只有500个请求/秒，则可能是工程过度。

我们还对CosmosDB SDK的新v3版本有了初步的了解：我们证明了它在高负载下工作得非常好，并且我们有幸使其能够与Spring Webflux在相同的响应框架下工作。肯定仍然存在一些错误，以及需要改进的API，例如：

它不使用setters / getters，例如我们使用options.maxItemCount（20）设置最大项目数，以及options.maxItemCount（）得到计数。我个人并不觉得这很容易使用。我觉得很奇怪，要创建一个项目，您只需使用container.createItem（项目），但是如果您需要阅读该内容，则会收到一封CosmosItemResponse and then need to create the POJO manually. I think we could have some automatic POJO mapper, like we have with MongoDB.对于查询，我们可以使用流利的查询生成器

由于仍有时间改进该API，请花一些时间阅读演示代码并提供反馈：在此帖子上添加评论，发送Twitter消息...请不要犹豫，我很乐意与您交流与Microsoft SDK团队的反馈。

from: https://dev.to//azure/going-full-reactive-with-spring-webflux-and-the-new-cosmosdb-api-v3-1n2a

cunxiedian8614

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
与Spring Webflux和新的CosmosDB API v3完全反应

Going full reactive?Following this blog post on Spring Boot and MongoDB, I decided to port it in order to be "fully reactive" :Migrating from the standard Spring Web stack to S...
复制链接

扫一扫