sql分页和ajax分页_sql分页你可能做错了

最新推荐文章于 2022-10-17 14:37:04 发布

杨_明

最新推荐文章于 2022-10-17 14:37:04 发布

阅读量145

点赞数

文章标签： sql ajax

原文链接：https://medium.com/swlh/sql-pagination-you-are-probably-doing-it-wrong-d0f2719cc166

版权

sql分页和ajax分页

Am I crazy assuming you are doing pagination wrong? I mean how hard is pagination? it’s so simple how can you do it wrong?

假设您的分页错误，我会发疯吗？我的意思是分页有多难？如此简单，您怎么能做错呢？

Well, you might not be doing it wrong, but you are probably using LIMIT with OFFSET to do pagination. That has performance limitations you might not be aware of and you probably can do better or you need to secure some of your URL's.

好吧，您可能没有做错错误，但是您可能正在使用LIMIT和OFFSET进行分页。这具有您可能不知道的性能限制，您可能会做得更好，或者需要保护某些URL。

You do not believe me? Well just check this graph down below.

你不相信我？好吧，请检查下面的图表。

This is done on a small dataset of a 100.000 lines with small 10 line pages. Using both Limit & Offset for the first data series and the alternative method which I have called “optimized” which in reality is called keyset pagination or seek pagination where only a Limit is used. This was done on mysql 5.7.

这是在一个100.000行的小型数据集和10行的小页面上完成的。在第一个数据序列中同时使用Limit＆Offset和我称为“最优化”的替代方法，实际上称为键集分页或仅使用Limit的寻求分页 。这是在mysql 5.7上完成的。

What you can see here is by using the offset, that the first pages are relatively fast to open. The first 1200 pages which allows us to load at the end12K lines of data have all taken less then 2ms each. So if you are displaying a page to a user it’s still “okay”. But if the users also opens the last page of the site, then performance will drop. And the more pages you have the slower the last page will be to load.

您在这里看到的是通过使用偏移量，可以相对快速地打开前几页。允许我们在末尾加载12K行数据的前1200页每页花费的时间少于2ms。因此，如果您要向用户显示页面，则仍然可以。但是，如果用户还打开网站的最后一页，则性能将下降。而且您拥有的页面越多，最后一页的加载速度就越慢。

So in this article we are going to clear out 3 questions:

因此，在本文中，我们将清除3个问题：

Do we need an order by when doing pagination? If so why
分页时需要命令吗？如果是这样，为什么
Why does using the offset slow downs the queries?
为什么使用偏移量会使查询变慢？
How can you get better performance across all pages without using the offset?
不使用偏移量如何在所有页面上获得更好的性能？

Databases allows us to query a data with little knowledge of how they work; but some very simple queries can create performance issues. It’s important to understand the basics of how things work to be able to write better code. I do a lot of simplifications in this article to make it clear. I try and point out those simplifications as we go.

数据库使我们可以在不了解数据如何工作的情况下查询数据。但是一些非常简单的查询可能会导致性能问题。重要的是要了解事物工作原理的基础，以便能够编写更好的代码。为了清楚起见，我在本文中做了很多简化。我尝试指出我们所做的简化。

Before getting into the subject if you wish to see the difference in performance of both methods for yourself you can check this gist with the code I used. I didn’t post the code here as I don’t think it’s relevant (and it’s ugly).

在进入本主题之前，如果您希望自己了解两种方法在性能上的差异，则可以使用我使用的代码检查要点。我没有在此处发布代码，因为我认为它无关紧要(而且很丑陋)。

不使用带限制和抵消的订单 (Not using an order by with limit & offset)

You probably have seen, or even written queries such as

您可能已经看过，甚至写过查询，例如

SELECT * FROM test LIMIT 100, 10;

At first glance this query doesn’t seems to be wrong; but it is.

乍一看，这个查询似乎没有错；但它是。

First of all let’s see what might go wrong, then why it can go wrong.

首先，让我们看一下可能出问题的地方，然后看看为什么会出问题。

Let’s use an example dataset with 10 rows, which contains only an ID.

让我们使用一个包含10行的示例数据集，其中仅包含一个ID。

╔════╦════╦════╦════╦════╦════╦════╦════╦════╦════╗
║ 10 ║ 13 ║ 15 ║ 17 ║ 21 ║ 22 ║ 24 ║ 25 ║ 27 ║ 28 ║
╚════╩════╩════╩════╩════╩════╩════╩════╩════╩════╝

Our first query SELECT * FROM test LIMIT 0, 5; should return the following data:

我们的第一个查询SELECT * FROM test LIMIT 0，5; 应该返回以下数据：

╔════╦════╦════╦════╦════╗
║ 10 ║ 13 ║ 15 ║ 17 ║ 21 ║
╚════╩════╩════╩════╩════╝

Our second data the following

我们的第二个数据如下

╔════╦════╦════╦════╦════╗
║ 22 ║ 24 ║ 25 ║ 27 ║ 28 ║
╚════╩════╩════╩════╩════╝

The order here is not really important, what we really wish is to get all 10 elements. In 2 different pages so that are user can see all the numbers. And we think they can. Or can they?

这里的顺序并不是很重要，我们真正希望的是获得所有10个元素。在2个不同的页面中，用户可以看到所有数字。而且我们认为他们可以。还是可以？

In reality nothing prevents the second page to contain a row that was already loaded on the first. So we can have the following 2 pages:

实际上，没有什么可以阻止第二页包含已经在第一页上加载的行。因此，我们可以拥有以下两个页面：

╔════╦════╦════╦════╦════╗
║ 10 ║ 13 ║ 15 ║ 17 ║ 21 ║
╚════╩════╩════╩════╩════╝
╔════╦════╦════╦════╦════╗
║ 22 ║ 24 ║ 10 ║ 27 ║ 28 ║
╚════╩════╩════╩════╩════╝

And we have “10” displayed in both the first page and the second page and “25” never displayed.

我们在第一页和第二页上都显示了“ 10”，而从未显示过“ 25”。

Why can this happen? Well we have not said to to the sql server to order the data. This means the execution plan of the query may differ between 2 queries in which case the way it loads the data changes and therefore the order of the data might change. Adding the Order By you insure that your data is reliable.

为什么会发生这种情况？好吧，我们还没有说要向sql server订购数据。这意味着查询的执行计划在两个查询之间可能有所不同，在这种情况下，其加载数据的方式会发生变化，因此数据的顺序可能会发生变化。添加订单您可以确保数据可靠。

This is a documented behavior and not a bug!

这是记录在案的行为，而不是错误！

Do you loose something by adding the order by and securing the behavior? Well no.

您是否通过添加订单并确保行为来放松某些东西？好吧

Pagination without and order by is not faster. Let’s see now why such queries are slow.

分页没有和排序不是更快。现在让我们看看为什么这样的查询很慢。

为什么使用偏移会减慢查询速度。 (Why does using offset slow down queries.)

To understand this we need first to understand how RDBMS will order the data. If our database schema is well done, we probably have an index on the field we are ordering by. We will assume in this article that our RDBMS uses B trees for those indexes. This is a simplification as the method used depends on the data type and even then they will rather use B tree variants. But for simplifications sake let’s assume that a simple B tree is used.

要了解这一点，我们首先需要了解RDBMS如何对数据进行排序。如果我们的数据库架构做得好，我们可能在排序依据的字段上有一个索引。在本文中，我们将假定RDBMS将B树用于这些索引。这是一种简化，因为所使用的方法取决于数据类型，即使那样，他们还是会使用B树形变体。但是为了简化起见，我们假设使用了一个简单的B树。

For our previous 10 rows the index would look as follows.

对于我们之前的10行，索引将如下所示。

So when we execute our query to get first 5 results, the tree allow the 5 elements to be found immediately. So the following query will go through the the tree as follows:

因此，当我们执行查询以获取前5个结果时，该树允许立即找到5个元素。因此，以下查询将遍历树，如下所示：

SELECT * FROM my_table ORDER BY id LIMIT 5

In this example just going through 5 leafs it was able to find the 5 smallest numbers. Let’s now load the second page and see what happens:

在此示例中，仅经过5个叶子，便能够找到5个最小的数字。现在让我们加载第二页，看看会发生什么：

SELECT * FROM my_table ORDER BY id LIMIT 5 5

The index didn’t help, we still had to go through all the leafs in order to find the last 5. This is because there is no way for the RDBMS to know where it needs to start. So it has to find the first 5(in red) elements in order to load the next 5 elements.

该索引没有帮助，我们仍然必须遍历所有叶子才能找到最后5个。这是因为RDBMS无法知道它从何处开始。因此，它必须找到前5个(红色)元素才能加载后5个元素。

优化分页 (Optimizing the pagination)

How can we make our pagination work and fast in that case?

在这种情况下，如何使分页工作快速进行？

Simple, by profiting from the indexes.

简单，通过从索引中获利。

The hypothesis we need to make here is that you are never going to open page 2 without opening page 1 before.

我们在这里需要做的假设是，您永远不会在没有打开第1页之前就打开第2页。

With this hypothesis we know that the last Id on page 1 was 21 we can therefore ask the 5 first elements whose id is superior to 21.

有了这个假设，我们知道第1页上的最后一个ID是21 ，因此我们可以询问ID大于21的前5个元素。

SELECT * FROM my_table WHERE id > 21 ORDER BY id LIMIT 5

This way the index is utilized and we are checking very few unnecessary rows. This is called seek method or keyset pagination.

通过这种方式利用索引，我们正在检查很少的不必要的行。这称为搜索方法或键集分页。

We can also see this by using the EXPLAIN command on both queries.

我们还可以通过在两个查询上使用EXPLAIN命令来看到这一点。

EXPLAIN EXTENDED SELECT * FROM test LIMIT 5, 5\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: test
 partitions: NULL
 type: ALL
possible_keys: NULL
 key: NULL
 key_len: NULL
 ref: NULL
 rows: 100104
 filtered: 100.00
 Extra: NULL

In this example there are no keys being used. While with the keyset pagination the primary key will be used.

在此示例中，没有使用任何键。在使用键集分页时，将使用主键。

EXPLAIN EXTENDED SELECT * FROM test WHERE col1 > 21 LIMIT 5\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: test
 partitions: NULL
 type: range
possible_keys: PRIMARY
 key: PRIMARY
 key_len: 4
 ref: NULL
 rows: 50052
 filtered: 100.00
 Extra: Using where
1 row in set, 2 warnings (0.00 sec)

Lets also note that some storage solutions such as Elasticsearch will even prevent you from querying a important offset for this same reason.

还需要注意的是，出于同样的原因，某些存储解决方案(例如Elasticsearch)甚至会阻止您查询重要的偏移量。

其他一些例子 (A few additional examples)

按日期排序 (Sorting per date)

If you are sorting per date there always is a risk that multiple rows have the same date. If those rows are just between 2 pages you are going to have an issue not loading some of those rows on neither page. To fix this you need to order on a second primary field.

如果按日期排序，则始终存在多个行具有相同日期的风险。如果这些行仅在2页之间，那么您将有一个问题，无法在这两个页面上都加载其中的某些行。要解决此问题，您需要在第二个主要字段上订购。

SELECT *
FROM my_table
WHERE (update_date = '2017-12-21' AND id > 21) 
    OR update_date > '2017-12-21'
ORDER BY update_date,id LIMIT 5

Basically we are fetching all the rows that has the same update date as the last row on the previous page, but whose id is different. And also rows that were updated later. This of course make the query more complicated.

基本上，我们将获取所有更新日期与上一页的最后一行相同的行，但其ID不同。以及稍后更新的行。这当然会使查询更加复杂。

结论 (Conclusion)

You need to remember that; you must use an order by when doing pagination.

您需要记住这一点； 分页时必须使用命令。

There is an alternative to pagination, but it’s has limitations

分页有替代方法，但有局限性

It won’t allow you to open a page without opening the previous page.
它不允许您在不打开上一页的情况下打开页面。
If you have multiple order by’s the query will become very complicated.
如果您有多个订单，查询将变得非常复杂。

Therefore if you are displaying a pager with page numbers to your users; you are probably using the correct method but must limit the max page number.

因此，如果您要向用户显示带有页码的寻呼机； 您可能使用了正确的方法，但必须限制最大页数。

Limiting the number of usable pages is important. If you have let’s say 100 Million entries well a few users trying to open the last page can create more load on your servers then hundreds of users. So you should limit the number of usable pages and force the user to use filters & additional sorting to find what he needs.

限制可用页面数很重要。如果您说有1亿条条目，那么尝试打开最后一页的一些用户可能会在您的服务器上产生更多的负载，然后是数百名用户。因此，您应该限制可用页面的数量，并强制用户使用过滤器和其他排序来查找他需要的内容。

Or just have a next & previous buttons using the the keyset pagination.

或仅使用键集分页有一个下一个和上一个按钮。

If you are using pagination for batch processes and you are using the offset you are using the wrong method.

如果对分批处理使用分页，并且使用偏移量，则使用的方法错误。

Using a seek method/keyset pagination to load pages is very efficient; it will make your processes faster for no cost if you do it that way from the beginning. It will also secure the development for the future. If it works today with 10.000 rows it will work in 10 years when you have 10 Million rows.

使用搜索方法/键集分页加载页面非常有效；如果您从一开始就这样做，它将使您的流程更快，更免费。它还将确保未来的发展。如果今天可以使用10.000行，那么当您有1000万行时，它将在10年后起作用。

If you have an infinite scroll or a simpler page with just a next & previous button you must also use the seek method/keyset pagination.

如果您具有无限滚动或仅带有下一个和上一个按钮的简单页面，则还必须使用seek方法/键集分页。

Keyset pagination’s only limitations is jumping on a specific page. But that’s is in my opinion rarely needed. With proper UX/Design it can be avoided.

键集分页的唯一限制是在特定页面上跳转。但这在我看来很少需要。使用正确的UX /设计可以避免这种情况。

Keyset pagination has no size limit, you can query in a 100 million row table and if you have the right indexes all the pages will be fast. So in my opinion there is no reason not to use it instead of what we are more used of doing.

键集分页没有大小限制，您可以在1亿行表中进行查询，如果索引正确，则所有页面都将很快。因此，我认为没有理由不使用它，而不是我们更习惯使用它。

Sadly we are used to the classic offset pagination, and most frameworks allows us to do such pagination's; and actually even make it easier for us to use them.

可悲的是，我们已经习惯了经典的胶印分页，并且大多数框架都允许我们进行这种分页。实际上甚至使我们更容易使用它们。

We should avoid using offset as much as possible and only use it when absolutely necessary and putting proper protection in place.

我们应避免使用偏移量，而应仅在绝对必要时使用偏移量，并提供适当的保护。

要阅读的其他内容 (Additional content to read)

This also points to a very interesting website which makes a similar point. I would recommend you reading it: https://use-the-index-luke.com/

这也指向了一个非常有趣的网站，它提出了类似的观点。我建议您阅读： https : //use-the-index-luke.com/

Thank you for reading.

感谢您的阅读。

翻译自: https://medium.com/swlh/sql-pagination-you-are-probably-doing-it-wrong-d0f2719cc166

sql分页和ajax分页

杨_明

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
sql分页和ajax分页_sql分页你可能做错了

sql分页和ajax分页Am I crazy assuming you are doing pagination wrong? I mean how hard is pagination? it’s so simple how can you do it wrong? 假设您的分页错误，我会发疯吗？我的意思是分页有多难？如此简单，您怎么能做错呢？ Well, you might not be...
复制链接

扫一扫