django orm优化_处理Django优化第4部分的n 1个问题

最新推荐文章于 2023-10-28 08:46:27 发布

weixin_26755331

最新推荐文章于 2023-10-28 08:46:27 发布

阅读量254

点赞数

文章标签： python java 算法 django 人工智能

原文链接：https://levelup.gitconnected.com/dealing-with-the-n-1-problem-optimising-django-part-4-f02010c7931d

版权

django orm优化

This issue is quite notorious among Django Rest Framework users, so it’s always worth getting refreshed on it. If you use DRF or intend to use it, this article will explain how the n+1 problem can rear its ugly head in even the most innocuous of use cases, and how you can deal with it easily.

这个问题在Django Rest Framework用户中是非常臭名昭著的，因此始终值得对其进行更新。如果您使用DRF或打算使用它，那么本文将说明n+1问题如何在最无害的用例中使人头昏脑沉，以及如何轻松处理它。

If you haven’t tried DRF, I wholeheartedly recommend you give it a try. It cuts out a lot of development work needed to get a REST framework up and running, while still being flexible enough to fulfil almost any programming needs you have

如果您还没有尝试过DRF，我会全力推荐您尝试一下。它削减了启动和运行REST框架所需的大量开发工作，同时仍然足够灵活，可以满足您几乎所有的编程需求

If you have worked with Django, you would no doubt be familiar with Django Rest Framework’s Serializer. It verifies and sometimes shapes incoming data into the forms you need for whatever backend purposes you desire. In addition, it also translates your model data into whatever format your REST endpoints have prescribed to your upstream clients.

如果您使用过Django，那么您无疑会熟悉Django Rest Framework的Serializer。它可以验证输入数据，有时还可以将输入数据整形为所需的任何后端目的。此外，它还可以将模型数据转换为REST端点为上游客户端指定的任何格式。

There are many other things you can do with the Serializer, which I won’t cover here. What I will cover, however, is a relatively common use case which can cause those SQL queries to explode again: The n+1 problem.

您可以使用序列化器执行许多其他操作，在此不做介绍。但是，我将介绍一个相对普通的用例，它可能导致这些SQL查询再次爆炸：n + 1问题。

那么n + 1问题是什么？ (So what is the n+1 problem?)

The n+1 problem goes something like this: Let’s take the example we used in Part 3:

n + 1问题是这样的：让我们以第3部分中使用的示例为例：

class User(AbstractUser):
    def __str__(self):
        return ' '.join([self.first_name, self.last_name])
class Event(models.Model):
    id = models.UUIDField(default=uuid.uuid4(), primary_key=True)
    event_name = models.TextField(default='')class Ticket(models.Model):
    user = models.ForeignKey(User, on_delete=models.CASCADE)
    event = models.ForeignKey(Event, on_delete=models.CASCADE, null=True)

So we have Users, Events, and Tickets for users to go to events.

因此，我们有“用户”，“事件”和“票证”供用户使用。

Now let’s say we want to create a view that, given a user_id, returns the user, his name, and a list of the events he’s going for. A view and serializer for that might look something like this:

现在，假设我们要创建一个视图，给定一个user_id，该视图将返回用户，他的名字以及他要处理的事件的列表。一个视图和序列化器可能看起来像这样：

class UserTicketSerializer(ModelSerializer):
    event_name = SerializerMethodField()
    class Meta:
        model = UserTicket
        fields = ('event_name',)
    def get_event_name(self, ticket):
        return 'Event {}'.format(ticket.event.event_name)
class UserSerializer(ModelSerializer):
    name = SerializerMethodField()
    events = UserTicketSerializer(many=True, source='userticket_set')
    class Meta:
        model = User
        fields = ('username', 'name', 'events')
    def get_name(self, user):
        return user.__str__()
class UserView(APIView):
    def get(self, request, user_id, *args, **kwargs):
        user_obj = User.objects.get(id=user_id)
        serializer = UserSerializer(user_obj)
        return Response(serializer.data, status=status.HTTP_200_OK)

And the result of a simple call looks like this:

一个简单的调用的结果如下所示：

GET /user/1/{
    "username": "admin",
    "name": "Mark Ang",
    "events": [ { "event_name": "Event Event 25"},
                { "event_name": "Event Event 27"},
                  ... [6 more]
    ]
}

所以有什么问题？已经知道了！ (So what’s the problem? Get to it already!)

Yeah yeah, we’re getting there.

是的，我们要到达那里。

Image for post — We’re not done, this is only the first bit.

The problem is in the SQL. The SQL log for this API call looks like this:

问题出在SQL中。此API调用SQL日志如下所示：

And as you can see, it is long.

如您所见，它很长。

Prohibitively long.

禁止地长。

So where is the problem?

那么问题出在哪里呢？

It’s simple, really. It happens because of the nested serializer, and for every Ticket given to it, it makes an SQL query for it. For ntickets, that’s n queries, and with the single query for the User, that’s n+1 SQL queries.

真的很简单。它是由于嵌套的序列化程序而发生的，并且对分配给它的每个故障单，都会对其进行SQL查询。对于n票证，这是n查询，对于用户的单个查询，则是n+1 SQL查询。

The times you see here are small because they were collected on a setup where the DB and the server are located on the same machine. If your server took even 10ms to reach your DB, you would have an extra 140ms per transaction to deal with. As your web application’s data grows, this can become quite an issue to deal with.

您在此处看到的时间很小，因为它们是在DB和服务器位于同一台计算机上的设置中收集的。如果您的服务器甚至需要10毫秒才能到达数据库，则每个事务将额外需要140毫秒来处理。随着Web应用程序数据的增长，这可能成为一个非常棘手的问题。

那么，我们如何处理呢？ (So, how do we deal with this?)

Finally we get to the answer: The Prefetch!

最后，我们得到答案：预取！

So what’s a prefetch? Well, think of it as caching stuff related to the items you’re looking for.

那么什么是预取？好吧，可以将其视为与您要查找的项目有关的缓存内容。

When you make a SELECT query (Django’s QuerySet does that for you so you don’t see it directly), you’re querying for specific data on a table. Well, the prefetch can help you collect the data for related objects from different tables in advance. In our case, we can prefetch Tickets related to the user, as well as the Events related to those tickets.

当您执行SELECT查询时(Django的QuerySet为您执行此操作，因此您不会直接看到它)，您正在查询表中的特定数据。好吧，预取可以帮助您提前从不同的表中收集相关对象的数据。在我们的情况下，我们可以预取与用户相关的票证以及与那些票证相关的事件。

Why is this helpful? Well, the serializer tries to get the information it needs from the queryset. Since the queryset doesn’t have the related object data by default, the serializer is forced to perform new queries.

为什么这有帮助？好吧，序列化程序尝试从查询集中获取所需的信息。由于默认情况下queryset没有相关的对象数据，因此强制序列化程序执行新查询。

But by prefetching the related data, the serializer doesn’t have to do any of that!

但是通过预取相关数据，序列化器无需执行任何操作！

让我们看看如何完成！ (Let’s see how this is done!)

Let’s modify our queryset to include a prefetch like this:

让我们修改查询集以包括这样的预取：

class UserView(APIView):
    def get(self, request, user_id, *args, **kwargs):
        qs = User.objects.filter(id=user_id). \
             prefetch_related('userticket_set__event'). \
             first()
        serializer = UserSerializer(qs)
        return Response(serializer.data, status=status.HTTP_200_OK)

And the resultant query log is much shorter:

结果查询日志要短得多：

The SQL is uglier, but who cares?! We just solved our n+1 problem! By prefetching the related Tickets and their Event data, we just cut down our query count from 14 to 5.

SQL比较丑陋，但是谁在乎呢？我们刚刚解决了我们的n+1问题！通过预取相关的票证及其事件数据，我们仅将查询计数从14减少到5。

那故事的寓意是什么？ (So what’s the moral of the story?)

Do not underestimate the power of the prefetch! When dealing with multiple related objects in the same serializer (you’ll normally run into them when you nest a serializer in a serializer), prefetching can be your best friend!

不要小看预取的力量！当在同一个序列化程序中处理多个相关对象时(将序列化程序嵌套在序列化程序中时，通常会遇到它们)，预取可能是最好的朋友！

Of course, the caveat here is to make sure you only prefetch what you know you need. Don’t prefetch willy-nilly as prefetching does make your SQL a little more complex and expensive to perform in the database. However as you can see, the potential performance gain is nothing to be sneezed at. If in doubt, always measure and compare the results to make sure your optimisations are doing what you want them to do.

当然，这里的警告是确保仅预取所需的信息。不要轻易预取，因为预取确实会使您SQL在数据库中执行起来更加复杂和昂贵。但是，如您所见，潜在的性能提升是不容小的。如有疑问，请始终对结果进行衡量和比较，以确保您的优化能够如您所愿。

Thank you for reading through this article! If you would like to know how to setup an SQL log in your Django app, click here to check out Part One of this series where I cover just this! Or check out this article where I explain why remote work is taking over the world (okay, that was too clickbaity, but do check it out anyway!)

感谢您阅读本文！ 如果您想知道如何在Django应用中设置SQL日志， 请单击此处以查看本系列的第1部分，我将在此进行介绍！ 或查看 这篇文章，在其中我解释为什么远程工作接管了整个世界 (好吧，这太过点击了，但还是要检查一下！)

翻译自: https://levelup.gitconnected.com/dealing-with-the-n-1-problem-optimising-django-part-4-f02010c7931d

django orm优化

weixin_26755331

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
django orm优化_处理Django优化第4部分的n 1个问题

django orm优化This issue is quite notorious among Django Rest Framework users, so it’s always worth getting refreshed on it. If you use DRF or intend to use it, this article will explain how the n+1 pro...
复制链接

扫一扫