Blog.objects.filter（）反查外键，django数据库models中的跨表查询，相当于sql的join

最新推荐文章于 2023-10-02 22:49:56 发布

梓沂

最新推荐文章于 2023-10-02 22:49:56 发布

阅读量7.7k

点赞数

分类专栏：研究过文章标签： django models 多表查询

研究过专栏收录该内容

41 篇文章 1 订阅

订阅专栏

https://docs.djangoproject.com/en/2.0/topics/db/queries/#lookups-that-span-relationships

跨表查询（join）不管再sql中还是django models中都需要编程者有很强的逻辑性，虽然django已经把语言的复杂度降到最低了，但是其中还是有许多弯弯绕，得研究清楚。

首先，在看下面的文字之前，得了解ManyToManyField和ForeignKey，另外下面这段文字中得例子得定义来自：

https://docs.djangoproject.com/en/1.11/topics/db/queries/

注：我们只关注原文中的例子即可。不是每句都翻译，我根据翻译对例子进行了解释，例子懂了文章的意思就懂了。

跨表查询

Lookups that span relationships¶

django提供直观而强大得方法处理跨表查询，能在后台自动为用户处理sql语言中得join子句。（如果你研究过sql的join，会发现它虽然无所不能，但是复杂难懂，基本上写完就忘了，再想看懂还得从头开始研究）。而django想跨表查询，就和在同一张表中查询一样，只需要将另一张表的field写在“__”双下划线后面即可。

Django offers a powerful and intuitive way to “follow” relationships inlookups, taking care of the SQL JOINs for you automatically, behind thescenes. To span a relationship, just use the field name of related fieldsacross models, separated by double underscores, until you get to the field youwant.

This example retrieves all Entry objects with a Blog whose nameis 'Beatles Blog':

 
  >>> Entry.objects.filter(blog__name='Beatles Blog')

看例子就能懂，blog是Entry的外键，name是Blog的field，这句的意思是查Entry对象中外键blog的name是'Beatles Blog'的Entry实例。

class Blog(models.Model):
    name = models.CharField(max_length=100)

class Entry(models.Model):
    blog = models.ForeignKey(Blog, on_delete=models.CASCADE)

This spanning can be as deep as you’d like.

It works backwards, too. To refer to a “reverse” relationship, just use thelowercase name of the model.

This example retrieves all Blog objects which have at least one Entrywhose headline contains 'Lennon':

 
  >>> Blog.objects.filter(entry__headline__contains='Lennon')

反向查询也行，语法类似，上面例子查Blog中关联的entry的headline包含'Lennon'的blog实例，注意"__"双下划线出现两次，如果关联的表更多，则需要更多的双下划线。（我没想到反向查询这么简单，但是确实很简单）

If you are filtering across multiple relationships and one of the intermediatemodels doesn’t have a value that meets the filter condition, Django will treatit as if there is an empty (all values are NULL), but valid, object there.All this means is that no error will be raised. For example, in this filter:

 
  Blog.objects.filter(entry__authors__name='Lennon')

这个例子演示了三表联合条件查询，Blog是起始的表，entry是与Blog相关的表，authors与Blog没有直接关系，但与entry是多对多关系，这句代码查询的是作者'Lennon'写文章所在的Blog。

(if there was a related Author model), if there was no authorassociated with an entry, it would be treated as if there was also no nameattached, rather than raising an error because of the missing author.Usually this is exactly what you want to have happen. The only case where itmight be confusing is if you are using isnull. Thus:

 
  Blog.objects.filter(entry__authors__name__isnull=True)

这个例子会出现歧义，本意是查作者名为空的文章所在的Blog，但查询结果也可能是没有作者外键的文章的Blog，因为按顺序查询，如果文章没有关联任何作者的外键，name作为作者的field，也不存在，即被视为空。

will return Blog objects that have an empty name on the author andalso those which have an empty author on the entry. If you don’t wantthose latter objects, you could write:

 
  Blog.objects.filter(entry__authors__isnull=False, entry__authors__name__isnull=True)

如果向避免上个例子中的歧义，需要加两个条件，1、文章的外键Authors不为空，2、Authors的name为空。

Spanning multi-valued relationships¶

When you are filtering an object based on aManyToManyField or a reverseForeignKey, there are two different sorts of filteryou may be interested in. Consider the Blog/Entry relationship(Blog to Entry is a one-to-many relation). We might be interested infinding blogs that have an entry which has both “Lennon” in the headline andwas published in 2008. Or we might want to find blogs that have an entry with“Lennon” in the headline as well as an entry that was publishedin 2008. Since there are multiple entries associated with a single Blog,both of these queries are possible and make sense in some situations.

The same type of situation arises with aManyToManyField. For example, if an Entry has aManyToManyField called tags, we might want tofind entries linked to tags called “music” and “bands” or we might want anentry that contains a tag with a name of “music” and a status of “public”.

To handle both of these situations, Django has a consistent way of processingfilter() calls. Everything inside asingle filter() call is appliedsimultaneously to filter out items matching all those requirements. Successivefilter() calls further restrict the setof objects, but for multi-valued relations, they apply to any object linked tothe primary model, not necessarily those objects that were selected by anearlier filter() call.

That may sound a bit confusing, so hopefully an example will clarify. Toselect all blogs that contain entries with both “Lennon” in the headlineand that were published in 2008 (the same entry satisfying both conditions),we would write:

 
   Blog.objects.filter(entry__headline__contains='Lennon', entry__pub_date__year=2008)

这个例子和下面的例子要联合看，这个例子的意思是找到文章（对象）的头条（field）是'Lennon'，同时文章的日期时2008年的Blog。

To select all blogs that contain an entry with “Lennon” in the headlineas well as an entry that was published in 2008, we would write:

 
   Blog.objects.filter(entry__headline__contains='Lennon').filter(entry__pub_date__year=2008)

entry__headline__contains='Lennon' or entry__pub_date__year=2008

and

entry__headline__contains='Lennon' 、entry__pub_date__year=2008属于同一个Blog

这个例子是，先找文章的头条是'Lennon'的Blog，再在前面的过滤条件查出的Blog结果中找文章的日期是2008年的Blog实例。

注意：第一个例子查询过程中headline__contains='Lennon'和pub_date__year=2008对应的是同一个entry对象，而第二个例子中的两个条件不是同时起作用的，有可能出现这样的结果：一个Blog中存在两篇文章，第一篇头条是'Lennon'但日期不是2008，第二篇日期是2008但头条不是'Lennon'，且不存在头条是'Lennon'同时日期是2008的文章。这样的情况下，第一个例子查询没有结果，第二个例子查询有结果。造成这种差异的原因是：Blog.objects.filter().filter()两个串联的filter不是同时起作用，而是第一次过滤出一个queryset结果集，再在这个结果集的基础上执行第二次过滤。这就是两者的区别，挺绕的。

Suppose there is only one blog that had both entries containing “Lennon” andentries from 2008, but that none of the entries from 2008 contained “Lennon”.The first query would not return any blogs, but the second query would returnthat one blog.

In the second example, the first filter restricts the queryset to all thoseblogs linked to entries with “Lennon” in the headline. The second filterrestricts the set of blogs further to those that are also linked to entriesthat were published in 2008. The entries selected by the second filter may ormay not be the same as the entries in the first filter. We are filtering theBlog items with each filter statement, not the Entry items.

Note

The behavior of filter() for queriesthat span multi-value relationships, as described above, is not implementedequivalently for exclude(). Instead,the conditions in a single exclude()call will not necessarily refer to the same item.

For example, the following query would exclude blogs that contain bothentries with “Lennon” in the headline and entries published in 2008:

 
    Blog.objects.exclude(
    entry__headline__contains='Lennon',
    entry__pub_date__year=2008,
)

exclude的例子比上面的filter更绕，exclude是把符合条件的结果排除在外的作用。这个例子是排除文章头条是'Lennon'的Blog,或排除文章日期是2008的Blog，但不能排除同一篇文章头条即是'Lennon'，日期又是2008的情况。

注：虽然原文写的是Both...and，但我觉得应该是“或”的关系，不是“和”的关系，而且不同于普通的或，为了表示清楚，需通过sql表达式说明：

!(entry__headline__contains='Lennon' or entry__pub_date__year=2008)

and

(entry__headline__contains='Lennon' and entry__pub_date__year=2008)

However, unlike the behavior when usingfilter(), this will not limit blogsbased on entries that satisfy both conditions. In order to do that, i.e.to select all blogs that do not contain entries published with “Lennon”that were published in 2008, you need to make two queries:

 
    Blog.objects.exclude(
    entry__in=Entry.objects.filter(
        headline__contains='Lennon',
        pub_date__year=2008,
    ),
)
 
   

这个例子才是把两个条件同时满足的文章的Blog，排除在结果集之外，它需要两次查询过程。这里与sql的join相比绕了个大弯

sql：

！(entry__headline__contains='Lennon' and entry__pub_date__year=2008)

len(LngLat.objects.filter(enborbs__gsm_lte__gsm_or_lte="LTE",enborbs__type__cell_type__contains="宏"))

1912

这个条件更严格，所以结果相对较少。

len(LngLat.objects.filter(enborbs__type__cell_type__contains="宏"))
3240
len(LngLat.objects.filter(enborbs__gsm_lte__gsm_or_lte="LTE"))

2552

以下两个条件不算太严格，逻辑上可能出现想查LTE宏站，但查到的经纬度点是LTE室分与GSM宏蜂窝重合的点。

len(LngLat.objects.filter(enborbs__type__cell_type__contains="宏").filter(enborbs__gsm_lte__gsm_or_lte="LTE"))
1999
len(LngLat.objects.filter(enborbs__gsm_lte__gsm_or_lte="LTE").filter(enborbs__type__cell_type__contains="宏"))
1999

以下是GSM宏蜂窝和LTE站重合的情况：

len(LngLat.objects.filter(enborbs__type__cell_type__contains="宏蜂窝").filter(enborbs__gsm_lte__gsm_or_lte="LTE"))
75
len(LngLat.objects.filter(enborbs__gsm_lte__gsm_or_lte="LTE").filter(enborbs__type__cell_type__contains="宏蜂窝"))

再来一个例子，说明一下外键的不确定性：

LngLat.objects.exclude(enborbs__type__cell_type__contains="宏").count()
包含“宏”的经纬度点的数量：846
LngLat.objects.all().count()
全部经纬度点的数量：3896
LngLat.objects.filter(enborbs__type__cell_type__contains="宏").count()

不包含“宏”字的经纬度点的数量：3240

为什么846+3240不是3896，原因是有部分经纬度点，既有宏站，又有室分，所以既满足条件1，又满足条件3。所以这两个条件查询结果的和比总数还多。