官网:https://docs.djangoproject.com/en/2.1/topics/db/aggregation/
摘简单的例子:
from django.db import models
class Author(models.Model):
name = models.CharField(max_length=100)
age = models.IntegerField()
class Publisher(models.Model):
name = models.CharField(max_length=300)
class Book(models.Model):
name = models.CharField(max_length=300)
pages = models.IntegerField()
price = models.DecimalField(max_digits=10, decimal_places=2)
rating = models.FloatField()
authors = models.ManyToManyField(Author)
publisher = models.ForeignKey(Publisher, on_delete=models.CASCADE)
pubdate = models.DateField()
class Store(models.Model):
name = models.CharField(max_length=300)
books = models.ManyToManyField(Book)
# Total number of books.
>>> Book.objects.count()
2452
# Total number of books with publisher=BaloneyPress
>>> Book.objects.filter(publisher__name='BaloneyPress').count()
73
# Average price across all books.
>>> from django.db.models import Avg
>>> Book.objects.all().aggregate(Avg('price'))
{'price__avg': 34.35}
# Max price across all books.
>>> from django.db.models import Max
>>> Book.objects.all().aggregate(Max('price'))
{'price__max': Decimal('81.20')}
# Difference between the highest priced book and the average price of all books.
>>> from django.db.models import FloatField
>>> Book.objects.aggregate(
... price_diff=Max('price', output_field=FloatField()) - Avg('price'))
{'price_diff': 46.85}
# All the following queries involve traversing the Book<->Publisher
# foreign key relationship backwards.
# Each publisher, each with a count of books as a "num_books" attribute.
>>> from django.db.models import Count
>>> pubs = Publisher.objects.annotate(num_books=Count('book'))
>>> pubs
<QuerySet [<Publisher: BaloneyPress>, <Publisher: SalamiPress>, ...]>
>>> pubs[0].num_books
73
# Each publisher, with a separate count of books with a rating above and below 5
>>> from django.db.models import Q
>>> above_5 = Count('book', filter=Q(book__rating__gt=5))
>>> below_5 = Count('book', filter=Q(book__rating__lte=5))
>>> pubs = Publisher.objects.annotate(below_5=below_5).annotate(above_5=above_5)
>>> pubs[0].above_5
23
>>> pubs[0].below_5
12
# The top 5 publishers, in order by number of books.
>>> pubs = Publisher.objects.annotate(num_books=Count('book')).order_by('-num_books')[:5]
>>> pubs[0].num_books
1323
简单说明:
Django 提供两种产生聚合的方法,第一种方法是产生整个结果集的总体统计值。
Django provides two ways to generate aggregates. The first way is to generate summary values over an entire QuerySet
.
......
第二种方法,产生结果集中每个对象的独立的总体统计值。注意,结果集有多少对象就生成多少个统计值,这是aggregate()与
annotate()的主要区别。
The second way to generate summary values is to generate an independent summary for each object in a QuerySet
.
理解这两个概念,就清楚怎么使用aggregate()与
annotate()
这两个函数了。
以前没有了解这个概念,觉得这两个函数很像,分不清怎么用。
下面是实际使用的例子:
from django.db.models import Count
source_list.values('freq').distinct().annotate(num=Count('id'))
<QuerySet [{'freq': 'D1', 'num': 2}]>
source_list.values('freq','pci').distinct().annotate(num=Count('id'))
<QuerySet [{'freq': 'D1', 'pci': 331, 'num': 1}, {'freq': 'D1', 'pci': 332, 'num': 1}]>
source_list.values('freq').distinct()
<QuerySet [{'freq': 'D1'}]>
# 请注意,下面的查询语句中,filter(freq_exact='D1')中一个下划线写错了,这里强调一下需要用2个下划线
source_list.filter(freq_exact='D1').values('pci').distinct()
django.core.exceptions.FieldError: Cannot resolve keyword 'freq_exact' into field. Choices are: adj, azimuth, cellname, cgi, eci, enb, enb_id, freq, id, lnglat, lnglat_id, m3, pci, zjcell
freq_exact被当成连续的一个对象名,实际上应该是freq__exact,但两个下划线连接处没有间隙,看起来与一个下划线一样。
source_list.filter(freq__exact='D1')
<QuerySet [<ZJcell: Z736934嘉兴质检所D_129>, <ZJcell: Z736934嘉兴质检所D_131>]>
source_list.filter(freq__exact='D1').values('pci').distinct()
<QuerySet [{'pci': 331}, {'pci': 332}]>
source_list.filter(freq__exact='D1').values_list('pci').distinct()
<QuerySet [(331,), (332,)]>
# order_by()函数有下划线,下面的例子错写成orderby()
source_list.filter(freq__exact='D1').values_list('pci').orderby('pci').distinct()
Traceback (most recent call last):
File "<input>", line 1, in <module>
AttributeError: 'QuerySet' object has no attribute 'orderby'
orderby应为order_by
source_list.filter(freq__exact='D1').values_list('pci').order_by('pci').distinct()
<QuerySet [(331,), (332,)]>
# 注意,order_by('-pci')字段名pci前加负号“-”表示反序
source_list.filter(freq__exact='D1').values_list('pci').order_by('-pci').distinct()
<QuerySet [(332,), (331,)]>
source_list.filter(freq__exact='D1').values('pci').order_by('pci').distinct()
<QuerySet [{'pci': 331}, {'pci': 332}]>
source_list.filter(freq__exact='D1').values('pci').order_by('-pci').distinct()
<QuerySet [{'pci': 332}, {'pci': 331}]>
order_by('-pci')字段名pci前加负号“-”表示反序
在不明白聚合的语法之前,还犯过一个错误:
source_list.values('freq', Count('id')).distinct()
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\Users\Administrator\PycharmProjects\cellsmap\venv\lib\site-packages\django\db\models\query.py", line 742, in values
clone = self._values(*fields, **expressions)
File "C:\Users\Administrator\PycharmProjects\cellsmap\venv\lib\site-packages\django\db\models\query.py", line 737, in _values
clone.query.set_values(fields)
File "C:\Users\Administrator\PycharmProjects\cellsmap\venv\lib\site-packages\django\db\models\sql\query.py", line 1935, in set_values
self.add_fields(field_names, True)
File "C:\Users\Administrator\PycharmProjects\cellsmap\venv\lib\site-packages\django\db\models\sql\query.py", line 1683, in add_fields
join_info = self.setup_joins(name.split(LOOKUP_SEP), opts, alias, allow_many=allow_m2m)
AttributeError: 'Count' object has no attribute 'split'
后来明白Count只能用在聚合函数中,如aggregate()与
annotate()