【性能提升】Django性能优化与缓存策略:13项技术让你的应用速度提升10倍
前言:为什么性能是Django应用成功的关键因素?
在当今竞争激烈的Web应用市场中,性能不再是可选项,而是用户体验和业务成功的核心决定因素。研究表明,页面加载时间每增加1秒,转化率就会下降7%,而53%的移动用户会因为页面加载超过3秒而放弃访问。Django作为一个"全能型"框架,提供了从数据库访问到模板渲染的完整解决方案,每一层都存在优化空间。然而,许多开发者仅满足于让应用"能用",而没有挖掘框架的性能潜力。本文将深入探讨Django应用的全方位性能优化策略,从数据库查询到缓存部署,通过13项核心技术帮助你构建反应迅速、可扩展的Web应用,让你的Django站点速度提升数倍。
1. 数据库优化:提升查询性能
数据库操作通常是Django应用的主要性能瓶颈。优化数据库查询是提升整体性能的关键。
1.1 高效索引设计
在Django模型中正确设置索引能显著提升查询性能:
# models.py
from django.db import models
class Article(models.Model):
title = models.CharField(max_length=200)
slug = models.SlugField(max_length=200, unique=True)
author = models.ForeignKey('auth.User', on_delete=models.CASCADE)
content = models.TextField()
published_at = models.DateTimeField(auto_now_add=True)
status = models.CharField(max_length=10, choices=[
('draft', 'Draft'),
('published', 'Published')
])
category = models.ForeignKey('Category', on_delete=models.CASCADE)
views_count = models.PositiveIntegerField(default=0)
class Meta:
# 复合索引 - 对常用过滤条件组合创建索引
indexes = [
models.Index(fields=['status', 'published_at']),
models.Index(fields=['author', 'status']),
models.Index(fields=['category', 'status', 'published_at']),
# 降序索引 - 用于按发布日期降序排序
models.Index(fields=['-published_at']),
]
# 用于排序的索引
ordering = ['-published_at']
针对低基数字段(如状态)单独建索引通常不划算,但如果该字段经常与其他条件组合使用,复合索引就非常有价值。
1.2 查询优化技巧
优化常见的低效查询模式:
# 低效查询 - 未使用索引
articles = Article.objects.filter(
content__contains='django' # 全文搜索应使用专门的解决方案
)
# 更高效查询 - 使用索引
articles = Article.objects.filter(
status='published',
published_at__gte=last_week
).select_related('author', 'category')
# 避免在循环中查询 - N+1问题
# 差
for article in Article.objects.all():
print(article.author.username) # 每篇文章都需要额外查询
# 优
articles = Article.objects.select_related('author').all()
for article in articles:
print(article.author.username) # 无额外查询
# 使用批量操作
# 差
for title in new_titles:
Article.objects.create(title=title, author=user)
# 优
Article.objects.bulk_create([
Article(title=title, author=user) for title in new_titles
])
1.3 优化模型设计
合理的模型设计对性能至关重要:
# 低效模型设计
class ArticleTag(models.Model):
article = models.ForeignKey('Article', on_delete=models.CASCADE)
tag = models.ForeignKey('Tag', on_delete=models.CASCADE)
# 每篇文章的每个标签创建一个记录
# 可能导致大量记录和繁重的连接查询
# 优化模型设计
class Article(models.Model):
# ...其他字段...
tags = models.ManyToManyField('Tag', related_name='articles')
# 使用Django内置的ManyToManyField简化设计
# 避免冗余数据,但适当反范式化能提升读取性能
class Article(models.Model):
# ...其他字段...
# 存储冗余的评论计数以避免每次都计算
comment_count = models.PositiveIntegerField(default=0)
def update_comment_count(self):
"""更新评论计数"""
self.comment_count = self.comments.count()
self.save(update_fields=['comment_count'])
2. ORM优化技巧
Django ORM是强大的抽象层,但使用不当会导致严重性能问题。
2.1 高效关系查询
# select_related: 用于ForeignKey关系(一对多)
# 一次SQL连接查询,预加载相关对象
article = Article.objects.select_related('author', 'category').get(id=1)
# 访问关联对象不需要额外查询
print(article.author.username)
print(article.category.name)
# prefetch_related: 用于ManyToMany关系或反向关系
# 分别查询相关对象,然后在Python中组合结果
articles = Article.objects.prefetch_related('tags', 'comments').all()
# 遍历关联对象不需要额外查询
for article in articles:
print([tag.name for tag in article.tags.all()])
print([comment.text for comment in article.comments.all()])
# 嵌套关系优化
articles = Article.objects.prefetch_related(
Prefetch('comments', queryset=Comment.objects.select_related('user'))
).all()
# 现在可以无额外查询地访问评论及其用户
for article in articles:
for comment in article.comments.all():
print(comment.user.username)
2.2 高级查询优化
# 只获取需要的字段
authors = User.objects.values('id', 'username').filter(is_staff=True)
# 等效SQL: SELECT id, username FROM auth_user WHERE is_staff = TRUE
# 使用defer避免大字段
articles = Article.objects.defer('content').all()
# 不会立即加载content字段(可能很大)
# 使用only明确指定要加载的字段
articles = Article.objects.only('id', 'title', 'published_at').all()
# 只加载指定字段
# 聚合查询优化
from django.db.models import Count, Avg, Sum
stats = Article.objects.values('category_id').annotate(
article_count=Count('id'),
avg_views=Avg('views_count')
)
# 一次查询完成按类别的统计
# 批量更新
Article.objects.filter(status='draft').update(status='published')
# 单个SQL语句更新多条记录
# 批量删除
Article.objects.filter(published_at__lt=one_year_ago).delete()
# 单个SQL语句删除多条记录
2.3 查询执行优化
控制何时执行查询以提高性能:
# Django查询是惰性的 - 只有在需要结果时才执行
# 创建查询
query = Article.objects.filter(status='published')
# 此时未执行查询
# 添加更多过滤器
if category:
query = query.filter(category__slug=category)
# 仍未执行查询
# 执行查询的操作
articles = list(query) # 执行查询
count = query.count() # 执行COUNT查询
exists = query.exists() # 执行EXISTS查询
# 避免重复执行相同查询
articles = list(Article.objects.all()) # 执行查询并缓存结果
count = len(articles) # 不需要额外查询
first = articles[0] if articles else None # 不需要额外查询
3. 查询优化工具与监控
3.1 使用django-debug-toolbar
安装django-debug-toolbar来分析查询性能:
# settings.py (开发环境)
INSTALLED_APPS = [
# ...
'debug_toolbar',
]
MIDDLEWARE = [
# ...
'debug_toolbar.middleware.DebugToolbarMiddleware',
]
INTERNAL_IPS = [
'127.0.0.1',
]
3.2 编写查询性能测试
# tests/test_performance.py
from django.test import TestCase
from django.test.utils import CaptureQueriesContext
from django.db import connection
from django.urls import reverse
class ArticleListPerformanceTest(TestCase):
def setUp(self):
# 创建测试数据
# ...
def test_article_list_query_count(self):
"""验证文章列表页的查询数量"""
url = reverse('article_list')
# 捕获执行的SQL查询
with CaptureQueriesContext(connection) as queries:
response = self.client.get(url)
# 验证查询数量在可接受范围内
self.assertLess(len(queries), 5, "文章列表页执行了过多查询")
def test_article_detail_query_count(self):
"""验证文章详情页的查询数量"""
article = Article.objects.first()
url = reverse('article_detail', kwargs={'slug': article.slug})
with CaptureQueriesContext(connection) as queries:
response = self.client.get(url)
# 验证查询数量
self.assertLess(len(queries), 4, "文章详情页执行了过多查询")
3.3 实现数据库性能日志
创建中间件记录慢查询:
# middleware.py
import time
import logging
from django.db import connection
logger = logging.getLogger('django.db.backends')
class DatabasePerformanceMiddleware:
def __init__(self, get_response):
self.get_response = get_response
# 设置慢查询阈值(ms)
self.threshold = 500
def __call__(self, request):
# 清除连接查询列表
connection.queries_log.clear()
# 执行请求
response = self.get_response(request)
# 只在DEBUG模式或使用特定参数时分析查询
if (settings.DEBUG or request.GET.get('analyze_db')) and connection.queries:
total_time = 0
# 分析查询
for query in connection.queries:
query_time = float(query.get('time', 0)) * 1000 # 转换为毫秒
total_time += query_time
# 记录慢查询
if query_time > self.threshold:
logger.warning(
f"慢查询({query_time:.2f}ms): {query['sql']}"
)
# 记录总查询数和总时间
logger.debug(
f"请求 {request.method} {request.path} 执行了 "
f"{len(connection.queries)} 个查询,总耗时 {total_time:.2f}ms"
)
return response
4. 缓存策略实施
缓存是提升Django应用性能的最有效方法之一。
4.1 基础缓存配置
# settings.py
CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.redis.RedisCache',
'LOCATION': 'redis://127.0.0.1:6379/1',
'TIMEOUT': 300, # 默认缓存时间(秒)
'OPTIONS': {
'CLIENT_CLASS': 'django_redis.client.DefaultClient',
'IGNORE_EXCEPTIONS': True,
},
'KEY_PREFIX': 'myapp', # 避免缓存键冲突
},
'session': { # 会话专用缓存
'BACKEND': 'django.core.cache.backends.redis.RedisCache',
'LOCATION': 'redis://127.0.0.1:6379/2',
'TIMEOUT': 86400, # 1天
'OPTIONS': {
'CLIENT_CLASS': 'django_redis.client.DefaultClient',
},
},
'staticfiles': { # 静态文件缓存
'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
'TIMEOUT': 3600, # 1小时
'LOCATION': 'staticfiles',
}
}
# 在生产环境中使用缓存会话引擎
SESSION_ENGINE = 'django.contrib.sessions.backends.cache'
SESSION_CACHE_ALIAS = 'session'
# 缓存中间件
MIDDLEWARE = [
# ...
'django.middleware.cache.UpdateCacheMiddleware', # 应放在列表开头
# ... 其他中间件 ...
'django.middleware.cache.FetchFromCacheMiddleware', # 应放在列表末尾
]
# 全站缓存设置
CACHE_MIDDLEWARE_ALIAS = 'default'
CACHE_MIDDLEWARE_SECONDS = 600 # 10分钟
CACHE_MIDDLEWARE_KEY_PREFIX = 'myapp_page'
4.2 视图级缓存
在视图中实现细粒度缓存控制:
# views.py
from django.views.decorators.cache import cache_page, cache_control
from django.utils.decorators import method_decorator
from django.conf import settings
# 函数视图缓存
@cache_page(60 * 15) # 缓存15分钟
@cache_control(public=True) # 允许CDN/代理缓存
def article_list(request):
# 视图逻辑
return render(request, 'blog/article_list.html', {'articles': articles})
# 对于类视图,使用method_decorator
class ArticleListView(ListView):
model = Article
@method_decorator(cache_page(60 * 15))
def dispatch(self, *args, **kwargs):
return super().dispatch(*args, **kwargs)
# 带条件的缓存 - 只缓存未登录用户的响应
def article_detail(request, slug):
# 获取文章
article = get_object_or_404(Article, slug=slug)
# 增加浏览量 - 缓存不应阻止这个更新
Article.objects.filter(id=article.id).update(views_count=F('views_count') + 1)
# 准备响应
response = render(request, 'blog/article_detail.html', {'article': article})
# 只为匿名用户设置缓存
if not request.user.is_authenticated:
patch_response_headers(response, cache_timeout=60 * 5) # 5分钟
return response
4.3 模板片段缓存
缓存模板的特定部分,而非整个页面:
{% load cache %}
{# 缓存侧边栏24小时 #}
{% cache 86400 'sidebar' %}
<div class="sidebar">
<h3>分类</h3>
<ul>
{% for category in categories %}
<li>{{ category.name }} ({{ category.article_count }})</li>
{% endfor %}
</ul>
<h3>标签云</h3>
{% include "tags/tag_cloud.html" %}
</div>
{% endcache %}
{# 用户特定内容的条件缓存 #}
{% if user.is_authenticated %}
{# 登录用户看到的内容 - 每个用户单独缓存 #}
{% cache 3600 'user_recommendations' user.id %}
<div class="recommendations">
<h3>为您推荐</h3>
{% for article in user_recommendations %}
{% include "articles/card.html" with article=article %}
{% endfor %}
</div>
{% endcache %}
{% else %}
{# 未登录用户看到的内容 - 共享缓存 #}
{% cache 3600 'trending_articles' %}
<div class="trending">
<h3>热门文章</h3>
{% for article in trending_articles %}
{% include "articles/card.html" with article=article %}
{% endfor %}
</div>
{% endcache %}
{% endif %}
4.4 底层缓存API
灵活使用底层缓存API:
# utils.py
from django.core.cache import cache, caches
from functools import wraps
import hashlib
import json
# 使用底层API缓存数据
def get_popular_articles(category=None, limit=10):
"""获取热门文章"""
# 构建缓存键
cache_key = f'popular_articles:{category}:{limit}'
# 尝试从缓存获取
result = cache.get(cache_key)
if result is not None:
return result
# 缓存未命中,查询数据库
queryset = Article.objects.filter(status='published')
if category:
queryset = queryset.filter(category__slug=category)
# 获取热门文章
popular = list(
queryset.order_by('-views_count')[:limit]
.values('id', 'title', 'slug', 'views_count')
)
# 存入缓存(1小时)
cache.set(cache_key, popular, 60 * 60)
return popular
# 自定义缓存装饰器
def cached_result(timeout=3600, key_prefix=''):
"""缓存函数返回值的装饰器"""
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
# 创建唯一的缓存键
key_parts = [key_prefix, func.__name__]
# 添加函数参数到缓存键
for arg in args:
key_parts.append(str(arg))
# 添加关键字参数到缓存键
for k, v in sorted(kwargs.items()):
key_parts.append(f"{k}:{v}")
# 使用MD5创建确定性缓存键
cache_key = hashlib.md5(':'.join(key_parts).encode()).hexdigest()
# 尝试从缓存获取
result = cache.get(cache_key)
if result is not None:
return result
# 缓存未命中,执行函数
result = func(*args, **kwargs)
# 存入缓存
cache.set(cache_key, result, timeout)
return result
return wrapper
return decorator
# 使用示例
@cached_result(timeout=600, key_prefix='user_stats')
def get_user_statistics(user_id):
"""获取用户统计数据"""
# 复杂的查询...
return stats
4.5 缓存失效策略
实现智能缓存失效:
# models.py
from django.db.models.signals import post_save, post_delete
from django.dispatch import receiver
from django.core.cache import cache
class Article(models.Model):
# ... 字段定义 ...
def save(self, *args, **kwargs):
"""保存时自动清除相关缓存"""
# 检查是否已存在(更新操作)
is_update = self.pk is not None
# 保存对象
super().save(*args, **kwargs)
# 清除单个文章缓存
cache.delete(f'article:{self.pk}')
cache.delete(f'article:{self.slug}')
# 清除影响到的列表缓存
cache.delete_many([
'article_list',
f'category_articles:{self.category_id}',
'popular_articles',
'homepage_articles'
])
# 使用信号自动清除缓存
@receiver(post_save, sender=Article)
def clear_article_cache(sender, instance, **kwargs):
"""文章保存时清除缓存"""
# 清除列表缓存
cache.delete('article_list')
cache.delete(f'category_articles:{instance.category_id}')
@receiver(post_delete, sender=Article)
def clear_article_list_cache(sender, instance, **kwargs):
"""文章删除时清除列表缓存"""
cache.delete('article_list')
cache.delete(f'category_articles:{instance.category_id}')
5. 网络与前端优化
5.1 静态文件优化
优化静态文件的服务与加载:
# settings.py
# 开发环境
STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.StaticFilesStorage'
# 生产环境
STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.ManifestStaticFilesStorage'
# 或使用云存储
# STATICFILES_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'
# 压缩与优化
INSTALLED_APPS = [
# ...
'compressor',
]
COMPRESS_ENABLED = True
COMPRESS_OFFLINE = True # 预编译压缩文件
COMPRESS_CSS_FILTERS = [
'compressor.filters.css_default.CssAbsoluteFilter',
'compressor.filters.cssmin.rCSSMinFilter',
]
COMPRESS_JS_FILTERS = [
'compressor.filters.jsmin.JSMinFilter',
]
STATICFILES_FINDERS = [
'django.contrib.staticfiles.finders.FileSystemFinder',
'django.contrib.staticfiles.finders.AppDirectoriesFinder',
'compressor.finders.CompressorFinder',
]
在模板中使用压缩:
{% load static compress %}
{% compress css %}
<link rel="stylesheet" href="{% static 'css/bootstrap.css' %}">
<link rel="stylesheet" href="{% static 'css/main.css' %}">
<link rel="stylesheet" href="{% static 'css/responsive.css' %}">
{% endcompress %}
{% compress js %}
<script src="{% static 'js/jquery.js' %}"></script>
<script src="{% static 'js/bootstrap.js' %}"></script>
<script src="{% static 'js/main.js' %}"></script>
{% endcompress %}
5.2 模板优化
优化模板渲染性能:
{# 优化循环性能 #}
{% for article in article_list %}
{# 使用with标签缓存计算结果 #}
{% with article_url=article.get_absolute_url %}
<div class="article-card">
<h2><a href="{{ article_url }}">{{ article.title }}</a></h2>
<p>{{ article.content|truncatewords:30 }}</p>
<a href="{{ article_url }}" class="read-more">阅读更多</a>
</div>
{% endwith %}
{% endfor %}
{# 使用include加载和缓存常用的模板片段 #}
{% include "articles/meta_info.html" with article=article only %}
{# 使用空的变量类别而非自定义过滤器 #}
{{ article.votes }} vote{{ article.votes|pluralize }}
自定义缓存模板标签:
# templatetags/cache_tags.py
from django import template
from django.core.cache import cache
import hashlib
register = template.Library()
@register.simple_tag(takes_context=True)
def cached_include(context, template_name, timeout=3600, **kwargs):
"""缓存include标签的结果"""
# 创建包含所有参数的缓存键
key_components = [template_name]
# 添加提供的所有关键字参数
key_components.extend([f"{k}:{v}" for k, v in sorted(kwargs.items())])
# 添加当前用户ID(如果存在)
if 'request' in context and hasattr(context['request'], 'user'):
key_components.append(f"user:{context['request'].user.id}")
# 创建缓存键
cache_key = f"template_include:{hashlib.md5(':'.join(key_components).encode()).hexdigest()}"
# 尝试从缓存获取
result = cache.get(cache_key)
if result is not None:
return result
# 缓存未命中,渲染模板
template = context.template.engine.get_template(template_name)
# 创建新上下文,添加关键字参数
new_context = context.new({**kwargs})
# 渲染模板
result = template.render(new_context)
# 存入缓存
cache.set(cache_key, result, timeout)
return result
在模板中使用:
{% load cache_tags %}
{# 缓存常用但较少变化的组件 #}
{% cached_include "components/sidebar.html" timeout=86400 %}
{# 对于用户特定内容,自动包含用户ID在缓存键中 #}
{% cached_include "components/user_recommendations.html" timeout=3600 %}
5.3 响应压缩与浏览器缓存
配置GZip压缩和浏览器缓存:
# settings.py
MIDDLEWARE = [
# ...
'django.middleware.gzip.GZipMiddleware', # 应放在靠前的位置
# ...
]
# 自定义中间件设置浏览器缓存头
class CacheControlMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
response = self.get_response(request)
# 设置静态文件的缓存头
if request.path.startswith('/static/') or request.path.startswith('/media/'):
# 静态资源缓存1年
response['Cache-Control'] = 'public, max-age=31536000, immutable'
# API响应不缓存
elif request.path.startswith('/api/'):
response['Cache-Control'] = 'no-store, no-cache, must-revalidate, max-age=0'
# 普通页面中等缓存时间
elif not request.user.is_authenticated and request.method == 'GET':
response['Cache-Control'] = 'public, max-age=600' # 10分钟
return response
6. 视图性能优化
6.1 异步视图
利用Django 3.1+的异步视图提升性能:
# views.py
import asyncio
import httpx
from django.http import HttpResponse
async def async_view(request):
"""异步处理I/O绑定操作"""
async with httpx.AsyncClient() as client:
# 同时发出多个API请求
weather_task = client.get('https://api.weather.com')
news_task = client.get('https://api.news.com')
stock_task = client.get('https://api.stocks.com')
# 并行等待所有响应
weather_response, news_response, stock_response = await asyncio.gather(
weather_task, news_task, stock_task
)
# 处理响应
context = {
'weather': weather_response.json(),
'news': news_response.json(),
'stocks': stock_response.json()
}
return render(request, 'dashboard.html', context)
6.2 分页优化
优化大数据集分页:
# views.py
from django.core.paginator import Paginator, EmptyPage, PageNotAnInteger
def article_list(request):
# 获取基础查询集
queryset = Article.objects.filter(status='published')
# 添加优化
queryset = queryset.select_related('author', 'category')
# 关键 - 使用优化的Paginator类
paginator = CachedPaginator(queryset, 20) # 每页20项
page = request.GET.get('page')
try:
articles = paginator.page(page)
except PageNotAnInteger:
# 如果页码不是整数,显示第一页
articles = paginator.page(1)
except EmptyPage:
# 如果页码超出范围,显示最后一页
articles = paginator.page(paginator.num_pages)
return render(request, 'blog/article_list.html', {'articles': articles})
# 优化的分页器
class CachedPaginator(Paginator):
"""带缓存的分页器,避免重复COUNT查询"""
_count = None
def validate_number(self, number):
"""验证页码有效性"""
try:
number = int(number)
except (TypeError, ValueError):
raise PageNotAnInteger('页码不是整数')
if number < 1:
return 1
if number > self.num_pages:
if number == 1 and self.allow_empty_first_page:
return 1
return self.num_pages
return number
@property
def count(self):
"""缓存集合中的项目数量"""
if self._count is None:
self._count = self.object_list.count()
return self._count
6.3 批量操作视图
优化批量操作视图:
# views.py
@require_POST
@transaction.atomic
def bulk_publish_articles(request):
"""批量发布文章"""
article_ids = request.POST.getlist('article_ids')
# 使用批量更新而非循环单个更新
updated_count = Article.objects.filter(
id__in=article_ids,
status='draft'
).update(
status='published',
published_at=timezone.now()
)
messages.success(request, f"成功发布{updated_count}篇文章")
return redirect('admin:article_list')
7. 中间件优化
7.1 自定义缓存中间件
针对特定URL模式的缓存中间件:
# middleware.py
from django.utils.deprecation import MiddlewareMixin
from django.core.cache import cache
import re
import hashlib
class CustomCacheMiddleware(MiddlewareMixin):
"""针对特定URL模式的缓存中间件"""
def __init__(self, get_response):
super().__init__(get_response)
# 定义需要缓存的URL模式及其超时时间
self.cache_patterns = [
(re.compile(r'^/articles/$'), 300), # 文章列表页缓存5分钟
(re.compile(r'^/categories/[\w-]+/$'), 600), # 分类页缓存10分钟
(re.compile(r'^/tags/[\w-]+/$'), 600), # 标签页缓存10分钟
(re.compile(r'^/api/public/'), 60), # 公共API缓存1分钟
]
def process_request(self, request):
"""处理请求,从缓存获取响应"""
# 非GET请求或已登录用户不缓存
if request.method != 'GET' or request.user.is_authenticated:
return None
# 检查URL是否匹配缓存模式
cache_timeout = None
for pattern, timeout in self.cache_patterns:
if pattern.match(request.path_info):
cache_timeout = timeout
break
if cache_timeout is None:
return None # 不缓存
# 创建缓存键
cache_key = self._get_cache_key(request)
# 尝试从缓存获取
response = cache.get(cache_key)
return response
def process_response(self, request, response):
"""处理响应,将结果存入缓存"""
# 检查响应是否可缓存
if self._should_cache_response(request, response):
# 检查URL是否匹配缓存模式
for pattern, timeout in self.cache_patterns:
if pattern.match(request.path_info):
# 创建缓存键
cache_key = self._get_cache_key(request)
# 存入缓存
cache.set(cache_key, response, timeout)
break
return response
def _get_cache_key(self, request):
"""创建缓存键"""
# 基于完整URL路径(包括查询参数)
key_parts = [request.path_info]
if request.GET:
key_parts.append(request.GET.urlencode())
# 将Accept和Accept-Encoding头信息加入缓存键
# 处理不同内容类型和压缩格式
for header in ['HTTP_ACCEPT', 'HTTP_ACCEPT_ENCODING']:
value = request.META.get(header, '')
if value:
key_parts.append(f"{header}:{value}")
# 创建MD5哈希键
key = hashlib.md5(':'.join(key_parts).encode()).hexdigest()
return f"page_cache:{key}"
def _should_cache_response(self, request, response):
"""判断响应是否可缓存"""
# 只缓存成功的GET请求
if request.method != 'GET' or response.status_code != 200:
return False
# 不缓存已登录用户的响应
if request.user.is_authenticated:
return False
# 不缓存包含特定cookie的响应
for cookie in request.COOKIES:
if cookie.startswith('sessionid') or cookie.startswith('csrftoken'):
return False
# 不缓存流式响应
if getattr(response, 'streaming', False):
return False
return True
7.2 中间件性能优化
优化中间件性能:
# middleware.py
class OptimizedMiddleware:
"""性能优化的中间件示例"""
def __init__(self, get_response):
self.get_response = get_response
# 在初始化时完成昂贵的设置,而非每个请求
self._compile_patterns()
def _compile_patterns(self):
"""预编译正则表达式"""
# 编译一次,在所有请求中重用
self.admin_pattern = re.compile(r'^/admin/')
self.static_pattern = re.compile(r'^/static/')
self.api_pattern = re.compile(r'^/api/')
def __call__(self, request):
# 快速路径 - 跳过不需要处理的请求
if self.static_pattern.match(request.path_info):
return self.get_response(request)
# 选择性处理 - 不同路径执行不同逻辑
if self.admin_pattern.match(request.path_info):
# 管理界面特定处理
pass
elif self.api_pattern.match(request.path_info):
# API特定处理
pass
else:
# 普通页面处理
pass
# 处理响应
response = self.get_response(request)
# 后处理
# ...
return response
8. 会话与认证优化
8.1 会话存储优化
优化会话存储:
# settings.py
# 使用缓存存储会话
SESSION_ENGINE = 'django.contrib.sessions.backends.cache'
SESSION_CACHE_ALIAS = 'session'
# 或使用带持久化的缓存
SESSION_ENGINE = 'django.contrib.sessions.backends.cached_db'
# 设置会话cookie
SESSION_COOKIE_AGE = 1209600 # 2周(秒)
SESSION_COOKIE_SECURE = True # 仅HTTPS
SESSION_COOKIE_HTTPONLY = True # 阻止JavaScript访问
SESSION_COOKIE_SAMESITE = 'Lax'
SESSION_SAVE_EVERY_REQUEST = False # 避免频繁写入
8.2 用户认证优化
优化认证流程的性能:
# settings.py
# 自定义认证后端
AUTHENTICATION_BACKENDS = [
'myapp.auth.EmailOrUsernameModelBackend',
'django.contrib.auth.backends.ModelBackend',
]
# 自定义后端实现
# auth.py
from django.contrib.auth import get_user_model
from django.contrib.auth.backends import ModelBackend
from django.db.models import Q
User = get_user_model()
class EmailOrUsernameModelBackend(ModelBackend):
"""允许使用电子邮件或用户名登录"""
def authenticate(self, request, username=None, password=None, **kwargs):
if username is None:
username = kwargs.get('email')
if username is None or password is None:
return None
# 核心优化点 - 一次查询同时检查用户名和邮箱
# 而不是先查一个再查另一个
try:
user = User.objects.get(
Q(username__iexact=username) | Q(email__iexact=username)
)
except User.DoesNotExist:
# 执行与真实用户相同的密码哈希操作,防止时序攻击
User().set_password(password)
return None
if user.check_password(password) and self.user_can_authenticate(user):
return user
return None
8.3 权限检查优化
优化频繁的权限检查:
# utils.py
from django.core.cache import cache
from functools import wraps
def cached_permission_required(perm, timeout=3600):
"""带缓存的权限检查装饰器"""
def decorator(view_func):
@wraps(view_func)
def _wrapped_view(request, *args, **kwargs):
if not request.user.is_authenticated:
return redirect_to_login(request.path)
# 创建缓存键
cache_key = f"perm:{request.user.id}:{perm}"
# 尝试从缓存获取权限结果
has_perm = cache.get(cache_key)
if has_perm is None:
# 缓存未命中,执行权限检查
has_perm = request.user.has_perm(perm)
# 存入缓存
cache.set(cache_key, has_perm, timeout)
if not has_perm:
raise PermissionDenied
return view_func(request, *args, **kwargs)
return _wrapped_view
return decorator
# 使用示例
@cached_permission_required('blog.publish_article')
def publish_article(request, article_id):
# 视图逻辑...
pass
9. 数据库连接与事务优化
9.1 数据库连接池
使用连接池优化数据库连接:
# settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql',
'NAME': 'mydatabase',
'USER': 'mydatabaseuser',
'PASSWORD': 'mypassword',
'HOST': '127.0.0.1',
'PORT': '5432',
'CONN_MAX_AGE': 60, # 连接复用时间(秒)
'OPTIONS': {
# PostgreSQL特定选项
'sslmode': 'require',
},
}
}
对于更高级的连接池,可以使用Django外部库:
# 使用django-db-connection-pool
DATABASES = {
'default': {
'ENGINE': 'dj_db_conn_pool.backends.postgresql',
'NAME': 'mydatabase',
# ...其他设置...
'POOL_OPTIONS': {
'POOL_SIZE': 20, # 池中连接数
'MAX_OVERFLOW': 10, # 最大额外连接数
'RECYCLE': 300, # 连接回收时间(秒)
}
}
}
9.2 事务优化
优化数据库事务:
# views.py
from django.db import transaction
# 方法1: 装饰器 - 整个视图使用单个事务
@transaction.atomic
def create_article(request):
# 所有数据库操作都在一个事务中
# 创建文章
article = Article.objects.create(title=title, content=content)
# 添加标签
for tag_name in tags:
tag, _ = Tag.objects.get_or_create(name=tag_name)
article.tags.add(tag)
# 如果任何一步失败,所有操作都会回滚
return render(request, 'success.html')
# 方法2: 上下文管理器 - 只有部分代码在事务中
def update_article(request, article_id):
article = get_object_or_404(Article, id=article_id)
# 更新基本信息 - 不需要事务
article.title = request.POST.get('title')
article.content = request.POST.get('content')
article.save()
# 使用事务处理复杂的标签更新
with transaction.atomic():
# 删除原有标签
article.tags.clear()
# 添加新标签
for tag_name in request.POST.getlist('tags'):
tag, _ = Tag.objects.get_or_create(name=tag_name)
article.tags.add(tag)
return redirect('article_detail', slug=article.slug)
# 方法3: 保存点 - 在事务中创建还原点
@transaction.atomic
def process_import(request):
if request.method == 'POST' and request.FILES.get('import_file'):
# 开始事务
# 创建保存点
save_point = transaction.savepoint()
try:
# 导入数据
import_data(request.FILES['import_file'])
except Exception as e:
# 发生错误,回滚到保存点
transaction.savepoint_rollback(save_point)
messages.error(request, f"导入失败: {str(e)}")
else:
# 成功,提交保存点
transaction.savepoint_commit(save_point)
messages.success(request, "数据导入成功")
return render(request, 'import.html')
9.3 使用select_for_update避免竞态条件
# views.py
@transaction.atomic
def add_to_cart(request, product_id):
# 锁定产品记录,防止并发更新导致的库存错误
product = Product.objects.select_for_update().get(id=product_id)
quantity = int(request.POST.get('quantity', 1))
# 检查库存
if product.stock < quantity:
messages.error(request, "库存不足")
return redirect('product_detail', id=product_id)
# 更新库存
product.stock -= quantity
product.save()
# 添加到购物车
cart_item, created = CartItem.objects.get_or_create(
cart=request.user.cart,
product=product,
defaults={'quantity': 0}
)
cart_item.quantity += quantity
cart_item.save()
messages.success(request, "商品已添加到购物车")
return redirect('cart')
10. 异步任务与后台处理
10.1 使用Celery处理耗时任务
# tasks.py
from celery import shared_task
from django.core.mail import send_mail
@shared_task
def process_article_submission(article_id):
"""处理文章提交"""
article = Article.objects.get(id=article_id)
# 处理文章内容(例如,格式化、过滤敏感内容等)
article.content = process_content(article.content)
# 生成摘要
article.summary = generate_summary(article.content)
# 提取关键词
article.keywords = extract_keywords(article.content)
# 保存更改
article.save()
# 通知编辑
send_mail(
f'新文章提交: {article.title}',
f'作者 {article.author} 提交了新文章,请审核。',
'noreply@example.com',
['editor@example.com'],
fail_silently=False,
)
return article.id
@shared_task
def generate_site_stats():
"""生成站点统计数据"""
# 耗时的统计计算...
stats = calculate_statistics()
# 存储结果
cache.set('site_stats', stats, 60 * 60 * 6) # 缓存6小时
return True
在视图中使用:
# views.py
def submit_article(request):
if request.method == 'POST':
form = ArticleForm(request.POST)
if form.is_valid():
# 保存文章
article = form.save(commit=False)
article.author = request.user
article.status = 'pending'
article.save()
# 后台处理任务
process_article_submission.delay(article.id)
messages.success(request, "文章提交成功,将在后台处理")
return redirect('article_submitted')
else:
form = ArticleForm()
return render(request, 'submit_article.html', {'form': form})
10.2 定期后台任务
配置定期执行的任务:
# settings.py
CELERY_BEAT_SCHEDULE = {
'daily_cleanup': {
'task': 'myapp.tasks.cleanup_old_data',
'schedule': 86400.0, # 每天执行一次
},
'generate_stats': {
'task': 'myapp.tasks.generate_site_stats',
'schedule': 3600.0, # 每小时执行一次
},
'update_search_index': {
'task': 'myapp.tasks.update_search_index',
'schedule': 1800.0, # 每30分钟执行一次
},
}
10.3 使用异步IO
结合Django Channels实现WebSocket和异步IO:
# consumers.py
import json
from channels.generic.websocket import AsyncWebsocketConsumer
from channels.db import database_sync_to_async
class NotificationConsumer(AsyncWebsocketConsumer):
async def connect(self):
self.user = self.scope['user']
if not self.user.is_authenticated:
await self.close()
return
self.group_name = f'user_{self.user.id}_notifications'
# 加入组
await self.channel_layer.group_add(
self.group_name,
self.channel_name
)
await self.accept()
# 发送未读通知
unread = await self.get_unread_notifications()
await self.send(text_data=json.dumps({
'type': 'unread_notifications',
'notifications': unread
}))
async def disconnect(self, close_code):
# 离开组
await self.channel_layer.group_discard(
self.group_name,
self.channel_name
)
async def receive(self, text_data):
data = json.loads(text_data)
action = data.get('action')
if action == 'mark_read':
notification_id = data.get('id')
await self.mark_notification_read(notification_id)
async def notification_message(self, event):
"""处理新通知"""
await self.send(text_data=json.dumps({
'type': 'new_notification',
'notification': event['notification']
}))
@database_sync_to_async
def get_unread_notifications(self):
"""获取未读通知"""
return list(
self.user.notifications.filter(read=False)
.order_by('-created_at')
.values('id', 'message', 'created_at')
)
@database_sync_to_async
def mark_notification_read(self, notification_id):
"""标记通知为已读"""
self.user.notifications.filter(id=notification_id).update(read=True)
11. 搜索功能优化
11.1 集成专用搜索引擎
使用Elasticsearch提供高性能搜索:
# settings.py
INSTALLED_APPS = [
# ...
'django_elasticsearch_dsl',
]
ELASTICSEARCH_DSL = {
'default': {
'hosts': 'localhost:9200'
},
}
# documents.py
from django_elasticsearch_dsl import Document, fields
from django_elasticsearch_dsl.registries import registry
from .models import Article
@registry.register_document
class ArticleDocument(Document):
"""文章搜索文档"""
author = fields.ObjectField(properties={
'id': fields.IntegerField(),
'username': fields.TextField(),
})
category = fields.ObjectField(properties={
'id': fields.IntegerField(),
'name': fields.TextField(),
'slug': fields.TextField(),
})
tags = fields.NestedField(properties={
'id': fields.IntegerField(),
'name': fields.TextField(),
})
class Index:
name = 'articles'
settings = {
'number_of_shards': 1,
'number_of_replicas': 0
}
class Django:
model = Article
fields = [
'id',
'title',
'slug',
'content',
'status',
'published_at',
'views_count',
]
# 只索引已发布的文章
queryset_pagination = 500 # 每批索引500篇文章
def get_queryset(self):
return super().get_queryset().filter(
status='published'
).select_related('author', 'category').prefetch_related('tags')
def get_instances_from_related(self, related_instance):
"""当关联对象更新时更新文档"""
if isinstance(related_instance, User):
return related_instance.article_set.all()
elif isinstance(related_instance, Category):
return related_instance.article_set.all()
elif isinstance(related_instance, Tag):
return related_instance.articles.all()
实现搜索视图:
# views.py
from elasticsearch_dsl import Q
def search(request):
"""使用Elasticsearch搜索文章"""
q = request.GET.get('q', '')
if not q:
return render(request, 'search.html', {'query': q})
# 构建查询
search_query = Q(
'multi_match',
query=q,
fields=['title^3', 'content', 'category.name', 'tags.name'],
fuzziness='AUTO'
)
# 过滤条件
category = request.GET.get('category')
if category:
search_query = search_query & Q('term', category__slug=category)
# 执行搜索
search_results = ArticleDocument.search().query(search_query)
# 排序
sort = request.GET.get('sort')
if sort == 'recent':
search_results = search_results.sort('-published_at')
elif sort == 'popular':
search_results = search_results.sort('-views_count')
# 分页
page = int(request.GET.get('page', 1))
per_page = 20
start = (page - 1) * per_page
end = start + per_page
# 获取结果总数
total = search_results.count()
# 获取当前页结果
search_results = search_results[start:end].execute()
context = {
'query': q,
'search_results': search_results,
'total': total,
'page': page,
'per_page': per_page,
'pages': (total // per_page) + 1,
}
return render(request, 'search.html', context)
11.2 使用数据库进行基本搜索
对于简单项目,可以优化基于数据库的搜索:
# PostgreSQL全文搜索
from django.contrib.postgres.search import SearchVector, SearchQuery, SearchRank
def db_search(request):
"""基于数据库的搜索实现"""
q = request.GET.get('q', '')
if not q:
return render(request, 'search.html', {'query': q})
# 创建搜索向量
vector = SearchVector('title', weight='A') + \
SearchVector('content', weight='B') + \
SearchVector('category__name', weight='C') + \
SearchVector('tags__name', weight='C')
# 创建搜索查询
query = SearchQuery(q)
# 执行搜索并按相关性排序
search_results = Article.objects.annotate(
rank=SearchRank(vector, query)
).filter(
rank__gt=0.1,
status='published'
).order_by('-rank').distinct()
# 分页处理
paginator = Paginator(search_results, 20)
page = request.GET.get('page')
try:
results = paginator.page(page)
except PageNotAnInteger:
results = paginator.page(1)
except EmptyPage:
results = paginator.page(paginator.num_pages)
return render(request, 'search.html', {
'query': q,
'results': results
})
12. 监控与性能分析
12.1 使用Django Silk监控性能
Django Silk是一个强大的性能分析工具:
# settings.py
INSTALLED_APPS = [
# ...
'silk',
]
MIDDLEWARE = [
# ...
'silk.middleware.SilkyMiddleware',
]
# 配置Silk
SILKY_PYTHON_PROFILER = True
SILKY_PYTHON_PROFILER_BINARY = True
SILKY_META = True
SILKY_AUTHENTICATION = True # 要求登录访问Silk
SILKY_AUTHORISATION = True # 要求staff权限
SILKY_MAX_RECORDED_REQUESTS = 10000
SILKY_MAX_RECORDED_REQUESTS_CHECK_PERCENT = 10
# urls.py
urlpatterns = [
# ...
path('silk/', include('silk.urls', namespace='silk')),
]
12.2 自定义性能指标收集
实现性能收集中间件:
# middleware.py
import time
from django.db import connection
import logging
logger = logging.getLogger('performance')
class PerformanceMonitorMiddleware:
"""监控请求性能"""
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
# 开始计时
start_time = time.time()
# 重置查询列表
reset_queries()
# 处理请求
response = self.get_response(request)
# 计算执行时间
duration = time.time() - start_time
# 记录慢请求
if duration > 1.0: # 超过1秒的请求
query_count = len(connection.queries)
logger.warning(
f"慢请求: {request.method} {request.path} 耗时 {duration:.2f}秒, "
f"执行了 {query_count} 个查询"
)
# 记录所有查询
for i, query in enumerate(connection.queries):
sql = query.get('sql', '').replace('\n', ' ')
time_taken = query.get('time', 0)
if float(time_taken) > 0.1: # 超过100ms的查询
logger.warning(
f"慢查询 {i+1}/{query_count}: 耗时 {time_taken}秒\n{sql}"
)
return response
12.3 使用监控工具分析性能
# Third-party monitoring integration
# Example for New Relic
# pip install newrelic
# Add to wsgi.py
import newrelic.agent
newrelic.agent.initialize('/path/to/newrelic.ini')
import os
from django.core.wsgi import get_wsgi_application
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "myproject.settings")
application = get_wsgi_application()
application = newrelic.agent.WSGIApplicationWrapper(application)
13. 部署优化
13.1 Gunicorn与Nginx配置
优化Gunicorn配置:
# gunicorn_config.py
import multiprocessing
# 工作进程数
workers = multiprocessing.cpu_count() * 2 + 1
# 每个工作进程的线程数
threads = 2
# 工作模式
worker_class = 'gevent' # 使用gevent处理异步请求
# 超时设置
timeout = 30
keepalive = 2
# 最大请求
max_requests = 1000
max_requests_jitter = 50
# 日志设置
accesslog = '/var/log/gunicorn/access.log'
errorlog = '/var/log/gunicorn/error.log'
loglevel = 'warning'
Nginx优化配置:
# nginx.conf
# 静态文件缓存
location /static/ {
alias /path/to/static/;
expires 30d;
add_header Cache-Control "public, max-age=2592000";
access_log off;
}
location /media/ {
alias /path/to/media/;
expires 7d;
add_header Cache-Control "public, max-age=604800";
access_log off;
}
# Gzip压缩
gzip on;
gzip_comp_level 5;
gzip_min_length 256;
gzip_proxied any;
gzip_vary on;
gzip_types
application/javascript
application/json
application/xml
application/xml+rss
text/css
text/javascript
text/plain
text/xml;
# 代理缓冲区设置
proxy_buffering on;
proxy_buffer_size 16k;
proxy_buffers 8 16k;
13.2 静态文件服务优化
使用云存储服务静态文件:
# settings.py
# 使用django-storages和boto3
INSTALLED_APPS = [
# ...
'storages',
]
# S3配置
AWS_ACCESS_KEY_ID = os.environ.get('AWS_ACCESS_KEY_ID')
AWS_SECRET_ACCESS_KEY = os.environ.get('AWS_SECRET_ACCESS_KEY')
AWS_STORAGE_BUCKET_NAME = 'my-static-bucket'
AWS_S3_REGION_NAME = 'us-east-1'
AWS_DEFAULT_ACL = 'public-read'
AWS_S3_OBJECT_PARAMETERS = {
'CacheControl': 'max-age=86400',
}
# 静态文件设置
STATICFILES_STORAGE = 'storages.backends.s3boto3.S3Boto3Storage'
STATIC_URL = f'https://{AWS_STORAGE_BUCKET_NAME}.s3.amazonaws.com/'
# 媒体文件设置
DEFAULT_FILE_STORAGE = 'myproject.storage.MediaStorage'
MEDIA_URL = f'https://{AWS_STORAGE_BUCKET_NAME}.s3.amazonaws.com/media/'
自定义存储类:
# storage.py
from storages.backends.s3boto3 import S3Boto3Storage
class MediaStorage(S3Boto3Storage):
location = 'media'
file_overwrite = False
13.3 CDN集成
# settings.py
# 使用CDN
STATIC_URL = 'https://cdn.example.com/static/'
# 自定义CDN中间件(在响应中替换URL)
class CDNMiddleware:
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
response = self.get_response(request)
# 只处理HTML响应
if response.get('Content-Type', '').startswith('text/html'):
response.content = response.content.replace(
b'/static/',
b'https://cdn.example.com/static/'
)
return response
总结:构建高性能Django应用的核心原则
优化Django应用性能是一个系统性工作,需要关注各个层面。本文介绍的13项核心技术覆盖了从数据库查询到缓存策略,从视图优化到部署配置的全方位优化方案。通过合理实施这些技术,你可以显著提升Django应用的性能和可扩展性。
关键原则总结:
- 优化数据库查询:使用合适的索引、减少查询次数、优化ORM使用是性能优化的基础
- 有效利用缓存:分层次的缓存策略能够大幅减少数据库负载和计算开销
- 异步处理:将耗时任务放入后台,确保用户体验的流畅性
- 前端优化:静态文件优化、响应压缩、浏览器缓存共同加速页面加载
- 监控与分析:持续监控性能指标,找出瓶颈并针对性优化
性能优化是一个持续的过程,随着应用的成长,不同的优化策略可能会发挥不同程度的重要性。建议从低悬果实开始,如数据库查询优化和基本缓存策略,然后根据实际需要逐步实施更复杂的优化方案。
重要的是,在优化之前要进行性能分析,准确找出瓶颈所在,避免盲目优化导致代码复杂性增加却没有带来实质性性能提升。
在下一篇文章中,我们将探讨Django与Celery异步任务集成,深入剖析如何构建可靠、高效的异步处理系统。