python 数据库查询内存不释放_为什么在django中的大量查询(或一系列查询)之后,内存不会释放到系统?...

在 Django 应用中,通过 User 模型进行大量数据库查询后,内存并未如预期那样释放回系统。即使 DEBUG 设置为 False,内存使用仍然保持高位。这可能是由于 Python 的内存管理机制,特别是对于对象不能在内存中移动导致的碎片问题。实验显示,即使删除引用,内存可能因碎片而无法完全释放。这种情况在 PyPy 中由于其采用的紧凑型垃圾收集器而较少出现。
摘要由CSDN通过智能技术生成

First off, DEBUG = False in settings.py, so no, connections['default'].queries is not growing and growing until it uses up all of memory.

Lets start off with the fact that I've loaded the User table from django.contrib.auth.models.User with 10000 users (each named 'test#' where # is a number between 1 and 10000).

Here is the view:

from django.contrib.auth.models import User

from django.http import HttpResponse

import time

def leak(request):

print "loading users"

users = []

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

users += list(User.objects.all())

print "sleeping"

time.sleep(10)

return HttpResponse('')

I've attached the view above to the /leak/ url and start the development server (with DEBUG=False, and I've tested and it has nothing to do with running a development server vs other instances).

After running:

% curl http://localhost:8000/leak/

The runserver process' memory grows to around the size seen from ps aux output below and then stays at that level.

USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND

dlamotte 25694 11.5 34.8 861384 705668 pts/3 Sl+ 19:11 2:52 /home/dlamotte/tmp/django-mem-leak/env/bin/python ./manage.py runserver

Then running the above curl command above does not seem to grow the instance's memory usage (which I expected from a true memory leak?), it must be re-using the memory? However, I feel that there is something wrong here that the memory does not get released to the system (however, I understand that it may be better performance that python does NOT release the memory).

Following this, I naively attempted to see if python would release large chunks of memory that it allocated. So I attempt the following from a python session:

>>> a = ''

>>> a += 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa' * 10000000

>>> del a

The memory is allocated on the a += ... line as expected, but when del a happens, the memory is released. Why is the behavior different for django query sets? Is it something that django is intending to do? Is there a way to change this behavior?

I've literally spent 2 days debugging this behavior with no idea where to go next (I've learned to use guppy AND objgraph which seem to not point to anything interesting that I can figure out).

UPDATE: This could be simply python memory management at work and have nothing to do with Django (suggested on django-users mailing list), but I'd like confirmation by somehow replicating this in python outside of Django.

UPDATE: Using python version 2.6.5

解决方案

I decided to move my comments into an answer to make things clearer.

Since Python 2.5, the CPython memory allocation tracks internal memory usage by the small object allocator, and attempts to return completely free arenas to the underlying OS. This works most of the time, but the fact that objects can't be moved around in memory means that fragmentation can be a serious problem.

Try the following experiment (I used 3.2, but 2.5+ should be similar if you use xrange):

# Create the big lists in advance to avoid skewing the memory counts

seq1 = [None] * 10**6 # Big list of references to None

seq2 = seq1[::10]

# Create and reference a lot of smaller lists

seq1[:] = [[] for x in range(10**6)] # References all the new lists

seq2[:] = seq1[::10] # Grab a second reference to 10% of the new lists

# Memory fragmentation in action

seq1[:] = [None] * 10**6 # 90% of the lists are no longer referenced here

seq2[:] = seq1[::10] # But memory freed only after last 10% are dropped

Note, even if you drop the references to seq1 and seq2, the above sequence will likely leave your Python process holding a lot of extra memory.

When people talk about PyPy using less memory than CPython, this is a major part of what they're talking about. Because PyPy doesn't use direct pointer references under the hood, it is able to use a compacting GC, thus avoiding much of the fragmentation problem and more reliably returning memory to the OS.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值