I have a sizable table (20M+) in Postgres am I try to do a raw Django query on it:
tweets = TweetX.objects.raw("SELECT * from twitter_tweet").using("twittertest")
I get a RawQuerySet fast, but when I try to iterate over its results it is grinding to a halt:
for tweet in tweets:
#do stuff
Memory is steadily rising so I suspect the whole dataset is being transferred.
Is there a way to get a database cursor from .raw so I can iterate over the result set without transferring it all at once?
解决方案
It seems that it is rather difficult to persuade django/postgres to use database cursors. Instead it fetches everything and then put a client side iterator (called cursor) over it.
Found a solution over here that explicitly creates a db cursor. Only downside is it does not fit into django models anymore.
from django.db import connections
conn = connections['twittertest']
# This is required to populate the connection object properly
if conn.connection is None:
cursor = conn.cursor()
cursor = conn.connection.cursor(name='gigantic_cursor')
cursor.execute("SELECT * from twitter_tweet")
for tweet in cursor:
#profit