在使用Django框架的应用中,有一个查询需要根据质量进行排序,并返回前20个结果。但是这个查询有时候很快(少于1秒),有时候却很慢(长达10秒)。
- 已经为相关模型建立了多个索引,但似乎效果有限。
- 怀疑可能是内存不足或索引过多导致的,但不知道如何解决。
2、解决方案
经过分析,问题的根源在于查询中使用了一个复杂的嵌套循环,导致性能下降。为了解决这个问题,可以通过以下步骤优化查询:
- 添加索引:
- 在connectr_twitterpassage_words表中添加一个包含word_id和twitterpassage_id的复合索引。
- 在connectr_twitterpassage表中添加一个包含id和quality(降序)的复合索引。
- 优化查询:
- 将查询重写为使用新添加的索引。
优化后的查询如下:
SELECT "connectr_twitterpassage"."id", "connectr_twitterpassage"."third_party_id", "connectr_twitterpassage"."third_party_created", "connectr_twitterpassage"."source", "connectr_twitterpassage"."text", "connectr_twitterpassage"."author", "connectr_twitterpassage"."raw_data", "connectr_twitterpassage"."retweet_count", "connectr_twitterpassage"."favorited_count", "connectr_twitterpassage"."lang", "connectr_twitterpassage"."location", "connectr_twitterpassage"."author_followers_count", "connectr_twitterpassage"."is_retweet", "connectr_twitterpassage"."url", "connectr_twitterpassage"."author_fk_id", "connectr_twitterpassage"."quality", "connectr_twitterpassage"."is_top_tweet", "connectr_twitterpassage"."created", "connectr_twitterpassage"."modified"
FROM "connectr_twitterpassage" INNER JOIN "connectr_twitterpassage_words"
ON ("connectr_twitterpassage"."id" = "connectr_twitterpassage_words"."twitterpassage_id")
WHERE ("connectr_twitterpassage_words"."word_id" = 19514309
AND NOT (("connectr_twitterpassage"."id"
IN (SELECT U1."passage_id" FROM "connectr_passageviewevent" U1 WHERE (U1."user_id" = 1 AND U1."passage_id" IS NOT NULL)) AND "connectr_twitterpassage"."id" IS NOT NULL)))
ORDER BY "connectr_twitterpassage"."quality" DESC LIMIT 20
代码例子
from django.db import connection
def get_top_quality_tweets(word_id, user_id):
"""
Retrieve the top quality tweets that the user has not seen yet.
Args:
word_id: The ID of the word to search for.
user_id: The ID of the user.
Returns:
A list of tweets.
"""
# Create a cursor.
cursor = connection.cursor()
# Execute the query.
cursor.execute(
"SELECT * FROM connectr_twitterpassage "
"INNER JOIN connectr_twitterpassage_words ON connectr_twitterpassage.id = connectr_twitterpassage_words.twitterpassage_id "
"WHERE connectr_twitterpassage_words.word_id = %s "
"AND NOT (connectr_twitterpassage.id IN (SELECT passage_id FROM connectr_passageviewevent WHERE user_id = %s AND passage_id IS NOT NULL) AND connectr_twitterpassage.id IS NOT NULL) "
"ORDER BY quality DESC LIMIT 20",
[word_id, user_id]
)
# Fetch all results.
results = cursor.fetchall()
# Close the cursor.
cursor.close()
# Return the results.
return results