缓存设计

最新推荐文章于 2024-02-28 18:08:18 发布

adamyoungjack

最新推荐文章于 2024-02-28 18:08:18 发布

阅读量882

点赞数 2

分类专栏：业务数据库文章标签：缓存设计过期和淘汰缓存模式缓存问题缓存实现

本文链接：https://blog.csdn.net/weixin_46072106/article/details/109694393

版权

数据库同时被 2 个专栏收录

21 篇文章 0 订阅

订阅专栏

业务

14 篇文章 1 订阅

订阅专栏

本文详细介绍了Redis缓存的架构、粒度、数据格式、设计思路、淘汰策略、更新处理以及缓存对象和集合的实现。重点讨论了LRU和LFU策略、缓存雪崩和穿透问题的解决方案，以及如何通过多级缓存和随机过期时间来增强系统的稳定性和性能。此外，还展示了用户基础数据和关注列表缓存的Python实现，强调了解耦和面向对象设计的重要性。

摘要由CSDN通过智能技术生成

1. 简介

1.1 定义

缓存: 减小数据库的访问压力, 提高并发能力

1.2 架构

1. 基本架构

缓存基本架构
web应用查询使用redis缓存数据库，增删改使用mysql数据库，redis没有查mysql，然后进行数据回填到redis

2. 多级缓存

缓存多级架构

说明：浏览器cache：sqlite3
一级缓存：使用全局大字典/全局变量，速度快，但不稳定，断电已丢失
二级缓存：内存型数据库充当缓存，每隔一个小时会更新一次
三级缓存：保存时间5小时，热点数据（头条热点数据）存储在缓存，不是存在mysql中
多级缓存未击中缓存，查询mysql，返回给web应用同时进行数据回填

1.3 粒度

缓存某个数值(string)

一个键只保存一个值, 键利用率低
场景: 验证码
数据类型：String类型

缓存数据对象(hash)

一条数据库记录
优点: 可以多次复用
场景: 用户/文章数据
存储数据：用户/文章id，字典
数据类型：Hash类型

 # 用户的基本信息
user = User.query.filter_by(id=1).first()
user -> User对象
{
    'user_id':1,
    'user_name': 'python',
    'age': 28,
    'introduction': ''
}

缓存数据集合（list/set/zset）

数据库查询的结果集
场景: 文章/关注列表
项目中主要对 数据集合+数据对象 进行缓存, 优点 复用性强, 节省内存
数据去重：sismeber set
排序：score 数据类型：zset
使用数据集合存储数据对象键的形式也称为 自定义redis二级索引
- author_id -->用户缓存

缓存视图响应

视图返回的响应数据
缺点: 复用性比较差

1.4 数据格式

数值

类型 string

数据对象

类型 hash

数据集合

list
- 有遍历的需要
zset
- 有排序的需要
set
- 有判断是否存在的需要

视图响应

string
- 键请求URL
- 值响应结果对应的字符串, 前端渲染json字符串或后端渲染html字符串

2. 设计

2.1 思路

缓存来源于数据库, 应该根据 项目的数据库结构 来设计缓存
数据库结构和缓存的关系: 基础数据表 -> 数据对象, 关系表 -> 数据集合
先设计数据对象, 几乎所有页面都依赖基础数据表
数据集合的设计标准 需要分析页面的具体使用形式, 根据页面的使用形式来确定是否需要设计缓存及缓存的数据格式

数据对象（实体表）应该包括用户基础数据、文章基础数据、评论基础数据、频道基础数据、公告基础数据，用hash存储
数据集合（关系表）：用户关注列表，用户粉丝列表，用户收藏列表，用户频道列表，用户文章列表，用户搜索历史列表，用户阅读历史列表，用zset存储

2.2 示例

用户数据
文章数据
说明：
对象：文章基本数据列表
集合：文章评论列表
文章点赞列表：set
评论数据

评论点赞列表：set，无需排序，只需知道是否在不在列表中
频道数据
所有频道列表：list
频道文章列表：zset，置顶
公告数据

3. 策略

3.1 过期

1. 简介

只要是缓存, 都应该设置过期时间, 设置有效期的优点：
- 节省空间
- 做到数据弱一致性，有效期失效后，可以保证数据的一致性（一致是指redis和mysql数据一致，弱：通过数据过期然后数据回填保证数据一致，强：直接修改保证数据一致）

2. 分类

2.1 定时过期

每个设置过期时间的key都创建一个定时器，到过期时间就会立即清除。该策略可以立即清除过期的数据，对内存很友好；
但是会占用大量的CPU资源进行计时和处理过期数据，从而影响缓存的响应时间和吞吐量。

2.2 惰性过期

只有当访问一个key时，才会判断该key是否已过期，过期则清除(返回nil)。该策略可以最大化地节省CPU资源，却对内存非常不友好。极端情况可能出现大量的过期key没有再次被访问，从而不会被清除，占用大量内存。

2.3 定期过期

每隔一定的时间，扫描数据库中一部分设置了有效期的key，并清除其中已过期的key。
该策略是前两者的一个折中方案。通过调整定时扫描的时间间隔和每次扫描的限定耗时，可以在不同情况下使得CPU和内存资源达到最优的平衡效果。

3. Redis的过期策略

同时使用了惰性过期和定期过期两种过期策略。

定期过期: 默认是每100ms检测一次，遇到过期的key则进行删除，这里的检测并不是顺序检测，而是随机检测。
惰性过期: 当我们去读/写一个key时，会触发Redis的惰性过期策略，直接删除过期的key

3.2 淘汰

1. 简介

假定某个key逃过了定期过期, 且长期没有使用(即逃过惰性过期), 那么redis的内存会越来越高。当redis占用的内存达到系统上限时, 就会触发 内存淘汰机制。
所谓内存淘汰机制, 是指 在Redis允许使用的内存达到上限时，如何淘汰已有数据及处理新的写入需求。

2. 分类

2.1 LRU

1. 定义

LRU是Least recently used 的缩写，即最后使用时间策略
LRU算法根据数据的历史访问记录来进行淘汰数据，优先淘汰最近没有使用过的数据。
即使用时间先后顺序淘汰，淘汰使用时间离当前时间最久远的数据

2. 基本思路

新数据插入到列表头部；
每当缓存命中（即缓存数据被访问），则将数据移到列表头部；
当列表满的时候，将列表尾部的数据丢弃。

3. 存在的问题

单独按照最后使用时间来进行数据淘汰, 可能会将一些使用频繁的数据删除, 如下例中数据A虽然最后使用时间比数据B早, 但是其使用次数较多, 后续再次使用的可能性也更大

数据         最后使用时间       使用次数 
数据A        2020-03-15         100
数据B        2020-03-16         2

2. LFU

1. 简介

LFU是Least Frequently Used的缩写，即最少使用次数策略
redis 4.x 后支持LFU策略

2. 原理

它是基于“如果一个数据在最近一段时间内使用次数很少，那么在将来一段时间内被使用的可能性也很小”的思路, 优先淘汰使用率最低的数据。
考虑到新添加的数据往往使用次数要低于旧数据, LFU还实现了 定期衰减机制

3. 缺点

需要每条数据维护一个使用计数
还需要定期衰减（减半）

3.3 Redis的淘汰策略

allkeys-lfu: 当内存不足以容纳新写入数据时，在键空间中，优先移除使用次数最少的key。
volatile-lfu: 当内存不足以容纳新写入数据时，在设置了过期时间的键空间中，优先移除使用次数最少的key。
allkeys-lru：当内存不足以容纳新写入数据时，在键空间中，优先移除最近没有使用过的key。
volatile-lru：当内存不足以容纳新写入数据时，在设置了过期时间的键空间中，优先移除最近没有使用过的key。
allkeys-random：当内存不足以容纳新写入数据时，在键空间中，随机移除某个key。
volatile-random：当内存不足以容纳新写入数据时，在设置了过期时间的键空间中，随机移除某个key。
volatile-ttl：当内存不足以容纳新写入数据时，在设置了过期时间的键空间中，有更早过期时间的key优先移除。
noeviction：当内存不足以容纳新写入数据时，新写入操作会报错。
Redis默认是贪婪模式，有多少内存占用多少内存，即maxmemory :设置内存使用上线
在设置了过期时长的键值对中按照使用时间先后顺序进行淘汰
在设置了过期时长的键值对中按照使用频率多少进行淘汰

思考题
问题: mySQL里有2000w数据，redis中只存20w的数据，如何保证redis中的数据都是热点数据?

使用LFU：2000w数据和使用LRU：20w数据，设置maxmemory
解决办法: 先预估出20W数据的内存用量, 再通过Redis配置限定内存使用上限并设置内存淘汰策略
查看Redis内存使用情况

方案1: 根据内置信息查询（粗略计算）

$ redis-cli 
127.0.0.1:6379> dbsize  # 查询当前库中记录了多少个键
(integer) 150

127.0.0.1:6379> info Memory
 # Memory
used_memory:1045456  # Redis分配的内存总量，包含了redis进程内部的开销和数据占用的内存，以字节（byte）为单位
used_memory_human:1020.95K  # 展示优化的 Redis内存总量

方案2: 使用第三方分析工具 redis-rdb-tools（pandas，精确计算）

 # 安装工具
git clone https://github.com/sripathikrishnan/redis-rdb-tools
cd redis-rdb-tools
sudo python3 setup.py install

 # 将Redis持久化的数据导出,  其中 /path/dump.rdb 为Redis持久化的数据文件路径(备份文件)
$ rdb -c memory /path/dump.rdb > ~/redis_memory_report.csv
$ cat ~/Desktop/redis_memory_report.csv

 # 库号, 类型, 键名, 占用空间, 编码方式, 元素个数, 最大元素占用的空间, 过期时间
database,type,key,size_in_bytes,encoding,num_elements,len_largest_element,expiry  
0,sortedset,user:all:art_count,89,ziplist,3,8,
0,sortedset,list2,63,ziplist,1,8,

通过Redis配置来限定内存使用上限并设置内存淘汰策略

$ sudo vi redis.conf
maxmemory 1048576  # 最大使用内存数量, 以字节为单位  如服务器内存10G, 最多给redis分配9G
maxmemory-policy volatile-lfu  # 淘汰策略

项目方案

缓存数据都设置有效期
配置redis，使用volatile-lfu

4. 问题

4.1 更新

1. 简介

mysql和redis是两个独立的系统, 在并发环境下, 无法保证更新的一致性
如下图（以Redis和Mysql为例），因为网络延迟，两个并发更新操作，数据库先更新的反而后更新缓存，数据库后更新的反而先更新缓存。这样就会造成数据库和缓存中的数据不一致，应用程序中读取的都是脏数据。

4.2 解决方案

方案1: 设计分布式锁(redis-setnx)/使用消息队列顺序执行
- 缺点: 并发能力差
方案2: 更新数据时, 先写入mysql, 再删除缓存
- 主要用于数据对象 (更新少)
- 数据集合可以考虑更新缓存 (集合的查询成本高, 频繁更新缓存效率太低)
- 字典，列表，集合：删除
- 广泛使用, 如: facebook

4.2 穿透

1. 简介

缓存只是为了缓解数据库压力而添加的一层保护层，当从缓存中查询不到我们需要的数据就要去数据库中查询了。如果被黑客利用，频繁去访问缓存中没有的数据，那么缓存就失去了存在的意义，瞬间所有请求的压力都落在了数据库上，这样会导致数据库连接异常。

2. 解决方案

方案1: 对于数据库中不存在的数据, 也对其在缓存中设置默认值Null
- 为避免占用资源, 一般过期时间会比较短
给缓存设置一个key:{“null”:True}非空标志位

缓存命中
如果数据是为空标志位，返回None
如果数为真是用户数据，返回用户字典

方案2: 可以设置一些过滤规则
- 如布隆过滤器(一种算法, 用于判断数据是否包含在集合中), 将所有可能的值录入过滤器, 如果不包含直接返回None, 有误杀概率
  
  布隆过滤器
安装包 pip install pybloomfiltermmap3

import pybloomfilter
# 创建过滤器
filter = pybloomfilter.BloomFilter(1000000, 0.01, 'words.bloom')
# 添加数据
filter.update(('bj', 'sh', 'gz'))
# 判断是否包含
if 'bj' in filter:
   print('包含')
else:
   print('不包含')

推荐阅读：

使用Redis实现布隆过滤器:https://blog.csdn.net/u013074999/article/details/88981153

4.3 雪崩

1. 简介

如果大量缓存数据都在同一个时间过期, 那么很可能出现 缓存集体失效, 会导致所有的请求都直接访问数据库, 导致数据库压力过大

2. 解决方案

方案1: 设置过期时间时添加随机值, 让过期时间进行一定程度分散，避免同一时间集体失效。
- 比如以前是设置10分钟的超时时间，那每个Key都可以随机8-13分钟过期，尽量让不同Key的过期时间不同。
方案2: 采用多级缓存，不同级别缓存设置的超时时间不同，即使某个级别缓存都过期，也有其他级别缓存兜底。

5. 模式

缓存设计的核心思路为 先读取缓存中的数据, 没有才会读取数据库中的数据, 以便解决数据库读取压力
具体的缓存设计模式可以主要分为以下两种:
- Cache Aside
- Read-through 通读

5.1 Cache Aside

1. 简介

Cache Aside：缓存边沿

缓存未命中
缓存命中
更新缓存

2. 特点

特点:
- 具体读写操作交给应用完成
缺点:
- 业务和数据操作耦合度高, 不利用技术升级
- 所有缓存读写操作都交由后端类视图完成
  业务逻辑代码和缓存读写代码混合–>高耦合
  不利于后续架构和技术升级

5.2 Read-through 通读

1. 简介

Read-through：通读

缓存未命中
特点：缓存操作代码封装到缓存工具中
好处；解除了缓存操作代码和业务逻辑代码之间的高耦合
缓存命中

2. 特点

特点:
- 具体读写操作交给缓存层完成, 即使后期修改存储方案, 业务代码不需要修改,
优点:
- 有利于项目的重构和架构升级
说明
- Cache：缓存工具类（中间人思想）
- CacheStore：redis内存型数据库

Read-through

项目方案

使用 Read-through
- 构建一层抽象出来的缓存操作层，负责数据库查询和Redis缓存存取，在Flask的视图逻辑中直接操作缓存层工具。
更新数据对象, 采用先更新数据库，再删除缓存; 更新数据集合时, 采用先更新数据库, 再更新缓存

6. 缓存实现

6.1 实现缓存对象

1. 缓存类设计

在进行缓存层设计时, 可以参考关系型数据库的交互方案: ORM, 用面向对象的形式进行数据增删改查
优点
- 开发过程中可以使用到面向对象的三大特性, 减少代码耦合和代码冗余
- API设计便于理解, 提高开发速度
  设计思路
ORM:
- 表 -> 类
- 记录 -> 实例对象
Redis缓存
- 数据对象/集合 -> 类
- 键值数据 -> 实例对象
  用户类设计
以用户基础数据为例, 设计缓存类

需求：缓存用户对象
redis类型：hash
redis存储命令：hset/hmset key value
redis读取命令：hget/hmget key
redis的key：user:user_id:basic
redis的value：user模型对象  --> user字典

用户数据缓存类 UserCache
- 属性
  - 用户id: userid
- 方法
  - 获取缓存数据: get()
  - 删除缓存数据: clear()

2. 获取缓存

在common包中创建 cache包, 用于存放缓存层模型文件
在 common/cache包中创建 user.py文件, 存放用户相关的缓存类
在 user.py文件中定义用户缓存类, 并实现获取缓存方法
项目中使用 Redis集群 来存储缓存数据
- 实际开发中, 主从和集群并没有明确的职能界定


#common/cache/user.py
 # ORM: 用面向对象的形式进行数据的增删改查
 # ORM: 表->类  记录->对象
 # redis缓存层:   数据对象/数据集合 -> 类   redis数据 -> 对象

import json
from sqlalchemy.orm import load_only
from app import redis_cluster
from models.user import User

 # user:<用户id>:basic   hash   {'name': xx, 'mobile': xx}

 # 用户数据缓存类
 # 属性
 #    userid   用户id
 # 方法
 #    get()    获取缓存数据
 #    clear()  删除缓存数据

class UserCache:
    """用户数据缓存类"""

    def __init__(self, userid):
        self.userid = userid  # 用户id
        self.key = 'user:{}:basic'.format(self.userid)  # redis的键

    def get(self):
        """获取数据"""
        # 先从缓存中读取数据
        data = redis_cluster.hgetall(self.key)  # 键不存在返回空字典
		 # 如果有, 再判断是否为默认值
        if data: 
        	# 如果为默认值, 则返回None
            if data.get('null'):
                return None
            # 如果不为默认值, 则返回数据
            else:
                print('获取用户缓存数据')
                return data
		# 如果没有, 再从数据库中读取数据
        else:  
            # 查询数据库
            user = User.query.options(
                load_only(User.id, User.name, User.profile_photo, User.introduction, User.article_count,
                          User.following_count, User.fans_count)).filter(User.id == self.userid).first()
			# 如果数据库有, 将数据回填到缓存中, 然后返回数据
            if user:  
                user_dict = user.to_dict()
                # 回填数据到缓存中
                # redis_cluster.hmset(self.key, user_dict)
                redis_cluseter.hset(self.key, mapping=user_dict)
                # 设置随机过期时长泛指出现缓存血崩
                redis_cluster.expire(self.key, 60 * 60 * 2)
                print('查询用户数据并回填')
                return user_dict
			 # 如果数据库没有, 在缓存中设置默认值-1, 然后返回None
            else: 
                # 设置默认值(防止缓存穿透)
                # redis_cluster.hmset(self.key, {'null': 1})
                redis_clusete.hset(self.key, mapping={"null":Ture})
                redis_cluster.expire(self.key, 60 * 10)
                return None

注意点:
- 考虑到 缓存穿透情况, 要给数据库中没有的数据设置默认值
- 用户基础数据对应数据对象, 不提供更新方法, 数据库更新后直接删除缓存

3. 清除缓存

在 user.py文件中实现用户缓存类的清除缓存方法


 # common/cache/user.py

class UserCache:
    """用户数据缓存类"""

    def clear(self):
        """删除缓存数据"""
        redis_cluster.delete(self.key)
        print("删除用户缓存")

4. 封装缓存过期时间

考虑到 缓存雪崩 问题, 还需要给过期时间添加随机值
- 可以 定义过期时间类, 以便利用继承和重写特性减少代码冗余
在 common/cache包中创建 constants.py文件, 存放缓存相关的常量, 如过期时间


 # common/cache/constants.py

import random

`# 用户缓存过期时间  设置随机值(避免缓存雪崩)
`# UserCacheTTL = 60 * 60 * 2 + random.randint(0, 600)

 # 升级1: 将生成过期时间 封装到函数中
 # def get_val():
    # return 60 * 60 * 2 + random.randint(0, 600)

 # 升级2: 将过期时间处理封装为类
class BaseCacheTTL(object):
    """过期时间基类"""
    TTL = 60 * 10  # 过期时间基础值
    MAX_DELTA = 60  # 最大随机过期时长值

    @classmethod
    def get_val(cls):
        """获取过期时间"""
        return cls.TTL + random.randint(0, cls.MAX_DELTA)


class UserCacheTTL(BaseCacheTTL):
    """用户缓存过期时间类"""
    TTL = 60 * 60 * 2  # 过期时间
    MAX_DELTA = 600  # 最大随机值


class UserNotExistTTL(BaseCacheTTL):
    """用户不存在过期时间类"""
    TTL = 60 * 60   # 过期时间
    MAX_DELTA = 300  # 最大随机值
    # pass

修改 common/cache/user.py文件, 给缓存过期时间设置随机值

 # common/cache/user.py
from cache.constants import UserCacheTTL, UserNotExistTTL
class UserCache:
    """用户数据缓存类"""
    
    def get(self):
        """获取数据"""

        # 先从缓存中读取数据
        else:  # 如果没有, 再从数据库中读取数据

            # 查询数据库
            if user:  # 如果数据库有, 将数据回填到缓存中, 然后返回数据

                user_dict = user.to_dict()
                # 回填数据到缓存中
                redis_cluster.hmset(self.key, user_dict)
                redis_cluster.expire(self.key, UserCacheTTL.get_val())
                print('查询用户数据并回填')
                return user_dict

            else:  # 如果数据库没有, 在缓存中设置默认值, 然后返回None

                # 设置默认值(防止缓存穿透)
                redis_cluster.hmset(self.key, {'null': 1})
                redis_cluster.expire(self.key, UserNotExistTTL.get_val())
                return None

5. 修改接口

已经实现了用户基础数据缓存类, 接下来改写个人信息和修改头像接口, 使用缓存类来获取用户数据
在 app/resources/user/profile.py文件中, 修改获取用户信息和修改头像视图函数

 # app/resources/user/profile.py
 
from cache.user import UserCache


class CurrentUserResource(Resource):
    """个人中心-当前用户"""
    method_decorators = {'get': [login_required]}

    def get(self):
        # 获取用户id
        userid = g.userid

        # 查询用户数据
        user_cache = UserCache(userid).get()
        if user_cache:
            return user_cache
        else:
            return {'message': "Invalid User", 'data': None}, 400


class UserPhotoResource(Resource):
    method_decorators = [login_required]

    def patch(self):
        """修改头像"""

        # 将数据库中头像URL进行更新
        User.query.filter(User.id == userid).update({'profile_photo': file_name})
        db.session.commit()

        # 将数据对象删除
        usercache = UserCache(userid)
        usercache.clear()

        # 返回URL
        return {'photo_url': current_app.config['QINIU_DOMAIN'] + file_name}

postman进行接口测试

6.2 实现缓存集合

1. 缓存类设计

数据集合的缓存实现形式仍然沿用 面向对象的形式
设计思路
ORM:
- 表 -> 类
- 记录 -> 实例对象
Redis缓存
- 数据对象/集合 -> 类
- 键值数据 -> 实例对象
  用户关注列表-类设计
下边以用户关注列表为例, 设计缓存类
用户数据缓存类 UserFollowingCache
- 属性
  - 用户id: userid
- 方法
  - 获取缓存数据: get()
  - 更新缓存数据: update()

2. 获取缓存

在 user.py文件中定义用户关注列表缓存类, 并实现获取缓存方法

# 30 条数据
# 每一条10条

第一页：start_index=0 end-index=9 page=1 per_page=10
第二页：start_index=10 end_index=19 page=2
per_page=10
第三页：start_index=10 end_index=29 page=3
per_page=10

start_index = (page-1) * per_page
end_index = start_index + per_page -1


 # common/cache/user.py

"""
类名称：UserFollowingCache
对象属性：user_id 当前用户id，查询数据的键：key = "user:<user_id>:following"
对象方法：get()查询缓存 update()更新缓存

缓存粒度：zset有序集合 作者id作为value，关注时间作为score
数据类型：zset
写入命令：zdd key score1 value1 score2 value2
写入函数：redis_cli.zadd("key",{"value1":score1, "value2":score2}
查询命令：zrevrange key 0 -1
查询命令：zrevrange key startindex endindex

查询缓存：UserFollowingCache(user_id).get()
更新缓存：UserFollowing Cache(uer_id).update()
"""
from models.user import Relation
from cache.constants import UserFollowCacheTTL

 # user:<用户id>:followings  zset  [{value: 用户id, score: 关注时间}, {}, {}]
class UserFollowingCache:
    """用户关注列表缓存类"""
    def __init__(self,user_id):
    self.user_id = user_id
    # 存取键
    self.key = "user:{}:following".format(user_id)
"""
    # 查询用户关注列表缓存思路：
    # 0. 自定义分页
    start_index = (self.page-1) * self.per_page
	end_index = start_index + self.per_page -1
	# 确定缓存是否存在
	is_following_exist = rdis_cluster.exists(self.key)
    # 1. 查询redis数据是否存在关注列表缓存
    if is_following_exist:
    # 2. 关注列表缓存存在-->返回关注列表,按照关注时间降序排序+分页
    		return redis_culster.zrevrange(self,start_index,end_index)
    
    # 3. 关注列表缓存不存在
    # 3.1 根据use_id查询用户对象
    User.query.options(load_only(User.following_count)).filter(User.id == self.user_id).first()
    # 3.2 判断用户对象是否存在，同时判断关注数量是否大于0
    if user and user.following_count > 0:
    
    # 3.3 关注数量大于0，根据user_id查询Relation表中的关注列表
    # Relation.author_id:关注作者id
    # Relation.update_time:关注作者时间
    following_list = Relation.query.options(load_only(author_id,Relation.update_time)).filter(Relation.user_id == self.user_id, Relation.relation == Relation.RELATION.FOLLOW()).all()
    id_list = []
    # 3.3.1 遍历关注列表，将关注时间和关注的作者id使用zadd方法存储到redis缓存数据库 - 回填
    for item in following_list:
    # 关注作者id
    author_id = item.author_id
    update_time = itme.update_time
    redis_cluster.zadd(self.key, {author-id:update_time})
    id_list.append(author_id)
    # 防止血崩，设置过期时长
redis_cluster.expire(self.key,UserFollowingCache.get_val())
    # 3.3.2 分页返回关注列表数据
    if len(id_list) >= start_index + 1：
    	# 分页
    	# 第一页：start_index = 0 end_index = 9
    	# 切片：左闭右开[0:10]
    	# 每一页填满的情况
    	try:
    		return id_list[start_index:end_index + 1]
    	expect Excetpiton as e:
    	# 举例：26条数据
    	# 第三页：start_index =20 end_index=25 page=3 per_page=10
    		return id_list[start_index: ]
    	# 列表长度为0
    	else:
    		return []
    # 3.4 关注数量等于0，关注列表不存在，返回空列表
    else: 
    return []
"""
    def __init__(self, userid):
        self.userid = userid  # 用户主键
        self.key = "user:{}:following".format(userid)  # redis的键

    def get(self, page, per_page):
        """获取缓存列表
        :return 主键列表 [1, 5, 11] /空列表
        """

        # 先从缓存中读取数据
        is_key_exist = redis_cluster.exists(self.key)

        # 根据页码和每页条数 构建出 开始索引 和 结束索引
        start_index = (page - 1) * per_page  # 开始索引 = (页码 - 1) * 每页条数
        end_index = start_index + per_page - 1  # 结束索引 = 开始索引 + 每页条数 - 1            

        if is_key_exist:  # 如果缓存中有数据

            # zrevrange 逆序取出元素, 且返回值一定为列表  ['3', '4', '5']
            print('从缓存中读取集合数据')
            return redis_cluster.zrevrange(self.key, start_index, end_index)

        else:  # 如果缓存中没有

            # 没有缓存, 其实有两种情况: 数据库没有数据 / 数据库有, 但缓存已过期
            user = UserCache(self.userid).get()  

            if user and user['follow_count']:  # 如果该用户有关注数量(数据库有, 但缓存已过期)

                followings = Relation.query.options(load_only(Relation.author_id, Relation.update_time)). \
                    filter(Relation.user_id == self.userid, Relation.relation == Relation.RELATION.FOLLOW). \
                    order_by(Relation.update_time.desc()).all()  # 直接查询出所有数据并缓存, 只缓存分页数据可能导致查询错误 

                # 如果有, 则应该回填数据, 并返回数据
                following_list = []
                for item in followings:
                
                    # 追加/更新缓存数据到关注列表中
                    redis_cluster.zadd(self.key, item.author_id, item.update_time.timestamp())
                    following_list.append(item.author_id)

                # 设置过期时间
                redis_cluster.expire(self.key, UserFollowCacheTTL.get_val())

                print('查询集合数据并回填')
                if len(following_list) >= start_index+1:  # 如果开始索引存在
                    try:
                        return following_list[start_index:end_index+1]  # 取出分页数据

                    except Exception as e:  # 如果结束索引不存在, 则将剩余的条数都取出
                        return following_list[start_index:]        
                else:
                    return []    

            else:  # 判断该用户没有关注数量, 直接返回空列表(通过判断关注数量, 避免了缓存穿透)
                return []

缓存集合也需要设置过期时间, 在 common/cache/constants.py文件中定义对应的缓存时间类

 # common/cache/constants.py
 
class UserFollowCacheTTL(BaseCacheTTL):
    """用户关注缓存时间类"""
    TTL = 60 * 60 * 2  # 过期时间
    MAX_DELTA = 600  # 最大随机值

注意点:
- 获取缓存集合时, 增加判断 if user and user['follow_count']可以确认用户是否进行过关注, 有关注才进行数据库查询, 可以避免缓存穿透问题

3. 更新缓存

在 user.py文件中实现用户关注列表-缓存类的更新缓存方法

 # common/cache/user.py
 
class UserFollowingCache:
    """用户关注列表缓存类"""
    
    def update(self, author_id, timestamp=None, is_follow=True):
        """关注/取消关注"""
		# 1. 查询redis缓存中是否有关注
        is_key_exist = redis_cluster.exists(self.key)  
        # 2. 关注列表不存在，不需要更新
        if not is_key_exist:  # 如果没有缓存, 则不需要更新缓存数据
            return None
		# 2. 关注列表存在
		# 2.1 is_follow=True,则为
        if is_follow:  # 关注用户
        	# redis_cluster.zadd(self.key,{author_id: timestamp})
            redis_cluster.zadd(self.key, author_id, timestamp)

        else:  # 取消关注
        	# redis_cluster.zrem(self.key,[author_id])
            redis_cluster.zrem(self.key, author_id)

4. 修改接口

已经实现了用户关注列表-数据缓存类, 接下来就可以改写获取关注列表和关注/取消关注接口, 使用缓存类来获取数据
在 app/resources/article/following.py文件中, 修改获取关注列表和关注/取消关注视图函数

 # app/resources/article/following.py

from datetime import datetime
from cache.user import UserFollowingCache, UserCache

class FollowUserResource(Resource):
    method_decorators = {'post': [login_required], 'get': [login_required]}

    def post(self):
        # 获取参数
        userid = g.userid
        parser = RequestParser()
        parser.add_argument('target', required=True, location='json', type=int)
        args = parser.parse_args()
        author_id = args.target

        # 获取当前时间
        update_time = datetime.now()

        # 查询数据
        relation = Relation.query.options(load_only(Relation.id)).filter(Relation.user_id == userid, Relation.author_id == author_id).first()

        if relation:  # 如果有, 修改记录
            relation.relation = Relation.RELATION.FOLLOW
            relation.update_time = update_time

        else:  # 如果没有, 新增记录
            relation = Relation(user_id=userid, author_id=author_id, relation=Relation.RELATION.FOLLOW)
            db.session.add(relation)

        # 让作者的粉丝数量+1
        User.query.filter(User.id == author_id).update({'fans_count': User.fans_count + 1})
        # 让用户的关注数量+1
        User.query.filter(User.id == userid).update({'following_count': User.following_count + 1})

        db.session.commit()

        """更新缓存"""
        UserFollowingCache(userid).update(author_id, update_time.timestamp(), is_follow=True)

        # 返回结果
        return {'target': author_id}

    def get(self):
        """获取关注列表"""
        # 获取参数
        userid = g.userid
        parser = RequestParser()
        parser.add_argument('page', default=1, location='args', type=int)
        parser.add_argument('per_page', default=2, location='args', type=int)
        args = parser.parse_args()
        page = args.page
        per_page = args.per_page

        """查询数据 当前用户的关注列表"""
        following_list = UserFollowingCache(userid).get(page, per_page)
        author_list = []
        for author_id in following_list:
            author_cache = UserCache(author_id).get()
            # 此处暂不实现相互关注功能, 在下一小节中实现
            author_dict = {
                'id': author_cache['id'],
                'name': author_cache['name'],
                'photo': author_cache['photo'],
                'fans_count': author_cache['fans_count'],
                'mutual_follow': False
            }
            author_list.append(author_dict)

            # 获取用户关注数量
        user = UserCache(userid).get()

        # 返回数据
        return {'results': author_list, 'per_page': per_page, 'page': page, 'total_count': user['follow_count']}


class UnFollowUserResource(Resource):
    method_decorators = {'delete': [login_required]}

    def delete(self, target):
        # 获取参数
        userid = g.userid

        # 更新用户关系
        Relation.query.filter(Relation.user_id == userid, Relation.author_id == 2, Relation.relation == Relation.RELATION.FOLLOW).update({'relation': 0, 'update_time': datetime.now()})

        # 让作者的粉丝数量-1
        User.query.filter(User.id == target).update({'fans_count': User.fans_count - 1})
        # 让用户的关注数量-1
        User.query.filter(User.id == userid).update({'following_count': User.following_count - 1})
        
        db.session.commit()

        """更新缓存"""
        UserFollowingCache(userid).update(target, is_follow=False)

        return {'target': target}

5. 相互关注

按照之前的视图逻辑, 要实现相互关注功能, 需要判断 关注列表中取出的作者是否包含在用户的粉丝列表中
考虑到用户粉丝列表也需要缓存, 就可以先封装用户粉丝列表-缓存类, 并在其中实现判断包含的方法
缓存类实现
由于用户粉丝列表的实现形式和用户关注列表完全一致, 所以可以定义基类, 抽取公共代码
在 ```common/cache/user.py``文件中定义关注缓存基类和用户粉丝列表-缓存类

 # common/cache/user.py

class BaseFollowCache:
    """关注基类"""

    def get(self, page, per_page):
        """
        获取关注列表, 分页获取

        :param page: 页码
        :param per_page: 每页条数
        :return: 指定页的数据  列表形式 [用户id, ...]  / []
        """
        2
        # 从Redis中查询缓存数据
        is_key_exist = redis_cluster.exists(self.key)

        # 计算开始索引和结束索引

        # 开始索引 = (页码 - 1) * 每页条数
        start_index = (page - 1) * per_page
        # 结束索引 = 开始索引 + 每页条数 - 1
        end_index = start_index + per_page - 1

        if is_key_exist:  # 如果缓存中有, 从缓存中查询数据

            # zrevrange取出的一定是列表(没有数据就是空列表)  ['3', '4', '5']
            print('从缓存中获取数据')
            return redis_cluster.zrevrange(self.key, start_index, end_index)  # 根据分数(关注时间)倒序取值

        else:  # 如果缓存中没有, 到数据库中进行查询

            # 缓存中如果没有数据, 其实有两种情况:  数据库没有该数据 / 数据库中有, 但是缓存过期
            user = UserCache(self.userid).get()

            # 判断数据库中是否有数据(当前用户是否关注过作者)
            if user and user[self.count_key]:   # 用户有关注过作者, 查询数据库

                # 当前用户的关注列表 (取出的字段: 作者id, 关注时间)  关注时间倒序排序
                followings = self.db_query()

                # 将数据回填到redis中
                following_list = []

                # 根据需求指定要查询的字段
                property_name = 'author_id' if self.count_key == 'follow_count' else 'user_id'

                for item in followings:
                    # getattr(对象, 字符串形式的属性名)  动态获取对象的属性
                    id = getattr(item, property_name)
                    # 将数据添加到关注缓存集合中   
                    redis_cluster.zadd(self.key, id, item.update_time.timestamp())
                    # 将作者id添加到列表中(构建返回数据)
                    following_list.append(id)

                # 给缓存集合设置过期时间
                print('查询数据库并回填数据')
                redis_cluster.expire(self.key, UserFollowCacheTTL.get_val())

                # 返回结果     元素数量为5, 最大索引4    最大索引 = 元素数量 - 1
                if start_index <= len(following_list) - 1:  # 如果开始索引存在
                    try:
                        return following_list[start_index:end_index+1]
                    except:
                        return following_list[start_index:]

                else:
                    return []

            else:  # 用户没有关注过任何作者, 返回空列表
                return []

    def update(self, author_id, timestamp=None, is_follow=True):
        """关注/取消关注"""

        is_key_exist = redis_cluster.exists(self.key)  
        # if not is_key_exist:  # 如果没有缓存, 则不需要更新缓存数据
           #  return

        if is_follow:  # 关注用户

            redis_cluster.zadd(self.key, author_id, timestamp)

        else:  # 取消关注
            redis_cluster.zrem(self.key, author_id)

 # user:<用户id>:followings  zset  [{value: 用户id, score: 关注时间}, {}, {}]
class UserFollowingCache(BaseFollowCache):
    """用户关注列表缓存类"""

    def __init__(self, userid):
        self.userid = userid  # 用户主键
        self.key = "user:{}:following".format(userid)  # redis的键
        self.count_key = 'follow_count'

    def db_query(self):
        # 当前用户的关注列表 (取出的字段: 作者id, 关注时间)  关注时间倒序排序
        return Relation.query.options(load_only(Relation.author_id, Relation.update_time)). \
            filter(Relation.user_id == self.userid, Relation.relation == Relation.RELATION.FOLLOW). \
            order_by(Relation.update_time.desc()).all()  # 直接从数据库中将所有数据取出, 只缓存分页可能导致查询错误

 # user:<用户id>:fans  zset  [{value: 用户id, score: 被关注时间}, {}, {}]
class UserFansCache(BaseFollowCache):
    """用户粉丝列表缓存类"""

    def __init__(self, userid):
        self.userid = userid  # 作者主键
        self.key = "user:{}:fans".format(userid)  # redis的键
        self.count_key = 'fans_count'

    def db_query(self):
        # 当前用户的粉丝列表 (取出的字段: userid, 关注时间) 关注时间倒序排序
        return 
        # 当前用户作为作者查询粉丝Relation.query.options(load_only(Relation.user_id, Relation.update_time)).\
            filter(Relation.author_id == self.userid, Relation.relation == Relation.RELATION.FOLLOW).\
            order_by(Relation.update_time.desc()).all()

在 common/cache/user.py文件的用户粉丝列表-缓存类中定义方法 判断是否包含指定的粉丝


 # common/cache/user.py

class UserFansCache(BaseFollowCache):
    """用户粉丝列表缓存类"""

    def has_fans(self, fans_id):
        """判断传入的id是否当前用户的粉丝"""
    
        # 先判断是否有缓存
        is_key_exist = redis_cluster.exists(self.key)
		 # 如果没有缓存， 生成缓存 
        if not is_key_exist: 
        	# 缓存过期
            items = self.get(1, 1)
            # 没有粉丝
            if len(items) == 0:  
                return False

        # 判断id是否为当前用户的粉丝
        score = redis_cluster.zscore(self.key, fans_id)
        return True if score else False

修改接口

已经实现了用户粉丝列表-判断是否包含的方法, 接下来就可以改写获取关注列表接口
在 app/resources/article/following.py文件中, 修改获取关注列表视图函数

 # app/resources/article/following.py

from cache.user import UserFansCache

class FollowUserResource(Resource):

    def get(self):
        """获取关注列表"""

        for author_id in following_list:
            author_cache = UserCache(author_id).get()
            author_dict = {
                'id': author_cache['id'],
                'name': author_cache['name'],
                'photo': author_cache['photo'],
                'fans_count': author_cache['fans_count'],
                'mutual_follow': False
            }

            # 如果该作者是当前用户的粉丝, 则为互相关注
            if UserFansCache(userid).has_fans(author_id):
                author_dict['mutual_follow'] = True

            author_list.append(author_dict)

判断是否包含指定粉丝的方法还可以用于 文章详情 路由, 在该路由中 需要判断当前用户是否为文章作者的粉丝, 使用该方法将代替直接使用数据库查询, 从而提高查询效率
在 app/resources/article/articles.py文件中, 修改文章详情视图函数

 
 # app/resources/article/articles.py
from cache.user import UserFansCache

class ArticleDetailResource(Resource):
    def get(self, article_id):
    
        # 判断用户是否已登录
        if userid:
            # 查询用户的关注关系   用户 -> 作者
            has_fans = UserFansCache(data.user_id).has_fans(userid)

            article_dict['is_followed'] = has_fans

        # 返回数据
        return article_dict

adamyoungjack

关注

2
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
缓存设计

1. 缓存内容1.1 缓存介绍缓存: 减小数据库的访问压力, 提高并发能力1.2 缓存架构基本架构多级缓存1.3 缓存粒度1. 缓存某个数值一个键只保存一个值, 键利用率低场景: 验证码2. 缓存数据对象一条数据库记录优点: 可以多次复用场景: 用户/文章数据 # 用户的基本信息user = User.query.filter_by(id=1).first()user -> User对象{ 'user_id':1, 'use
复制链接

扫一扫

专栏目录