分布式id生成器

昔日微醉

已于 2023-05-23 19:27:29 修改

阅读量152

点赞数

文章标签：数据库 java 大数据

于 2023-05-23 15:54:58 首次发布

本文链接：https://blog.csdn.net/qq_41494158/article/details/130828654

版权

分布式id生成器

id生成问题
业界成熟方案
- tddl sequence
- Twitter Snowflake
设计方案
- id bits结构
- 实现细节

id生成问题

单机系统中生成一个唯一id很简单，利用数据库自身的自增主键特性就可以完成。但在分库分表分布式场景下，要生成一个唯一id就变得有点复杂了，所生成的全局的 unique ID 要满足以下需求：

唯一性（唯一性我们只要保障某个命名空间下唯一就行，不需要全局唯一，例如生成的订单id，用户id在各自的命名空间内都是唯一的，但彼此之间生成的id可能相同）
时间相关（比如订单号中得体现订单产生的时间）可以为后续的数据同步至hbase的rowkey设计提供使用
大致有序（如果我们精确到秒级别，那么后一秒的id肯定比前一秒的id大，但是同一秒内可能后面取的id可能比前面小。所以这个大致体现在整个id只保证时间上的有序性，在秒级别上不再保证有序）
生成 ID 的速度有要求，吞吐量要足够高，以满足业务系统需要（淘宝双11峰值订单超过50万笔/秒）
服务得高可用

业界成熟方案

tddl sequence

每个应用db配置一张sequence表，由tddl-client负责分配id，tddl-client会缓存一批id区间在应用本地内存中，用完再去db里的sequence表中取出一段

优点：利用现有db就可实现，高可用依赖mysql或pg等数据库的高可用
缺点: id上没有时间，sharding等信息，且分库分表下需要部署一个单独的db服务

Twitter Snowflake

Snowflake 生成的 unique ID 的组成 (由高位到低位):

41 bits: Timestamp (毫秒级)
10 bits: 节点 ID (datacenter ID 5 bits + worker ID 5 bits)
12 bits: sequence number
一共 63 bits (最高位是 0)

unique ID 生成过程:

10 bits 的机器号, 在 ID 分配 Worker 启动的时候, 从一个 Zookeeper 集群获取 (保证所有的 Worker 不会有重复的机器号)
41 bits 的 Timestamp: 每次要生成一个新 ID 的时候, 都会获取一下当前的 Timestamp, 然后分两种情况生成 sequence number:
如果当前的 Timestamp 和前一个已生成 ID 的 Timestamp 相同 (在同一毫秒中), 就用前一个 ID 的 sequence number + 1 作为新的 sequence number (12 bits); 如果本毫秒内的所有 ID 用完, 等到下一毫秒继续 (这个等待过程中, 不能分配出新的 ID)
如果当前的 Timestamp 比前一个 ID 的 Timestamp 大, 随机生成一个初始 sequence number (12 bits) 作为本毫秒内的第一个 sequence number
整个过程中, 只是在 Worker 启动的时候会对外部有依赖 (需要从 Zookeeper 获取 Worker 号), 之后就可以独立工作了, 做到了去中心化.
优点：基本去中心化，无单点
缺点：timestamp精确到毫秒，吞吐量受影响；且如果应用机器时钟回拨则可用性较差

设计方案

整体上是Snowflake的一个变种，增加了sharding位，利用redis lua脚本来分配相同时间戳下不同机器间的sequence取值区间。
id中带上分片信息，这样通过订单号来查询时，就能直接解析出shardingId。
lua脚本中sequenceKey由namespace + 应用的机器的当前时间戳（EpochSeconds）构成。
应用每台机器缓存一批sequenceKey的sequence 区间，这样跟redis交互非常少，增加id分配吞吐量。
需要考虑时间回拨和ntp同步gap问题。

id bits结构

8字节的long所有bit分配如下

---Sign(1bit)----Time(32bit)----sharding(12bit)----sequence(19bit)--—

符号位： 1 bit，始终为0，保持是正数
时间戳: 32 bit，精确到秒，可以使用约136年，配合epoch偏移，可以使用到公元2155年
分片位：12 bit，可以支撑4096个分片
序列号：19bit，可以最大支撑52万个id/秒（1L << 19 = 524288）

 public long id(long seconds, int shardingId, long sequence) {
    long shiftedTimestamp = (seconds - CUSTOM_EPOCH) << TIMESTAMP_SHIFT;
    long shiftedSharding = shardingId << SHARDING_SHIFT;
    return shiftedTimestamp | shiftedSharding | sequence;
  }

实现细节

接口定义

public interface IdGenerator {
    /**
     * generate next id, which the type of value is long.
     *
     * @param shardingId sharding id
     * @return the id
     */
    long nextId(int shardingId);

    /**
     * @param shardingId sharding id
     * @return the string id
     */
    default String next(int shardingId) {
        return next("", shardingId);
    }

    /**
     * id format: prefix + time(yyyyMMddHHmmss 14位) + sequence(6位) + shardingId(4位)
     * <p>
     * the last 4 bits are sharding id.
     * </p>
     *
     * @param prefix     the id prefix
     * @param shardingId sharding id
     * @return the string id
     */
    String next(String prefix, int shardingId);
}

public interface IdGenerator {
    /**
     * generate next id, which the type of value is long.
     *
     * @param shardingId sharding id
     * @return the id
     */
    long nextId(int shardingId);

    /**
     * @param shardingId sharding id
     * @return the string id
     */
    default String next(int shardingId) {
        return next("", shardingId);
    }

    /**
     * id format: prefix + time(yyyyMMddHHmmss 14位) + sequence(6位) + shardingId(4位)
     * <p>
     * the last 4 bits are sharding id.
     * </p>
     *
     * @param prefix     the id prefix
     * @param shardingId sharding id
     * @return the string id
     */
    String next(String prefix, int shardingId);
}

并发情况下sequence区间分配问题
利用redis lua脚本来处理相关key的逻辑，redis 处理操作类命令是单线程排队执行
机器时间回拨和ntp同步gap问题
生成ID时，利用应用所在的机器当前时间(秒数)+ namespace作为key去redis集群获取一批sequence区间，用来生成ID。我们知道机器间存在同步时间差，所以需要为key设置一个安全的过期时间，一般10分钟足够了，ntp同步间隔需要肯定小于这个过期时间。

local sequence_key = KEYS[1]
local app_server_time = tonumber(ARGV[1])
local max_sequence = tonumber(ARGV[2])
local size = tonumber(ARGV[3])
local lock_key = 'lock-' .. sequence_key

if redis.call('EXISTS', lock_key) == 1 then
   redis.log(redis.LOG_WARNING, 'Cannot generate ID, waiting for lock to expire.')
   return redis.error_reply('Cannot generate ID, waiting for lock to expire.')
end

--[[
Increment by a set number, this can
--]]
local end_sequence = redis.pcall('INCRBY', sequence_key, size)
redis.pcall('EXPIRE', sequence_key, 600) -- 10min expire, ntp time gap should be less than 10min
local start_sequence = end_sequence - size + 1
if end_sequence >= max_sequence then
    redis.log(redis.LOG_WARNING, 'Rolling sequence back to the start, locking for 1s.')
    redis.pcall('PSETEX', lock_key, 1000, 'lock')
    end_sequence = max_sequence
end

return {
    start_sequence,
    end_sequence, -- Doesn't need conversion, the result of INCR or the variable set is always a number.
    app_server_time
}

当前缓存的一批sequence区间用完了，需要去redis server再去获取一批，高并发场景下，需要保证当前机器节点只有1个线程去更新，不然会导致大量的sequence区间浪费。
利用单个生产者多个消费者模型，java里可通过ReadWriteLock来实现

本地cache key为sequenceKey，value为一批sequence区间段（SequenceSegment）

public class SequenceSegment {
  private volatile long startSequence;
  private volatile long endSequence;
  private volatile long seconds;
  private AtomicLong val = new AtomicLong(0);
  private final ReadWriteLock lock;
	//.....
}

更新sequence区间线程获取ReadWriteLock的write lock

private SequenceSegment getSegment(String sequenceKey) {
    try {
      SequenceSegment segment = cache.getIfPresent(sequenceKey);
      if (segment != null) {
        if (segment.isReachEnd()) {//本地缓存的sequence区间使用完了的话，去更新
          synchronized (segment) {
            if (segment.isReachEnd()) {
              Lock wLock = segment.getLock().writeLock();
              wLock.lock();
              if (log.isDebugEnabled()) {
                log.debug("update sequenceKey={} id segment={}", sequenceKey, segment);
              }
              try {
                List<Long> data = evalLuaScript(sequenceKey);
                populateSegment(segment, data);
              } finally {
                wLock.unlock();
              }
            }
          }
        }
      } else {
        segment = cache.get(sequenceKey, () -> {
          List<Long> data = evalLuaScript(sequenceKey);
          SequenceSegment ans = new SequenceSegment();
          populateSegment(ans, data);
          if (log.isDebugEnabled()) {
            log.debug("insert sequenceKey={} id segment={}", sequenceKey, ans);
          }
          return ans;
        });
      }
      return segment;
    } catch (ExecutionException e) {
      log.error("getRedisResponse occur error, sequenceKey={}", sequenceKey, e);
      throw new SequenceException("", e);
    }
  }

获取sequence线程获取read lock

private long[] getSecondsAndSequence(int shardingId) {
    String key = getSequenceKey(Instant.now().getEpochSecond());
    long sequence, seconds;
    try {
      SequenceSegment segment = getSegment(key);
      while (true) {
        segment.getLock().readLock().lock();
        try {
          sequence = segment.takeOne();
          seconds = segment.getSeconds();
          if (sequence <= segment.getEndSequence()) {
            return new long[]{sequence, seconds};
          }
        } finally {
          segment.getLock().readLock().unlock();
        }
      }
    } catch (SequenceException e) {
      log.error("redis occur exception, fallback...", e);
      seconds = Instant.now().getEpochSecond();
      sequence = ThreadLocalRandom.current()
          .nextInt(1, DefaultIdDistribution.INSTANCE.getMaxSequence());
      return new long[]{sequence, seconds};
    }
  }

服务高可用
redis server主从配置，重要应用单独申请redis实例。redis server crash期间先使用缓存的sequence区间，用完后降级至使用随机数生成sequence。

昔日微醉

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
分布式id生成器

如果当前的 Timestamp 和前一个已生成 ID 的 Timestamp 相同 (在同一毫秒中), 就用前一个 ID 的 sequence number + 1 作为新的 sequence number (12 bits);如果本毫秒内的所有 ID 用完, 等到下一毫秒继续 (这个等待过程中, 不能分配出新的 ID)10 bits 的机器号, 在 ID 分配 Worker 启动的时候, 从一个 Zookeeper 集群获取 (保证所有的 Worker 不会有重复的机器号)优点：基本去中心化，无单点。
复制链接

扫一扫