SnowFlake brief introduction
简介
Snowflake算法是Twitter设计的一个可以在分布式系统中生成唯一的ID的算法。
-
特点
-
性能(Performance)
minimum 10k ids per second per process;response rate 2ms (plus network latency)
每个进程每秒10K;响应速度2ms
-
协调性(Uncoordinated)
For high availability within and across data centers, machines generating ids should not have to coordinate with each other.
-
时间顺序性(Roughly)Time Ordered
-
可排序(Directly Sortable)
-
紧凑的(Compact)
There are many otherwise reasonable solutions to this problem that require 128bit numbers. For various reasons, we need to keep our ids under 64bits.
-
-
算法
2.1 特点-
以毫秒为精读,可使用69年。
time - 41 bits (millisecond precision w/ a custom epoch gives us 69 years)
-
可配置在1024台机器上
configured machine id - 10 bits - gives us up to 1024 machines
-
每台机器每毫秒最多可生成4096个id
sequence number - 12 bits - rolls over every 4096 per machine (with protection to avoid rollover in the same ms)
2.2 结构
- 1位,保留,0正数,1负数
- 41位,记录时间戳(毫秒),因此(2^41-1)/(1000 * 60 * 60 * 24 * 365)=69年
- 10位,工作机器id,5位datacenterId+5位workerId。2^10=1024
- 12位,序列号,记录同毫秒内产生的不同id。2^12=4096
2.3 核心代码
/** 开始时间截 (2015-01-01) */ private final long twepoch = 1420041600000L; /** 时间截向左移22位(5+5+12) */ private final long timestampLeftShift = sequenceBits + workerIdBits + datacenterIdBits; /** 机器ID向左移12位 */ private final long workerIdShift = sequenceBits; /** 数据标识id向左移17位(12+5) */ private final long datacenterIdShift = sequenceBits + workerIdBits; /** 生成序列的掩码,这里为4095 (0b111111111111=0xfff=4095) */ private final long sequenceMask = -1L ^ (-1L << sequenceBits); /** * 获得下一个ID (该方法是线程安全的) * @return SnowflakeId */ public synchronized long nextId() { long timestamp = timeGen(); //如果当前时间小于上一次ID生成的时间戳,说明系统时钟回退过这个时候应当抛出异常 if (timestamp < lastTimestamp) { throw new RuntimeException( String.format("Clock moved backwards. Refusing to generate id for %d milliseconds", lastTimestamp - timestamp)); } //如果是同一时间生成的,则进行毫秒内序列 if (lastTimestamp == timestamp) { sequence = (sequence + 1) & sequenceMask; //毫秒内序列溢出 if (sequence == 0) { //阻塞到下一个毫秒,获得新的时间戳 timestamp = tilNextMillis(lastTimestamp); } } else {//时间戳改变,毫秒内序列重置e sequence = 0L; } //上次生成ID的时间截 lastTimestamp = timestamp; //移位并通过或运算拼到一起组成64位的ID return ((timestamp - twepoch) << timestampLeftShift) // | (datacenterId << datacenterIdShift) // | (workerId << workerIdShift) // | sequence; } /** * 阻塞到下一个毫秒,直到获得新的时间戳 * @param lastTimestamp 上次生成ID的时间截 * @return 当前时间戳 */ protected long tilNextMillis(long lastTimestamp) { long timestamp = timeGen(); while (timestamp <= lastTimestamp) { timestamp = timeGen(); } return timestamp; }
-
-
系统时钟依赖
You should use NTP to keep your system clock accurate. Snowflake protects from non-monotonic clocks, i.e. clocks that run backwards. If your clock is running fast and NTP tells it to repeat a few milliseconds, snowflake will refuse to generate ids until a time that is after the last time we generated an id. Even better, run in a mode where ntp won’t move the clock backwards. See http://wiki.dovecot.org/TimeMovedBackwards#Time_synchronization for tips on how to do this.
强依赖机器时钟;若机器上的时钟回拨,snowflake在系统时钟追上last time之前都无法生成id。