雪花算法是什么?
雪花算法最早是Twitter内部使用的分布式环境下的唯一ID生成算法,于2014年开源。
雪花算法的特性
- 能满足高并发分布式系统环境下ID不重复
- 基于时间戳,可以保证基本有序递增
- 安全性,无规则,不顺序,防止数据被轻易爬取
雪花算法的组成
使用雪花算法生成的主键,二进制表示形式包含4部分,从高位到低位分表为:1bit符号位、41bit时间戳位、10bit工作进程位以及12bit序列号位。
- 符号位,预留的符号位,恒为零。
- 时间戳位(41bit)
- 工作进程位是java进程内唯一的,10位的长度最多支持部署1024个节点,而且可以拆分为5位datacenterId和5位workerId。
- 序列号位(12bit) 该序列是用来在同一个毫秒内生成不同的ID,同一毫秒内最大支持4096(2的12次幂)。
sharding-jdbc处理方式
sharding-jdbc 4.x版本的默认分布式主键id即使用了雪花算法,并解决了时间回拨等问题。
时针回拨问题
org.apache.shardingsphere.core.strategy.keygen.SnowflakeShardingKeyGenerator
/**
* 服务器时钟回拨会导致产生重复序列,因此默认分布式主键生成器提供了一个最大容忍的时钟回拨毫秒数。
* 如果时钟回拨的时间超过最大容忍的毫秒数阈值,则程序报错;
* 如果在可容忍的范围内,默认分布式主键生成器会等待时钟同步到最后一次主键生成的时间后再继续工作。
*/
@Override
public synchronized Comparable<?> generateKey() {
long currentMilliseconds = timeService.getCurrentMillis();
if (waitTolerateTimeDifferenceIfNeed(currentMilliseconds)) {
currentMilliseconds = timeService.getCurrentMillis();
}
if (lastMilliseconds == currentMilliseconds) {
// 每ms最大4096个,若超过,则等待至下一ms生成id
if (0L == (sequence = (sequence + 1) & SEQUENCE_MASK)) {
currentMilliseconds = waitUntilNextTime(currentMilliseconds);
}
} else {
//设定每ms的起始值
vibrateSequenceOffset();
sequence = sequenceOffset;
}
lastMilliseconds = currentMilliseconds;
return ((currentMilliseconds - EPOCH) << TIMESTAMP_LEFT_SHIFT_BITS) | (getWorkerId() << WORKER_ID_LEFT_SHIFT_BITS) | sequence;
}
@SneakyThrows
private boolean waitTolerateTimeDifferenceIfNeed(final long currentMilliseconds) {
if (lastMilliseconds <= currentMilliseconds) {
return false;
}
long timeDifferenceMilliseconds = lastMilliseconds - currentMilliseconds;
//如果时钟回拨的时间超过最大容忍的毫秒数阈值,则程序报错;
Preconditions.checkState(timeDifferenceMilliseconds < getMaxTolerateTimeDifferenceMilliseconds(),
"Clock is moving backwards, last time is %d milliseconds, current time is %d milliseconds", lastMilliseconds, currentMilliseconds);
//如果在可容忍的范围内,默认分布式主键生成器会等待时钟同步到最后一次主键生成的时间后再继续工作。
Thread.sleep(timeDifferenceMilliseconds);
return true;
}
数据更均匀或记录不总是为负数
snowflake算法的最后4位是在同一毫秒内的访问递增值。因此,如果毫秒内并发度不高,最后4位为零的几率则很大。因此并发度不高的应用生成偶数主键的几率会更高。
//设定每ms的起始值
private void vibrateSequenceOffset() {
sequenceOffset = sequenceOffset >= getMaxVibrationOffset() ? 0 : sequenceOffset + 1;
}
//max.vibration.offset 与节点数量有关,如果分片策略为按节点数量取余,则推荐为节点数量
private int getMaxVibrationOffset() {
int result = Integer.parseInt(properties.getProperty("max.vibration.offset", String.valueOf(DEFAULT_VIBRATION_VALUE)));
Preconditions.checkArgument(result >= 0 && result <= SEQUENCE_MASK, "Illegal max vibration offset");
return result;
}
workerid 生成方式
默认是通过配置文件配置
private long getWorkerId() {
long result = Long.valueOf(properties.getProperty("worker.id", String.valueOf(WORKER_ID)));
Preconditions.checkArgument(result >= 0L && result < WORKER_ID_MAX_VALUE);
return result;
}
也可以通过jvm的启动参数传入,即通过-D选项传入
-DworkerId=value
System.getProperty("workerId");
leaf处理方式
在5.0.0版本之前,ShardingSphere借鉴的是Leaf的主键id生成方案。
Leaf的主键id生成方案主要分为Leaf-segment和Leaf-snowflake两种。ShardingSphere在4.0.0-RC2-release版本中实现了Leaf-segment,在4.0.0-RC3-release版本中实现了Leaf-snowflake。
利用zookeeper永久节点管理 workerid
com.sankuai.inf.leaf.snowflake.SnowflakeIDGenImpl
com.sankuai.inf.leaf.snowflake.SnowflakeZookeeperHolder
public boolean init() {
try {
CuratorFramework curator = createWithOptions(connectionString, new RetryUntilElapsed(1000, 4), 10000, 6000);
curator.start();
Stat stat = curator.checkExists().forPath(PATH_FOREVER);
if (stat == null) {
//不存在根节点,机器第一次启动,创建/snowflake/ip:port-000000000,并上传数据
zk_AddressNode = createNode(curator);
//worker id 默认是0
updateLocalWorkerID(workerID);
//定时上报本机时间给forever节点
ScheduledUploadData(curator, zk_AddressNode);
return true;
} else {
Map<String, Integer> nodeMap = Maps.newHashMap();//ip:port->00001
Map<String, String> realNode = Maps.newHashMap();//ip:port->(ipport-000001)
//存在根节点,先检查是否有属于自己的根节点
List<String> keys = curator.getChildren().forPath(PATH_FOREVER);
for (String key : keys) {
String[] nodeKey = key.split("-");
realNode.put(nodeKey[0], key);
nodeMap.put(nodeKey[0], Integer.parseInt(nodeKey[1]));
}
Integer workerid = nodeMap.get(listenAddress);
if (workerid != null) {
//有自己的节点,zk_AddressNode=ip:port
zk_AddressNode = PATH_FOREVER + "/" + realNode.get(listenAddress);
workerID = workerid;//启动worder时使用会使用
if (!checkInitTimeStamp(curator, zk_AddressNode)) {
throw new CheckLastTimeException("init timestamp check error,forever node timestamp gt this node time");
}
//准备创建临时节点
doService(curator);
updateLocalWorkerID(workerID);
LOGGER.info("[Old NODE]find forever node have this endpoint ip-{} port-{} workid-{} childnode and start SUCCESS", ip, port, workerID);
} else {
//表示新启动的节点,创建持久节点 ,不用check时间
String newNode = createNode(curator);
zk_AddressNode = newNode;
String[] nodeKey = newNode.split("-");
workerID = Integer.parseInt(nodeKey[1]);
doService(curator);
updateLocalWorkerID(workerID);
LOGGER.info("[New NODE]can not find node on forever node that endpoint ip-{} port-{} workid-{},create own node on forever node and start SUCCESS ", ip, port, workerID);
}
}
} catch (Exception e) {
LOGGER.error("Start node ERROR {}", e);
try {
Properties properties = new Properties();
properties.load(new FileInputStream(new File(PROP_PATH.replace("{port}", port + ""))));
workerID = Integer.valueOf(properties.getProperty("workerID"));
LOGGER.warn("START FAILED ,use local node file properties workerID-{}", workerID);
} catch (Exception e1) {
LOGGER.error("Read file error ", e1);
return false;
}
}
return true;
}
本地缓存workerId,对zookeeper弱依赖
除了每次会去ZK拿数据以外,也会在本机文件系统上缓存一个workerID文件。当ZooKeeper出现问题,恰好机器出现问题需要重启时,能保证服务能够正常启动。这样做到了对三方组件的弱依赖。一定程度上提高了SLA
Properties properties = new Properties();
properties.load(new FileInputStream(new File(PROP_PATH.replace("{port}", port + ""))));
workerID = Integer.valueOf(properties.getProperty("workerID"));
附录
1、sharding-jdbc如何实现第三方自定义id生成器
参照如下
https://blog.csdn.net/it_freshman/article/details/106075291
2、sharding-jdbc-plugin 获取workerId
sharding-jdbc的sharding-jdbc-plugin模块中,提供了三种方式获取workerId的方式,并提供接口获取分布式唯一ID的方法–generateKey(),可参照如下:
https://www.cnblogs.com/hongdada/p/9324473.html
3、leaf docker部署,获取ip的问题
leaf通过ip,port 来获取workerId,ip获取方式如下
/**
* 获取已激活网卡的IP地址
*
* @param interfaceName 可指定网卡名称,null则获取全部
* @return List<String>
*/
private static List<String> getHostAddress(String interfaceName) throws SocketException {
List<String> ipList = new ArrayList<String>(5);
Enumeration<NetworkInterface> interfaces = NetworkInterface.getNetworkInterfaces();
while (interfaces.hasMoreElements()) {
NetworkInterface ni = interfaces.nextElement();
Enumeration<InetAddress> allAddress = ni.getInetAddresses();
while (allAddress.hasMoreElements()) {
InetAddress address = allAddress.nextElement();
if (address.isLoopbackAddress()) {
// skip the loopback addr
continue;
}
if (address instanceof Inet6Address) {
// skip the IPv6 addr
continue;
}
String hostAddress = address.getHostAddress();
if (null == interfaceName) {
ipList.add(hostAddress);
} else if (interfaceName.equals(ni.getDisplayName())) {
ipList.add(hostAddress);
}
}
}
return ipList;
}
但如果部署在docker环境,则获取的ip地址可能为docker0网卡里的ip地址,而此地址在docker重启后可能发生变更,因此可参考eureka注册ip获取方法,如下:
spring-cloud-commons
org.springframework.cloud.commons.util.InetUtils#findFirstNonLoopbackAddress
public InetAddress findFirstNonLoopbackAddress() {
InetAddress result = null;
try {
int lowest = Integer.MAX_VALUE;
for (Enumeration<NetworkInterface> nics = NetworkInterface.getNetworkInterfaces(); nics
.hasMoreElements();) {
NetworkInterface ifc = nics.nextElement();
if (ifc.isUp()) {
this.log.trace("Testing interface: " + ifc.getDisplayName());
if (ifc.getIndex() < lowest || result == null) {
lowest = ifc.getIndex();
}
else if (result != null) {
continue;
}
// @formatter:off
if (!ignoreInterface(ifc.getDisplayName())) {
for (Enumeration<InetAddress> addrs = ifc
.getInetAddresses(); addrs.hasMoreElements();) {
InetAddress address = addrs.nextElement();
if (address instanceof Inet4Address
&& !address.isLoopbackAddress()
&& isPreferredAddress(address)) {
this.log.trace("Found non-loopback interface: "
+ ifc.getDisplayName());
result = address;
}
}
}
// @formatter:on
}
}
}
catch (IOException ex) {
this.log.error("Cannot get first non-loopback address", ex);
}
if (result != null) {
return result;
}
try {
return InetAddress.getLocalHost();
}
catch (UnknownHostException e) {
this.log.warn("Unable to retrieve localhost");
}
return null;
}
参考文档
https://shardingsphere.apache.org/document/legacy/4.x/document/cn/features/sharding/other-features/key-generator/
https://tech.meituan.com/2019/03/07/open-source-project-leaf.html
https://tech.meituan.com/2017/04/21/mt-leaf.html
https://juejin.im/post/6844904153144098823