ULID规范解读与实现原理

前提
最近发现各个频道推荐了很多​​ULID​​​相关文章,这里对​​ULID​​​的规范文件进行解读,并且基于​​Java​​​语言自行实现​​ULID​​​,通过此实现过程展示​​ULID​​的底层原理。

笔者尝试把ULID基于Java实现的代码工程化,见仓库​ ​ulid4j​​,欢迎star或者提交PR

ULID出现的背景

​​ULID​​​全称是​​Universally Unique Lexicographically Sortable Identifier​​,直译过来就是通用唯一按字典排序的标识符,它的原始仓库是​​https://github.com/ulid/javascript​​​,该项目由前端开发者​ ​alizain​​​发起,基于​​JavaScript​​​语言编写。从项目中的​​commit​​​历史来看已经超过了​​5​​年,理论上得到充分的实践验证。​​ULID​​​出现的原因是一些开发者认为主流的​​UUID​​方案在许多场景下可能不是最优的,存在下面的原因:

​​UUID​​​不是​​128 bit​​​随机编码(由​​128 bit​​随机数通过编码生成字符串)的最高效实现方式
​​UUID​​​的​​v1/v2​​​实现在许多环境中是不切实际的,因为这两个版本的的实现需要访问唯一的、稳定的​​MAC​​地址
​​UUID​​​的​​v3/v5​​​实现需要唯一的种子,并且产生随机分布的​​ID​​,这可能会导致在许多数据结构中出现碎片
​​UUID​​​的​​v4​​除了随机性之外不需要提供其他信息,随机性可能会在许多数据结构中导致碎片
这里概括一下就是:​​UUID​​​的​​v1/v2​​​实现依赖唯一稳定​​MAC​​​地址不现实,​​v3/v4/v5​​​实现因为随机性产生的​​ID​​会"碎片化"。

基于此提出了​​ULID​​,它用起来像这样:

ulid() // 01ARZ3NDEKTSV4RRFFQ69G5FAV

​​ULID​​的特点如下:

设计为​​128 bit​​​大小,与​​UUID​​兼容
每毫秒生成​​1.21e+24​​​个唯一的​​ULID​​(高性能)
按字典顺序(字母顺序)排序
标准编码为​​26​​​个字符的字符串,而不是像​​UUID​​​那样需要​​36​​个字符
使用​​Crockford​​​的​​base32​​​算法来提高效率和可读性(每个字符​​5 bit​​)
不区分大小写
没有特殊字符串(​​URL​​​安全,不需要进行二次​​URL​​编码)
单调排序(正确地检测并处理相同的毫秒,所谓单调性,就是毫秒数相同的情况下,能够确保新的​​ULID​​​随机部分的在最低有效位上加​​1​​位)
 

ULID规范

下面的​​ULID​​​规范在​​ULID/javascript​​​类库中实现,此二进制格式目前没有在​​JavaScript​​中实现:

01AN4Z07BY      79KA1307SR9X4MV3
|----------|    |----------------|
 Timestamp          Randomness
   48bits             80bits

组成


时间戳(​​Timestamp​​)

占据​​48 bit​​​(​​high​​)
本质是​​UNIX-time​​,单位为毫秒
直到公元​​10889​​年才会用完


随机数(​​Randomness​​)

占据​​80 bit​​​(​​low​​)
如果可能的话,使用加密安全的随机源


排序
"最左边"的字符必须排在最前面,"最右边"的字符排在最后(词法顺序,或者俗称的字典排序),并且所有字符必须使用默认的​​ASCII​​字符集。在相同的毫秒(时间戳)内,无法保证排序顺序。
 

规范的表示形式
​​ULID​​规范的字符串表示形式如下:

ttttttttttrrrrrrrrrrrrrrrr

where
t is Timestamp (10 characters)
r is Randomness (16 characters)

也就是:

时间戳占据高(左边)​​10​​个(编码后的)字符
随机数占据低(右边)​​16​​个(编码后的)字符


​​ULID​​规范的字符串表示形式的长度是确定的,共占据​​26​​个字符。

编码
使用​​Crockford Base32​​编码算法,这个编码算法的字母表如下:

0123456789ABCDEFGHJKMNPQRSTVWXYZ

该字母表排除了​​I​​​、 ​​L​​​、​​O​​​、​​U​​​字母,目的是避免混淆和滥用。此算法实现不难,它的官网有详细的算法说明(见​​https://www.crockford.com/base32.html​​):

单调性
(如果启用了单调性这个特性为前提下)当在相同的毫秒内生成多个​​ULID​​​时,可以保证排序的顺序。也就是说,如果检测到相同的毫秒,则随机分量在最低有效位上加​​1​​位(带进位)。例如:

monotonicUlid()  // 01BX5ZZKBKACTAV9WEVGEMMVRZ
monotonicUlid()  // 01BX5ZZKBKACTAV9WEVGEMMVS0

溢出错误处理
从技术实现上来看,​​26​​​个字符的​​Base32​​​编码字符串可以包含​​130 bit​​​信息,而​​ULID​​​只包含​​128 bit​​​信息,所以该编码算法是能完全满足​​ULID​​​的需要。基于​​Base32​​​编码能够生成的最大的合法​​ULID​​​其实就是​​7ZZZZZZZZZZZZZZZZZZZZZZZZZ​​​,并且使用的时间戳为​​epoch time​​​的​​281474976710655​​​或者说​​2 ^ 48 - 1​​​。对于任何对大于此值的​​ULID​​进行解码或编码的尝试都应该被所有实现拒绝,以防止溢出错误。
-----------------------------------
©著作权归作者所有:来自51CTO博客作者Throwable的原创作品,请联系作者获取转载授权,否则将追究法律责任
ULID规范解读与实现原理
https://blog.51cto.com/throwable/5788487

二进制布局
二进制布局的多个部分被编码为​​16 byte​​​,每个部分都以最高字节优先(网络字节序,也就是​​big-endian​​)进行编码,布局如下:

0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      32_bit_uint_time_high                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     16_bit_uint_time_low      |       16_bit_uint_random      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       32_bit_uint_random                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       32_bit_uint_random                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

ULID使用
对于​​script​​标签引用:

<script src="https://unpkg.com/ulid@latest/dist/index.umd.js"></script>
<script>
    ULID.ulid()
</script>

NPM​​安装:

npm install --save ulid

​TypeScript​​​, ​​ES6+​​​, ​​Babel​​​, ​​Webpack​​​, ​​Rollup​​等等下使用:

// import
import { ulid } from 'ulid'

ulid()

// CommonJS env
const ULID = require('ulid')

ULID.ulid()

后端​​Maven​​​项目中使用需要引入依赖,这里选用​​ulid-creator​​实现:

<dependency>
  <groupId>com.github.f4b6a3</groupId>
  <artifactId>ulid-creator</artifactId>
  <version>5.0.2</version>
</dependency>

然后调用​​UlidCreator#getUlid()​​系列方法:

// 常规
Ulid ulid = UlidCreator.getUlid();

// 单调排序
Ulid ulid = UlidCreator.getMonotonicUlid();

实现ULID
前面已经提到​​ULID​​​的规范,其实具体实现​​ULID​​​就是对着规范里面的每一个小节进行编码实现。先看二进制布局,由于使用​​128 bit​​​去存储,可以借鉴​​UUID​​​那样,使用两个​​long​​​类似的成员变量存储​​ULID​​的信息,看起来像这样:

public final class ULID {

    /*
     * The most significant 64 bits of this ULID.
     *
     */
    private final long msb;

    /*
     * The least significant 64 bits of this ULID.
     *
     */
    private final long lsb;

    public ULID(long msb, long lsb) {
        this.msb = msb;
        this.lsb = lsb;
    }
}

按照​​ULID​​的组成来看,可以提供一个入参为时间戳和随机数字节数组的构造:

public ULID(long timestamp, byte[] randomness) {
    if ((timestamp & TIMESTAMP_MASK) != 0) {
        throw new IllegalArgumentException("Invalid timestamp");
    }
    if (Objects.isNull(randomness) || RANDOMNESS_BYTE_LEN != randomness.length) {
        throw new IllegalArgumentException("Invalid randomness");
    }
    long msb = 0;
    long lsb = 0;
    // 时间戳左移16位,低位补零准备填入部分随机数位,即16_bit_uint_random
    msb |= timestamp << 16;
    // randomness[0]左移0位填充到16_bit_uint_random的高8位,randomness[1]填充到16_bit_uint_random的低8位
    msb |= (long) (randomness[0x0] & 0xff) << 8;
    // randomness[1]填充到16_bit_uint_random的低8位
    msb |= randomness[0x1] & 0xff;
    // randomness[2] ~ randomness[9]填充到剩余的bit_uint_random中,要左移相应的位
    lsb |= (long) (randomness[0x2] & 0xff) << 56;
    lsb |= (long) (randomness[0x3] & 0xff) << 48;
    lsb |= (long) (randomness[0x4] & 0xff) << 40;
    lsb |= (long) (randomness[0x5] & 0xff) << 32;
    lsb |= (long) (randomness[0x6] & 0xff) << 24;
    lsb |= (long) (randomness[0x7] & 0xff) << 16;
    lsb |= (long) (randomness[0x8] & 0xff) << 8;
    lsb |= (randomness[0x9] & 0xff);
    this.msb = msb;
    this.lsb = lsb;
}

这是完全按照规范的二进制布局编写代码,可以像​​UUID​​的构造那样精简一下:

long msb = 0;
long lsb = 0;
byte[] data = new byte[16];
byte[] ts = ByteBuffer.allocate(8).putLong(0, timestamp << 16).array();
System.arraycopy(ts, 0, data, 0, 6);
System.arraycopy(randomness, 0, data, 6, 10);
for (int i = 0; i < 8; i++)
    msb = (msb << 8) | (data[i] & 0xff);
for (int i = 8; i < 16; i++)
    lsb = (lsb << 8) | (data[i] & 0xff);

接着可以简单添加下面几个方法:

public long getMostSignificantBits() {
    return this.msb;
}

public long getLeastSignificantBits() {
    return this.lsb;
}

// 静态工厂方法,由UUID实例生成ULID实例
public static ULID fromUUID(UUID uuid) {
    return new ULID(uuid.getMostSignificantBits(), uuid.getLeastSignificantBits());
}

// 实例方法,当前ULID实例转换为UUID实例
public UUID toUUID() {
    return new UUID(this.msb, this.lsb);
}

接着需要覆盖​​toString()​​​方法,这是​​ULID​​​的核心方法,需要通过​​Crockford Base32​​​编码生成规范的字符串表示形式。由于​​128 bit​​​要映射为​​26 char​​​,这里可以考虑分三段进行映射,也就是​​48 bit​​​时间戳映射为​​10 char​​​,剩下的两部分随机数分别做​​40 bit​​​到​​8 char​​​的映射,加起来就是​​26 char​​:
 

|----------|      |----------------|
  Timestamp     Randomness[split to 2 part]
48bit => 10char    80bit => 16char

编写方法:

/**
  * Default alphabet of ULID
  */
private static final char[] DEFAULT_ALPHABET = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C',
        'D', 'E', 'F', 'G', 'H', 'J', 'K', 'M', 'N', 'P', 'Q', 'R', 'S', 'T', 'V', 'W', 'X', 'Y', 'Z'};

/**
  * Default alphabet mask
  */
private static final int DEFAULT_ALPHABET_MASK = 0b11111;

/**
  * Character num of ULID
  */
private static final int ULID_CHAR_LEN = 0x1a;

@Override
public String toString() {
    return toCanonicalString(DEFAULT_ALPHABET);
}

public String toCanonicalString(char[] alphabet) {
    char[] chars = new char[ULID_CHAR_LEN];
    long timestamp = this.msb >> 16;
    // 第一部分随机数取msb的低16位+lsb的高24位,这里(msb & 0xffff) << 24作为第一部分随机数的高16位,所以要左移24位
    long randMost = ((this.msb & 0xffffL) << 24) | (this.lsb >>> 40);
    // 第二部分随机数取lsb的低40位,0xffffffffffL是2^40-1
    long randLeast = (this.lsb & 0xffffffffffL);
    // 接着每个部分的偏移量和DEFAULT_ALPHABET_MASK(31)做一次或运算就行,就是char[index] = alphabet[(part >> (step * index)) & 31]
    chars[0x00] = alphabet[(int) (timestamp >>> 45 & DEFAULT_ALPHABET_MASK)];
    chars[0x01] = alphabet[(int) (timestamp >>> 40 & DEFAULT_ALPHABET_MASK)];
    chars[0x02] = alphabet[(int) (timestamp >>> 35 & DEFAULT_ALPHABET_MASK)];
    chars[0x03] = alphabet[(int) (timestamp >>> 30 & DEFAULT_ALPHABET_MASK)];
    chars[0x04] = alphabet[(int) (timestamp >>> 25 & DEFAULT_ALPHABET_MASK)];
    chars[0x05] = alphabet[(int) (timestamp >>> 20 & DEFAULT_ALPHABET_MASK)];
    chars[0x06] = alphabet[(int) (timestamp >>> 15 & DEFAULT_ALPHABET_MASK)];
    chars[0x07] = alphabet[(int) (timestamp >>> 10 & DEFAULT_ALPHABET_MASK)];
    chars[0x08] = alphabet[(int) (timestamp >>> 5 & DEFAULT_ALPHABET_MASK)];
    chars[0x09] = alphabet[(int) (timestamp & DEFAULT_ALPHABET_MASK)];
    chars[0x0a] = alphabet[(int) (randMost >>> 35 & DEFAULT_ALPHABET_MASK)];
    chars[0x0b] = alphabet[(int) (randMost >>> 30 & DEFAULT_ALPHABET_MASK)];
    chars[0x0c] = alphabet[(int) (randMost >>> 25 & DEFAULT_ALPHABET_MASK)];
    chars[0x0d] = alphabet[(int) (randMost >>> 20 & DEFAULT_ALPHABET_MASK)];
    chars[0x0e] = alphabet[(int) (randMost >>> 15 & DEFAULT_ALPHABET_MASK)];
    chars[0x0f] = alphabet[(int) (randMost >>> 10 & DEFAULT_ALPHABET_MASK)];
    chars[0x10] = alphabet[(int) (randMost >>> 5 & DEFAULT_ALPHABET_MASK)];
    chars[0x11] = alphabet[(int) (randMost & DEFAULT_ALPHABET_MASK)];
    chars[0x12] = alphabet[(int) (randLeast >>> 35 & DEFAULT_ALPHABET_MASK)];
    chars[0x13] = alphabet[(int) (randLeast >>> 30 & DEFAULT_ALPHABET_MASK)];
    chars[0x14] = alphabet[(int) (randLeast >>> 25 & DEFAULT_ALPHABET_MASK)];
    chars[0x15] = alphabet[(int) (randLeast >>> 20 & DEFAULT_ALPHABET_MASK)];
    chars[0x16] = alphabet[(int) (randLeast >>> 15 & DEFAULT_ALPHABET_MASK)];
    chars[0x17] = alphabet[(int) (randLeast >>> 10 & DEFAULT_ALPHABET_MASK)];
    chars[0x18] = alphabet[(int) (randLeast >>> 5 & DEFAULT_ALPHABET_MASK)];
    chars[0x19] = alphabet[(int) (randLeast & DEFAULT_ALPHABET_MASK)];
    return new String(chars);
}

上面的方法​​toCanonicalString()​​​看起来很"臃肿",但是能保证性能比较高,实现思路来自于​​Long#fastUUID()​​​,也就是​​UUID​​​的五段格式化方法。借鉴并且简化一下可以抽取一个​​toCanonicalString0()​​方法:

public String toCanonicalString0() {
    byte[] bytes = new byte[ULID_CHAR_LEN];
    formatUnsignedLong0(this.lsb & 0xffffffffffL, 5, bytes, 18, 8);
    formatUnsignedLong0(((this.msb & 0xffffL) << 24) | (this.lsb >>> 40), 5, bytes, 10, 8);
    formatUnsignedLong0(this.msb >> 16, 5, bytes, 0, 10);
    return new String(bytes, StandardCharsets.US_ASCII);
}

private static void formatUnsignedLong0(long val, int shift, byte[] buf, int offset, int len) {
    int charPos = offset + len;
    long radix = 1L << shift;
    long mask = radix - 1;
    do {
        buf[--charPos] = (byte) DEFAULT_ALPHABET[(int) (val & mask)];
        val >>>= shift;
    } while (charPos > offset);
}

​toCanonicalString0()​​​方法和​​toString()​​​方法会得到相同的​​ULID​​格式化字符串。接着添加常用的工厂方法:

public static ULID ulid() {
    return ulid(System::currentTimeMillis, len -> {
        byte[] bytes = new byte[len];
        ThreadLocalRandom.current().nextBytes(bytes);
        return bytes;
    });
}

public static ULID ulid(Supplier<Long> timestampProvider,
                        IntFunction<byte[]> randomnessProvider) {
    return new ULID(timestampProvider.get(), randomnessProvider.apply(RANDOMNESS_BYTE_LEN));
}

默认使用​​ThreadLocalRandom​​​生成随机数,如果是​​JDK17​​​以上,还可以选用更高性能的新型​​PRNG​​​实现,对应接口是​​RandomGenerator​​​,默认实现是​​L32X64MixRandom​​​。编写一个​​main​​方法运行一下:

public static void main(String[] args) {
    System.out.println(ULID.ulid());
}
// 某次执行结果
01GFGGMBFGB5WKXBN7S84ATRDG

最后实现"单调递增"的​​ULID​​构造,先提供一个"增长"方法:

/**
  * The least significant 64 bits increase overflow, 0xffffffffffffffffL + 1
  */
private static final long OVERFLOW = 0x0000000000000000L;

public ULID increment() {
    long msb = this.msb;
    long lsb = this.lsb + 1;
    if (lsb == OVERFLOW) {
        msb += 1;
    }
    return new ULID(msb, lsb);
}

其实就是低位加​​1​​​,溢出后高位加​​1​​。接着添加一个静态内部子类和响应方法如下:

// 构造函数
public ULID(ULID other) {
    this.msb = other.msb;
    this.lsb = other.lsb;
}


public static byte[] defaultRandomBytes(int len) {
    byte[] bytes = new byte[len];
    ThreadLocalRandom.current().nextBytes(bytes);
    return bytes;
}

public static MonotonicULIDSpi monotonicUlid() {
    return monotonicUlid(System::currentTimeMillis, ULID::defaultRandomBytes);
}

public static MonotonicULIDSpi monotonicUlid(Supplier<Long> timestampProvider,
                                              IntFunction<byte[]> randomnessProvider) {
    return new MonotonicULID(timestampProvider, randomnessProvider, timestampProvider.get(),
            randomnessProvider.apply(RANDOMNESS_BYTE_LEN));
}

// @SPI MonotonicULID
public interface MonotonicULIDSpi {

  ULID next();
}

private static class MonotonicULID extends ULID implements MonotonicULIDSpi {

    @Serial
    private static final long serialVersionUID = -9158161806889605101L;

    private volatile ULID lastULID;

    private final Supplier<Long> timestampProvider;

    private final IntFunction<byte[]> randomnessProvider;

    public MonotonicULID(Supplier<Long> timestampProvider,
                          IntFunction<byte[]> randomnessProvider,
                          long timestamp,
                          byte[] randomness) {
        super(timestamp, randomness);
        this.timestampProvider = timestampProvider;
        this.randomnessProvider = randomnessProvider;
        this.lastULID = new ULID(timestamp, randomness);
    }
    
    // 这里没设计好,子类缓存了上一个节点,需要重写一下increment方法,父类可以移除此方法 
    @Override
    public ULID increment() {
        long newMsb = lastULID.msb;
        long newLsb = lastULID.lsb + 1;
        if (newLsb == OVERFLOW) {
            newMsb += 1;
        }
        return new ULID(newMsb, newLsb);
    }

    @Override
    public synchronized ULID next() {
        long lastTimestamp = lastULID.getTimestamp();
        long timestamp = getTimestamp();
        // 这里做了一个恒为true的判断,后面再研读其他代码进行修改
        if (lastTimestamp >= timestamp || timestamp - lastTimestamp <= 1000) {
            this.lastULID = this.increment();
        } else {
            this.lastULID = new ULID(timestampProvider.get(), randomnessProvider.apply(RANDOMNESS_BYTE_LEN));
        }
        return new ULID(this.lastULID);
    }
}

关于上一个​​ULID​​​和下一个​​ULID​​​之间的时间戳判断,这里从规范文件没看出细节实现,先简单做一个永远为​​true​​的分支判断,后面再深入研究然后修改。使用方式如下:

public static void main(String[] args) {
    MonotonicULIDSpi spi = ULID.monotonicUlid();
    System.out.println(spi.next());
    System.out.println(spi.next());
    System.out.println(spi.next());
    System.out.println(spi.next());
}
// 某次运行输出
01GFGASXXQXD5ZJ26PKSCFGNPF
01GFGASXXQXD5ZJ26PKSCFGNPG
01GFGASXXQXD5ZJ26PKSCFGNPH
01GFGASXXQXD5ZJ26PKSCFGNPJ

这里为了更加灵活,没有采用全局静态属性缓存上一个​​ULID​​实例,而是简单使用继承方式实现。

ULID性能评估
引入​​JMH​​做了一个简单的性能测试,代码如下:

@Fork(1)
@Threads(10)
@State(Scope.Benchmark)
@BenchmarkMode(Mode.Throughput)
@Warmup(iterations = 1, time = 1)
@Measurement(iterations = 5, time = 3)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public class BenchmarkRunner {

    private static ULID.MonotonicULIDSpi SPI;

    @Setup
    public void setup() {
        SPI = ULID.monotonicUlid();
    }


    @Benchmark
    public UUID createUUID() {
        return UUID.randomUUID();
    }

    @Benchmark
    public String createUUIDToString() {
        return UUID.randomUUID().toString();
    }

    @Benchmark
    public ULID createULID() {
        return ULID.ulid();
    }

    @Benchmark
    public String createULIDToString() {
        return ULID.ulid().toString();
    }
    
    @Benchmark
    public String createULIDToCanonicalString0() {
        return ULID.ulid().toCanonicalString0();
    }

    @Benchmark
    public ULID createMonotonicULID() {
        return SPI.next();
    }

    @Benchmark
    public String createMonotonicULIDToString() {
        return SPI.next().toString();
    }

    public static void main(String[] args) throws Exception {
        new Runner(new OptionsBuilder().build()).run();
    }
}

某次测试报告如下(开发环境​​Intel 6700K 4C8T 32G​​​,使用​​OpenJDK-19​​):

Benchmark                                      Mode  Cnt       Score      Error   Units
BenchmarkRunner.createMonotonicULID           thrpt    5   18529.565 ± 3432.113  ops/ms
BenchmarkRunner.createMonotonicULIDToString   thrpt    5   12308.443 ± 1729.675  ops/ms
BenchmarkRunner.createULID                    thrpt    5  122347.702 ± 3183.734  ops/ms
BenchmarkRunner.createULIDToCanonicalString0  thrpt    5   50848.135 ± 3699.334  ops/ms
BenchmarkRunner.createULIDToString            thrpt    5   37346.891 ± 1029.809  ops/ms
BenchmarkRunner.createUUID                    thrpt    5     806.134 ±  218.622  ops/ms
BenchmarkRunner.createUUIDToString            thrpt    5     813.380 ±   46.333  ops/ms

小结
本文就​​ULID​​​的规范进行解读,通过规范和参考现有类库进行​​ULID​​​的​​Java​​​实现。​​ULID​​​适用于一些"排序ID"生成或者需要"单调ID"生成的场景,可以考虑用于数据库键设计、顺序号设计等等场景。从实现上看它性能会优于​​UUID​​​(特别是单调​​ULID​​,因为不需要重新获取随机数部分,吞吐量会提升一个数量级)。

​​Demo​​项目仓库:

​​framework-mesh/ulid4j​​​:​​https://github.com/zjcscut/framework-mesh/tree/master/ulid4j​​
工程化项目仓库:

​​ulid4j​​​:​​https://github.com/zjcscut/ulid4j​​
参考资料:

​ ​ulid-spec​​
​ ​crockford-base32​​
​ ​ulid-creator​
 

----------------------------------------------------------------------

ulidj/ULID.java at master · azam/ulidj · GitHub

/**
 * MIT License
 * 
 * Copyright (c) 2016 Azamshul Azizy
 * 
 * Permission is hereby granted, free of charge, to any person obtaining a copy
 * of this software and associated documentation files (the "Software"), to deal
 * in the Software without restriction, including without limitation the rights
 * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 * copies of the Software, and to permit persons to whom the Software is
 * furnished to do so, subject to the following conditions:
 * 
 * The above copyright notice and this permission notice shall be included in all
 * copies or substantial portions of the Software.
 * 
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
 * SOFTWARE.
 */
package io.azam.ulidj;

import java.util.Random;

/**
 * ULID string generator and parser class, using Crockford Base32 encoding. Only
 * upper case letters are used for generation. Parsing allows upper and lower
 * case letters, and i and l will be treated as 1 and o will be treated as 0.
 * <br>
 * <br>
 * ULID generation examples:<br>
 * 
 * <pre>
 * String ulid1 = ULID.random();
 * String ulid2 = ULID.random(ThreadLocalRandom.current());
 * String ulid3 = ULID.random(SecureRandom.newInstance("SHA1PRNG"));
 * byte[] entropy = new byte[] { 0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0x9 };
 * String ulid4 = ULID.generate(System.currentTimeMillis(), entropy);
 * </pre>
 * 
 * ULID parsing examples:<br>
 * 
 * <pre>
 * String ulid = "003JZ9J6G80123456789abcdef";
 * assert ULID.isValid(ulid);
 * long ts = ULID.getTimestamp(ulid);
 * assert ts == 123456789000L;
 * byte[] entropy = ULID.getEntropy(ulid);
 * </pre>
 * 
 * @author azam
 * @since 0.0.1
 * 
 * @see <a href="http://www.crockford.com/wrmg/base32.html">Base32 Encoding</a>
 * @see <a href="https://github.com/alizain/ulid">ULID</a>
 */
public class ULID {
	/**
	 * ULID string length.
	 */
	public static final int ULID_LENGTH = 26;

	/**
	 * Minimum allowed timestamp value.
	 */
	public static final long MIN_TIME = 0x0L;

	/**
	 * Maximum allowed timestamp value.
	 */
	public static final long MAX_TIME = 0x0000ffffffffffffL;

	/**
	 * Base32 characters mapping
	 */
	private static final char[] C = new char[] { //
			0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, //
			0x38, 0x39, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, //
			0x47, 0x48, 0x4a, 0x4b, 0x4d, 0x4e, 0x50, 0x51, //
			0x52, 0x53, 0x54, 0x56, 0x57, 0x58, 0x59, 0x5a };

	/**
	 * {@code char} to {@code byte} O(1) mapping with alternative chars mapping
	 */
	private static final byte[] V = new byte[] { //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0x00, (byte) 0x01, (byte) 0x02, (byte) 0x03, //
			(byte) 0x04, (byte) 0x05, (byte) 0x06, (byte) 0x07, //
			(byte) 0x08, (byte) 0x09, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0x0a, (byte) 0x0b, (byte) 0x0c, //
			(byte) 0x0d, (byte) 0x0e, (byte) 0x0f, (byte) 0x10, //
			(byte) 0x11, (byte) 0xff, (byte) 0x12, (byte) 0x13, //
			(byte) 0xff, (byte) 0x14, (byte) 0x15, (byte) 0xff, //
			(byte) 0x16, (byte) 0x17, (byte) 0x18, (byte) 0x19, //
			(byte) 0x1a, (byte) 0xff, (byte) 0x1b, (byte) 0x1c, //
			(byte) 0x1d, (byte) 0x1e, (byte) 0x1f, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0x0a, (byte) 0x0b, (byte) 0x0c, //
			(byte) 0x0d, (byte) 0x0e, (byte) 0x0f, (byte) 0x10, //
			(byte) 0x11, (byte) 0xff, (byte) 0x12, (byte) 0x13, //
			(byte) 0xff, (byte) 0x14, (byte) 0x15, (byte) 0xff, //
			(byte) 0x16, (byte) 0x17, (byte) 0x18, (byte) 0x19, //
			(byte) 0x1a, (byte) 0xff, (byte) 0x1b, (byte) 0x1c, //
			(byte) 0x1d, (byte) 0x1e, (byte) 0x1f, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff, //
			(byte) 0xff, (byte) 0xff, (byte) 0xff, (byte) 0xff //
	};

	/**
	 * Generate random ULID string using {@link java.util.Random} instance.
	 * 
	 * @return ULID string
	 */
	public static String random() {
		byte[] entropy = new byte[10];
		Random random = new Random();
		random.nextBytes(entropy);
		return generate(System.currentTimeMillis(), entropy);
	}

	/**
	 * Generate random ULID string using provided {@link java.util.Random}
	 * instance.
	 * 
	 * @param random
	 *            {@link java.util.Random} instance
	 * @return ULID string
	 */
	public static String random(Random random) {
		byte[] entropy = new byte[10];
		random.nextBytes(entropy);
		return generate(System.currentTimeMillis(), entropy);
	}

	/**
	 * Generate ULID from Unix epoch timestamp in millisecond and entropy bytes.
	 * Throws {@link java.lang.IllegalArgumentException} if timestamp is less
	 * than {@value #MIN_TIME}, is more than {@value #MAX_TIME}, or entropy
	 * bytes is null or less than 10 bytes.
	 * 
	 * @param time
	 *            Unix epoch timestamp in millisecond
	 * @param entropy
	 *            Entropy bytes
	 * @return ULID string
	 */
	public static String generate(long time, byte[] entropy) {
		if (time < MIN_TIME || time > MAX_TIME || entropy == null || entropy.length < 10) {
			throw new IllegalArgumentException("Time is too long, or entropy is less than 10 bytes or null");
		}

		char[] chars = new char[26];

		// time
		chars[0] = C[((byte) (time >>> 45)) & 0x1f];
		chars[1] = C[((byte) (time >>> 40)) & 0x1f];
		chars[2] = C[((byte) (time >>> 35)) & 0x1f];
		chars[3] = C[((byte) (time >>> 30)) & 0x1f];
		chars[4] = C[((byte) (time >>> 25)) & 0x1f];
		chars[5] = C[((byte) (time >>> 20)) & 0x1f];
		chars[6] = C[((byte) (time >>> 15)) & 0x1f];
		chars[7] = C[((byte) (time >>> 10)) & 0x1f];
		chars[8] = C[((byte) (time >>> 5)) & 0x1f];
		chars[9] = C[((byte) (time)) & 0x1f];

		// entropy
		chars[10] = C[(byte) ((entropy[0] & 0xff) >>> 3)];
		chars[11] = C[(byte) (((entropy[0] << 2) | ((entropy[1] & 0xff) >>> 6)) & 0x1f)];
		chars[12] = C[(byte) (((entropy[1] & 0xff) >>> 1) & 0x1f)];
		chars[13] = C[(byte) (((entropy[1] << 4) | ((entropy[2] & 0xff) >>> 4)) & 0x1f)];
		chars[14] = C[(byte) (((entropy[2] << 1) | ((entropy[3] & 0xff) >>> 7)) & 0x1f)];
		chars[15] = C[(byte) (((entropy[3] & 0xff) >>> 2) & 0x1f)];
		chars[16] = C[(byte) (((entropy[3] << 3) | ((entropy[4] & 0xff) >>> 5)) & 0x1f)];
		chars[17] = C[(byte) (entropy[4] & 0x1f)];
		chars[18] = C[(byte) ((entropy[5] & 0xff) >>> 3)];
		chars[19] = C[(byte) (((entropy[5] << 2) | ((entropy[6] & 0xff) >>> 6)) & 0x1f)];
		chars[20] = C[(byte) (((entropy[6] & 0xff) >>> 1) & 0x1f)];
		chars[21] = C[(byte) (((entropy[6] << 4) | ((entropy[7] & 0xff) >>> 4)) & 0x1f)];
		chars[22] = C[(byte) (((entropy[7] << 1) | ((entropy[8] & 0xff) >>> 7)) & 0x1f)];
		chars[23] = C[(byte) (((entropy[8] & 0xff) >>> 2) & 0x1f)];
		chars[24] = C[(byte) (((entropy[8] << 3) | ((entropy[9] & 0xff) >>> 5)) & 0x1f)];
		chars[25] = C[(byte) (entropy[9] & 0x1f)];

		return new String(chars);
	}

	/**
	 * Checks ULID string validity.
	 * 
	 * @param ulid
	 *            ULID string
	 * @return true if ULID string is valid
	 */
	public static boolean isValid(CharSequence ulid) {
		if (ulid == null || ulid.length() != ULID_LENGTH) {
			return false;
		}
		for (int i = 0; i < ULID_LENGTH; i++) {
			/** We only care for chars between 0x00 and 0xff. */
			char c = ulid.charAt(i);
			if (c < 0 || c > V.length || V[c] == (byte) 0xff) {
				return false;
			}
		}
		return true;
	}

	/**
	 * Extract and return the timestamp part from ULID. Expects a valid ULID
	 * string. Call {@link io.azam.ulidj.ULID#isValid(CharSequence)} and check
	 * validity before calling this method if you do not trust the origin of the
	 * ULID string.
	 * 
	 * @param ulid
	 *            ULID string
	 * @return Unix epoch timestamp in millisecond
	 */
	public static long getTimestamp(CharSequence ulid) {
		return (long) V[ulid.charAt(0)] << 45 //
				| (long) V[ulid.charAt(1)] << 40 //
				| (long) V[ulid.charAt(2)] << 35 //
				| (long) V[ulid.charAt(3)] << 30 //
				| (long) V[ulid.charAt(4)] << 25 //
				| (long) V[ulid.charAt(5)] << 20 //
				| (long) V[ulid.charAt(6)] << 15 //
				| (long) V[ulid.charAt(7)] << 10 //
				| (long) V[ulid.charAt(8)] << 5 //
				| (long) V[ulid.charAt(9)];
	}

	/**
	 * Extract and return the entropy part from ULID. Expects a valid ULID
	 * string. Call {@link io.azam.ulidj.ULID#isValid(CharSequence)} and check
	 * validity before calling this method if you do not trust the origin of the
	 * ULID string.
	 * 
	 * @param ulid
	 *            ULID string
	 * @return Entropy bytes
	 */
	public static byte[] getEntropy(CharSequence ulid) {
		byte[] bytes = new byte[10];
		bytes[0] = (byte) ((V[ulid.charAt(10)] << 3) //
				| (V[ulid.charAt(11)] & 0xff) >>> 2);
		bytes[1] = (byte) ((V[ulid.charAt(11)] << 6) //
				| V[ulid.charAt(12)] << 1 //
				| (V[ulid.charAt(13)] & 0xff) >>> 4);
		bytes[2] = (byte) ((V[ulid.charAt(13)] << 4) //
				| (V[ulid.charAt(14)] & 0xff) >>> 1);
		bytes[3] = (byte) ((V[ulid.charAt(14)] << 7) //
				| V[ulid.charAt(15)] << 2 //
				| (V[ulid.charAt(16)] & 0xff) >>> 3);
		bytes[4] = (byte) ((V[ulid.charAt(16)] << 5) //
				| V[ulid.charAt(17)]);
		bytes[5] = (byte) ((V[ulid.charAt(18)] << 3) //
				| (V[ulid.charAt(19)] & 0xff) >>> 2);
		bytes[6] = (byte) ((V[ulid.charAt(19)] << 6) //
				| V[ulid.charAt(20)] << 1 //
				| (V[ulid.charAt(21)] & 0xff) >>> 4);
		bytes[7] = (byte) ((V[ulid.charAt(21)] << 4) //
				| (V[ulid.charAt(22)] & 0xff) >>> 1);
		bytes[8] = (byte) ((V[ulid.charAt(22)] << 7) //
				| V[ulid.charAt(23)] << 2 //
				| (V[ulid.charAt(24)] & 0xff) >>> 3);
		bytes[9] = (byte) ((V[ulid.charAt(24)] << 5) //
				| V[ulid.charAt(25)]);
		return bytes;
	}
}

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值