【邀请码体系设计】邀请码生成方案的落地实现

最新推荐文章于 2024-03-07 17:05:51 发布

ATFWUS

最新推荐文章于 2024-03-07 17:05:51 发布

阅读量2.2k

点赞数 26

分类专栏：【系统设计】文章标签：邀请码设计邀请码的生成邀请码的生成方案字符集映射

本文为ATFWUS原创，允许转载，但请附上作者署名和本文链接

本文链接：https://blog.csdn.net/ATFWUS/article/details/136482547

版权

【系统设计】专栏收录该内容

4 篇文章

订阅专栏

这篇文章探讨了邀请码的生成机制，指出邀请码生成与唯一ID生成的区别，并强调了邀请码生成过程中唯一性校验的重要性。文章介绍了一种直观的哈希方法来生成邀请码，并分析了其潜在的容量限制问题，为了解决这个问题，提出了将哈希值转换为36进制的方法，以及在分布式环境下利用分布式ID生成算法（如雪花算法）来优化邀请码的生成，保证其在分布式系统中的唯一性和一致性。

1.前置文章

【邀请码体系设计】深入解析准入限制型邀请码的设计与实现

2.概述

邀请码生成与唯一ID生成在目的上有着本质的区别。唯一ID生成的目标是保证每次生成的ID都是全局唯一的，任何重复都被视为异常。相对地，邀请码生成允许在生成过程中进行唯一性校验，如果发现重复，系统会再次生成新的邀请码。虽然这种重复校验机制存在，但通过设计高效的生成策略，我们可以显著减少校验次数，从而提高系统性能，如前置文章第4节所述。

3.基于哈希的直观做法与优化

直接将时间戳 $t$ 和一个序列号 $t$ 进行字符串拼接后取哈希值，然后取哈希值的前6位作为邀请码。用如下公式进行表达：
$first_6_chars ( H ( t ∣ ∣ s ) ) I = \text{first\_6\_chars}\left( H(t || s) \right)$
这种方法简单直接，但会大幅降低方案的容量，从 (36^6) 减少到 (16^6)。

import hashlib
import time

CHARSET = "abcdefghijklmnopqrstuvwxyz0123456789"
BASE = len(CHARSET)

def hash_function(input_value):
    # 使用SHA-256哈希函数
    hash_object = hashlib.sha256(str(input_value).encode())
    hex_dig = hash_object.hexdigest()
    # 取哈希值的前6个字符
    return int(hex_dig[:6], 16)

def encode(hash_value):
    # 将哈希值转换为BASE进制的6位字符串
    code = ''
    for _ in range(6):
        code = CHARSET[hash_value % BASE] + code
        hash_value //= BASE
    return code

def generate_invite_code(timestamp, sequence):
    # 结合时间戳和序列号生成邀请码
    hash_value = hash_function(f"{timestamp}-{sequence}")
    invite_code = encode(hash_value)
    return invite_code

def unique_check(invite_code):
    # 伪唯一性校验函数，后续替换为数据库校验
    # 假设所有生成的代码都是唯一的
    return True

def get_unique_invite_code():
    timestamp = time.time()
    sequence = 0
    while True:
        invite_code = generate_invite_code(timestamp, sequence)
        if unique_check(invite_code):
            return invite_code
        sequence += 1


from tqdm import tqdm

def test_conflict_rate(iterations):
    # 存储所有生成的邀请码
    generated_codes = set()
    # 记录冲突次数
    conflicts = 0

    for _ in tqdm(range(iterations)):
        code = get_unique_invite_code()
        if code in generated_codes:
            conflicts += 1
        else:
            generated_codes.add(code)

    print(f"Total iterations: {iterations}")
    print(f"Total conflicts: {conflicts}")
    print(f"Conflict rate: {conflicts / iterations * 100:.6f}%")



# 测试一亿次生成邀请码
#test_conflict_rate(1_000_000_000)

为了匹配原字符集的容量，我们将哈希值转为36进制：

$first_6_chars ( Base36 ( H ( t ∣ ∣ s ) ) ) I = \text{first\_6\_chars}\left(\text{Base36}\left(H(\text{t} || \text{s})\right)\right)$
代码如下：

import hashlib
import time
from tqdm import tqdm

CHARSET = "abcdefghijklmnopqrstuvwxyz0123456789"
BASE = len(CHARSET)


def hash_function(input_value):
    hash_object = hashlib.sha256(str(input_value).encode())
    # Convert the hash to a large integer
    return int(hash_object.hexdigest(), 16)


def base36_encode(number):
    if not isinstance(number, int):
        raise TypeError('Number must be an integer')

    base36 = ''
    while number:
        number, i = divmod(number, BASE)
        base36 = CHARSET[i] + base36

    return base36 or '0'


def generate_invite_code(timestamp, sequence):
    hash_value = hash_function(f"{timestamp}-{sequence}")
    # Ensure the code is 6 characters long
    invite_code = base36_encode(hash_value)[:6]
    return invite_code


def get_unique_invite_code(sequence):
    timestamp = int(time.time())
    invite_code = generate_invite_code(timestamp, sequence)
    return invite_code


def test_conflict_rate(iterations):
    generated_codes = set()
    conflicts = 0

    for i in tqdm(range(iterations)):
        code = get_unique_invite_code(i)
        if code in generated_codes:
            conflicts += 1
        else:
            generated_codes.add(code)

    print(f"Total iterations: {iterations}")
    print(f"Total conflicts: {conflicts}")
    print(f"Conflict rate: {conflicts / iterations * 100:.6f}%")


# Test the function with a significant number of iterations
test_conflict_rate(100_000_000)

100%|██████████| 100000000/100000000 [11:11<00:00, 148957.67it/s]
Total iterations: 100000000
Total conflicts: 10651577
Conflict rate: 10.651577%

尽管这样可以保持字符集一致，但依赖于自增序号，在分布式环境下需引入如分布式Redis来维护序号，增加了系统复杂性。

5.改进哈希方案（借助分布式ID生成算法）

利用分布式ID生成算法（如雪花算法）生成唯一ID (DID)，然后将 (DID) 作为输入获取哈希值，并转为36进制取前6位作为邀请码：
$first_6_chars ( Base36 ( H ( D I D ) ) ) I = \text{first\_6\_chars}\left(\text{Base36}\left(H(DID)\right)\right)$
此方法相比直接哈希时间戳和序列号的方式，在分布式环境下更为稳健，无需担心序号同步问题。