先考虑连续做。暂时我们不关心最终数,所以让我们在区间[0…1]中均匀抽样X_i,使它们的和等于1X_1 + X_2 + ... X_n = 1
这就是众所周知的分布,称为Dirichlet分布,或称伽马变量,或单纯形抽样。请参阅Generating N uniform random numbers that sum to M上的详细信息和讨论。可以使用random.gammavariate(a,1)或正确处理角点,参数为1的伽马变量为等效指数分布,下面是直接采样代码
^{pr2}$
所以从simplex_sampling开始,你就有了向量和求和作为标准化。在
因此,把它用于,比如N=5N = 5
sum, r = simplex_sampling(N)
norm = float(N)/sum
# normalization together with matching back to integers
result = []
for k in range(N):
# t is now float uniformly distributed in [0.0...N], with sum equal to N
t = r[k] * norm
# not sure if you could have zeros,
# and check for boundaries might be useful, but
# conversion to integers is trivial anyway:
# values in [0...1) shall be converted to 0,
# values in [1...2) shall be converted to 1, etc
result.append( int(t) )