python加权随机库_python中的加权随机样本

小编典典

从您的代码:..

weight_sample_indexes = lambda weights, k: random.sample([val

for val, cnt in enumerate(weights) for i in range(cnt)], k)

..我假设权重是正整数,并且“无替代”是指没有替代解散的序列。

这是一个基于random.sample和O(log n)的解决方案__getitem__:

import bisect

import random

from collections import Counter, Sequence

def weighted_sample(population, weights, k):

return random.sample(WeightedPopulation(population, weights), k)

class WeightedPopulation(Sequence):

def __init__(self, population, weights):

assert len(population) == len(weights) > 0

self.population = population

self.cumweights = []

cumsum = 0 # compute cumulative weight

for w in weights:

cumsum += w

self.cumweights.append(cumsum)

def __len__(self):

return self.cumweights[-1]

def __getitem__(self, i):

if not 0 <= i < len(self):

raise IndexError(i)

return self.population[bisect.bisect(self.cumweights, i)]

total = Counter()

for _ in range(1000):

sample = weighted_sample("abc", [1,10,2], 5)

total.update(sample)

print(sample)

print("Frequences %s" % (dict(Counter(sample)),))

# Check that values are sane

print("Total " + ', '.join("%s: %.0f" % (val, count * 1.0 / min(total.values()))

for val, count in total.most_common()))

输出量

['b', 'b', 'b', 'c', 'c']

Frequences {'c': 2, 'b': 3}

Total b: 10, c: 2, a: 1

2020-07-28

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值